All of lore.kernel.org
 help / color / mirror / Atom feed
* Couple of PMD questions
@ 2016-04-19 20:16 Jay Rolette
  2016-04-20  9:10 ` Bruce Richardson
  2016-04-20  9:35 ` Andriy Berestovskyy
  0 siblings, 2 replies; 9+ messages in thread
From: Jay Rolette @ 2016-04-19 20:16 UTC (permalink / raw)
  To: DPDK

In ixgbe_dev_rx_init(), there is this bit of code:

	/*
	 * Configure the RX buffer size in the BSIZEPACKET field of
	 * the SRRCTL register of the queue.
	 * The value is in 1 KB resolution. Valid values can be from
	 * 1 KB to 16 KB.
	 */
	buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool) -
		RTE_PKTMBUF_HEADROOM);
	srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
		   IXGBE_SRRCTL_BSIZEPKT_MASK);

	IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);

	buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
			       IXGBE_SRRCTL_BSIZEPKT_SHIFT);

	/* It adds dual VLAN length for supporting dual VLAN */
	if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
				    2 * IXGBE_VLAN_TAG_SIZE > buf_size)
		dev->data->scattered_rx = 1;


Couple of questions I have about it:

1) If the caller configured the MBUF memory pool data room size to be
something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then the
driver ends up silently programming the NIC to use a smaller max RX size
than the caller specified.

Should the driver error out in that case instead of only "sort of" working?
If not, it seems like it should be logging an error message.

2) Second question is about the "/* It adds dual VLAN length for supporting
dual VLAN */" bit...

As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be set to
the max frame size you support (although it says it's only used if jumbo
frame is enabled). That same value is generally used when calculating the
size that mbuf elements should be created at.

The description for the data_room_size parameter of
rte_pktmbuf_pool_create():

"Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."


If I support a max frame size of 9216 bytes (exactly a 1K multiple to make
the NIC happy), then max_rx_pkt_len is going to be 9216 and data_room_size
will be 9216 + RTE_PKTMBUF_HEADROOM.

ixgbe_dev_rx_init() will calculate normalize that back to 9216, which will
fail the dual VLAN length check. The really nasty part about that is it has
a side-effect of enabling scattered RX regardless of the fact that I didn't
enable scattered RX in dev_conf.rxmode.

The code in the e1000 PMD is similar, so nothing unique to ixgbe.

Is that check correct? It seems wrong to be adding space for q-in-q on top
of your specified max frame size...

Thanks,
Jay

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Couple of PMD questions
  2016-04-19 20:16 Couple of PMD questions Jay Rolette
@ 2016-04-20  9:10 ` Bruce Richardson
  2016-04-20 12:22   ` Jay Rolette
  2016-04-20  9:35 ` Andriy Berestovskyy
  1 sibling, 1 reply; 9+ messages in thread
From: Bruce Richardson @ 2016-04-20  9:10 UTC (permalink / raw)
  To: Jay Rolette; +Cc: DPDK

On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
> In ixgbe_dev_rx_init(), there is this bit of code:
> 
> 	/*
> 	 * Configure the RX buffer size in the BSIZEPACKET field of
> 	 * the SRRCTL register of the queue.
> 	 * The value is in 1 KB resolution. Valid values can be from
> 	 * 1 KB to 16 KB.
> 	 */
> 	buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool) -
> 		RTE_PKTMBUF_HEADROOM);
> 	srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
> 		   IXGBE_SRRCTL_BSIZEPKT_MASK);
> 
> 	IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
> 
> 	buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
> 			       IXGBE_SRRCTL_BSIZEPKT_SHIFT);
> 
> 	/* It adds dual VLAN length for supporting dual VLAN */
> 	if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
> 				    2 * IXGBE_VLAN_TAG_SIZE > buf_size)
> 		dev->data->scattered_rx = 1;
> 
> 
> Couple of questions I have about it:
> 
> 1) If the caller configured the MBUF memory pool data room size to be
> something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then the
> driver ends up silently programming the NIC to use a smaller max RX size
> than the caller specified.
> 
> Should the driver error out in that case instead of only "sort of" working?
> If not, it seems like it should be logging an error message.

Well, it does log at the end of the whole rx init process what RX function is
being used, so there is that.
However, looking at it now, given that there is an explicit setting in the config
to request scattered mode, I think you may be right and that we should error out
here if scattered mode is needed and not set. It could help prevent unexpected
problems with these drivers.

> 
> 2) Second question is about the "/* It adds dual VLAN length for supporting
> dual VLAN */" bit...
> 
> As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be set to
> the max frame size you support (although it says it's only used if jumbo
> frame is enabled). That same value is generally used when calculating the
> size that mbuf elements should be created at.
> 
> The description for the data_room_size parameter of
> rte_pktmbuf_pool_create():
> 
> "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
> 
> 
> If I support a max frame size of 9216 bytes (exactly a 1K multiple to make
> the NIC happy), then max_rx_pkt_len is going to be 9216 and data_room_size
> will be 9216 + RTE_PKTMBUF_HEADROOM.
> 
> ixgbe_dev_rx_init() will calculate normalize that back to 9216, which will
> fail the dual VLAN length check. The really nasty part about that is it has
> a side-effect of enabling scattered RX regardless of the fact that I didn't
> enable scattered RX in dev_conf.rxmode.
> 
> The code in the e1000 PMD is similar, so nothing unique to ixgbe.
> 
> Is that check correct? It seems wrong to be adding space for q-in-q on top
> of your specified max frame size...

The issue here is what the correct behaviour needs to be. If we have the user
specify the maximum frame size including all vlan tags, then we hit the problem
where we have to subtract the VLAN tag sizes when writing the value to the NIC.
In that case, we will hit a problem where we get a e.g. 9210 byte frame - to 
reuse your example - without any VLAN tags, which will be rejected by the
hardware as being oversized. If we don't do the subtraction, and we get the
same 9210 byte packet with 2 VLAN tags on it, the hardware will accept it and
then split it across multiple descriptors because the actual DMA size is
9218 bytes.

I'm not sure there is a works-in-all-cases solution here.

/Bruce

> 
> Thanks,
> Jay

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Couple of PMD questions
  2016-04-19 20:16 Couple of PMD questions Jay Rolette
  2016-04-20  9:10 ` Bruce Richardson
@ 2016-04-20  9:35 ` Andriy Berestovskyy
  2016-04-20 12:45   ` Jay Rolette
  1 sibling, 1 reply; 9+ messages in thread
From: Andriy Berestovskyy @ 2016-04-20  9:35 UTC (permalink / raw)
  To: Jay Rolette; +Cc: DPDK

Hi Jay,

On Tue, Apr 19, 2016 at 10:16 PM, Jay Rolette <rolette@infinite.io> wrote:
> Should the driver error out in that case instead of only "sort of" working?

+1, we hit the same issue. Error or log message would help.

> If I support a max frame size of 9216 bytes (exactly a 1K multiple to make
> the NIC happy), then max_rx_pkt_len is going to be 9216 and data_room_size
> will be 9216 + RTE_PKTMBUF_HEADROOM.

Try to set max_rx_pkt_len <= 9K - 8 and mempool element size to 9K +
headroom + size of structures.

> Is that check correct?

Datasheet says:
The MFS does not include the 4 bytes of the VLAN header. Packets with
VLAN header can be as large as MFS + 4. When double VLAN is enabled,
the device adds 8 to the MFS for any packets.


Regards,
Andriy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Couple of PMD questions
  2016-04-20  9:10 ` Bruce Richardson
@ 2016-04-20 12:22   ` Jay Rolette
  2016-04-20 14:05     ` Bruce Richardson
  0 siblings, 1 reply; 9+ messages in thread
From: Jay Rolette @ 2016-04-20 12:22 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: DPDK

On Wed, Apr 20, 2016 at 4:10 AM, Bruce Richardson <
bruce.richardson@intel.com> wrote:

> On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
> > In ixgbe_dev_rx_init(), there is this bit of code:
> >
> >       /*
> >        * Configure the RX buffer size in the BSIZEPACKET field of
> >        * the SRRCTL register of the queue.
> >        * The value is in 1 KB resolution. Valid values can be from
> >        * 1 KB to 16 KB.
> >        */
> >       buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool) -
> >               RTE_PKTMBUF_HEADROOM);
> >       srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
> >                  IXGBE_SRRCTL_BSIZEPKT_MASK);
> >
> >       IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
> >
> >       buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
> >                              IXGBE_SRRCTL_BSIZEPKT_SHIFT);
> >
> >       /* It adds dual VLAN length for supporting dual VLAN */
> >       if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
> >                                   2 * IXGBE_VLAN_TAG_SIZE > buf_size)
> >               dev->data->scattered_rx = 1;
> >
> >
> > Couple of questions I have about it:
> >
> > 1) If the caller configured the MBUF memory pool data room size to be
> > something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then the
> > driver ends up silently programming the NIC to use a smaller max RX size
> > than the caller specified.
> >
> > Should the driver error out in that case instead of only "sort of"
> working?
> > If not, it seems like it should be logging an error message.
>
> Well, it does log at the end of the whole rx init process what RX function
> is
> being used, so there is that.
> However, looking at it now, given that there is an explicit setting in the
> config
> to request scattered mode, I think you may be right and that we should
> error out
> here if scattered mode is needed and not set. It could help prevent
> unexpected
> problems with these drivers.
>

+1. A hard fail at init if I've misconfigured things is much preferred to
something that mostly works for typical test cases and only fails on
corner/limit testing.


> > 2) Second question is about the "/* It adds dual VLAN length for
> supporting
> > dual VLAN */" bit...
> >
> > As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be set
> to
> > the max frame size you support (although it says it's only used if jumbo
> > frame is enabled). That same value is generally used when calculating the
> > size that mbuf elements should be created at.
> >
> > The description for the data_room_size parameter of
> > rte_pktmbuf_pool_create():
> >
> > "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
> >
> >
> > If I support a max frame size of 9216 bytes (exactly a 1K multiple to
> make
> > the NIC happy), then max_rx_pkt_len is going to be 9216 and
> data_room_size
> > will be 9216 + RTE_PKTMBUF_HEADROOM.
> >
> > ixgbe_dev_rx_init() will calculate normalize that back to 9216, which
> will
> > fail the dual VLAN length check. The really nasty part about that is it
> has
> > a side-effect of enabling scattered RX regardless of the fact that I
> didn't
> > enable scattered RX in dev_conf.rxmode.
> >
> > The code in the e1000 PMD is similar, so nothing unique to ixgbe.
> >
> > Is that check correct? It seems wrong to be adding space for q-in-q on
> top
> > of your specified max frame size...
>
> The issue here is what the correct behaviour needs to be. If we have the
> user
> specify the maximum frame size including all vlan tags, then we hit the
> problem
> where we have to subtract the VLAN tag sizes when writing the value to the
> NIC.
> In that case, we will hit a problem where we get a e.g. 9210 byte frame -
> to
> reuse your example - without any VLAN tags, which will be rejected by the
> hardware as being oversized. If we don't do the subtraction, and we get the
> same 9210 byte packet with 2 VLAN tags on it, the hardware will accept it
> and
> then split it across multiple descriptors because the actual DMA size is
> 9218 bytes.
>

As an app developer, I didn't realize the max frame size didn't include
VLAN tags. I expected max frame size to be the size of the ethernet frame
on the wire, which I would expect to include space used by any VLAN or MPLS
tags.

Is there anything in the docs or example apps about that? I did some
digging as I was debugging this and didn't notice it, but entirely possible
I just missed it.


> I'm not sure there is a works-in-all-cases solution here.
>

Andriy's suggestion seems like it points in the right direction.

>From an app developer point of view, I'd expect to have a single max frame
size value to track and the APIs should take care of any adjustments
required internally. Maybe have rte_pktmbuf_pool_create() add the
additional bytes when it calls rte_mempool_create() under the covers? Then
it's nice and clean for the API without unexpected side-effects.

Jay


> /Bruce
>
> >
> > Thanks,
> > Jay
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Couple of PMD questions
  2016-04-20  9:35 ` Andriy Berestovskyy
@ 2016-04-20 12:45   ` Jay Rolette
  0 siblings, 0 replies; 9+ messages in thread
From: Jay Rolette @ 2016-04-20 12:45 UTC (permalink / raw)
  To: Andriy Berestovskyy; +Cc: DPDK, Richardson, Bruce

On Wed, Apr 20, 2016 at 4:35 AM, Andriy Berestovskyy <aber@semihalf.com>
wrote:

> Hi Jay,
>
> On Tue, Apr 19, 2016 at 10:16 PM, Jay Rolette <rolette@infinite.io> wrote:
> > Should the driver error out in that case instead of only "sort of"
> working?
>
> +1, we hit the same issue. Error or log message would help.
>
> > If I support a max frame size of 9216 bytes (exactly a 1K multiple to
> make
> > the NIC happy), then max_rx_pkt_len is going to be 9216 and
> data_room_size
> > will be 9216 + RTE_PKTMBUF_HEADROOM.
>
> Try to set max_rx_pkt_len <= 9K - 8 and mempool element size to 9K +
> headroom + size of structures.
>

Thanks, Andriy. That makes sense given how the code + NIC are working.


> > Is that check correct?
>
> Datasheet says:
> The MFS does not include the 4 bytes of the VLAN header. Packets with
> VLAN header can be as large as MFS + 4. When double VLAN is enabled,
> the device adds 8 to the MFS for any packets.
>

Gotcha. Hopefully we can get the docs and example code up to where app
developers don't need to crawl through driver source and NIC datasheets.
That or have rte_pktmbuf_pool_create() take care of that sort of detail
automatically. :-)

Jay


Regards,
> Andriy
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Couple of PMD questions
  2016-04-20 12:22   ` Jay Rolette
@ 2016-04-20 14:05     ` Bruce Richardson
  2016-04-20 15:47       ` Jay Rolette
  0 siblings, 1 reply; 9+ messages in thread
From: Bruce Richardson @ 2016-04-20 14:05 UTC (permalink / raw)
  To: Jay Rolette; +Cc: DPDK

On Wed, Apr 20, 2016 at 07:22:57AM -0500, Jay Rolette wrote:
> On Wed, Apr 20, 2016 at 4:10 AM, Bruce Richardson <
> bruce.richardson@intel.com> wrote:
> 
> > On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
> > > In ixgbe_dev_rx_init(), there is this bit of code:
> > >
> > >       /*
> > >        * Configure the RX buffer size in the BSIZEPACKET field of
> > >        * the SRRCTL register of the queue.
> > >        * The value is in 1 KB resolution. Valid values can be from
> > >        * 1 KB to 16 KB.
> > >        */
> > >       buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool) -
> > >               RTE_PKTMBUF_HEADROOM);
> > >       srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
> > >                  IXGBE_SRRCTL_BSIZEPKT_MASK);
> > >
> > >       IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
> > >
> > >       buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
> > >                              IXGBE_SRRCTL_BSIZEPKT_SHIFT);
> > >
> > >       /* It adds dual VLAN length for supporting dual VLAN */
> > >       if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
> > >                                   2 * IXGBE_VLAN_TAG_SIZE > buf_size)
> > >               dev->data->scattered_rx = 1;
> > >
> > >
> > > Couple of questions I have about it:
> > >
> > > 1) If the caller configured the MBUF memory pool data room size to be
> > > something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then the
> > > driver ends up silently programming the NIC to use a smaller max RX size
> > > than the caller specified.
> > >
> > > Should the driver error out in that case instead of only "sort of"
> > working?
> > > If not, it seems like it should be logging an error message.
> >
> > Well, it does log at the end of the whole rx init process what RX function
> > is
> > being used, so there is that.
> > However, looking at it now, given that there is an explicit setting in the
> > config
> > to request scattered mode, I think you may be right and that we should
> > error out
> > here if scattered mode is needed and not set. It could help prevent
> > unexpected
> > problems with these drivers.
> >
> 
> +1. A hard fail at init if I've misconfigured things is much preferred to
> something that mostly works for typical test cases and only fails on
> corner/limit testing.
> 
> 
> > > 2) Second question is about the "/* It adds dual VLAN length for
> > supporting
> > > dual VLAN */" bit...
> > >
> > > As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be set
> > to
> > > the max frame size you support (although it says it's only used if jumbo
> > > frame is enabled). That same value is generally used when calculating the
> > > size that mbuf elements should be created at.
> > >
> > > The description for the data_room_size parameter of
> > > rte_pktmbuf_pool_create():
> > >
> > > "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
> > >
> > >
> > > If I support a max frame size of 9216 bytes (exactly a 1K multiple to
> > make
> > > the NIC happy), then max_rx_pkt_len is going to be 9216 and
> > data_room_size
> > > will be 9216 + RTE_PKTMBUF_HEADROOM.
> > >
> > > ixgbe_dev_rx_init() will calculate normalize that back to 9216, which
> > will
> > > fail the dual VLAN length check. The really nasty part about that is it
> > has
> > > a side-effect of enabling scattered RX regardless of the fact that I
> > didn't
> > > enable scattered RX in dev_conf.rxmode.
> > >
> > > The code in the e1000 PMD is similar, so nothing unique to ixgbe.
> > >
> > > Is that check correct? It seems wrong to be adding space for q-in-q on
> > top
> > > of your specified max frame size...
> >
> > The issue here is what the correct behaviour needs to be. If we have the
> > user
> > specify the maximum frame size including all vlan tags, then we hit the
> > problem
> > where we have to subtract the VLAN tag sizes when writing the value to the
> > NIC.
> > In that case, we will hit a problem where we get a e.g. 9210 byte frame -
> > to
> > reuse your example - without any VLAN tags, which will be rejected by the
> > hardware as being oversized. If we don't do the subtraction, and we get the
> > same 9210 byte packet with 2 VLAN tags on it, the hardware will accept it
> > and
> > then split it across multiple descriptors because the actual DMA size is
> > 9218 bytes.
> >
> 
> As an app developer, I didn't realize the max frame size didn't include
> VLAN tags. I expected max frame size to be the size of the ethernet frame
> on the wire, which I would expect to include space used by any VLAN or MPLS
> tags.
> 
> Is there anything in the docs or example apps about that? I did some
> digging as I was debugging this and didn't notice it, but entirely possible
> I just missed it.
> 
> 
> > I'm not sure there is a works-in-all-cases solution here.
> >
> 
> Andriy's suggestion seems like it points in the right direction.
> 
> From an app developer point of view, I'd expect to have a single max frame
> size value to track and the APIs should take care of any adjustments
> required internally. Maybe have rte_pktmbuf_pool_create() add the
> additional bytes when it calls rte_mempool_create() under the covers? Then
> it's nice and clean for the API without unexpected side-effects.
>

It will still have unintended side-effects I think, depending on the resolution
of the NIC buffer length paramters. For drivers like ixgbe or e1000, the mempool
create call could potentially have to add an additional 1k to each buffer just
to be able to store the extra eight bytes.

/Bruce

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Couple of PMD questions
  2016-04-20 14:05     ` Bruce Richardson
@ 2016-04-20 15:47       ` Jay Rolette
  2016-04-21  9:13         ` Andriy Berestovskyy
  2016-04-21 10:54         ` Bruce Richardson
  0 siblings, 2 replies; 9+ messages in thread
From: Jay Rolette @ 2016-04-20 15:47 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: DPDK, Andriy Berestovskyy

On Wed, Apr 20, 2016 at 9:05 AM, Bruce Richardson <
bruce.richardson@intel.com> wrote:

> On Wed, Apr 20, 2016 at 07:22:57AM -0500, Jay Rolette wrote:
> > On Wed, Apr 20, 2016 at 4:10 AM, Bruce Richardson <
> > bruce.richardson@intel.com> wrote:
> >
> > > On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
> > > > In ixgbe_dev_rx_init(), there is this bit of code:
> > > >
> > > >       /*
> > > >        * Configure the RX buffer size in the BSIZEPACKET field of
> > > >        * the SRRCTL register of the queue.
> > > >        * The value is in 1 KB resolution. Valid values can be from
> > > >        * 1 KB to 16 KB.
> > > >        */
> > > >       buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool)
> -
> > > >               RTE_PKTMBUF_HEADROOM);
> > > >       srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
> > > >                  IXGBE_SRRCTL_BSIZEPKT_MASK);
> > > >
> > > >       IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
> > > >
> > > >       buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
> > > >                              IXGBE_SRRCTL_BSIZEPKT_SHIFT);
> > > >
> > > >       /* It adds dual VLAN length for supporting dual VLAN */
> > > >       if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
> > > >                                   2 * IXGBE_VLAN_TAG_SIZE > buf_size)
> > > >               dev->data->scattered_rx = 1;
> > > >
> > > >
> > > > Couple of questions I have about it:
> > > >
> > > > 1) If the caller configured the MBUF memory pool data room size to be
> > > > something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then
> the
> > > > driver ends up silently programming the NIC to use a smaller max RX
> size
> > > > than the caller specified.
> > > >
> > > > Should the driver error out in that case instead of only "sort of"
> > > working?
> > > > If not, it seems like it should be logging an error message.
> > >
> > > Well, it does log at the end of the whole rx init process what RX
> function
> > > is
> > > being used, so there is that.
> > > However, looking at it now, given that there is an explicit setting in
> the
> > > config
> > > to request scattered mode, I think you may be right and that we should
> > > error out
> > > here if scattered mode is needed and not set. It could help prevent
> > > unexpected
> > > problems with these drivers.
> > >
> >
> > +1. A hard fail at init if I've misconfigured things is much preferred to
> > something that mostly works for typical test cases and only fails on
> > corner/limit testing.
> >
> >
> > > > 2) Second question is about the "/* It adds dual VLAN length for
> > > supporting
> > > > dual VLAN */" bit...
> > > >
> > > > As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be
> set
> > > to
> > > > the max frame size you support (although it says it's only used if
> jumbo
> > > > frame is enabled). That same value is generally used when
> calculating the
> > > > size that mbuf elements should be created at.
> > > >
> > > > The description for the data_room_size parameter of
> > > > rte_pktmbuf_pool_create():
> > > >
> > > > "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
> > > >
> > > >
> > > > If I support a max frame size of 9216 bytes (exactly a 1K multiple to
> > > make
> > > > the NIC happy), then max_rx_pkt_len is going to be 9216 and
> > > data_room_size
> > > > will be 9216 + RTE_PKTMBUF_HEADROOM.
> > > >
> > > > ixgbe_dev_rx_init() will calculate normalize that back to 9216, which
> > > will
> > > > fail the dual VLAN length check. The really nasty part about that is
> it
> > > has
> > > > a side-effect of enabling scattered RX regardless of the fact that I
> > > didn't
> > > > enable scattered RX in dev_conf.rxmode.
> > > >
> > > > The code in the e1000 PMD is similar, so nothing unique to ixgbe.
> > > >
> > > > Is that check correct? It seems wrong to be adding space for q-in-q
> on
> > > top
> > > > of your specified max frame size...
> > >
> > > The issue here is what the correct behaviour needs to be. If we have
> the
> > > user
> > > specify the maximum frame size including all vlan tags, then we hit the
> > > problem
> > > where we have to subtract the VLAN tag sizes when writing the value to
> the
> > > NIC.
> > > In that case, we will hit a problem where we get a e.g. 9210 byte
> frame -
> > > to
> > > reuse your example - without any VLAN tags, which will be rejected by
> the
> > > hardware as being oversized. If we don't do the subtraction, and we
> get the
> > > same 9210 byte packet with 2 VLAN tags on it, the hardware will accept
> it
> > > and
> > > then split it across multiple descriptors because the actual DMA size
> is
> > > 9218 bytes.
> > >
> >
> > As an app developer, I didn't realize the max frame size didn't include
> > VLAN tags. I expected max frame size to be the size of the ethernet frame
> > on the wire, which I would expect to include space used by any VLAN or
> MPLS
> > tags.
> >
> > Is there anything in the docs or example apps about that? I did some
> > digging as I was debugging this and didn't notice it, but entirely
> possible
> > I just missed it.
> >
> >
> > > I'm not sure there is a works-in-all-cases solution here.
> > >
> >
> > Andriy's suggestion seems like it points in the right direction.
> >
> > From an app developer point of view, I'd expect to have a single max
> frame
> > size value to track and the APIs should take care of any adjustments
> > required internally. Maybe have rte_pktmbuf_pool_create() add the
> > additional bytes when it calls rte_mempool_create() under the covers?
> Then
> > it's nice and clean for the API without unexpected side-effects.
> >
>
> It will still have unintended side-effects I think, depending on the
> resolution
> of the NIC buffer length paramters. For drivers like ixgbe or e1000, the
> mempool
> create call could potentially have to add an additional 1k to each buffer
> just
> to be able to store the extra eight bytes.


The comments in the ixgbe driver say that the value programmed into SRRCTL
must be on a 1K boundary. Based on your previous response, it sounded like
the NIC ignores that limit for VLAN tags, hence the check for the extra 8
bytes on the mbuf element size. Are you worried about the size resolution
on mempool elements?

Sounds like I've got to go spend some quality time in the NIC data
sheets... Maybe I should back up and just ask the higher level question:

What's the right incantation in both the dev_conf structure and in creating
the mbuf pool to support jumbo frames of some particular size on the wire,
with or without VLAN tags, without requiring scattered_rx support in an app?

Thanks,
Jay

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Couple of PMD questions
  2016-04-20 15:47       ` Jay Rolette
@ 2016-04-21  9:13         ` Andriy Berestovskyy
  2016-04-21 10:54         ` Bruce Richardson
  1 sibling, 0 replies; 9+ messages in thread
From: Andriy Berestovskyy @ 2016-04-21  9:13 UTC (permalink / raw)
  To: Jay Rolette; +Cc: Bruce Richardson, DPDK

Hi Jay,
Max RX frame size and buffer size are independent bits. You may set
MAXFRS to 9K and feed 512B buffers or vice versa. The issue is that
you are trying to tune both of the options to avoid buf chains.

Here is another way to enable jumbo frames and avoid buf chains:
1. Disable scattering and jumbo frames at rte_eth_dev_configure().

2. Make sure the rte_pktmbuf_data_room_size(mpool)
-RTE_PKTMBUF_HEADROOM is aligned to 1K.

3. Use rte_eth_dev_set_mtu() to enable jumbo frames. Since the
scattering is disabled, set_mtu() will make sure the new packet size
fits into a single mbuf.

Andriy

On Wed, Apr 20, 2016 at 5:47 PM, Jay Rolette <rolette@infinite.io> wrote:
>
> On Wed, Apr 20, 2016 at 9:05 AM, Bruce Richardson
> <bruce.richardson@intel.com> wrote:
>>
>> On Wed, Apr 20, 2016 at 07:22:57AM -0500, Jay Rolette wrote:
>> > On Wed, Apr 20, 2016 at 4:10 AM, Bruce Richardson <
>> > bruce.richardson@intel.com> wrote:
>> >
>> > > On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
>> > > > In ixgbe_dev_rx_init(), there is this bit of code:
>> > > >
>> > > >       /*
>> > > >        * Configure the RX buffer size in the BSIZEPACKET field of
>> > > >        * the SRRCTL register of the queue.
>> > > >        * The value is in 1 KB resolution. Valid values can be from
>> > > >        * 1 KB to 16 KB.
>> > > >        */
>> > > >       buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool)
>> > > > -
>> > > >               RTE_PKTMBUF_HEADROOM);
>> > > >       srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
>> > > >                  IXGBE_SRRCTL_BSIZEPKT_MASK);
>> > > >
>> > > >       IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
>> > > >
>> > > >       buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK)
>> > > > <<
>> > > >                              IXGBE_SRRCTL_BSIZEPKT_SHIFT);
>> > > >
>> > > >       /* It adds dual VLAN length for supporting dual VLAN */
>> > > >       if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
>> > > >                                   2 * IXGBE_VLAN_TAG_SIZE >
>> > > > buf_size)
>> > > >               dev->data->scattered_rx = 1;
>> > > >
>> > > >
>> > > > Couple of questions I have about it:
>> > > >
>> > > > 1) If the caller configured the MBUF memory pool data room size to
>> > > > be
>> > > > something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then
>> > > > the
>> > > > driver ends up silently programming the NIC to use a smaller max RX
>> > > > size
>> > > > than the caller specified.
>> > > >
>> > > > Should the driver error out in that case instead of only "sort of"
>> > > working?
>> > > > If not, it seems like it should be logging an error message.
>> > >
>> > > Well, it does log at the end of the whole rx init process what RX
>> > > function
>> > > is
>> > > being used, so there is that.
>> > > However, looking at it now, given that there is an explicit setting in
>> > > the
>> > > config
>> > > to request scattered mode, I think you may be right and that we should
>> > > error out
>> > > here if scattered mode is needed and not set. It could help prevent
>> > > unexpected
>> > > problems with these drivers.
>> > >
>> >
>> > +1. A hard fail at init if I've misconfigured things is much preferred
>> > to
>> > something that mostly works for typical test cases and only fails on
>> > corner/limit testing.
>> >
>> >
>> > > > 2) Second question is about the "/* It adds dual VLAN length for
>> > > supporting
>> > > > dual VLAN */" bit...
>> > > >
>> > > > As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be
>> > > > set
>> > > to
>> > > > the max frame size you support (although it says it's only used if
>> > > > jumbo
>> > > > frame is enabled). That same value is generally used when
>> > > > calculating the
>> > > > size that mbuf elements should be created at.
>> > > >
>> > > > The description for the data_room_size parameter of
>> > > > rte_pktmbuf_pool_create():
>> > > >
>> > > > "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
>> > > >
>> > > >
>> > > > If I support a max frame size of 9216 bytes (exactly a 1K multiple
>> > > > to
>> > > make
>> > > > the NIC happy), then max_rx_pkt_len is going to be 9216 and
>> > > data_room_size
>> > > > will be 9216 + RTE_PKTMBUF_HEADROOM.
>> > > >
>> > > > ixgbe_dev_rx_init() will calculate normalize that back to 9216,
>> > > > which
>> > > will
>> > > > fail the dual VLAN length check. The really nasty part about that is
>> > > > it
>> > > has
>> > > > a side-effect of enabling scattered RX regardless of the fact that I
>> > > didn't
>> > > > enable scattered RX in dev_conf.rxmode.
>> > > >
>> > > > The code in the e1000 PMD is similar, so nothing unique to ixgbe.
>> > > >
>> > > > Is that check correct? It seems wrong to be adding space for q-in-q
>> > > > on
>> > > top
>> > > > of your specified max frame size...
>> > >
>> > > The issue here is what the correct behaviour needs to be. If we have
>> > > the
>> > > user
>> > > specify the maximum frame size including all vlan tags, then we hit
>> > > the
>> > > problem
>> > > where we have to subtract the VLAN tag sizes when writing the value to
>> > > the
>> > > NIC.
>> > > In that case, we will hit a problem where we get a e.g. 9210 byte
>> > > frame -
>> > > to
>> > > reuse your example - without any VLAN tags, which will be rejected by
>> > > the
>> > > hardware as being oversized. If we don't do the subtraction, and we
>> > > get the
>> > > same 9210 byte packet with 2 VLAN tags on it, the hardware will accept
>> > > it
>> > > and
>> > > then split it across multiple descriptors because the actual DMA size
>> > > is
>> > > 9218 bytes.
>> > >
>> >
>> > As an app developer, I didn't realize the max frame size didn't include
>> > VLAN tags. I expected max frame size to be the size of the ethernet
>> > frame
>> > on the wire, which I would expect to include space used by any VLAN or
>> > MPLS
>> > tags.
>> >
>> > Is there anything in the docs or example apps about that? I did some
>> > digging as I was debugging this and didn't notice it, but entirely
>> > possible
>> > I just missed it.
>> >
>> >
>> > > I'm not sure there is a works-in-all-cases solution here.
>> > >
>> >
>> > Andriy's suggestion seems like it points in the right direction.
>> >
>> > From an app developer point of view, I'd expect to have a single max
>> > frame
>> > size value to track and the APIs should take care of any adjustments
>> > required internally. Maybe have rte_pktmbuf_pool_create() add the
>> > additional bytes when it calls rte_mempool_create() under the covers?
>> > Then
>> > it's nice and clean for the API without unexpected side-effects.
>> >
>>
>> It will still have unintended side-effects I think, depending on the
>> resolution
>> of the NIC buffer length paramters. For drivers like ixgbe or e1000, the
>> mempool
>> create call could potentially have to add an additional 1k to each buffer
>> just
>> to be able to store the extra eight bytes.
>
>
> The comments in the ixgbe driver say that the value programmed into SRRCTL
> must be on a 1K boundary. Based on your previous response, it sounded like
> the NIC ignores that limit for VLAN tags, hence the check for the extra 8
> bytes on the mbuf element size. Are you worried about the size resolution on
> mempool elements?
>
> Sounds like I've got to go spend some quality time in the NIC data sheets...
> Maybe I should back up and just ask the higher level question:
>
> What's the right incantation in both the dev_conf structure and in creating
> the mbuf pool to support jumbo frames of some particular size on the wire,
> with or without VLAN tags, without requiring scattered_rx support in an app?
>
> Thanks,
> Jay



-- 
Andriy Berestovskyy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Couple of PMD questions
  2016-04-20 15:47       ` Jay Rolette
  2016-04-21  9:13         ` Andriy Berestovskyy
@ 2016-04-21 10:54         ` Bruce Richardson
  1 sibling, 0 replies; 9+ messages in thread
From: Bruce Richardson @ 2016-04-21 10:54 UTC (permalink / raw)
  To: Jay Rolette; +Cc: DPDK, Andriy Berestovskyy

On Wed, Apr 20, 2016 at 10:47:59AM -0500, Jay Rolette wrote:
> On Wed, Apr 20, 2016 at 9:05 AM, Bruce Richardson <
> bruce.richardson@intel.com> wrote:
> 
> > On Wed, Apr 20, 2016 at 07:22:57AM -0500, Jay Rolette wrote:
> > > On Wed, Apr 20, 2016 at 4:10 AM, Bruce Richardson <
> > > bruce.richardson@intel.com> wrote:
> > >
> > > > On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
> > > > > In ixgbe_dev_rx_init(), there is this bit of code:
> > > > >
> > > > >       /*
> > > > >        * Configure the RX buffer size in the BSIZEPACKET field of
> > > > >        * the SRRCTL register of the queue.
> > > > >        * The value is in 1 KB resolution. Valid values can be from
> > > > >        * 1 KB to 16 KB.
> > > > >        */
> > > > >       buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool)
> > -
> > > > >               RTE_PKTMBUF_HEADROOM);
> > > > >       srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
> > > > >                  IXGBE_SRRCTL_BSIZEPKT_MASK);
> > > > >
> > > > >       IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
> > > > >
> > > > >       buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
> > > > >                              IXGBE_SRRCTL_BSIZEPKT_SHIFT);
> > > > >
> > > > >       /* It adds dual VLAN length for supporting dual VLAN */
> > > > >       if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
> > > > >                                   2 * IXGBE_VLAN_TAG_SIZE > buf_size)
> > > > >               dev->data->scattered_rx = 1;
> > > > >
> > > > >
> > > > > Couple of questions I have about it:
> > > > >
> > > > > 1) If the caller configured the MBUF memory pool data room size to be
> > > > > something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then
> > the
> > > > > driver ends up silently programming the NIC to use a smaller max RX
> > size
> > > > > than the caller specified.
> > > > >
> > > > > Should the driver error out in that case instead of only "sort of"
> > > > working?
> > > > > If not, it seems like it should be logging an error message.
> > > >
> > > > Well, it does log at the end of the whole rx init process what RX
> > function
> > > > is
> > > > being used, so there is that.
> > > > However, looking at it now, given that there is an explicit setting in
> > the
> > > > config
> > > > to request scattered mode, I think you may be right and that we should
> > > > error out
> > > > here if scattered mode is needed and not set. It could help prevent
> > > > unexpected
> > > > problems with these drivers.
> > > >
> > >
> > > +1. A hard fail at init if I've misconfigured things is much preferred to
> > > something that mostly works for typical test cases and only fails on
> > > corner/limit testing.
> > >
> > >
> > > > > 2) Second question is about the "/* It adds dual VLAN length for
> > > > supporting
> > > > > dual VLAN */" bit...
> > > > >
> > > > > As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be
> > set
> > > > to
> > > > > the max frame size you support (although it says it's only used if
> > jumbo
> > > > > frame is enabled). That same value is generally used when
> > calculating the
> > > > > size that mbuf elements should be created at.
> > > > >
> > > > > The description for the data_room_size parameter of
> > > > > rte_pktmbuf_pool_create():
> > > > >
> > > > > "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
> > > > >
> > > > >
> > > > > If I support a max frame size of 9216 bytes (exactly a 1K multiple to
> > > > make
> > > > > the NIC happy), then max_rx_pkt_len is going to be 9216 and
> > > > data_room_size
> > > > > will be 9216 + RTE_PKTMBUF_HEADROOM.
> > > > >
> > > > > ixgbe_dev_rx_init() will calculate normalize that back to 9216, which
> > > > will
> > > > > fail the dual VLAN length check. The really nasty part about that is
> > it
> > > > has
> > > > > a side-effect of enabling scattered RX regardless of the fact that I
> > > > didn't
> > > > > enable scattered RX in dev_conf.rxmode.
> > > > >
> > > > > The code in the e1000 PMD is similar, so nothing unique to ixgbe.
> > > > >
> > > > > Is that check correct? It seems wrong to be adding space for q-in-q
> > on
> > > > top
> > > > > of your specified max frame size...
> > > >
> > > > The issue here is what the correct behaviour needs to be. If we have
> > the
> > > > user
> > > > specify the maximum frame size including all vlan tags, then we hit the
> > > > problem
> > > > where we have to subtract the VLAN tag sizes when writing the value to
> > the
> > > > NIC.
> > > > In that case, we will hit a problem where we get a e.g. 9210 byte
> > frame -
> > > > to
> > > > reuse your example - without any VLAN tags, which will be rejected by
> > the
> > > > hardware as being oversized. If we don't do the subtraction, and we
> > get the
> > > > same 9210 byte packet with 2 VLAN tags on it, the hardware will accept
> > it
> > > > and
> > > > then split it across multiple descriptors because the actual DMA size
> > is
> > > > 9218 bytes.
> > > >
> > >
> > > As an app developer, I didn't realize the max frame size didn't include
> > > VLAN tags. I expected max frame size to be the size of the ethernet frame
> > > on the wire, which I would expect to include space used by any VLAN or
> > MPLS
> > > tags.
> > >
> > > Is there anything in the docs or example apps about that? I did some
> > > digging as I was debugging this and didn't notice it, but entirely
> > possible
> > > I just missed it.
> > >
> > >
> > > > I'm not sure there is a works-in-all-cases solution here.
> > > >
> > >
> > > Andriy's suggestion seems like it points in the right direction.
> > >
> > > From an app developer point of view, I'd expect to have a single max
> > frame
> > > size value to track and the APIs should take care of any adjustments
> > > required internally. Maybe have rte_pktmbuf_pool_create() add the
> > > additional bytes when it calls rte_mempool_create() under the covers?
> > Then
> > > it's nice and clean for the API without unexpected side-effects.
> > >
> >
> > It will still have unintended side-effects I think, depending on the
> > resolution
> > of the NIC buffer length paramters. For drivers like ixgbe or e1000, the
> > mempool
> > create call could potentially have to add an additional 1k to each buffer
> > just
> > to be able to store the extra eight bytes.
> 
> 
> The comments in the ixgbe driver say that the value programmed into SRRCTL
> must be on a 1K boundary. Based on your previous response, it sounded like
> the NIC ignores that limit for VLAN tags, hence the check for the extra 8
> bytes on the mbuf element size. Are you worried about the size resolution
> on mempool elements?
> 
> Sounds like I've got to go spend some quality time in the NIC data
> sheets... Maybe I should back up and just ask the higher level question:
> 
> What's the right incantation in both the dev_conf structure and in creating
> the mbuf pool to support jumbo frames of some particular size on the wire,
> with or without VLAN tags, without requiring scattered_rx support in an app?
> 
I think that the answer to that may be that it's hard to do, or impossible,
depending on the NIC. From what I can see from the datasheet for some of the
ixgbe NICs, the max-frame-size (MAXFRS) field that you specify always excludes
VLAN tags. Therefore you can't tell the NIC to accept a max size of X irrespective
of VLAN tags. The max size will always vary depending on the tags.

/Bruce

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-04-21 10:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-19 20:16 Couple of PMD questions Jay Rolette
2016-04-20  9:10 ` Bruce Richardson
2016-04-20 12:22   ` Jay Rolette
2016-04-20 14:05     ` Bruce Richardson
2016-04-20 15:47       ` Jay Rolette
2016-04-21  9:13         ` Andriy Berestovskyy
2016-04-21 10:54         ` Bruce Richardson
2016-04-20  9:35 ` Andriy Berestovskyy
2016-04-20 12:45   ` Jay Rolette

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.