All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Sanford, Robert" <rsanford@akamai.com>
To: "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "Venkatesan, Venky" <venky.venkatesan@intel.com>,
	"Wiles, Keith" <keith.wiles@intel.com>,
	"Liang, Cunming" <cunming.liang@intel.com>,
	"Zhang, Helin" <helin.zhang@intel.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	Olivier MATZ <olivier.matz@6wind.com>,
	"adrien.mazarguil@6wind.com" <adrien.mazarguil@6wind.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	"Jerin Jacob (Cavium) (jerin.jacob@caviumnetworks.com)"
	<jerin.jacob@caviumnetworks.com>
Subject: Re: [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops
Date: Fri, 1 Apr 2016 19:31:55 +0000	[thread overview]
Message-ID: <D3242DFD.11970%rsanford@akamai.com> (raw)
In-Reply-To: <3EB4FA525960D640B5BDFFD6A3D8912647975059@IRSMSX108.ger.corp.intel.com>

Hi Cristian,

In hindsight, I was overly agressive in proposing the same change
(approach #2, as you call it below) for rte_port_ring and rte_port_sched.
Changing local variable bsz_mask to uint64_t should be sufficient.

Please see additional comments inline below.



On 3/31/16 11:41 AM, "Dumitrescu, Cristian"
<cristian.dumitrescu@intel.com> wrote:
>> -----Original Message-----
>> From: Robert Sanford [mailto:rsanford2@gmail.com]
>> Sent: Monday, March 28, 2016 9:52 PM
>> To: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
>> Subject: [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops
>> 
>> For several f_tx_bulk functions in rte_port_{ethdev,ring,sched}.c,
>> it appears that the intent of the bsz_mask logic is to test whether
>> pkts_mask contains a full burst (i.e., the <tx_burst_sz> least
>> significant bits are set).
>> 
>> There are two problems with the bsz_mask code: 1) It truncates
>> by using the wrong size for local variable uint32_t bsz_mask, and
>
>This is indeed a bug: although port->bsz_mask is always defined as
>uint64_t, there are several places where we cache it to a local variable
>which is defined as 32-bit by mistake: uint32_t bsz = p->bsz_mask.
>Thanks, Robert!
>
>> 2) We may pass oversized bursts to the underlying ethdev/ring/sched,
>> e.g., tx_burst_sz=16, bsz_mask=0x8000, and pkts_mask=0x1ffff
>> (17 packets), results in expr==0, and we send a burst larger than
>> desired (and non-power-of-2) to the underlying tx burst interface.
>> 
>
>As stated in another related email, this is done by design, with the key
>assumption being that larger TX burst sizes will always be beneficial. So
>tx_burst_size is, by design, a requirement for the *minimal* value of the
>TX burst size rather than the *exact* value for the burst size.
>As an example, when the TX burst size of 32 is set, then larger burst
>sizes of 33, 34, ..., 40, 41, ..., 48, ..., 64 are welcomed and sent out
>as a single burst rather than breaking in into multiple fixed size
>32-packet bursts. 
>For PMDs, burst size (smaller than 64) is typically much lower than the
>TX ring size (typical value for IXGBE: 512). Same for rte_ring.
>
>So what we are debating here is which of the following two approaches is
>better:
>Approach 1: Consider tx_burst_sz as the minimal burst size, welcome
>larger bursts and send them as a single burst (i.e. do not break them
>into fixed tx_burst_sz bursts). This is the existing approach used
>consistently everywhere in librte_port.
>Approach 2: Consider tx_burst_sz as an exact burst size requirement, any
>larger incoming burst is broken into fixed size bursts of exactly
>tx_burst_sz packets before send. This is the approach suggested by Robert.
>
>I think we should go for the approach that gives the best performance.
>Personally, I think Approach 1 (existing) is doing this, but I would like
>to get more fact-based opinions from the people on this mail list (CC-ing
>a few key folks), especially PMD and ring maintainers. What is your
>experience, guys?


I only advocate approach #2 (for rte_port_ethdev) if we can't be certain
of the semantics of the underlying rte_eth_tx_burst(). e.g., if we attempt
to send 33 packets, and it never enqueues more than 32.

Off topic, general points about PMD APIs:
* From the viewpoint of an API user, having rte_eth_{rx,tx}_burst()
unexpectedly truncate a burst request is an unwelcome surprise, and should
at least be well-documented.
* We previously encountered this problem on the RX-side of IXGBE: We asked
rte_eth_rx_burst() for 64 packets, but it never returned more than 32,
even when there were hundreds "done". This was a little unexpected.

* From a quick look at the latest code, it appears that
ixgbe_xmit_pkts_vec() and ixgbe_recv_pkts_vec() still do this. Yes, these
are the vector versions, but do we need to study the drivers of every
device that we might use, when deciding things such as burst sizes? :)
* One simple enhancement idea: ethdev info-get API could convey related
info, e.g., whether rx/tx APIs truncate bursts, if so, how big, etc.


>
>> We propose to effectively set bsz_mask = (1 << tx_burst_sz) - 1
>> (while avoiding truncation for tx_burst_sz=64), to cache the mask
>> value of a full burst, and then do a simple compare with pkts_mask
>> in each f_tx_bulk.
>> 
>> Signed-off-by: Robert Sanford <rsanford@akamai.com>
>> ---
>>  lib/librte_port/rte_port_ethdev.c |   15 ++++-----------
>>  lib/librte_port/rte_port_ring.c   |   16 ++++------------
>>  lib/librte_port/rte_port_sched.c  |    7 ++-----
>>  3 files changed, 10 insertions(+), 28 deletions(-)
>> 
>> diff --git a/lib/librte_port/rte_port_ethdev.c
>> b/lib/librte_port/rte_port_ethdev.c
>> index 1c34602..3fb4947 100644
>> --- a/lib/librte_port/rte_port_ethdev.c
>> +++ b/lib/librte_port/rte_port_ethdev.c
>> @@ -188,7 +188,7 @@ rte_port_ethdev_writer_create(void *params, int
>> socket_id)
>>  	port->queue_id = conf->queue_id;
>>  	port->tx_burst_sz = conf->tx_burst_sz;
>>  	port->tx_buf_count = 0;
>> -	port->bsz_mask = 1LLU << (conf->tx_burst_sz - 1);
>> +	port->bsz_mask = UINT64_MAX >> (64 - conf->tx_burst_sz);
>
>Another way to write this is: port->bsz_mask =
>RTE_LEN2MASK(conf->tx_burst_sz, uint64_t);

True, didn't know about that macro.

--
Thanks,
Robert

  reply	other threads:[~2016-04-01 19:31 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-28 20:51 [PATCH 0/4] port: fix and test bugs in tx_bulk ops Robert Sanford
2016-03-28 20:51 ` [PATCH 1/4] app/test: enhance test_port_ring_writer Robert Sanford
2016-04-01 19:42   ` Sanford, Robert
2016-04-06 16:46     ` Dumitrescu, Cristian
2016-03-28 20:51 ` [PATCH 2/4] port: fix ring writer buffer overflow Robert Sanford
2016-03-31 11:21   ` Dumitrescu, Cristian
2016-03-28 20:51 ` [PATCH 3/4] port: fix full burst checks in f_tx_bulk ops Robert Sanford
2016-03-31 15:41   ` Dumitrescu, Cristian
2016-04-01 19:31     ` Sanford, Robert [this message]
2016-03-28 20:51 ` [PATCH 4/4] port: fix ethdev writer burst too big Robert Sanford
2016-03-31 13:22   ` Dumitrescu, Cristian
2016-03-30 11:00 ` [PATCH 0/4] port: fix and test bugs in tx_bulk ops Thomas Monjalon
2016-03-30 11:58   ` Dumitrescu, Cristian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D3242DFD.11970%rsanford@akamai.com \
    --to=rsanford@akamai.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=bruce.richardson@intel.com \
    --cc=cristian.dumitrescu@intel.com \
    --cc=cunming.liang@intel.com \
    --cc=dev@dpdk.org \
    --cc=helin.zhang@intel.com \
    --cc=jerin.jacob@caviumnetworks.com \
    --cc=keith.wiles@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=stephen@networkplumber.org \
    --cc=venky.venkatesan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.