All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pavel Odintsov <pavel.odintsov@gmail.com>
To: Anuj Kalia <anujkaliaiitd@gmail.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: Could not achieve wire speed for 40GE with any DPDK version on XL710 NIC's
Date: Fri, 3 Jul 2015 11:35:45 +0300	[thread overview]
Message-ID: <CALgsdbfWibuZ-Znn4p7USB=GDT5WgbChJB=4au-fzzCXDpCeKA@mail.gmail.com> (raw)
In-Reply-To: <CADPSxAhXQWUgqkBacwSNU=qYUv61k4Ox2EN6qKq_xOhi3KMTkQ@mail.gmail.com>

Hello, folks!

We have found root of issue.

Intel do not offer wire speed for 64b packets in XL710 at all.

As mentioned in data sheet
http://www.intel.ru/content/dam/www/public/us/en/documents/product-briefs/xl710-10-40-gbe-controller-brief.pdf
we have:

Small packet performance: Maintains wire-rate throughput on smaller
payload sizes (>128 Bytes for 40 GbE and >64 Bytes for 10 GbE

Could anybody recommend NIC's which could truly achieve wire rate for 40GE?


On Wed, Jul 1, 2015 at 9:01 PM, Anuj Kalia <anujkaliaiitd@gmail.com> wrote:
> Thanks for the comments.
>
> On Wed, Jul 1, 2015 at 1:32 PM, Vladimir Medvedkin <medvedkinv@gmail.com> wrote:
>> Hi Anuj,
>>
>> Thanks for fixes!
>> I have 2 comments
>> - from i40e_ethdev.h : #define I40E_DEFAULT_RX_WTHRESH      0
>> - (26 + 32) / 4 (batched descriptor writeback) should be (26 + 4 * 32) / 4
>> (batched descriptor writeback)
>> , thus we have 135 bytes/packet
>>
>> This corresponds to 58.8 Mpps
>>
>> Regards,
>> Vladimir
>>
>> 2015-07-01 17:22 GMT+03:00 Anuj Kalia <anujkaliaiitd@gmail.com>:
>>>
>>> Vladimir,
>>>
>>> Few possible fixes to your PCIe analysis (let me know if I'm wrong):
>>> - ECRC is probably disabled (check using sudo lspci -vvv | grep
>>> CGenEn-), so TLP header is 26 bytes
>>> - Descriptor writeback can be batched using high value of WTHRESH,
>>> which is what DPDK uses by default
>>> - Read request contains full TLP header (26 bytes)
>>>
>>> Assuming WTHRESH = 4, bytes transferred from NIC to host per packet =
>>> 26 + 64 (packet itself) +
>>> (26 + 32) / 4 (batched descriptor writeback) +
>>> (26 / 4) (read request for new descriptors) =
>>> 111 bytes / packet
>>>
>>> This corresponds to 70.9 Mpps over PCIe 3.0 x8. Assuming 5% DLLP
>>> overhead, rate = 67.4 Mpps
>>>
>>> --Anuj
>>>
>>>
>>>
>>> On Wed, Jul 1, 2015 at 9:40 AM, Vladimir Medvedkin <medvedkinv@gmail.com>
>>> wrote:
>>> > In case with syn flood you should take into account return syn-ack
>>> > traffic,
>>> > which generates PCIe DLLP's from NIC to host, thus pcie bandwith exceeds
>>> > faster. And don't forget about DLLP's generated by rx traffic, which
>>> > saturates host-to-NIC bus.
>>> >
>>> > 2015-07-01 16:05 GMT+03:00 Pavel Odintsov <pavel.odintsov@gmail.com>:
>>> >
>>> >> Yes, Bruce, we understand this. But we are working with huge SYN
>>> >> attacks processing and they are 64byte only :(
>>> >>
>>> >> On Wed, Jul 1, 2015 at 3:59 PM, Bruce Richardson
>>> >> <bruce.richardson@intel.com> wrote:
>>> >> > On Wed, Jul 01, 2015 at 03:44:57PM +0300, Pavel Odintsov wrote:
>>> >> >> Thanks for answer, Vladimir! So we need look for x16 NIC if we want
>>> >> >> achieve 40GE line rate...
>>> >> >>
>>> >> > Note that this would only apply for your minimal i.e. 64-byte, packet
>>> >> sizes.
>>> >> > Once you go up to larger e.g. 128B packets, your PCI bandwidth
>>> >> requirements
>>> >> > are lower and you can easier achieve line rate.
>>> >> >
>>> >> > /Bruce
>>> >> >
>>> >> >> On Wed, Jul 1, 2015 at 3:06 PM, Vladimir Medvedkin <
>>> >> medvedkinv@gmail.com> wrote:
>>> >> >> > Hi Pavel,
>>> >> >> >
>>> >> >> > Looks like you ran into pcie bottleneck. So let's calculate xl710
>>> >> >> > rx
>>> >> only
>>> >> >> > case.
>>> >> >> > Assume we have 32byte descriptors (if we want more offload).
>>> >> >> > DMA makes one pcie transaction with packet payload, one descriptor
>>> >> writeback
>>> >> >> > and one memory request for free descriptors for every 4 packets.
>>> >> >> > For
>>> >> >> > Transaction Layer Packet (TLP) there is 30 bytes overhead (4 PHY +
>>> >> >> > 6
>>> >> DLL +
>>> >> >> > 16 header + 4 ECRC). So for 1 rx packet dma sends 30 + 64(packet
>>> >> itself) +
>>> >> >> > 30 + 32 (writeback descriptor) + (16 / 4) (read request for new
>>> >> >> > descriptors). Note that we do not take into account PCIe
>>> >> >> > ACK/NACK/FC
>>> >> Update
>>> >> >> > DLLP. So we have 160 bytes per packet. One lane PCIe 3.0 transmits
>>> >> >> > 1
>>> >> byte in
>>> >> >> > 1 ns, so x8 transmits 8 bytes  in 1 ns. 1 packet transmits in 20
>>> >> >> > ns.
>>> >> Thus
>>> >> >> > in theory pcie 3.0 x8 may transfer not more than 50mpps.
>>> >> >> > Correct me if I'm wrong.
>>> >> >> >
>>> >> >> > Regards,
>>> >> >> > Vladimir
>>> >> >> >
>>> >> >> >
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Sincerely yours, Pavel Odintsov
>>> >>
>>
>>



-- 
Sincerely yours, Pavel Odintsov

      reply	other threads:[~2015-07-03  8:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-28 10:34 Could not achieve wire speed for 40GE with any DPDK version on XL710 NIC's Pavel Odintsov
2015-06-28 23:35 ` Keunhong Lee
2015-06-29  6:59   ` Pavel Odintsov
2015-06-29 15:06     ` Keunhong Lee
2015-06-29 15:38       ` Andrew Theurer
2015-06-29 15:41         ` Pavel Odintsov
2015-07-01 12:06           ` Vladimir Medvedkin
2015-07-01 12:44             ` Pavel Odintsov
2015-07-01 12:59               ` Bruce Richardson
2015-07-01 13:05                 ` Pavel Odintsov
2015-07-01 13:40                   ` Vladimir Medvedkin
2015-07-01 14:22                     ` Anuj Kalia
2015-07-01 17:32                       ` Vladimir Medvedkin
2015-07-01 18:01                         ` Anuj Kalia
2015-07-03  8:35                           ` Pavel Odintsov [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALgsdbfWibuZ-Znn4p7USB=GDT5WgbChJB=4au-fzzCXDpCeKA@mail.gmail.com' \
    --to=pavel.odintsov@gmail.com \
    --cc=anujkaliaiitd@gmail.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.