netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Aya Levin <ayal@mellanox.com>
To: Saeed Mahameed <saeedm@mellanox.com>,
	"kuba@kernel.org" <kuba@kernel.org>,
	Bjorn Helgaas <helgaas@kernel.org>
Cc: "mkubecek@suse.cz" <mkubecek@suse.cz>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Tariq Toukan <tariqt@mellanox.com>
Subject: Re: [net-next 10/10] net/mlx5e: Add support for PCI relaxed ordering
Date: Wed, 24 Jun 2020 10:34:40 +0300	[thread overview]
Message-ID: <082c6bfe-5146-c213-9220-65177717c342@mellanox.com> (raw)
In-Reply-To: <dda5c2b729bbaf025592aa84e2bdb84d0cda7570.camel@mellanox.com>



On 6/24/2020 9:56 AM, Saeed Mahameed wrote:
> On Tue, 2020-06-23 at 14:31 -0700, Jakub Kicinski wrote:
>> On Tue, 23 Jun 2020 12:52:29 -0700 Saeed Mahameed wrote:
>>> From: Aya Levin <ayal@mellanox.com>
>>>
>>> The concept of Relaxed Ordering in the PCI Express environment
>>> allows
>>> switches in the path between the Requester and Completer to reorder
>>> some
>>> transactions just received before others that were previously
>>> enqueued.
>>>
>>> In ETH driver, there is no question of write integrity since each
>>> memory
>>> segment is written only once per cycle. In addition, the driver
>>> doesn't
>>> access the memory shared with the hardware until the corresponding
>>> CQE
>>> arrives indicating all PCI transactions are done.
>>
> 
> Hi Jakub, sorry i missed your comments on this patch.
> 
>> Assuming the device sets the RO bits appropriately, right? Otherwise
>> CQE write could theoretically surpass the data write, no?
>>
> 
> Yes HW guarantees correctness of correlated queues and transactions.
> 
>>> With relaxed ordering set, traffic on the remote-numa is at the
>>> same
>>> level as when on the local numa.
>>
>> Same level of? Achievable bandwidth?
>>
> 
> Yes, Bandwidth, according the below explanation, i see that the message
> needs improvements.
> 
>>> Running TCP single stream over ConnectX-4 LX, ARM CPU on remote-
>>> numa
>>> has 300% improvement in the bandwidth.
>>> With relaxed ordering turned off: BW:10 [GB/s]
>>> With relaxed ordering turned on:  BW:40 [GB/s]
>>>
>>> The driver turns relaxed ordering off by default. It exposes 2
>>> boolean
>>> private-flags in ethtool: pci_ro_read and pci_ro_write for user
>>> control.
>>>
>>> $ ethtool --show-priv-flags eth2
>>> Private flags for eth2:
>>> ...
>>> pci_ro_read        : off
>>> pci_ro_write       : off
>>>
>>> $ ethtool --set-priv-flags eth2 pci_ro_write on
>>> $ ethtool --set-priv-flags eth2 pci_ro_read on
>>
>> I think Michal will rightly complain that this does not belong in
>> private flags any more. As (/if?) ARM deployments take a foothold
>> in DC this will become a common setting for most NICs.
> 
> Initially we used pcie_relaxed_ordering_enabled() to
>   programmatically enable this on/off on boot but this seems to
> introduce some degradation on some Intel CPUs since the Intel Faulty
> CPUs list is not up to date. Aya is discussing this with Bjorn.
Adding Bjorn Helgaas
> 
> So until we figure this out, will keep this off by default.
> 
> for the private flags we want to keep them for performance analysis as
> we do with all other mlx5 special performance features and flags.
> 

  reply	other threads:[~2020-06-24  7:34 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-23 19:52 [pull request][net-next 00/10] mlx5 updates 2020-06-23 Saeed Mahameed
2020-06-23 19:52 ` [net-next 01/10] net/mlx5: Avoid eswitch header inclusion in fs core layer Saeed Mahameed
2020-06-23 21:00   ` Jakub Kicinski
2020-06-23 19:52 ` [net-next 02/10] net/mlx5: FWTrace: Add missing space Saeed Mahameed
2020-06-23 19:52 ` [net-next 03/10] net/mlx5: Add a missing macro undefinition Saeed Mahameed
2020-06-23 19:52 ` [net-next 04/10] net/mlx5: Use kfree(ft->g) in arfs_create_groups() Saeed Mahameed
2020-06-23 19:52 ` [net-next 05/10] net/mlx5e: Remove unused mlx5e_xsk_first_unused_channel Saeed Mahameed
2020-06-23 19:52 ` [net-next 06/10] net/mlx5e: Move including net/arp.h from en_rep.c to rep/neigh.c Saeed Mahameed
2020-06-23 21:02   ` Jakub Kicinski
2020-06-23 19:52 ` [net-next 07/10] net/mlx5e: Move TC-specific function definitions into MLX5_CLS_ACT Saeed Mahameed
2020-06-23 21:03   ` Jakub Kicinski
2020-06-23 21:26     ` Saeed Mahameed
2020-06-23 21:33       ` Jakub Kicinski
2020-06-23 19:52 ` [net-next 08/10] net/mlx5e: vxlan: Use RCU for vxlan table lookup Saeed Mahameed
2020-06-23 19:52 ` [net-next 09/10] net/mlx5e: vxlan: Return bool instead of opaque ptr in port_lookup() Saeed Mahameed
2020-06-23 19:52 ` [net-next 10/10] net/mlx5e: Add support for PCI relaxed ordering Saeed Mahameed
2020-06-23 21:31   ` Jakub Kicinski
2020-06-24  6:56     ` Saeed Mahameed
2020-06-24  7:34       ` Aya Levin [this message]
2020-06-24 17:22         ` Jakub Kicinski
2020-06-24 20:15           ` Saeed Mahameed
     [not found]             ` <20200624133018.5a4d238b@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
2020-07-06 13:00               ` Aya Levin
2020-07-06 16:52                 ` Jakub Kicinski
2020-07-06 19:49                 ` David Miller
2040-07-08  8:22                   ` Aya Levin
2020-07-08 23:16                     ` Bjorn Helgaas
2020-07-08 23:26                       ` Jason Gunthorpe
2020-07-09 17:35                         ` Jonathan Lemon
2020-07-09 18:20                           ` Jason Gunthorpe
2020-07-09 19:47                             ` Jakub Kicinski
2020-07-10  2:18                               ` Saeed Mahameed
2020-07-10 12:21                                 ` Jason Gunthorpe
2020-07-09 20:33                             ` Jonathan Lemon
2020-07-14 10:47                       ` Aya Levin
2020-07-23 21:03                     ` Alexander Duyck
2020-06-26 20:12           ` Bjorn Helgaas
2020-06-26 20:24             ` David Miller
2020-06-29  9:32             ` Aya Levin
2020-06-29 19:33               ` Bjorn Helgaas
2020-06-29 19:57                 ` Raj, Ashok
2020-06-30  7:32                   ` Ding Tianhong
2020-07-05 11:15                     ` Aya Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=082c6bfe-5146-c213-9220-65177717c342@mellanox.com \
    --to=ayal@mellanox.com \
    --cc=davem@davemloft.net \
    --cc=helgaas@kernel.org \
    --cc=kuba@kernel.org \
    --cc=mkubecek@suse.cz \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@mellanox.com \
    --cc=tariqt@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).