From: Mateusz Polchlopek <mateusz.polchlopek@intel.com> To: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org, horms@kernel.org, przemyslaw.kitszel@intel.com, Michal Wilczynski <michal.wilczynski@intel.com>, Mateusz Polchlopek <mateusz.polchlopek@intel.com> Subject: [Intel-wired-lan] [PATCH iwl-next v4 5/5] ice: Document tx_scheduling_layers parameter Date: Mon, 19 Feb 2024 05:05:59 -0500 [thread overview] Message-ID: <20240219100555.7220-6-mateusz.polchlopek@intel.com> (raw) In-Reply-To: <20240219100555.7220-1-mateusz.polchlopek@intel.com> From: Michal Wilczynski <michal.wilczynski@intel.com> New driver specific parameter 'tx_scheduling_layers' was introduced. Describe parameter in the documentation. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> --- Documentation/networking/devlink/ice.rst | 41 ++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst index efc6be109dc3..1ae46dee0fd5 100644 --- a/Documentation/networking/devlink/ice.rst +++ b/Documentation/networking/devlink/ice.rst @@ -36,6 +36,47 @@ Parameters The latter allows for bandwidth higher than external port speed when looping back traffic between VFs. Works with 8x10G and 4x25G cards. + * - ``tx_scheduling_layers`` + - permanent + - The ice hardware uses hierarchical scheduling for Tx with a fixed + number of layers in the scheduling tree. Root node is representing a + port, while all the leaves represents the queues. This way of + configuring Tx scheduler allows features like DCB or devlink-rate + (documented below) for fine-grained configuration how much BW is given + to any given queue or group of queues, as scheduling parameters can be + configured at any given layer of the tree. By default 9-layer tree + topology was deemed best for most workloads, as it gives optimal + performance to configurability ratio. However for some specific cases, + this might not be the case. A great example would be sending traffic to + queues that is not a multiple of 8. Since in 9-layer topology maximum + number of children is limited to 8, the 9th queue has a different parent + than the rest, and it's given more BW credits. This causes a problem + when the system is sending traffic to 9 queues: + + | tx_queue_0_packets: 24163396 + | tx_queue_1_packets: 24164623 + | tx_queue_2_packets: 24163188 + | tx_queue_3_packets: 24163701 + | tx_queue_4_packets: 24163683 + | tx_queue_5_packets: 24164668 + | tx_queue_6_packets: 23327200 + | tx_queue_7_packets: 24163853 + | tx_queue_8_packets: 91101417 < Too much traffic is sent to 9th + + Sometimes this might be a big concern, so the idea is to empower the + user to switch to 5-layer topology, enabling performance gains but + sacrificing configurability for features like DCB and devlink-rate. + + This parameter gives user flexibility to choose the 5-layer transmit + scheduler topology. After switching parameter reboot is required for + the feature to start working. + + User could choose 9 (the default) or 5 as a value of parameter, e.g.: + $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers + value 5 cmode permanent + + And verify that value has been set: + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers Info versions ============= -- 2.38.1
WARNING: multiple messages have this Message-ID (diff)
From: Mateusz Polchlopek <mateusz.polchlopek@intel.com> To: intel-wired-lan@lists.osuosl.org Cc: netdev@vger.kernel.org, Mateusz Polchlopek <mateusz.polchlopek@intel.com>, Michal Wilczynski <michal.wilczynski@intel.com>, horms@kernel.org, przemyslaw.kitszel@intel.com Subject: [Intel-wired-lan] [PATCH iwl-next v4 5/5] ice: Document tx_scheduling_layers parameter Date: Mon, 19 Feb 2024 05:05:59 -0500 [thread overview] Message-ID: <20240219100555.7220-6-mateusz.polchlopek@intel.com> (raw) In-Reply-To: <20240219100555.7220-1-mateusz.polchlopek@intel.com> From: Michal Wilczynski <michal.wilczynski@intel.com> New driver specific parameter 'tx_scheduling_layers' was introduced. Describe parameter in the documentation. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> --- Documentation/networking/devlink/ice.rst | 41 ++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/Documentation/networking/devlink/ice.rst b/Documentation/networking/devlink/ice.rst index efc6be109dc3..1ae46dee0fd5 100644 --- a/Documentation/networking/devlink/ice.rst +++ b/Documentation/networking/devlink/ice.rst @@ -36,6 +36,47 @@ Parameters The latter allows for bandwidth higher than external port speed when looping back traffic between VFs. Works with 8x10G and 4x25G cards. + * - ``tx_scheduling_layers`` + - permanent + - The ice hardware uses hierarchical scheduling for Tx with a fixed + number of layers in the scheduling tree. Root node is representing a + port, while all the leaves represents the queues. This way of + configuring Tx scheduler allows features like DCB or devlink-rate + (documented below) for fine-grained configuration how much BW is given + to any given queue or group of queues, as scheduling parameters can be + configured at any given layer of the tree. By default 9-layer tree + topology was deemed best for most workloads, as it gives optimal + performance to configurability ratio. However for some specific cases, + this might not be the case. A great example would be sending traffic to + queues that is not a multiple of 8. Since in 9-layer topology maximum + number of children is limited to 8, the 9th queue has a different parent + than the rest, and it's given more BW credits. This causes a problem + when the system is sending traffic to 9 queues: + + | tx_queue_0_packets: 24163396 + | tx_queue_1_packets: 24164623 + | tx_queue_2_packets: 24163188 + | tx_queue_3_packets: 24163701 + | tx_queue_4_packets: 24163683 + | tx_queue_5_packets: 24164668 + | tx_queue_6_packets: 23327200 + | tx_queue_7_packets: 24163853 + | tx_queue_8_packets: 91101417 < Too much traffic is sent to 9th + + Sometimes this might be a big concern, so the idea is to empower the + user to switch to 5-layer topology, enabling performance gains but + sacrificing configurability for features like DCB and devlink-rate. + + This parameter gives user flexibility to choose the 5-layer transmit + scheduler topology. After switching parameter reboot is required for + the feature to start working. + + User could choose 9 (the default) or 5 as a value of parameter, e.g.: + $ devlink dev param set pci/0000:16:00.0 name tx_scheduling_layers + value 5 cmode permanent + + And verify that value has been set: + $ devlink dev param show pci/0000:16:00.0 name tx_scheduling_layers Info versions ============= -- 2.38.1
next prev parent reply other threads:[~2024-02-19 10:15 UTC|newest] Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top 2024-02-19 10:05 [Intel-wired-lan] [PATCH iwl-next v4 0/5] ice: Support 5 layer Tx scheduler topology Mateusz Polchlopek 2024-02-19 10:05 ` Mateusz Polchlopek 2024-02-19 10:05 ` [Intel-wired-lan] [PATCH iwl-next v1 1/5] ice: Support 5 layer topology Mateusz Polchlopek 2024-02-19 10:05 ` Mateusz Polchlopek 2024-02-19 10:16 ` Mateusz Polchlopek 2024-02-19 10:16 ` Mateusz Polchlopek 2024-02-19 10:05 ` [Intel-wired-lan] [PATCH iwl-next v4 2/5] ice: Adjust the VSI/Aggregator layers Mateusz Polchlopek 2024-02-19 10:05 ` Mateusz Polchlopek 2024-02-19 10:05 ` [Intel-wired-lan] [PATCH iwl-next v4 3/5] ice: Enable switching default Tx scheduler topology Mateusz Polchlopek 2024-02-19 10:05 ` Mateusz Polchlopek 2024-02-19 10:05 ` [Intel-wired-lan] [PATCH iwl-next v4 4/5] ice: Add tx_scheduling_layers devlink param Mateusz Polchlopek 2024-02-19 10:05 ` Mateusz Polchlopek 2024-02-19 12:37 ` Jiri Pirko 2024-02-19 12:37 ` Jiri Pirko 2024-02-19 13:33 ` Przemek Kitszel 2024-02-19 13:33 ` Przemek Kitszel 2024-02-19 17:15 ` Jiri Pirko 2024-02-19 17:15 ` Jiri Pirko 2024-02-21 23:38 ` Jakub Kicinski 2024-02-21 23:38 ` Jakub Kicinski 2024-02-22 13:25 ` Mateusz Polchlopek 2024-02-22 13:25 ` Mateusz Polchlopek 2024-02-22 23:07 ` Jakub Kicinski 2024-02-22 23:07 ` Jakub Kicinski 2024-02-23 9:45 ` Jiri Pirko 2024-02-23 9:45 ` Jiri Pirko 2024-02-23 14:27 ` Jakub Kicinski 2024-02-23 14:27 ` Jakub Kicinski 2024-02-25 7:18 ` Jiri Pirko 2024-02-25 7:18 ` Jiri Pirko 2024-02-27 2:37 ` Jakub Kicinski 2024-02-27 2:37 ` Jakub Kicinski 2024-02-27 12:17 ` Jiri Pirko 2024-02-27 12:17 ` Jiri Pirko 2024-02-27 13:05 ` Przemek Kitszel 2024-02-27 13:05 ` Przemek Kitszel 2024-02-27 15:39 ` Jiri Pirko 2024-02-27 15:39 ` Jiri Pirko 2024-02-27 15:41 ` Andrew Lunn 2024-02-27 15:41 ` Andrew Lunn 2024-02-27 16:04 ` Jiri Pirko 2024-02-27 16:04 ` Jiri Pirko 2024-02-27 20:38 ` Andrew Lunn 2024-02-27 20:38 ` Andrew Lunn 2024-02-19 10:05 ` Mateusz Polchlopek [this message] 2024-02-19 10:05 ` [Intel-wired-lan] [PATCH iwl-next v4 5/5] ice: Document tx_scheduling_layers parameter Mateusz Polchlopek
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20240219100555.7220-6-mateusz.polchlopek@intel.com \ --to=mateusz.polchlopek@intel.com \ --cc=horms@kernel.org \ --cc=intel-wired-lan@lists.osuosl.org \ --cc=michal.wilczynski@intel.com \ --cc=netdev@vger.kernel.org \ --cc=przemyslaw.kitszel@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.