* [PATCH net-next 0/1] net: stmmac: add per-q coalesce support @ 2021-03-15 6:44 Ong Boon Leong 2021-03-15 6:44 ` [PATCH net-next 1/1] net: stmmac: add per-queue TX & RX coalesce ethtool support Ong Boon Leong 0 siblings, 1 reply; 4+ messages in thread From: Ong Boon Leong @ 2021-03-15 6:44 UTC (permalink / raw) To: Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu, David S . Miller, Jakub Kicinski, Maxime Coquelin Cc: netdev, linux-stm32, linux-arm-kernel, linux-kernel, Ong Boon Leong Hi, This patch adds per-queue RX & TX coalesce control so that user can adjust the RX & TX interrupt moderation per queue. This is beneficial for mixed criticality control (according to VLAN priority) by user application. The patch as been tested with following steps and results and the from the output of ethtool, it looks good. ######################################################################## > ethtool --show-coalesce eth0 Coalesce parameters for eth0: Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 > ethtool --per-queue eth0 queue_mask 0xFF --show-coalesce Queue: 0 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 1 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 2 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 3 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 4 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 5 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 6 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 7 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 > ethtool --per-queue eth0 queue_mask 0x02 --coalesce rx-usecs 100 rx-frames 5 > ethtool --per-queue eth0 queue_mask 0x20 --coalesce rx-usecs 100 rx-frames 5 > ethtool --per-queue eth0 queue_mask 0x22 --show-coalesce Queue: 1 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 99 rx-frames: 5 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 5 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 99 rx-frames: 5 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 > ethtool --per-queue eth0 queue_mask 0x04 --coalesce tx-usecs 156 tx-frames 26 > ethtool --per-queue eth0 queue_mask 0x40 --coalesce tx-usecs 156 tx-frames 26 > ethtool --per-queue eth0 queue_mask 0x44 --show-coalesce Queue: 2 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 200 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 156 tx-frames: 26 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 6 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 200 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 156 tx-frames: 26 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 > ethtool --per-queue eth0 queue_mask 0xFF --coalesce rx-usecs 204 rx-frames 0 rx-frames unmodified, ignoring rx-frames unmodified, ignoring rx-frames unmodified, ignoring rx-frames unmodified, ignoring rx-frames unmodified, ignoring rx-frames unmodified, ignoring > ethtool --per-queue eth0 queue_mask 0xFF --coalesce tx-usecs 1000 tx-frames 25 tx-usecs unmodified, ignoring tx-frames unmodified, ignoring Queue 0, no coalesce parameters changed tx-usecs unmodified, ignoring tx-frames unmodified, ignoring Queue 1, no coalesce parameters changed tx-usecs unmodified, ignoring tx-frames unmodified, ignoring Queue 3, no coalesce parameters changed tx-usecs unmodified, ignoring tx-frames unmodified, ignoring Queue 4, no coalesce parameters changed tx-usecs unmodified, ignoring tx-frames unmodified, ignoring Queue 5, no coalesce parameters changed tx-usecs unmodified, ignoring tx-frames unmodified, ignoring Queue 7, no coalesce parameters changed > ethtool --show-coalesce eth0 Coalesce parameters for eth0: Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 > ethtool --per-queue eth0 queue_mask 0xFF --show-coalesce Queue: 0 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 1 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 2 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 3 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 4 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 5 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 6 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 Queue: 7 Adaptive RX: off TX: off stats-block-usecs: 0 sample-interval: 0 pkt-rate-low: 0 pkt-rate-high: 0 rx-usecs: 202 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0 tx-usecs: 1000 tx-frames: 25 tx-usecs-irq: 0 tx-frames-irq: 0 rx-usecs-low: 0 rx-frames-low: 0 tx-usecs-low: 0 tx-frames-low: 0 rx-usecs-high: 0 rx-frames-high: 0 tx-usecs-high: 0 tx-frames-high: 0 ######################################################################## Thanks, Boon Leong Ong Boon Leong (1): net: stmmac: add per-queue TX & RX coalesce ethtool support .../ethernet/stmicro/stmmac/dwmac1000_dma.c | 2 +- .../net/ethernet/stmicro/stmmac/dwmac4_dma.c | 7 +- .../ethernet/stmicro/stmmac/dwxgmac2_dma.c | 7 +- drivers/net/ethernet/stmicro/stmmac/hwif.h | 2 +- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 8 +- .../ethernet/stmicro/stmmac/stmmac_ethtool.c | 132 ++++++++++++++++-- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 48 ++++--- 7 files changed, 157 insertions(+), 49 deletions(-) -- 2.25.1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net-next 1/1] net: stmmac: add per-queue TX & RX coalesce ethtool support 2021-03-15 6:44 [PATCH net-next 0/1] net: stmmac: add per-q coalesce support Ong Boon Leong @ 2021-03-15 6:44 ` Ong Boon Leong 2021-03-15 19:50 ` Jakub Kicinski 0 siblings, 1 reply; 4+ messages in thread From: Ong Boon Leong @ 2021-03-15 6:44 UTC (permalink / raw) To: Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu, David S . Miller, Jakub Kicinski, Maxime Coquelin Cc: netdev, linux-stm32, linux-arm-kernel, linux-kernel, Ong Boon Leong Extending the driver to support per-queue RX and TX coalesce settings in order to support below commands: To show per-queue coalesce setting:- $ ethtool --per-queue <DEVNAME> queue_mask <MASK> --show-coalesce To set per-queue coalesce setting:- $ ethtool --per-queue <DEVNAME> queue_mask <MASK> --coalesce \ [rx-usecs N] [rx-frames M] [tx-usecs P] [tx-frames Q] Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> --- .../ethernet/stmicro/stmmac/dwmac1000_dma.c | 2 +- .../net/ethernet/stmicro/stmmac/dwmac4_dma.c | 7 +- .../ethernet/stmicro/stmmac/dwxgmac2_dma.c | 7 +- drivers/net/ethernet/stmicro/stmmac/hwif.h | 2 +- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 8 +- .../ethernet/stmicro/stmmac/stmmac_ethtool.c | 132 ++++++++++++++++-- .../net/ethernet/stmicro/stmmac/stmmac_main.c | 48 ++++--- 7 files changed, 157 insertions(+), 49 deletions(-) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c index 2bac49b49f73..90383abafa66 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c @@ -255,7 +255,7 @@ static void dwmac1000_get_hw_feature(void __iomem *ioaddr, } static void dwmac1000_rx_watchdog(void __iomem *ioaddr, u32 riwt, - u32 number_chan) + u32 queue) { writel(riwt, ioaddr + DMA_RX_WATCHDOG); } diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c index 62aa0e95beb7..8958778d16b7 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_dma.c @@ -210,12 +210,9 @@ static void dwmac4_dump_dma_regs(void __iomem *ioaddr, u32 *reg_space) _dwmac4_dump_dma_regs(ioaddr, i, reg_space); } -static void dwmac4_rx_watchdog(void __iomem *ioaddr, u32 riwt, u32 number_chan) +static void dwmac4_rx_watchdog(void __iomem *ioaddr, u32 riwt, u32 queue) { - u32 chan; - - for (chan = 0; chan < number_chan; chan++) - writel(riwt, ioaddr + DMA_CHAN_RX_WATCHDOG(chan)); + writel(riwt, ioaddr + DMA_CHAN_RX_WATCHDOG(queue)); } static void dwmac4_dma_rx_chan_op_mode(void __iomem *ioaddr, int mode, diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c index 77308c5c5d29..f2cab5b76732 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_dma.c @@ -441,12 +441,9 @@ static void dwxgmac2_get_hw_feature(void __iomem *ioaddr, dma_cap->frpsel = (hw_cap & XGMAC_HWFEAT_FRPSEL) >> 3; } -static void dwxgmac2_rx_watchdog(void __iomem *ioaddr, u32 riwt, u32 nchan) +static void dwxgmac2_rx_watchdog(void __iomem *ioaddr, u32 riwt, u32 queue) { - u32 i; - - for (i = 0; i < nchan; i++) - writel(riwt & XGMAC_RWT, ioaddr + XGMAC_DMA_CH_Rx_WATCHDOG(i)); + writel(riwt & XGMAC_RWT, ioaddr + XGMAC_DMA_CH_Rx_WATCHDOG(queue)); } static void dwxgmac2_set_rx_ring_len(void __iomem *ioaddr, u32 len, u32 chan) diff --git a/drivers/net/ethernet/stmicro/stmmac/hwif.h b/drivers/net/ethernet/stmicro/stmmac/hwif.h index 979ac9fca23c..da9996a985f6 100644 --- a/drivers/net/ethernet/stmicro/stmmac/hwif.h +++ b/drivers/net/ethernet/stmicro/stmmac/hwif.h @@ -206,7 +206,7 @@ struct stmmac_dma_ops { void (*get_hw_feature)(void __iomem *ioaddr, struct dma_features *dma_cap); /* Program the HW RX Watchdog */ - void (*rx_watchdog)(void __iomem *ioaddr, u32 riwt, u32 number_chan); + void (*rx_watchdog)(void __iomem *ioaddr, u32 riwt, u32 queue); void (*set_tx_ring_len)(void __iomem *ioaddr, u32 len, u32 chan); void (*set_rx_ring_len)(void __iomem *ioaddr, u32 len, u32 chan); void (*set_rx_tail_ptr)(void __iomem *ioaddr, u32 tail_ptr, u32 chan); diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index e553b9a1f785..74ecd80fec1b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -147,9 +147,9 @@ struct stmmac_flow_entry { struct stmmac_priv { /* Frequently used values are kept adjacent for cache effect */ - u32 tx_coal_frames; - u32 tx_coal_timer; - u32 rx_coal_frames; + u32 tx_coal_frames[MTL_MAX_TX_QUEUES]; + u32 tx_coal_timer[MTL_MAX_TX_QUEUES]; + u32 rx_coal_frames[MTL_MAX_TX_QUEUES]; int tx_coalesce; int hwts_tx_en; @@ -160,7 +160,7 @@ struct stmmac_priv { unsigned int dma_buf_sz; unsigned int rx_copybreak; - u32 rx_riwt; + u32 rx_riwt[MTL_MAX_TX_QUEUES]; int hwts_rx_en; void __iomem *ioaddr; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c index c5642985ef95..5fadd8f42d29 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c @@ -756,28 +756,89 @@ static u32 stmmac_riwt2usec(u32 riwt, struct stmmac_priv *priv) return (riwt * 256) / (clk / 1000000); } -static int stmmac_get_coalesce(struct net_device *dev, - struct ethtool_coalesce *ec) +static int __stmmac_get_coalesce(struct net_device *dev, + struct ethtool_coalesce *ec, + int queue) { struct stmmac_priv *priv = netdev_priv(dev); + u32 max_cnt; + u32 rx_cnt; + u32 tx_cnt; - ec->tx_coalesce_usecs = priv->tx_coal_timer; - ec->tx_max_coalesced_frames = priv->tx_coal_frames; + rx_cnt = priv->plat->rx_queues_to_use; + tx_cnt = priv->plat->tx_queues_to_use; + max_cnt = max(rx_cnt, tx_cnt); - if (priv->use_riwt) { - ec->rx_max_coalesced_frames = priv->rx_coal_frames; - ec->rx_coalesce_usecs = stmmac_riwt2usec(priv->rx_riwt, priv); + if (queue < 0) + queue = 0; + else if (queue >= max_cnt) + return -EINVAL; + + if (queue < tx_cnt) { + ec->tx_coalesce_usecs = priv->tx_coal_timer[queue]; + ec->tx_max_coalesced_frames = priv->tx_coal_frames[queue]; + } else { + ec->tx_coalesce_usecs = -1; + ec->tx_max_coalesced_frames = -1; + } + + if (priv->use_riwt && queue < rx_cnt) { + ec->rx_max_coalesced_frames = priv->rx_coal_frames[queue]; + ec->rx_coalesce_usecs = stmmac_riwt2usec(priv->rx_riwt[queue], + priv); + } else { + ec->rx_max_coalesced_frames = -1; + ec->rx_coalesce_usecs = -1; } return 0; } -static int stmmac_set_coalesce(struct net_device *dev, +static int stmmac_get_coalesce(struct net_device *dev, struct ethtool_coalesce *ec) +{ + return __stmmac_get_coalesce(dev, ec, -1); +} + +static int stmmac_get_per_queue_coalesce(struct net_device *dev, u32 queue, + struct ethtool_coalesce *ec) +{ + return __stmmac_get_coalesce(dev, ec, queue); +} + +static int __stmmac_set_coalesce(struct net_device *dev, + struct ethtool_coalesce *ec, + int queue) { struct stmmac_priv *priv = netdev_priv(dev); - u32 rx_cnt = priv->plat->rx_queues_to_use; + bool all_queues = false; unsigned int rx_riwt; + u32 max_cnt; + u32 rx_cnt; + u32 tx_cnt; + + rx_cnt = priv->plat->rx_queues_to_use; + tx_cnt = priv->plat->tx_queues_to_use; + max_cnt = max(rx_cnt, tx_cnt); + + if (queue < 0) + all_queues = true; + else if (queue >= max_cnt) + return -EINVAL; + + /* Check not supported parameters */ + if (ec->rx_coalesce_usecs_irq || + ec->rx_max_coalesced_frames_irq || ec->tx_coalesce_usecs_irq || + ec->use_adaptive_rx_coalesce || ec->use_adaptive_tx_coalesce || + ec->pkt_rate_low || ec->rx_coalesce_usecs_low || + ec->rx_max_coalesced_frames_low || ec->tx_coalesce_usecs_high || + ec->tx_max_coalesced_frames_low || ec->pkt_rate_high || + ec->tx_coalesce_usecs_low || ec->rx_coalesce_usecs_high || + ec->rx_max_coalesced_frames_high || + ec->tx_max_coalesced_frames_irq || + ec->stats_block_coalesce_usecs || + ec->tx_max_coalesced_frames_high || ec->rate_sample_interval) + return -EOPNOTSUPP; if (priv->use_riwt && (ec->rx_coalesce_usecs > 0)) { rx_riwt = stmmac_usec2riwt(ec->rx_coalesce_usecs, priv); @@ -785,8 +846,23 @@ static int stmmac_set_coalesce(struct net_device *dev, if ((rx_riwt > MAX_DMA_RIWT) || (rx_riwt < MIN_DMA_RIWT)) return -EINVAL; - priv->rx_riwt = rx_riwt; - stmmac_rx_watchdog(priv, priv->ioaddr, priv->rx_riwt, rx_cnt); + if (all_queues) { + int i; + + for (i = 0; i < rx_cnt; i++) { + priv->rx_riwt[i] = rx_riwt; + stmmac_rx_watchdog(priv, priv->ioaddr, + rx_riwt, i); + priv->rx_coal_frames[i] = + ec->rx_max_coalesced_frames; + } + } else if (queue < rx_cnt) { + priv->rx_riwt[queue] = rx_riwt; + stmmac_rx_watchdog(priv, priv->ioaddr, + rx_riwt, queue); + priv->rx_coal_frames[queue] = + ec->rx_max_coalesced_frames; + } } if ((ec->tx_coalesce_usecs == 0) && @@ -797,13 +873,37 @@ static int stmmac_set_coalesce(struct net_device *dev, (ec->tx_max_coalesced_frames > STMMAC_TX_MAX_FRAMES)) return -EINVAL; - /* Only copy relevant parameters, ignore all others. */ - priv->tx_coal_frames = ec->tx_max_coalesced_frames; - priv->tx_coal_timer = ec->tx_coalesce_usecs; - priv->rx_coal_frames = ec->rx_max_coalesced_frames; + if (all_queues) { + int i; + + for (i = 0; i < tx_cnt; i++) { + priv->tx_coal_frames[i] = + ec->tx_max_coalesced_frames; + priv->tx_coal_timer[i] = + ec->tx_coalesce_usecs; + } + } else if (queue < tx_cnt) { + priv->tx_coal_frames[queue] = + ec->tx_max_coalesced_frames; + priv->tx_coal_timer[queue] = + ec->tx_coalesce_usecs; + } + return 0; } +static int stmmac_set_coalesce(struct net_device *dev, + struct ethtool_coalesce *ec) +{ + return __stmmac_set_coalesce(dev, ec, -1); +} + +static int stmmac_set_per_queue_coalesce(struct net_device *dev, u32 queue, + struct ethtool_coalesce *ec) +{ + return __stmmac_set_coalesce(dev, ec, queue); +} + static int stmmac_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *rxnfc, u32 *rule_locs) { @@ -1001,6 +1101,8 @@ static const struct ethtool_ops stmmac_ethtool_ops = { .get_ts_info = stmmac_get_ts_info, .get_coalesce = stmmac_get_coalesce, .set_coalesce = stmmac_set_coalesce, + .get_per_queue_coalesce = stmmac_get_per_queue_coalesce, + .set_per_queue_coalesce = stmmac_set_per_queue_coalesce, .get_channels = stmmac_get_channels, .set_channels = stmmac_set_channels, .get_tunable = stmmac_get_tunable, diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index e58ff652e95f..5d8601b8b809 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2185,7 +2185,8 @@ static int stmmac_tx_clean(struct stmmac_priv *priv, int budget, u32 queue) /* We still have pending packets, let's call for a new scheduling */ if (tx_q->dirty_tx != tx_q->cur_tx) - hrtimer_start(&tx_q->txtimer, STMMAC_COAL_TIMER(priv->tx_coal_timer), + hrtimer_start(&tx_q->txtimer, + STMMAC_COAL_TIMER(priv->tx_coal_timer[queue]), HRTIMER_MODE_REL); __netif_tx_unlock_bh(netdev_get_tx_queue(priv->dev, queue)); @@ -2470,7 +2471,8 @@ static void stmmac_tx_timer_arm(struct stmmac_priv *priv, u32 queue) { struct stmmac_tx_queue *tx_q = &priv->tx_queue[queue]; - hrtimer_start(&tx_q->txtimer, STMMAC_COAL_TIMER(priv->tx_coal_timer), + hrtimer_start(&tx_q->txtimer, + STMMAC_COAL_TIMER(priv->tx_coal_timer[queue]), HRTIMER_MODE_REL); } @@ -2511,18 +2513,21 @@ static enum hrtimer_restart stmmac_tx_timer(struct hrtimer *t) static void stmmac_init_coalesce(struct stmmac_priv *priv) { u32 tx_channel_count = priv->plat->tx_queues_to_use; + u32 rx_channel_count = priv->plat->rx_queues_to_use; u32 chan; - priv->tx_coal_frames = STMMAC_TX_FRAMES; - priv->tx_coal_timer = STMMAC_COAL_TX_TIMER; - priv->rx_coal_frames = STMMAC_RX_FRAMES; - for (chan = 0; chan < tx_channel_count; chan++) { struct stmmac_tx_queue *tx_q = &priv->tx_queue[chan]; + priv->tx_coal_frames[chan] = STMMAC_TX_FRAMES; + priv->tx_coal_timer[chan] = STMMAC_COAL_TX_TIMER; + hrtimer_init(&tx_q->txtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); tx_q->txtimer.function = stmmac_tx_timer; } + + for (chan = 0; chan < rx_channel_count; chan++) + priv->rx_coal_frames[chan] = STMMAC_RX_FRAMES; } static void stmmac_set_rings_length(struct stmmac_priv *priv) @@ -2827,10 +2832,15 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp) priv->tx_lpi_timer = eee_timer * 1000; if (priv->use_riwt) { - if (!priv->rx_riwt) - priv->rx_riwt = DEF_DMA_RIWT; + u32 queue; + + for (queue = 0; queue < rx_cnt; queue++) { + if (!priv->rx_riwt[queue]) + priv->rx_riwt[queue] = DEF_DMA_RIWT; - ret = stmmac_rx_watchdog(priv, priv->ioaddr, priv->rx_riwt, rx_cnt); + stmmac_rx_watchdog(priv, priv->ioaddr, + priv->rx_riwt[queue], queue); + } } if (priv->hw->pcs) @@ -3319,11 +3329,12 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev) if ((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) && priv->hwts_tx_en) set_ic = true; - else if (!priv->tx_coal_frames) + else if (!priv->tx_coal_frames[queue]) set_ic = false; - else if (tx_packets > priv->tx_coal_frames) + else if (tx_packets > priv->tx_coal_frames[queue]) set_ic = true; - else if ((tx_q->tx_count_frames % priv->tx_coal_frames) < tx_packets) + else if ((tx_q->tx_count_frames % + priv->tx_coal_frames[queue]) < tx_packets) set_ic = true; else set_ic = false; @@ -3548,11 +3559,12 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev) if ((skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) && priv->hwts_tx_en) set_ic = true; - else if (!priv->tx_coal_frames) + else if (!priv->tx_coal_frames[queue]) set_ic = false; - else if (tx_packets > priv->tx_coal_frames) + else if (tx_packets > priv->tx_coal_frames[queue]) set_ic = true; - else if ((tx_q->tx_count_frames % priv->tx_coal_frames) < tx_packets) + else if ((tx_q->tx_count_frames % + priv->tx_coal_frames[queue]) < tx_packets) set_ic = true; else set_ic = false; @@ -3751,11 +3763,11 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv, u32 queue) stmmac_refill_desc3(priv, rx_q, p); rx_q->rx_count_frames++; - rx_q->rx_count_frames += priv->rx_coal_frames; - if (rx_q->rx_count_frames > priv->rx_coal_frames) + rx_q->rx_count_frames += priv->rx_coal_frames[queue]; + if (rx_q->rx_count_frames > priv->rx_coal_frames[queue]) rx_q->rx_count_frames = 0; - use_rx_wd = !priv->rx_coal_frames; + use_rx_wd = !priv->rx_coal_frames[queue]; use_rx_wd |= rx_q->rx_count_frames > 0; if (!priv->use_riwt) use_rx_wd = false; -- 2.25.1 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next 1/1] net: stmmac: add per-queue TX & RX coalesce ethtool support 2021-03-15 6:44 ` [PATCH net-next 1/1] net: stmmac: add per-queue TX & RX coalesce ethtool support Ong Boon Leong @ 2021-03-15 19:50 ` Jakub Kicinski 2021-03-16 4:44 ` Ong, Boon Leong 0 siblings, 1 reply; 4+ messages in thread From: Jakub Kicinski @ 2021-03-15 19:50 UTC (permalink / raw) To: Ong Boon Leong Cc: Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu, David S . Miller, Maxime Coquelin, netdev, linux-stm32, linux-arm-kernel, linux-kernel On Mon, 15 Mar 2021 14:44:48 +0800 Ong Boon Leong wrote: > Extending the driver to support per-queue RX and TX coalesce settings in > order to support below commands: > > To show per-queue coalesce setting:- > $ ethtool --per-queue <DEVNAME> queue_mask <MASK> --show-coalesce > > To set per-queue coalesce setting:- > $ ethtool --per-queue <DEVNAME> queue_mask <MASK> --coalesce \ > [rx-usecs N] [rx-frames M] [tx-usecs P] [tx-frames Q] > > Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> > -static int stmmac_get_coalesce(struct net_device *dev, > - struct ethtool_coalesce *ec) > +static int __stmmac_get_coalesce(struct net_device *dev, > + struct ethtool_coalesce *ec, > + int queue) > { > struct stmmac_priv *priv = netdev_priv(dev); > + u32 max_cnt; > + u32 rx_cnt; > + u32 tx_cnt; > > - ec->tx_coalesce_usecs = priv->tx_coal_timer; > - ec->tx_max_coalesced_frames = priv->tx_coal_frames; > + rx_cnt = priv->plat->rx_queues_to_use; > + tx_cnt = priv->plat->tx_queues_to_use; > + max_cnt = max(rx_cnt, tx_cnt); > > - if (priv->use_riwt) { > - ec->rx_max_coalesced_frames = priv->rx_coal_frames; > - ec->rx_coalesce_usecs = stmmac_riwt2usec(priv->rx_riwt, priv); > + if (queue < 0) > + queue = 0; > + else if (queue >= max_cnt) > + return -EINVAL; > + > + if (queue < tx_cnt) { > + ec->tx_coalesce_usecs = priv->tx_coal_timer[queue]; > + ec->tx_max_coalesced_frames = priv->tx_coal_frames[queue]; > + } else { > + ec->tx_coalesce_usecs = -1; > + ec->tx_max_coalesced_frames = -1; > + } > + > + if (priv->use_riwt && queue < rx_cnt) { > + ec->rx_max_coalesced_frames = priv->rx_coal_frames[queue]; > + ec->rx_coalesce_usecs = stmmac_riwt2usec(priv->rx_riwt[queue], > + priv); > + } else { > + ec->rx_max_coalesced_frames = -1; > + ec->rx_coalesce_usecs = -1; Why the use of negative values? why not leave them as 0? > } > > return 0; > } > > -static int stmmac_set_coalesce(struct net_device *dev, > +static int stmmac_get_coalesce(struct net_device *dev, > struct ethtool_coalesce *ec) > +{ > + return __stmmac_get_coalesce(dev, ec, -1); > +} > + > +static int stmmac_get_per_queue_coalesce(struct net_device *dev, u32 queue, > + struct ethtool_coalesce *ec) > +{ > + return __stmmac_get_coalesce(dev, ec, queue); > +} > + > +static int __stmmac_set_coalesce(struct net_device *dev, > + struct ethtool_coalesce *ec, > + int queue) > { > struct stmmac_priv *priv = netdev_priv(dev); > - u32 rx_cnt = priv->plat->rx_queues_to_use; > + bool all_queues = false; > unsigned int rx_riwt; > + u32 max_cnt; > + u32 rx_cnt; > + u32 tx_cnt; > + > + rx_cnt = priv->plat->rx_queues_to_use; > + tx_cnt = priv->plat->tx_queues_to_use; > + max_cnt = max(rx_cnt, tx_cnt); > + > + if (queue < 0) > + all_queues = true; > + else if (queue >= max_cnt) > + return -EINVAL; > + > + /* Check not supported parameters */ > + if (ec->rx_coalesce_usecs_irq || > + ec->rx_max_coalesced_frames_irq || ec->tx_coalesce_usecs_irq || > + ec->use_adaptive_rx_coalesce || ec->use_adaptive_tx_coalesce || > + ec->pkt_rate_low || ec->rx_coalesce_usecs_low || > + ec->rx_max_coalesced_frames_low || ec->tx_coalesce_usecs_high || > + ec->tx_max_coalesced_frames_low || ec->pkt_rate_high || > + ec->tx_coalesce_usecs_low || ec->rx_coalesce_usecs_high || > + ec->rx_max_coalesced_frames_high || > + ec->tx_max_coalesced_frames_irq || > + ec->stats_block_coalesce_usecs || > + ec->tx_max_coalesced_frames_high || ec->rate_sample_interval) > + return -EOPNOTSUPP; This shouldn't be needed now that supporter types are expressed in dev->ethtool_ops->supported_coalesce_params, no? > if (priv->use_riwt && (ec->rx_coalesce_usecs > 0)) { > rx_riwt = stmmac_usec2riwt(ec->rx_coalesce_usecs, priv); > @@ -785,8 +846,23 @@ static int stmmac_set_coalesce(struct net_device *dev, > if ((rx_riwt > MAX_DMA_RIWT) || (rx_riwt < MIN_DMA_RIWT)) > return -EINVAL; > > - priv->rx_riwt = rx_riwt; > - stmmac_rx_watchdog(priv, priv->ioaddr, priv->rx_riwt, rx_cnt); > + if (all_queues) { > + int i; > + > + for (i = 0; i < rx_cnt; i++) { > + priv->rx_riwt[i] = rx_riwt; > + stmmac_rx_watchdog(priv, priv->ioaddr, > + rx_riwt, i); > + priv->rx_coal_frames[i] = > + ec->rx_max_coalesced_frames; > + } > + } else if (queue < rx_cnt) { > + priv->rx_riwt[queue] = rx_riwt; > + stmmac_rx_watchdog(priv, priv->ioaddr, > + rx_riwt, queue); > + priv->rx_coal_frames[queue] = > + ec->rx_max_coalesced_frames; > + } > } > > if ((ec->tx_coalesce_usecs == 0) && > @@ -797,13 +873,37 @@ static int stmmac_set_coalesce(struct net_device *dev, > (ec->tx_max_coalesced_frames > STMMAC_TX_MAX_FRAMES)) > return -EINVAL; > > - /* Only copy relevant parameters, ignore all others. */ > - priv->tx_coal_frames = ec->tx_max_coalesced_frames; > - priv->tx_coal_timer = ec->tx_coalesce_usecs; > - priv->rx_coal_frames = ec->rx_max_coalesced_frames; > + if (all_queues) { > + int i; > + > + for (i = 0; i < tx_cnt; i++) { > + priv->tx_coal_frames[i] = > + ec->tx_max_coalesced_frames; > + priv->tx_coal_timer[i] = > + ec->tx_coalesce_usecs; > + } > + } else if (queue < tx_cnt) { > + priv->tx_coal_frames[queue] = > + ec->tx_max_coalesced_frames; > + priv->tx_coal_timer[queue] = > + ec->tx_coalesce_usecs; > + } > + > return 0; > } ^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: [PATCH net-next 1/1] net: stmmac: add per-queue TX & RX coalesce ethtool support 2021-03-15 19:50 ` Jakub Kicinski @ 2021-03-16 4:44 ` Ong, Boon Leong 0 siblings, 0 replies; 4+ messages in thread From: Ong, Boon Leong @ 2021-03-16 4:44 UTC (permalink / raw) To: Jakub Kicinski Cc: Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu, David S . Miller, Maxime Coquelin, netdev, linux-stm32, linux-arm-kernel, linux-kernel >> + if (queue < tx_cnt) { >> + ec->tx_coalesce_usecs = priv->tx_coal_timer[queue]; >> + ec->tx_max_coalesced_frames = priv->tx_coal_frames[queue]; >> + } else { >> + ec->tx_coalesce_usecs = -1; >> + ec->tx_max_coalesced_frames = -1; >> + } >> + >> + if (priv->use_riwt && queue < rx_cnt) { >> + ec->rx_max_coalesced_frames = priv->rx_coal_frames[queue]; >> + ec->rx_coalesce_usecs = stmmac_riwt2usec(priv- >>rx_riwt[queue], >> + priv); >> + } else { >> + ec->rx_max_coalesced_frames = -1; >> + ec->rx_coalesce_usecs = -1; > >Why the use of negative values? why not leave them as 0? The initial logic was to return negative value to unsupported TXQ & RXQ since they are invalid. No preference here. So, we can leave it as all zeros. > >> } >> >> return 0; >> } >> >> -static int stmmac_set_coalesce(struct net_device *dev, >> +static int stmmac_get_coalesce(struct net_device *dev, >> struct ethtool_coalesce *ec) >> +{ >> + return __stmmac_get_coalesce(dev, ec, -1); >> +} >> + >> +static int stmmac_get_per_queue_coalesce(struct net_device *dev, u32 >queue, >> + struct ethtool_coalesce *ec) >> +{ >> + return __stmmac_get_coalesce(dev, ec, queue); >> +} >> + >> +static int __stmmac_set_coalesce(struct net_device *dev, >> + struct ethtool_coalesce *ec, >> + int queue) >> { >> struct stmmac_priv *priv = netdev_priv(dev); >> - u32 rx_cnt = priv->plat->rx_queues_to_use; >> + bool all_queues = false; >> unsigned int rx_riwt; >> + u32 max_cnt; >> + u32 rx_cnt; >> + u32 tx_cnt; >> + >> + rx_cnt = priv->plat->rx_queues_to_use; >> + tx_cnt = priv->plat->tx_queues_to_use; >> + max_cnt = max(rx_cnt, tx_cnt); >> + >> + if (queue < 0) >> + all_queues = true; >> + else if (queue >= max_cnt) >> + return -EINVAL; >> + >> + /* Check not supported parameters */ >> + if (ec->rx_coalesce_usecs_irq || >> + ec->rx_max_coalesced_frames_irq || ec->tx_coalesce_usecs_irq || >> + ec->use_adaptive_rx_coalesce || ec->use_adaptive_tx_coalesce || >> + ec->pkt_rate_low || ec->rx_coalesce_usecs_low || >> + ec->rx_max_coalesced_frames_low || ec->tx_coalesce_usecs_high || >> + ec->tx_max_coalesced_frames_low || ec->pkt_rate_high || >> + ec->tx_coalesce_usecs_low || ec->rx_coalesce_usecs_high || >> + ec->rx_max_coalesced_frames_high || >> + ec->tx_max_coalesced_frames_irq || >> + ec->stats_block_coalesce_usecs || >> + ec->tx_max_coalesced_frames_high || ec->rate_sample_interval) >> + return -EOPNOTSUPP; > >This shouldn't be needed now that supporter types are expressed in >dev->ethtool_ops->supported_coalesce_params, no? Wil fix this. Thanks for pointing out. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-03-16 4:45 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-15 6:44 [PATCH net-next 0/1] net: stmmac: add per-q coalesce support Ong Boon Leong 2021-03-15 6:44 ` [PATCH net-next 1/1] net: stmmac: add per-queue TX & RX coalesce ethtool support Ong Boon Leong 2021-03-15 19:50 ` Jakub Kicinski 2021-03-16 4:44 ` Ong, Boon Leong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).