* [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64()
@ 2022-06-29 7:28 Qiao Ma
2022-06-30 9:56 ` Paolo Abeni
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Qiao Ma @ 2022-06-29 7:28 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, gustavoars, cai.huoqing,
aviad.krawczyk, zhaochen6
Cc: netdev, linux-kernel
When using hinic device as a bond slave device, and reading device stats of
master bond device, the kernel may hung.
The kernel panic calltrace as follows:
Kernel panic - not syncing: softlockup: hung tasks
Call trace:
native_queued_spin_lock_slowpath+0x1ec/0x31c
dev_get_stats+0x60/0xcc
dev_seq_printf_stats+0x40/0x120
dev_seq_show+0x1c/0x40
seq_read_iter+0x3c8/0x4dc
seq_read+0xe0/0x130
proc_reg_read+0xa8/0xe0
vfs_read+0xb0/0x1d4
ksys_read+0x70/0xfc
__arm64_sys_read+0x20/0x30
el0_svc_common+0x88/0x234
do_el0_svc+0x2c/0x90
el0_svc+0x1c/0x30
el0_sync_handler+0xa8/0xb0
el0_sync+0x148/0x180
And the calltrace of task that actually caused kernel hungs as follows:
__switch_to+124
__schedule+548
schedule+72
schedule_timeout+348
__down_common+188
__down+24
down+104
hinic_get_stats64+44 [hinic]
dev_get_stats+92
bond_get_stats+172 [bonding]
dev_get_stats+92
dev_seq_printf_stats+60
dev_seq_show+24
seq_read_iter+964
seq_read+220
proc_reg_read+164
vfs_read+172
ksys_read+108
__arm64_sys_read+28
el0_svc_common+132
do_el0_svc+40
el0_svc+24
el0_sync_handler+164
el0_sync+324
When getting device stats from bond, kernel will call bond_get_stats().
It first holds the spinlock bond->stats_lock, and then call
hinic_get_stats64() to collect hinic device's stats.
However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
protect its critical section, which may schedule current task out.
And if system is under high pressure, the task cannot be woken up
immediately, which eventually triggers kernel hung panic.
Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
---
drivers/net/ethernet/huawei/hinic/hinic_dev.h | 1 +
drivers/net/ethernet/huawei/hinic/hinic_main.c | 7 +++----
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_dev.h b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
index fb3e89141a0d..1fb343d03fd5 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_dev.h
+++ b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
@@ -97,6 +97,7 @@ struct hinic_dev {
struct hinic_txq_stats tx_stats;
struct hinic_rxq_stats rx_stats;
+ spinlock_t stats_lock;
u8 rss_tmpl_idx;
u8 rss_hash_engine;
diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c
index 56a89793f47d..32a3d700ad26 100644
--- a/drivers/net/ethernet/huawei/hinic/hinic_main.c
+++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
@@ -125,11 +125,13 @@ static void update_nic_stats(struct hinic_dev *nic_dev)
{
int i, num_qps = hinic_hwdev_num_qps(nic_dev->hwdev);
+ spin_lock(&nic_dev->stats_lock);
for (i = 0; i < num_qps; i++)
update_rx_stats(nic_dev, &nic_dev->rxqs[i]);
for (i = 0; i < num_qps; i++)
update_tx_stats(nic_dev, &nic_dev->txqs[i]);
+ spin_unlock(&nic_dev->stats_lock);
}
/**
@@ -859,13 +861,9 @@ static void hinic_get_stats64(struct net_device *netdev,
nic_rx_stats = &nic_dev->rx_stats;
nic_tx_stats = &nic_dev->tx_stats;
- down(&nic_dev->mgmt_lock);
-
if (nic_dev->flags & HINIC_INTF_UP)
update_nic_stats(nic_dev);
- up(&nic_dev->mgmt_lock);
-
stats->rx_bytes = nic_rx_stats->bytes;
stats->rx_packets = nic_rx_stats->pkts;
stats->rx_errors = nic_rx_stats->errors;
@@ -1239,6 +1237,7 @@ static int nic_dev_init(struct pci_dev *pdev)
u64_stats_init(&tx_stats->syncp);
u64_stats_init(&rx_stats->syncp);
+ spin_lock_init(&nic_dev->stats_lock);
nic_dev->vlan_bitmap = devm_bitmap_zalloc(&pdev->dev, VLAN_N_VID,
GFP_KERNEL);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64()
2022-06-29 7:28 [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64() Qiao Ma
@ 2022-06-30 9:56 ` Paolo Abeni
2022-06-30 14:20 ` maqiao
2022-06-30 10:07 ` Paolo Abeni
2022-06-30 10:23 ` Eric Dumazet
2 siblings, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2022-06-30 9:56 UTC (permalink / raw)
To: Qiao Ma, davem, edumazet, kuba, gustavoars, cai.huoqing,
aviad.krawczyk, zhaochen6
Cc: netdev, linux-kernel
On Wed, 2022-06-29 at 15:28 +0800, Qiao Ma wrote:
> When using hinic device as a bond slave device, and reading device stats of
> master bond device, the kernel may hung.
>
> The kernel panic calltrace as follows:
> Kernel panic - not syncing: softlockup: hung tasks
> Call trace:
> native_queued_spin_lock_slowpath+0x1ec/0x31c
> dev_get_stats+0x60/0xcc
> dev_seq_printf_stats+0x40/0x120
> dev_seq_show+0x1c/0x40
> seq_read_iter+0x3c8/0x4dc
> seq_read+0xe0/0x130
> proc_reg_read+0xa8/0xe0
> vfs_read+0xb0/0x1d4
> ksys_read+0x70/0xfc
> __arm64_sys_read+0x20/0x30
> el0_svc_common+0x88/0x234
> do_el0_svc+0x2c/0x90
> el0_svc+0x1c/0x30
> el0_sync_handler+0xa8/0xb0
> el0_sync+0x148/0x180
>
> And the calltrace of task that actually caused kernel hungs as follows:
> __switch_to+124
> __schedule+548
> schedule+72
> schedule_timeout+348
> __down_common+188
> __down+24
> down+104
> hinic_get_stats64+44 [hinic]
> dev_get_stats+92
> bond_get_stats+172 [bonding]
> dev_get_stats+92
> dev_seq_printf_stats+60
> dev_seq_show+24
> seq_read_iter+964
> seq_read+220
> proc_reg_read+164
> vfs_read+172
> ksys_read+108
> __arm64_sys_read+28
> el0_svc_common+132
> do_el0_svc+40
> el0_svc+24
> el0_sync_handler+164
> el0_sync+324
>
> When getting device stats from bond, kernel will call bond_get_stats().
> It first holds the spinlock bond->stats_lock, and then call
> hinic_get_stats64() to collect hinic device's stats.
> However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
> protect its critical section, which may schedule current task out.
> And if system is under high pressure, the task cannot be woken up
> immediately, which eventually triggers kernel hung panic.
>
> Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
> Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
Side note: it looks like that after this patch every section protected
by the mgmt_lock is already under rtnl lock protection, so you could
probably remove the hinic specific lock (in a separate, net-next,
patch).
Please double check the above as I skimmed upon that quickly.
Thanks,
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64()
2022-06-30 9:56 ` Paolo Abeni
@ 2022-06-30 14:20 ` maqiao
0 siblings, 0 replies; 8+ messages in thread
From: maqiao @ 2022-06-30 14:20 UTC (permalink / raw)
To: Paolo Abeni, davem, edumazet, kuba, gustavoars, cai.huoqing,
aviad.krawczyk, zhaochen6
Cc: netdev, linux-kernel
在 2022/6/30 下午5:56, Paolo Abeni 写道:
> On Wed, 2022-06-29 at 15:28 +0800, Qiao Ma wrote:
>> When using hinic device as a bond slave device, and reading device stats of
>> master bond device, the kernel may hung.
>>
>> The kernel panic calltrace as follows:
>> Kernel panic - not syncing: softlockup: hung tasks
>> Call trace:
>> native_queued_spin_lock_slowpath+0x1ec/0x31c
>> dev_get_stats+0x60/0xcc
>> dev_seq_printf_stats+0x40/0x120
>> dev_seq_show+0x1c/0x40
>> seq_read_iter+0x3c8/0x4dc
>> seq_read+0xe0/0x130
>> proc_reg_read+0xa8/0xe0
>> vfs_read+0xb0/0x1d4
>> ksys_read+0x70/0xfc
>> __arm64_sys_read+0x20/0x30
>> el0_svc_common+0x88/0x234
>> do_el0_svc+0x2c/0x90
>> el0_svc+0x1c/0x30
>> el0_sync_handler+0xa8/0xb0
>> el0_sync+0x148/0x180
>>
>> And the calltrace of task that actually caused kernel hungs as follows:
>> __switch_to+124
>> __schedule+548
>> schedule+72
>> schedule_timeout+348
>> __down_common+188
>> __down+24
>> down+104
>> hinic_get_stats64+44 [hinic]
>> dev_get_stats+92
>> bond_get_stats+172 [bonding]
>> dev_get_stats+92
>> dev_seq_printf_stats+60
>> dev_seq_show+24
>> seq_read_iter+964
>> seq_read+220
>> proc_reg_read+164
>> vfs_read+172
>> ksys_read+108
>> __arm64_sys_read+28
>> el0_svc_common+132
>> do_el0_svc+40
>> el0_svc+24
>> el0_sync_handler+164
>> el0_sync+324
>>
>> When getting device stats from bond, kernel will call bond_get_stats().
>> It first holds the spinlock bond->stats_lock, and then call
>> hinic_get_stats64() to collect hinic device's stats.
>> However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
>> protect its critical section, which may schedule current task out.
>> And if system is under high pressure, the task cannot be woken up
>> immediately, which eventually triggers kernel hung panic.
>>
>> Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
>> Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
>
> Side note: it looks like that after this patch every section protected
> by the mgmt_lock is already under rtnl lock protection, so you could
> probably remove the hinic specific lock (in a separate, net-next,
> patch).
>
> Please double check the above as I skimmed upon that quickly.
Thank you, I need to carefully check each section will only be called
through netlink dev_get_stats().
And I forgot to add prefix "net-next" in patch's title, forgive me...
>
> Thanks,
>
> Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64()
2022-06-29 7:28 [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64() Qiao Ma
2022-06-30 9:56 ` Paolo Abeni
@ 2022-06-30 10:07 ` Paolo Abeni
2022-06-30 10:23 ` Eric Dumazet
2 siblings, 0 replies; 8+ messages in thread
From: Paolo Abeni @ 2022-06-30 10:07 UTC (permalink / raw)
To: Qiao Ma, davem, edumazet, kuba, gustavoars, cai.huoqing,
aviad.krawczyk, zhaochen6
Cc: netdev, linux-kernel
On Wed, 2022-06-29 at 15:28 +0800, Qiao Ma wrote:
> When using hinic device as a bond slave device, and reading device stats of
> master bond device, the kernel may hung.
>
> The kernel panic calltrace as follows:
> Kernel panic - not syncing: softlockup: hung tasks
> Call trace:
> native_queued_spin_lock_slowpath+0x1ec/0x31c
> dev_get_stats+0x60/0xcc
> dev_seq_printf_stats+0x40/0x120
> dev_seq_show+0x1c/0x40
> seq_read_iter+0x3c8/0x4dc
> seq_read+0xe0/0x130
> proc_reg_read+0xa8/0xe0
> vfs_read+0xb0/0x1d4
> ksys_read+0x70/0xfc
> __arm64_sys_read+0x20/0x30
> el0_svc_common+0x88/0x234
> do_el0_svc+0x2c/0x90
> el0_svc+0x1c/0x30
> el0_sync_handler+0xa8/0xb0
> el0_sync+0x148/0x180
>
> And the calltrace of task that actually caused kernel hungs as follows:
> __switch_to+124
> __schedule+548
> schedule+72
> schedule_timeout+348
> __down_common+188
> __down+24
> down+104
> hinic_get_stats64+44 [hinic]
> dev_get_stats+92
> bond_get_stats+172 [bonding]
> dev_get_stats+92
> dev_seq_printf_stats+60
> dev_seq_show+24
> seq_read_iter+964
> seq_read+220
> proc_reg_read+164
> vfs_read+172
> ksys_read+108
> __arm64_sys_read+28
> el0_svc_common+132
> do_el0_svc+40
> el0_svc+24
> el0_sync_handler+164
> el0_sync+324
>
> When getting device stats from bond, kernel will call bond_get_stats().
> It first holds the spinlock bond->stats_lock, and then call
> hinic_get_stats64() to collect hinic device's stats.
> However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
> protect its critical section, which may schedule current task out.
> And if system is under high pressure, the task cannot be woken up
> immediately, which eventually triggers kernel hung panic.
>
> Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
> Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
It does not apply cleanly to the net tree (but it does to net-next)
This looks like -net material, please reabase the patch and re-post it
explicitly targeting the net tree.
Thanks!
Paolo
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64()
2022-06-29 7:28 [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64() Qiao Ma
2022-06-30 9:56 ` Paolo Abeni
2022-06-30 10:07 ` Paolo Abeni
@ 2022-06-30 10:23 ` Eric Dumazet
2022-06-30 13:57 ` maqiao
2 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2022-06-30 10:23 UTC (permalink / raw)
To: Qiao Ma
Cc: David Miller, Jakub Kicinski, Paolo Abeni, gustavoars,
cai.huoqing, Aviad Krawczyk, zhaochen6, netdev, LKML
On Wed, Jun 29, 2022 at 9:28 AM Qiao Ma <mqaio@linux.alibaba.com> wrote:
>
> When using hinic device as a bond slave device, and reading device stats of
> master bond device, the kernel may hung.
>
> The kernel panic calltrace as follows:
> Kernel panic - not syncing: softlockup: hung tasks
> Call trace:
> native_queued_spin_lock_slowpath+0x1ec/0x31c
> dev_get_stats+0x60/0xcc
> dev_seq_printf_stats+0x40/0x120
> dev_seq_show+0x1c/0x40
> seq_read_iter+0x3c8/0x4dc
> seq_read+0xe0/0x130
> proc_reg_read+0xa8/0xe0
> vfs_read+0xb0/0x1d4
> ksys_read+0x70/0xfc
> __arm64_sys_read+0x20/0x30
> el0_svc_common+0x88/0x234
> do_el0_svc+0x2c/0x90
> el0_svc+0x1c/0x30
> el0_sync_handler+0xa8/0xb0
> el0_sync+0x148/0x180
>
> And the calltrace of task that actually caused kernel hungs as follows:
> __switch_to+124
> __schedule+548
> schedule+72
> schedule_timeout+348
> __down_common+188
> __down+24
> down+104
> hinic_get_stats64+44 [hinic]
> dev_get_stats+92
> bond_get_stats+172 [bonding]
> dev_get_stats+92
> dev_seq_printf_stats+60
> dev_seq_show+24
> seq_read_iter+964
> seq_read+220
> proc_reg_read+164
> vfs_read+172
> ksys_read+108
> __arm64_sys_read+28
> el0_svc_common+132
> do_el0_svc+40
> el0_svc+24
> el0_sync_handler+164
> el0_sync+324
>
> When getting device stats from bond, kernel will call bond_get_stats().
> It first holds the spinlock bond->stats_lock, and then call
> hinic_get_stats64() to collect hinic device's stats.
> However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
> protect its critical section, which may schedule current task out.
> And if system is under high pressure, the task cannot be woken up
> immediately, which eventually triggers kernel hung panic.
>
> Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
> Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
> ---
> drivers/net/ethernet/huawei/hinic/hinic_dev.h | 1 +
> drivers/net/ethernet/huawei/hinic/hinic_main.c | 7 +++----
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_dev.h b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
> index fb3e89141a0d..1fb343d03fd5 100644
> --- a/drivers/net/ethernet/huawei/hinic/hinic_dev.h
> +++ b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
> @@ -97,6 +97,7 @@ struct hinic_dev {
>
> struct hinic_txq_stats tx_stats;
> struct hinic_rxq_stats rx_stats;
> + spinlock_t stats_lock;
>
> u8 rss_tmpl_idx;
> u8 rss_hash_engine;
> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c
> index 56a89793f47d..32a3d700ad26 100644
> --- a/drivers/net/ethernet/huawei/hinic/hinic_main.c
> +++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
> @@ -125,11 +125,13 @@ static void update_nic_stats(struct hinic_dev *nic_dev)
> {
> int i, num_qps = hinic_hwdev_num_qps(nic_dev->hwdev);
>
> + spin_lock(&nic_dev->stats_lock);
> for (i = 0; i < num_qps; i++)
> update_rx_stats(nic_dev, &nic_dev->rxqs[i]);
>
> for (i = 0; i < num_qps; i++)
> update_tx_stats(nic_dev, &nic_dev->txqs[i]);
> + spin_unlock(&nic_dev->stats_lock);
> }
>
> /**
> @@ -859,13 +861,9 @@ static void hinic_get_stats64(struct net_device *netdev,
> nic_rx_stats = &nic_dev->rx_stats;
> nic_tx_stats = &nic_dev->tx_stats;
>
> - down(&nic_dev->mgmt_lock);
> -
> if (nic_dev->flags & HINIC_INTF_UP))
> update_nic_stats(nic_dev);
>
> - up(&nic_dev->mgmt_lock);
> -
Note: The following is racy, because multiple threads can call
hinic_get_stats64() at the same time.
It needs a loop, see include/linux/u64_stats_sync.h for detail.
> stats->rx_bytes = nic_rx_stats->bytes;
> stats->rx_packets = nic_rx_stats->pkts;
> stats->rx_errors = nic_rx_stats->errors;
Same remark for nic_tx_stats->{bytes|pkts|errors}
> @@ -1239,6 +1237,7 @@ static int nic_dev_init(struct pci_dev *pdev)
>
> u64_stats_init(&tx_stats->syncp);
> u64_stats_init(&rx_stats->syncp);
> + spin_lock_init(&nic_dev->stats_lock);
>
> nic_dev->vlan_bitmap = devm_bitmap_zalloc(&pdev->dev, VLAN_N_VID,
> GFP_KERNEL);
> --
> 1.8.3.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64()
2022-06-30 10:23 ` Eric Dumazet
@ 2022-06-30 13:57 ` maqiao
2022-06-30 13:59 ` Eric Dumazet
0 siblings, 1 reply; 8+ messages in thread
From: maqiao @ 2022-06-30 13:57 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, Jakub Kicinski, Paolo Abeni, gustavoars,
cai.huoqing, Aviad Krawczyk, zhaochen6, netdev, LKML
在 2022/6/30 下午6:23, Eric Dumazet 写道:
> On Wed, Jun 29, 2022 at 9:28 AM Qiao Ma <mqaio@linux.alibaba.com> wrote:
>>
>> When using hinic device as a bond slave device, and reading device stats of
>> master bond device, the kernel may hung.
>>
>> The kernel panic calltrace as follows:
>> Kernel panic - not syncing: softlockup: hung tasks
>> Call trace:
>> native_queued_spin_lock_slowpath+0x1ec/0x31c
>> dev_get_stats+0x60/0xcc
>> dev_seq_printf_stats+0x40/0x120
>> dev_seq_show+0x1c/0x40
>> seq_read_iter+0x3c8/0x4dc
>> seq_read+0xe0/0x130
>> proc_reg_read+0xa8/0xe0
>> vfs_read+0xb0/0x1d4
>> ksys_read+0x70/0xfc
>> __arm64_sys_read+0x20/0x30
>> el0_svc_common+0x88/0x234
>> do_el0_svc+0x2c/0x90
>> el0_svc+0x1c/0x30
>> el0_sync_handler+0xa8/0xb0
>> el0_sync+0x148/0x180
>>
>> And the calltrace of task that actually caused kernel hungs as follows:
>> __switch_to+124
>> __schedule+548
>> schedule+72
>> schedule_timeout+348
>> __down_common+188
>> __down+24
>> down+104
>> hinic_get_stats64+44 [hinic]
>> dev_get_stats+92
>> bond_get_stats+172 [bonding]
>> dev_get_stats+92
>> dev_seq_printf_stats+60
>> dev_seq_show+24
>> seq_read_iter+964
>> seq_read+220
>> proc_reg_read+164
>> vfs_read+172
>> ksys_read+108
>> __arm64_sys_read+28
>> el0_svc_common+132
>> do_el0_svc+40
>> el0_svc+24
>> el0_sync_handler+164
>> el0_sync+324
>>
>> When getting device stats from bond, kernel will call bond_get_stats().
>> It first holds the spinlock bond->stats_lock, and then call
>> hinic_get_stats64() to collect hinic device's stats.
>> However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
>> protect its critical section, which may schedule current task out.
>> And if system is under high pressure, the task cannot be woken up
>> immediately, which eventually triggers kernel hung panic.
>>
>> Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
>> Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
>> ---
>> drivers/net/ethernet/huawei/hinic/hinic_dev.h | 1 +
>> drivers/net/ethernet/huawei/hinic/hinic_main.c | 7 +++----
>> 2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_dev.h b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
>> index fb3e89141a0d..1fb343d03fd5 100644
>> --- a/drivers/net/ethernet/huawei/hinic/hinic_dev.h
>> +++ b/drivers/net/ethernet/huawei/hinic/hinic_dev.h
>> @@ -97,6 +97,7 @@ struct hinic_dev {
>>
>> struct hinic_txq_stats tx_stats;
>> struct hinic_rxq_stats rx_stats;
>> + spinlock_t stats_lock;
>>
>> u8 rss_tmpl_idx;
>> u8 rss_hash_engine;
>> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c b/drivers/net/ethernet/huawei/hinic/hinic_main.c
>> index 56a89793f47d..32a3d700ad26 100644
>> --- a/drivers/net/ethernet/huawei/hinic/hinic_main.c
>> +++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
>> @@ -125,11 +125,13 @@ static void update_nic_stats(struct hinic_dev *nic_dev)
>> {
>> int i, num_qps = hinic_hwdev_num_qps(nic_dev->hwdev);
>>
>> + spin_lock(&nic_dev->stats_lock);
>> for (i = 0; i < num_qps; i++)
>> update_rx_stats(nic_dev, &nic_dev->rxqs[i]);
>>
>> for (i = 0; i < num_qps; i++)
>> update_tx_stats(nic_dev, &nic_dev->txqs[i]);
>> + spin_unlock(&nic_dev->stats_lock);
>> }
>>
>> /**
>> @@ -859,13 +861,9 @@ static void hinic_get_stats64(struct net_device *netdev,
>> nic_rx_stats = &nic_dev->rx_stats;
>> nic_tx_stats = &nic_dev->tx_stats;
>>
>> - down(&nic_dev->mgmt_lock);
>> -
>> if (nic_dev->flags & HINIC_INTF_UP))
>> update_nic_stats(nic_dev);
>>
>> - up(&nic_dev->mgmt_lock);
>> -
>
> Note: The following is racy, because multiple threads can call
> hinic_get_stats64() at the same time.
> It needs a loop, see include/linux/u64_stats_sync.h for detail.Thanks for reminding, and I noticed that nic_tx_stats/nic_rx_stats has
been protected by u64_stats_sync in update_t/rx_stats(), it seems that
it's unnecessary to use spinlock in update_nic_stats().
I will send v2 as soon as possible, thanks.
>
>> stats->rx_bytes = nic_rx_stats->bytes;
>> stats->rx_packets = nic_rx_stats->pkts;
>> stats->rx_errors = nic_rx_stats->errors;
>
> Same remark for nic_tx_stats->{bytes|pkts|errors}
>
>> @@ -1239,6 +1237,7 @@ static int nic_dev_init(struct pci_dev *pdev)
>>
>> u64_stats_init(&tx_stats->syncp);
>> u64_stats_init(&rx_stats->syncp);
>> + spin_lock_init(&nic_dev->stats_lock);
>>
>> nic_dev->vlan_bitmap = devm_bitmap_zalloc(&pdev->dev, VLAN_N_VID,
>> GFP_KERNEL);
>> --
>> 1.8.3.1
>>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64()
2022-06-30 13:57 ` maqiao
@ 2022-06-30 13:59 ` Eric Dumazet
2022-06-30 14:15 ` maqiao
0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2022-06-30 13:59 UTC (permalink / raw)
To: maqiao
Cc: David Miller, Jakub Kicinski, Paolo Abeni, gustavoars,
cai.huoqing, Aviad Krawczyk, zhaochen6, netdev, LKML
On Thu, Jun 30, 2022 at 3:57 PM maqiao <mqaio@linux.alibaba.com> wrote:
>
>
>
> 在 2022/6/30 下午6:23, Eric Dumazet 写道:
> > Note: The following is racy, because multiple threads can call
> > hinic_get_stats64() at the same time.
> > It needs a loop, see include/linux/u64_stats_sync.h for detail.
> Thanks for reminding, and I noticed that nic_tx_stats/nic_rx_stats has
> been protected by u64_stats_sync in update_t/rx_stats(), it seems that
> it's unnecessary to use spinlock in update_nic_stats().
It is necessary to use the spinlock to protect writers among themselves.
>
> I will send v2 as soon as possible, thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64()
2022-06-30 13:59 ` Eric Dumazet
@ 2022-06-30 14:15 ` maqiao
0 siblings, 0 replies; 8+ messages in thread
From: maqiao @ 2022-06-30 14:15 UTC (permalink / raw)
To: Eric Dumazet
Cc: David Miller, Jakub Kicinski, Paolo Abeni, gustavoars,
cai.huoqing, Aviad Krawczyk, zhaochen6, netdev, LKML
在 2022/6/30 下午9:59, Eric Dumazet 写道:
> On Thu, Jun 30, 2022 at 3:57 PM maqiao <mqaio@linux.alibaba.com> wrote:
>>
>>
>>
>> 在 2022/6/30 下午6:23, Eric Dumazet 写道:
>
>>> Note: The following is racy, because multiple threads can call
>>> hinic_get_stats64() at the same time.
>>> It needs a loop, see include/linux/u64_stats_sync.h for detail.
>> Thanks for reminding, and I noticed that nic_tx_stats/nic_rx_stats has
>> been protected by u64_stats_sync in update_t/rx_stats(), it seems that
>> it's unnecessary to use spinlock in update_nic_stats().
>
> It is necessary to use the spinlock to protect writers among themselves.
Ohhh, sorry, I was wrong.
I did not realize that seqlock cannot prevent mutil writers enter
critical section...
>
>
>>
>> I will send v2 as soon as possible, thanks.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-06-30 14:35 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-29 7:28 [PATCH] net: hinic: avoid kernel hung in hinic_get_stats64() Qiao Ma
2022-06-30 9:56 ` Paolo Abeni
2022-06-30 14:20 ` maqiao
2022-06-30 10:07 ` Paolo Abeni
2022-06-30 10:23 ` Eric Dumazet
2022-06-30 13:57 ` maqiao
2022-06-30 13:59 ` Eric Dumazet
2022-06-30 14:15 ` maqiao
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.