linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V4 net-next 00/13]  Bug Fixes in ENA driver
@ 2017-02-09 13:21 Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 01/13] net/ena: remove ntuple filter support from device feature list Netanel Belgazal
                   ` (13 more replies)
  0 siblings, 14 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

Changes from V3:
* Rebase patchset to master and solve merge conflicts.
* Remove redundant bug fix (fix error handling when probe fails)

Netanel Belgazal (13):
  net/ena: remove ntuple filter support from device feature list
  net/ena: fix queues number calculation
  net/ena: fix ethtool RSS flow configuration
  net/ena: fix RSS default hash configuration
  net/ena: fix NULL dereference when removing the driver after device
    reset failed
  net/ena: refactor ena_get_stats64 to be atomic context safe
  net/ena: fix potential access to freed memory during device reset
  net/ena: use napi_complete_done() return value
  net/ena: use READ_ONCE to access completion descriptors
  net/ena: reduce the severity of ena printouts
  net/ena: change driver's default timeouts
  net/ena: change condition for host attribute configuration
  net/ena: update driver version to 1.1.2

 drivers/net/ethernet/amazon/ena/ena_admin_defs.h |  20 ++-
 drivers/net/ethernet/amazon/ena/ena_com.c        |  41 +++---
 drivers/net/ethernet/amazon/ena/ena_com.h        |   1 +
 drivers/net/ethernet/amazon/ena/ena_eth_com.c    |   8 +-
 drivers/net/ethernet/amazon/ena/ena_netdev.c     | 176 ++++++++++++++++-------
 drivers/net/ethernet/amazon/ena/ena_netdev.h     |   9 +-
 6 files changed, 172 insertions(+), 83 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 01/13] net/ena: remove ntuple filter support from device feature list
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 02/13] net/ena: fix queues number calculation Netanel Belgazal
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

Remove NETIF_F_NTUPLE from netdev->features.
The ENA device driver does not support ntuple filtering.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index aca95b3..d8cd9ca 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2720,7 +2720,6 @@ static void ena_set_dev_offloads(struct ena_com_dev_get_features_ctx *feat,
 	netdev->features =
 		dev_features |
 		NETIF_F_SG |
-		NETIF_F_NTUPLE |
 		NETIF_F_RXHASH |
 		NETIF_F_HIGHDMA;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 02/13] net/ena: fix queues number calculation
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 01/13] net/ena: remove ntuple filter support from device feature list Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 03/13] net/ena: fix ethtool RSS flow configuration Netanel Belgazal
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

The ENA driver tries to open a queue per vCPU.
To determine how many vCPUs the instance have it uses num_possible_cpus()
while it should have use num_online_cpus() instead.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index d8cd9ca..effe686 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2658,7 +2658,7 @@ static int ena_calc_io_queue_num(struct pci_dev *pdev,
 		io_sq_num = get_feat_ctx->max_queues.max_sq_num;
 	}
 
-	io_queue_num = min_t(int, num_possible_cpus(), ENA_MAX_NUM_IO_QUEUES);
+	io_queue_num = min_t(int, num_online_cpus(), ENA_MAX_NUM_IO_QUEUES);
 	io_queue_num = min_t(int, io_queue_num, io_sq_num);
 	io_queue_num = min_t(int, io_queue_num,
 			     get_feat_ctx->max_queues.max_cq_num);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 03/13] net/ena: fix ethtool RSS flow configuration
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 01/13] net/ena: remove ntuple filter support from device feature list Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 02/13] net/ena: fix queues number calculation Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 04/13] net/ena: fix RSS default hash configuration Netanel Belgazal
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

ena_flow_data_to_flow_hash and ena_flow_hash_to_flow_type
treat the ena_flow_hash_to_flow_type enum as power of two values.

Change the values of ena_admin_flow_hash_fields to be power of two values.

This bug effect the ethtool set/get rxnfc.
ethtool will report wrong values hash fields for get and will
configure wrong hash fields in set.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_admin_defs.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
index a46e749..e1594d6 100644
--- a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
@@ -631,22 +631,22 @@ enum ena_admin_flow_hash_proto {
 /* RSS flow hash fields */
 enum ena_admin_flow_hash_fields {
 	/* Ethernet Dest Addr */
-	ENA_ADMIN_RSS_L2_DA	= 0,
+	ENA_ADMIN_RSS_L2_DA	= BIT(0),
 
 	/* Ethernet Src Addr */
-	ENA_ADMIN_RSS_L2_SA	= 1,
+	ENA_ADMIN_RSS_L2_SA	= BIT(1),
 
 	/* ipv4/6 Dest Addr */
-	ENA_ADMIN_RSS_L3_DA	= 2,
+	ENA_ADMIN_RSS_L3_DA	= BIT(2),
 
 	/* ipv4/6 Src Addr */
-	ENA_ADMIN_RSS_L3_SA	= 5,
+	ENA_ADMIN_RSS_L3_SA	= BIT(3),
 
 	/* tcp/udp Dest Port */
-	ENA_ADMIN_RSS_L4_DP	= 6,
+	ENA_ADMIN_RSS_L4_DP	= BIT(4),
 
 	/* tcp/udp Src Port */
-	ENA_ADMIN_RSS_L4_SP	= 7,
+	ENA_ADMIN_RSS_L4_SP	= BIT(5),
 };
 
 struct ena_admin_proto_input {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 04/13] net/ena: fix RSS default hash configuration
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (2 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 03/13] net/ena: fix ethtool RSS flow configuration Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 05/13] net/ena: fix NULL dereference when removing the driver after device reset failed Netanel Belgazal
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

ENA default hash configures IPv4_frag hash twice instead of
configure non-IP packets.

The bug caused IPv4 fragmented packets to be calculated based on
L2 source and destination address instead of L3 source and destination.
IPv4 packets can reach to the wrong Rx queue.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 3066d9c..46aad3a 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -2184,7 +2184,7 @@ int ena_com_set_default_hash_ctrl(struct ena_com_dev *ena_dev)
 	hash_ctrl->selected_fields[ENA_ADMIN_RSS_IP4_FRAG].fields =
 		ENA_ADMIN_RSS_L3_SA | ENA_ADMIN_RSS_L3_DA;
 
-	hash_ctrl->selected_fields[ENA_ADMIN_RSS_IP4_FRAG].fields =
+	hash_ctrl->selected_fields[ENA_ADMIN_RSS_NOT_IP].fields =
 		ENA_ADMIN_RSS_L2_DA | ENA_ADMIN_RSS_L2_SA;
 
 	for (i = 0; i < ENA_ADMIN_RSS_PROTO_NUM; i++) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 05/13] net/ena: fix NULL dereference when removing the driver after device reset failed
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (3 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 04/13] net/ena: fix RSS default hash configuration Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 06/13] net/ena: refactor ena_get_stats64 to be atomic context safe Netanel Belgazal
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

If for some reason the device stops responding, and the device reset
failes to recover the device, the mmio register read data structure
will not be reinitialized.

On driver removal, the driver will also try to reset the device, but
this time the mmio data structure will be NULL.

To solve this issue, perform the device reset in the remove function
only if the device is runnig.

Crash log
   54.240382] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   54.244186] IP: [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186] PGD 0
[   54.244186] Oops: 0002 [#1] SMP
[   54.244186] Modules linked in: ena_drv(OE-) snd_hda_codec_generic kvm_intel kvm crct10dif_pclmul ppdev crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_intel aes_x86_64 snd_hda_controller lrw gf128mul cirrus glue_helper ablk_helper ttm snd_hda_codec drm_kms_helper cryptd snd_hwdep drm snd_pcm pvpanic snd_timer syscopyarea sysfillrect snd parport_pc sysimgblt serio_raw soundcore i2c_piix4 mac_hid lp parport psmouse floppy
[   54.244186] CPU: 5 PID: 1841 Comm: rmmod Tainted: G           OE 3.16.0-031600-generic #201408031935
[   54.244186] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   54.244186] task: ffff880135852880 ti: ffff8800bb640000 task.ti: ffff8800bb640000
[   54.244186] RIP: 0010:[<ffffffffc067de5a>]  [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186] RSP: 0018:ffff8800bb643d50  EFLAGS: 00010083
[   54.244186] RAX: 000000000000deb0 RBX: 0000000000030d40 RCX: 0000000000000003
[   54.244186] RDX: 0000000000000202 RSI: 0000000000000058 RDI: ffffc90000775104
[   54.244186] RBP: ffff8800bb643d88 R08: 0000000000000000 R09: cf00000000000000
[   54.244186] R10: 0000000fffffffe0 R11: 0000000000000001 R12: 0000000000000000
[   54.244186] R13: ffffc90000765000 R14: ffffc90000775104 R15: 00007fca1fa98090
[   54.244186] FS:  00007fca1f1bd740(0000) GS:ffff88013fd40000(0000) knlGS:0000000000000000
[   54.244186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.244186] CR2: 0000000000000000 CR3: 00000000b9cf6000 CR4: 00000000001406e0
[   54.244186] Stack:
[   54.244186]  0000000000000202 0000005800000286 ffffc90000765000 ffffc90000765000
[   54.244186]  ffff880135f6b000 ffff8800b9360000 00007fca1fa98090 ffff8800bb643db8
[   54.244186]  ffffffffc0680b3d ffff8800b93608c0 ffffc90000765000 ffff880135f6b000
[   54.244186] Call Trace:
[   54.244186]  [<ffffffffc0680b3d>] ena_com_dev_reset+0x1d/0x1b0 [ena_drv]
[   54.244186]  [<ffffffffc0678497>] ena_remove+0xa7/0x130 [ena_drv]
[   54.244186]  [<ffffffff813d4df6>] pci_device_remove+0x46/0xc0
[   54.244186]  [<ffffffff814c3b7f>] __device_release_driver+0x7f/0xf0
[   54.244186]  [<ffffffff814c4738>] driver_detach+0xc8/0xd0
[   54.244186]  [<ffffffff814c3969>] bus_remove_driver+0x59/0xd0
[   54.244186]  [<ffffffff814c4fde>] driver_unregister+0x2e/0x60
[   54.244186]  [<ffffffff810f0a80>] ? show_refcnt+0x40/0x40
[   54.244186]  [<ffffffff813d4ec3>] pci_unregister_driver+0x23/0xa0
[   54.244186]  [<ffffffffc068413f>] ena_cleanup+0x10/0xed1 [ena_drv]
[   54.244186]  [<ffffffff810f3a47>] SyS_delete_module+0x157/0x1e0
[   54.244186]  [<ffffffff81014fb7>] ? do_notify_resume+0xc7/0xd0
[   54.244186]  [<ffffffff81793fad>] system_call_fastpath+0x1a/0x1f
[   54.244186] Code: c3 4d 8d b5 04 01 01 00 4c 89 f7 e8 e1 5a 11 c1 48 89 45 c8 41 0f b7 85 00 01 01 00 8d 48 01 66 2d 52 21 66 41 89 8d 00 01 01 00 <66> 41 89 04 24 0f b7 45 d4 89 45 d0 89 c1 41 0f b7 85 00 01 01
[   54.244186] RIP  [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186]  RSP <ffff8800bb643d50>
[   54.244186] CR2: 0000000000000000
[   54.244186] ---[ end trace 18dd9889b6497810 ]---

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index effe686..d1aa7b6 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2507,6 +2507,8 @@ static void ena_fw_reset_device(struct work_struct *work)
 err:
 	rtnl_unlock();
 
+	clear_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags);
+
 	dev_err(&pdev->dev,
 		"Reset attempt failed. Can not reset the device\n");
 }
@@ -3115,7 +3117,9 @@ static void ena_remove(struct pci_dev *pdev)
 
 	cancel_work_sync(&adapter->resume_io_task);
 
-	ena_com_dev_reset(ena_dev);
+	/* Reset the device only if the device is running. */
+	if (test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
+		ena_com_dev_reset(ena_dev);
 
 	ena_free_mgmnt_irq(adapter);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 06/13] net/ena: refactor ena_get_stats64 to be atomic context safe
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (4 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 05/13] net/ena: fix NULL dereference when removing the driver after device reset failed Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 07/13] net/ena: fix potential access to freed memory during device reset Netanel Belgazal
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

ndo_get_stat64() can be called from atomic context, but the current
implementation sends an admin command to retrieve the statistics from
the device. This admin command can sleep.

This patch re-factors the implementation of ena_get_stats64() to use
the {rx,tx}bytes/count from the driver's inner counters, and to obtain
the rx drop counter from the asynchronous keep alive (heart bit)
event.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_admin_defs.h |  8 ++++
 drivers/net/ethernet/amazon/ena/ena_netdev.c     | 48 ++++++++++++++++--------
 drivers/net/ethernet/amazon/ena/ena_netdev.h     |  1 +
 3 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
index e1594d6..5b6509d 100644
--- a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
@@ -873,6 +873,14 @@ struct ena_admin_aenq_link_change_desc {
 	u32 flags;
 };
 
+struct ena_admin_aenq_keep_alive_desc {
+	struct ena_admin_aenq_common_desc aenq_common_desc;
+
+	u32 rx_drops_low;
+
+	u32 rx_drops_high;
+};
+
 struct ena_admin_ena_mmio_req_read_less_resp {
 	u16 req_id;
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index d1aa7b6..54493e1 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2169,28 +2169,46 @@ static void ena_get_stats64(struct net_device *netdev,
 			    struct rtnl_link_stats64 *stats)
 {
 	struct ena_adapter *adapter = netdev_priv(netdev);
-	struct ena_admin_basic_stats ena_stats;
-	int rc;
+	struct ena_ring *rx_ring, *tx_ring;
+	unsigned int start;
+	u64 rx_drops;
+	int i;
 
 	if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
 		return;
 
-	rc = ena_com_get_dev_basic_stats(adapter->ena_dev, &ena_stats);
-	if (rc)
-		return;
+	for (i = 0; i < adapter->num_queues; i++) {
+		u64 bytes, packets;
+
+		tx_ring = &adapter->tx_ring[i];
 
-	stats->tx_bytes = ((u64)ena_stats.tx_bytes_high << 32) |
-		ena_stats.tx_bytes_low;
-	stats->rx_bytes = ((u64)ena_stats.rx_bytes_high << 32) |
-		ena_stats.rx_bytes_low;
+		do {
+			start = u64_stats_fetch_begin_irq(&tx_ring->syncp);
+			packets = tx_ring->tx_stats.cnt;
+			bytes = tx_ring->tx_stats.bytes;
+		} while (u64_stats_fetch_retry_irq(&tx_ring->syncp, start));
 
-	stats->rx_packets = ((u64)ena_stats.rx_pkts_high << 32) |
-		ena_stats.rx_pkts_low;
-	stats->tx_packets = ((u64)ena_stats.tx_pkts_high << 32) |
-		ena_stats.tx_pkts_low;
+		stats->tx_packets += packets;
+		stats->tx_bytes += bytes;
+
+		rx_ring = &adapter->rx_ring[i];
+
+		do {
+			start = u64_stats_fetch_begin_irq(&rx_ring->syncp);
+			packets = rx_ring->rx_stats.cnt;
+			bytes = rx_ring->rx_stats.bytes;
+		} while (u64_stats_fetch_retry_irq(&rx_ring->syncp, start));
+
+		stats->rx_packets += packets;
+		stats->rx_bytes += bytes;
+	}
+
+	do {
+		start = u64_stats_fetch_begin_irq(&adapter->syncp);
+		rx_drops = adapter->dev_stats.rx_drops;
+	} while (u64_stats_fetch_retry_irq(&adapter->syncp, start));
 
-	stats->rx_dropped = ((u64)ena_stats.rx_drops_high << 32) |
-		ena_stats.rx_drops_low;
+	stats->rx_dropped = rx_drops;
 
 	stats->multicast = 0;
 	stats->collisions = 0;
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index 69d7e9e..f0ddc11 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -241,6 +241,7 @@ struct ena_stats_dev {
 	u64 interface_up;
 	u64 interface_down;
 	u64 admin_q_pause;
+	u64 rx_drops;
 };
 
 enum ena_flags_t {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 07/13] net/ena: fix potential access to freed memory during device reset
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (5 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 06/13] net/ena: refactor ena_get_stats64 to be atomic context safe Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 08/13] net/ena: use napi_complete_done() return value Netanel Belgazal
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

If the ena driver detects that the device is not behave as expected,
it tries to reset the device.
The reset flow calls ena_down, which will frees all the resources
the driver allocates and then it will reset the device.

This flow can cause memory corruption if the device is still writes
to the driver's memory space.
To overcome this potential race, move the reset before the device
resources are freed.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 56 +++++++++++++++++++++-------
 1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 54493e1..8ca1ba3 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -80,14 +80,18 @@ static void ena_tx_timeout(struct net_device *dev)
 {
 	struct ena_adapter *adapter = netdev_priv(dev);
 
+	/* Change the state of the device to trigger reset
+	 * Check that we are not in the middle or a trigger already
+	 */
+
+	if (test_and_set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
+		return;
+
 	u64_stats_update_begin(&adapter->syncp);
 	adapter->dev_stats.tx_timeout++;
 	u64_stats_update_end(&adapter->syncp);
 
 	netif_err(adapter, tx_err, dev, "Transmit time out\n");
-
-	/* Change the state of the device to trigger reset */
-	set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 }
 
 static void update_rx_ring_mtu(struct ena_adapter *adapter, int mtu)
@@ -1109,7 +1113,8 @@ static int ena_io_poll(struct napi_struct *napi, int budget)
 
 	tx_budget = tx_ring->ring_size / ENA_TX_POLL_BUDGET_DIVIDER;
 
-	if (!test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags)) {
+	if (!test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags) ||
+	    test_bit(ENA_FLAG_TRIGGER_RESET, &tx_ring->adapter->flags)) {
 		napi_complete_done(napi, 0);
 		return 0;
 	}
@@ -1698,12 +1703,22 @@ static void ena_down(struct ena_adapter *adapter)
 	adapter->dev_stats.interface_down++;
 	u64_stats_update_end(&adapter->syncp);
 
-	/* After this point the napi handler won't enable the tx queue */
-	ena_napi_disable_all(adapter);
 	netif_carrier_off(adapter->netdev);
 	netif_tx_disable(adapter->netdev);
 
+	/* After this point the napi handler won't enable the tx queue */
+	ena_napi_disable_all(adapter);
+
 	/* After destroy the queue there won't be any new interrupts */
+
+	if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags)) {
+		int rc;
+
+		rc = ena_com_dev_reset(adapter->ena_dev);
+		if (rc)
+			dev_err(&adapter->pdev->dev, "Device reset failed\n");
+	}
+
 	ena_destroy_all_io_queues(adapter);
 
 	ena_disable_io_intr_sync(adapter);
@@ -2065,6 +2080,14 @@ static void ena_netpoll(struct net_device *netdev)
 	struct ena_adapter *adapter = netdev_priv(netdev);
 	int i;
 
+	/* Dont schedule NAPI if the driver is in the middle of reset
+	 * or netdev is down.
+	 */
+
+	if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags) ||
+	    test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
+		return;
+
 	for (i = 0; i < adapter->num_queues; i++)
 		napi_schedule(&adapter->ena_napi[i].napi);
 }
@@ -2449,6 +2472,14 @@ static void ena_fw_reset_device(struct work_struct *work)
 	bool dev_up, wd_state;
 	int rc;
 
+	if (unlikely(!test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
+		dev_err(&pdev->dev,
+			"device reset schedule while reset bit is off\n");
+		return;
+	}
+
+	netif_carrier_off(netdev);
+
 	del_timer_sync(&adapter->timer_service);
 
 	rtnl_lock();
@@ -2462,12 +2493,6 @@ static void ena_fw_reset_device(struct work_struct *work)
 	 */
 	ena_close(netdev);
 
-	rc = ena_com_dev_reset(ena_dev);
-	if (rc) {
-		dev_err(&pdev->dev, "Device reset failed\n");
-		goto err;
-	}
-
 	ena_free_mgmnt_irq(adapter);
 
 	ena_disable_msix(adapter);
@@ -2480,6 +2505,8 @@ static void ena_fw_reset_device(struct work_struct *work)
 
 	ena_com_mmio_reg_read_request_destroy(ena_dev);
 
+	clear_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
+
 	/* Finish with the destroy part. Start the init part */
 
 	rc = ena_device_init(ena_dev, adapter->pdev, &get_feat_ctx, &wd_state);
@@ -2545,6 +2572,9 @@ static void check_for_missing_tx_completions(struct ena_adapter *adapter)
 	if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
 		return;
 
+	if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
+		return;
+
 	budget = ENA_MONITORED_TX_QUEUES;
 
 	for (i = adapter->last_monitored_tx_qid; i < adapter->num_queues; i++) {
@@ -2644,7 +2674,7 @@ static void ena_timer_service(unsigned long data)
 	if (host_info)
 		ena_update_host_info(host_info, adapter->netdev);
 
-	if (unlikely(test_and_clear_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
+	if (unlikely(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
 		netif_err(adapter, drv, adapter->netdev,
 			  "Trigger reset is on\n");
 		ena_dump_stats_to_dmesg(adapter);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 08/13] net/ena: use napi_complete_done() return value
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (6 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 07/13] net/ena: fix potential access to freed memory during device reset Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 09/13] net/ena: use READ_ONCE to access completion descriptors Netanel Belgazal
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

Do not unamsk interrupts if we are in busy poll mode.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 44 ++++++++++++++++++----------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 8ca1ba3..d467a79 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1122,26 +1122,40 @@ static int ena_io_poll(struct napi_struct *napi, int budget)
 	tx_work_done = ena_clean_tx_irq(tx_ring, tx_budget);
 	rx_work_done = ena_clean_rx_irq(rx_ring, napi, budget);
 
-	if ((budget > rx_work_done) && (tx_budget > tx_work_done)) {
-		napi_complete_done(napi, rx_work_done);
+	/* If the device is about to reset or down, avoid unmask
+	 * the interrupt and return 0 so NAPI won't reschedule
+	 */
+	if (unlikely(!test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags) ||
+		     test_bit(ENA_FLAG_TRIGGER_RESET, &tx_ring->adapter->flags))) {
+		napi_complete_done(napi, 0);
+		ret = 0;
 
+	} else if ((budget > rx_work_done) && (tx_budget > tx_work_done)) {
 		napi_comp_call = 1;
-		/* Tx and Rx share the same interrupt vector */
-		if (ena_com_get_adaptive_moderation_enabled(rx_ring->ena_dev))
-			ena_adjust_intr_moderation(rx_ring, tx_ring);
 
-		/* Update intr register: rx intr delay, tx intr delay and
-		 * interrupt unmask
+		/* Update numa and unmask the interrupt only when schedule
+		 * from the interrupt context (vs from sk_busy_loop)
 		 */
-		ena_com_update_intr_reg(&intr_reg,
-					rx_ring->smoothed_interval,
-					tx_ring->smoothed_interval,
-					true);
+		if (napi_complete_done(napi, rx_work_done)) {
+			/* Tx and Rx share the same interrupt vector */
+			if (ena_com_get_adaptive_moderation_enabled(rx_ring->ena_dev))
+				ena_adjust_intr_moderation(rx_ring, tx_ring);
+
+			/* Update intr register: rx intr delay,
+			 * tx intr delay and interrupt unmask
+			 */
+			ena_com_update_intr_reg(&intr_reg,
+						rx_ring->smoothed_interval,
+						tx_ring->smoothed_interval,
+						true);
+
+			/* It is a shared MSI-X.
+			 * Tx and Rx CQ have pointer to it.
+			 * So we use one of them to reach the intr reg
+			 */
+			ena_com_unmask_intr(rx_ring->ena_com_io_cq, &intr_reg);
+		}
 
-		/* It is a shared MSI-X. Tx and Rx CQ have pointer to it.
-		 * So we use one of them to reach the intr reg
-		 */
-		ena_com_unmask_intr(rx_ring->ena_com_io_cq, &intr_reg);
 
 		ena_update_ring_numa_node(tx_ring, rx_ring);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 09/13] net/ena: use READ_ONCE to access completion descriptors
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (7 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 08/13] net/ena: use napi_complete_done() return value Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 10/13] net/ena: reduce the severity of ena printouts Netanel Belgazal
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

Completion descriptors are accessed from the driver and from the device.
To avoid reading the old value, use READ_ONCE macro.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.h     | 1 +
 drivers/net/ethernet/amazon/ena/ena_eth_com.c | 8 ++++----
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.h b/drivers/net/ethernet/amazon/ena/ena_com.h
index 509d7b8..c9b33ee 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -33,6 +33,7 @@
 #ifndef ENA_COM
 #define ENA_COM
 
+#include <linux/compiler.h>
 #include <linux/delay.h>
 #include <linux/dma-mapping.h>
 #include <linux/gfp.h>
diff --git a/drivers/net/ethernet/amazon/ena/ena_eth_com.c b/drivers/net/ethernet/amazon/ena/ena_eth_com.c
index 539c536..f999305 100644
--- a/drivers/net/ethernet/amazon/ena/ena_eth_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_eth_com.c
@@ -45,7 +45,7 @@ static inline struct ena_eth_io_rx_cdesc_base *ena_com_get_next_rx_cdesc(
 	cdesc = (struct ena_eth_io_rx_cdesc_base *)(io_cq->cdesc_addr.virt_addr
 			+ (head_masked * io_cq->cdesc_entry_size_in_bytes));
 
-	desc_phase = (cdesc->status & ENA_ETH_IO_RX_CDESC_BASE_PHASE_MASK) >>
+	desc_phase = (READ_ONCE(cdesc->status) & ENA_ETH_IO_RX_CDESC_BASE_PHASE_MASK) >>
 			ENA_ETH_IO_RX_CDESC_BASE_PHASE_SHIFT;
 
 	if (desc_phase != expected_phase)
@@ -141,7 +141,7 @@ static inline u16 ena_com_cdesc_rx_pkt_get(struct ena_com_io_cq *io_cq,
 
 		ena_com_cq_inc_head(io_cq);
 		count++;
-		last = (cdesc->status & ENA_ETH_IO_RX_CDESC_BASE_LAST_MASK) >>
+		last = (READ_ONCE(cdesc->status) & ENA_ETH_IO_RX_CDESC_BASE_LAST_MASK) >>
 			ENA_ETH_IO_RX_CDESC_BASE_LAST_SHIFT;
 	} while (!last);
 
@@ -489,13 +489,13 @@ int ena_com_tx_comp_req_id_get(struct ena_com_io_cq *io_cq, u16 *req_id)
 	 * expected, it mean that the device still didn't update
 	 * this completion.
 	 */
-	cdesc_phase = cdesc->flags & ENA_ETH_IO_TX_CDESC_PHASE_MASK;
+	cdesc_phase = READ_ONCE(cdesc->flags) & ENA_ETH_IO_TX_CDESC_PHASE_MASK;
 	if (cdesc_phase != expected_phase)
 		return -EAGAIN;
 
 	ena_com_cq_inc_head(io_cq);
 
-	*req_id = cdesc->req_id;
+	*req_id = READ_ONCE(cdesc->req_id);
 
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 10/13] net/ena: reduce the severity of ena printouts
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (8 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 09/13] net/ena: use READ_ONCE to access completion descriptors Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 11/13] net/ena: change driver's default timeouts Netanel Belgazal
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c    | 27 +++++++++++++++++----------
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 14 +++++++++++---
 2 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 46aad3a..5518b1f 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -784,7 +784,7 @@ static int ena_com_get_feature_ex(struct ena_com_dev *ena_dev,
 	int ret;
 
 	if (!ena_com_check_supported_feature_id(ena_dev, feature_id)) {
-		pr_info("Feature %d isn't supported\n", feature_id);
+		pr_debug("Feature %d isn't supported\n", feature_id);
 		return -EPERM;
 	}
 
@@ -1126,7 +1126,13 @@ int ena_com_execute_admin_command(struct ena_com_admin_queue *admin_queue,
 	comp_ctx = ena_com_submit_admin_cmd(admin_queue, cmd, cmd_size,
 					    comp, comp_size);
 	if (unlikely(IS_ERR(comp_ctx))) {
-		pr_err("Failed to submit command [%ld]\n", PTR_ERR(comp_ctx));
+		if (comp_ctx == ERR_PTR(-ENODEV))
+			pr_debug("Failed to submit command [%ld]\n",
+				 PTR_ERR(comp_ctx));
+		else
+			pr_err("Failed to submit command [%ld]\n",
+			       PTR_ERR(comp_ctx));
+
 		return PTR_ERR(comp_ctx);
 	}
 
@@ -1895,7 +1901,7 @@ int ena_com_set_dev_mtu(struct ena_com_dev *ena_dev, int mtu)
 	int ret;
 
 	if (!ena_com_check_supported_feature_id(ena_dev, ENA_ADMIN_MTU)) {
-		pr_info("Feature %d isn't supported\n", ENA_ADMIN_MTU);
+		pr_debug("Feature %d isn't supported\n", ENA_ADMIN_MTU);
 		return -EPERM;
 	}
 
@@ -1948,8 +1954,8 @@ int ena_com_set_hash_function(struct ena_com_dev *ena_dev)
 
 	if (!ena_com_check_supported_feature_id(ena_dev,
 						ENA_ADMIN_RSS_HASH_FUNCTION)) {
-		pr_info("Feature %d isn't supported\n",
-			ENA_ADMIN_RSS_HASH_FUNCTION);
+		pr_debug("Feature %d isn't supported\n",
+			 ENA_ADMIN_RSS_HASH_FUNCTION);
 		return -EPERM;
 	}
 
@@ -2112,7 +2118,8 @@ int ena_com_set_hash_ctrl(struct ena_com_dev *ena_dev)
 
 	if (!ena_com_check_supported_feature_id(ena_dev,
 						ENA_ADMIN_RSS_HASH_INPUT)) {
-		pr_info("Feature %d isn't supported\n", ENA_ADMIN_RSS_HASH_INPUT);
+		pr_debug("Feature %d isn't supported\n",
+			 ENA_ADMIN_RSS_HASH_INPUT);
 		return -EPERM;
 	}
 
@@ -2270,8 +2277,8 @@ int ena_com_indirect_table_set(struct ena_com_dev *ena_dev)
 
 	if (!ena_com_check_supported_feature_id(
 		    ena_dev, ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG)) {
-		pr_info("Feature %d isn't supported\n",
-			ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG);
+		pr_debug("Feature %d isn't supported\n",
+			 ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG);
 		return -EPERM;
 	}
 
@@ -2542,8 +2549,8 @@ int ena_com_init_interrupt_moderation(struct ena_com_dev *ena_dev)
 
 	if (rc) {
 		if (rc == -EPERM) {
-			pr_info("Feature %d isn't supported\n",
-				ENA_ADMIN_INTERRUPT_MODERATION);
+			pr_debug("Feature %d isn't supported\n",
+				 ENA_ADMIN_INTERRUPT_MODERATION);
 			rc = 0;
 		} else {
 			pr_err("Failed to get interrupt moderation admin cmd. rc: %d\n",
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index d467a79..5079366 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -563,6 +563,7 @@ static void ena_free_all_rx_bufs(struct ena_adapter *adapter)
  */
 static void ena_free_tx_bufs(struct ena_ring *tx_ring)
 {
+	bool print_once = true;
 	u32 i;
 
 	for (i = 0; i < tx_ring->ring_size; i++) {
@@ -574,9 +575,16 @@ static void ena_free_tx_bufs(struct ena_ring *tx_ring)
 		if (!tx_info->skb)
 			continue;
 
-		netdev_notice(tx_ring->netdev,
-			      "free uncompleted tx skb qid %d idx 0x%x\n",
-			      tx_ring->qid, i);
+		if (print_once) {
+			netdev_notice(tx_ring->netdev,
+				      "free uncompleted tx skb qid %d idx 0x%x\n",
+				      tx_ring->qid, i);
+			print_once = false;
+		} else {
+			netdev_dbg(tx_ring->netdev,
+				   "free uncompleted tx skb qid %d idx 0x%x\n",
+				   tx_ring->qid, i);
+		}
 
 		ena_buf = tx_info->bufs;
 		dma_unmap_single(tx_ring->dev,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 11/13] net/ena: change driver's default timeouts
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (9 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 10/13] net/ena: reduce the severity of ena printouts Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 12/13] net/ena: change condition for host attribute configuration Netanel Belgazal
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

The timeouts were too agressive and sometimes cause false alarms.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c    | 4 ++--
 drivers/net/ethernet/amazon/ena/ena_netdev.h | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 5518b1f..8029e7c 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -36,9 +36,9 @@
 /*****************************************************************************/
 
 /* Timeout in micro-sec */
-#define ADMIN_CMD_TIMEOUT_US (1000000)
+#define ADMIN_CMD_TIMEOUT_US (3000000)
 
-#define ENA_ASYNC_QUEUE_DEPTH 4
+#define ENA_ASYNC_QUEUE_DEPTH 16
 #define ENA_ADMIN_QUEUE_DEPTH 32
 
 #define MIN_ENA_VER (((ENA_COMMON_SPEC_VERSION_MAJOR) << \
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index f0ddc11..efe0ea1 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -100,7 +100,7 @@
 /* Number of queues to check for missing queues per timer service */
 #define ENA_MONITORED_TX_QUEUES	4
 /* Max timeout packets before device reset */
-#define MAX_NUM_OF_TIMEOUTED_PACKETS 32
+#define MAX_NUM_OF_TIMEOUTED_PACKETS 128
 
 #define ENA_TX_RING_IDX_NEXT(idx, ring_size) (((idx) + 1) & ((ring_size) - 1))
 
@@ -116,9 +116,9 @@
 #define ENA_IO_IRQ_IDX(q)		(ENA_IO_IRQ_FIRST_IDX + (q))
 
 /* ENA device should send keep alive msg every 1 sec.
- * We wait for 3 sec just to be on the safe side.
+ * We wait for 6 sec just to be on the safe side.
  */
-#define ENA_DEVICE_KALIVE_TIMEOUT	(3 * HZ)
+#define ENA_DEVICE_KALIVE_TIMEOUT	(6 * HZ)
 
 #define ENA_MMIO_DISABLE_REG_READ	BIT(0)
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 12/13] net/ena: change condition for host attribute configuration
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (10 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 11/13] net/ena: change driver's default timeouts Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-09 13:21 ` [PATCH V4 net-next 13/13] net/ena: update driver version to 1.1.2 Netanel Belgazal
  2017-02-10  3:27 ` [PATCH V4 net-next 00/13] Bug Fixes in ENA driver David Miller
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

Move the host info config to be the first admin command that is executed.
This change require the driver to remove the 'feature check'
from host info configuration flow.
The check is removed since the supported features bitmask field
is retrieved only after calling ENA_ADMIN_DEVICE_ATTRIBUTES admin command.

If set host info is not supported an error will be returned by the device.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c    | 8 +++-----
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 5 +++--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 8029e7c..08d11ce 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -2451,11 +2451,9 @@ int ena_com_set_host_attributes(struct ena_com_dev *ena_dev)
 
 	int ret;
 
-	if (!ena_com_check_supported_feature_id(ena_dev,
-						ENA_ADMIN_HOST_ATTR_CONFIG)) {
-		pr_warn("Set host attribute isn't supported\n");
-		return -EPERM;
-	}
+	/* Host attribute config is called before ena_com_get_dev_attr_feat
+	 * so ena_com can't check if the feature is supported.
+	 */
 
 	memset(&cmd, 0x0, sizeof(cmd));
 	admin_queue = &ena_dev->admin_queue;
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 5079366..d8c920b 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2414,6 +2414,8 @@ static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev,
 	 */
 	ena_com_set_admin_polling_mode(ena_dev, true);
 
+	ena_config_host_info(ena_dev);
+
 	/* Get Device Attributes*/
 	rc = ena_com_get_dev_attr_feat(ena_dev, get_feat_ctx);
 	if (rc) {
@@ -2438,11 +2440,10 @@ static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev,
 
 	*wd_state = !!(aenq_groups & BIT(ENA_ADMIN_KEEP_ALIVE));
 
-	ena_config_host_info(ena_dev);
-
 	return 0;
 
 err_admin_init:
+	ena_com_delete_host_info(ena_dev);
 	ena_com_admin_destroy(ena_dev);
 err_mmio_read_less:
 	ena_com_mmio_reg_read_request_destroy(ena_dev);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH V4 net-next 13/13] net/ena: update driver version to 1.1.2
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (11 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 12/13] net/ena: change condition for host attribute configuration Netanel Belgazal
@ 2017-02-09 13:21 ` Netanel Belgazal
  2017-02-10  3:27 ` [PATCH V4 net-next 00/13] Bug Fixes in ENA driver David Miller
  13 siblings, 0 replies; 15+ messages in thread
From: Netanel Belgazal @ 2017-02-09 13:21 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet, evgenys

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index efe0ea1..ed62d8e 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -44,7 +44,7 @@
 #include "ena_eth_com.h"
 
 #define DRV_MODULE_VER_MAJOR	1
-#define DRV_MODULE_VER_MINOR	0
+#define DRV_MODULE_VER_MINOR	1
 #define DRV_MODULE_VER_SUBMINOR 2
 
 #define DRV_MODULE_NAME		"ena"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH V4 net-next 00/13] Bug Fixes in ENA driver
  2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
                   ` (12 preceding siblings ...)
  2017-02-09 13:21 ` [PATCH V4 net-next 13/13] net/ena: update driver version to 1.1.2 Netanel Belgazal
@ 2017-02-10  3:27 ` David Miller
  13 siblings, 0 replies; 15+ messages in thread
From: David Miller @ 2017-02-10  3:27 UTC (permalink / raw)
  To: netanel
  Cc: linux-kernel, netdev, dwmw, zorik, alex, saeed, msw, aliguori,
	nafea, eric.dumazet, evgenys

From: Netanel Belgazal <netanel@annapurnalabs.com>
Date: Thu,  9 Feb 2017 15:21:26 +0200

> Changes from V3:
> * Rebase patchset to master and solve merge conflicts.
> * Remove redundant bug fix (fix error handling when probe fails)

Series applied, thank you.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-02-10  3:27 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-09 13:21 [PATCH V4 net-next 00/13] Bug Fixes in ENA driver Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 01/13] net/ena: remove ntuple filter support from device feature list Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 02/13] net/ena: fix queues number calculation Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 03/13] net/ena: fix ethtool RSS flow configuration Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 04/13] net/ena: fix RSS default hash configuration Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 05/13] net/ena: fix NULL dereference when removing the driver after device reset failed Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 06/13] net/ena: refactor ena_get_stats64 to be atomic context safe Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 07/13] net/ena: fix potential access to freed memory during device reset Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 08/13] net/ena: use napi_complete_done() return value Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 09/13] net/ena: use READ_ONCE to access completion descriptors Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 10/13] net/ena: reduce the severity of ena printouts Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 11/13] net/ena: change driver's default timeouts Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 12/13] net/ena: change condition for host attribute configuration Netanel Belgazal
2017-02-09 13:21 ` [PATCH V4 net-next 13/13] net/ena: update driver version to 1.1.2 Netanel Belgazal
2017-02-10  3:27 ` [PATCH V4 net-next 00/13] Bug Fixes in ENA driver David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).