linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 net-next 00/14] Bug Fixes in ENA driver.
@ 2017-01-26 22:18 Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 01/14] net/ena: remove ntuple filter support from device feature list Netanel Belgazal
                   ` (14 more replies)
  0 siblings, 15 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

Changes between V3 and V2:
* Fix typos and correct alignment in commit messages.
* use napi_complete_done() return value to determine when the napi
handler needs to unmask the interrupts rather than implementing
non standard solution.
* Remove new features from this patchset and leave bug fixes only.
* Give example in the commit message for kernel crashes.
* Use BIT(x) instead of use the value explicitly.


Netanel Belgazal (14):
  net/ena: remove ntuple filter support from device feature list
  net/ena: fix error handling when probe fails
  net/ena: fix queues number calculation
  net/ena: fix ethtool RSS flow configuration
  net/ena: fix RSS default hash configuration
  net/ena: fix NULL dereference when removing the driver after device
    reset failed
  net/ena: refactor ena_get_stats64 to be atomic context safe
  net/ena: fix potential access to freed memory during device reset
  net/ena: use napi_complete_done() return value
  net/ena: use READ_ONCE to access completion descriptors
  net/ena: reduce the severity of ena printouts
  net/ena: change driver's default timeouts
  net/ena: change condition for host attribute configuration
  net/ena: update driver version to 1.1.2

 drivers/net/ethernet/amazon/ena/ena_admin_defs.h |  20 ++-
 drivers/net/ethernet/amazon/ena/ena_com.c        |  41 ++---
 drivers/net/ethernet/amazon/ena/ena_com.h        |   1 +
 drivers/net/ethernet/amazon/ena/ena_eth_com.c    |   8 +-
 drivers/net/ethernet/amazon/ena/ena_netdev.c     | 186 ++++++++++++++++-------
 drivers/net/ethernet/amazon/ena/ena_netdev.h     |   9 +-
 6 files changed, 182 insertions(+), 83 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 01/14] net/ena: remove ntuple filter support from device feature list
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails Netanel Belgazal
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

Remove NETIF_F_NTUPLE from netdev->features.
The ENA device driver does not support ntuple filtering.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index cc8b13e..7493ea3 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2722,7 +2722,6 @@ static void ena_set_dev_offloads(struct ena_com_dev_get_features_ctx *feat,
 	netdev->features =
 		dev_features |
 		NETIF_F_SG |
-		NETIF_F_NTUPLE |
 		NETIF_F_RXHASH |
 		NETIF_F_HIGHDMA;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 01/14] net/ena: remove ntuple filter support from device feature list Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-27 23:33   ` Lino Sanfilippo
  2017-01-26 22:18 ` [PATCH V3 net-next 03/14] net/ena: fix queues number calculation Netanel Belgazal
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

When driver fails in probe, it will release all resources,
including adapter.
In case of probe failure, ena_remove should not try to
free the adapter resources.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 7493ea3..cb60567 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -3046,6 +3046,7 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 err_free_region:
 	ena_release_bars(ena_dev, pdev);
 err_free_ena_dev:
+	pci_set_drvdata(pdev, NULL);
 	vfree(ena_dev);
 err_disable_device:
 	pci_disable_device(pdev);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 03/14] net/ena: fix queues number calculation
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 01/14] net/ena: remove ntuple filter support from device feature list Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 04/14] net/ena: fix ethtool RSS flow configuration Netanel Belgazal
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

The ENA driver tries to open a queue per vCPU.
To determine how many vCPUs the instance have it uses num_possible_cpus()
while it should have use num_online_cpus() instead.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index cb60567..f409cfd 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2660,7 +2660,7 @@ static int ena_calc_io_queue_num(struct pci_dev *pdev,
 		io_sq_num = get_feat_ctx->max_queues.max_sq_num;
 	}
 
-	io_queue_num = min_t(int, num_possible_cpus(), ENA_MAX_NUM_IO_QUEUES);
+	io_queue_num = min_t(int, num_online_cpus(), ENA_MAX_NUM_IO_QUEUES);
 	io_queue_num = min_t(int, io_queue_num, io_sq_num);
 	io_queue_num = min_t(int, io_queue_num,
 			     get_feat_ctx->max_queues.max_cq_num);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 04/14] net/ena: fix ethtool RSS flow configuration
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (2 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 03/14] net/ena: fix queues number calculation Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 05/14] net/ena: fix RSS default hash configuration Netanel Belgazal
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

ena_flow_data_to_flow_hash and ena_flow_hash_to_flow_type
treat the ena_flow_hash_to_flow_type enum as power of two values.

Change the values of ena_admin_flow_hash_fields to be power of two values.

This bug effect the ethtool set/get rxnfc.
ethtool will report wrong values hash fields for get and will
configure wrong hash fields in set.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_admin_defs.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
index a46e749..e1594d6 100644
--- a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
@@ -631,22 +631,22 @@ enum ena_admin_flow_hash_proto {
 /* RSS flow hash fields */
 enum ena_admin_flow_hash_fields {
 	/* Ethernet Dest Addr */
-	ENA_ADMIN_RSS_L2_DA	= 0,
+	ENA_ADMIN_RSS_L2_DA	= BIT(0),
 
 	/* Ethernet Src Addr */
-	ENA_ADMIN_RSS_L2_SA	= 1,
+	ENA_ADMIN_RSS_L2_SA	= BIT(1),
 
 	/* ipv4/6 Dest Addr */
-	ENA_ADMIN_RSS_L3_DA	= 2,
+	ENA_ADMIN_RSS_L3_DA	= BIT(2),
 
 	/* ipv4/6 Src Addr */
-	ENA_ADMIN_RSS_L3_SA	= 5,
+	ENA_ADMIN_RSS_L3_SA	= BIT(3),
 
 	/* tcp/udp Dest Port */
-	ENA_ADMIN_RSS_L4_DP	= 6,
+	ENA_ADMIN_RSS_L4_DP	= BIT(4),
 
 	/* tcp/udp Src Port */
-	ENA_ADMIN_RSS_L4_SP	= 7,
+	ENA_ADMIN_RSS_L4_SP	= BIT(5),
 };
 
 struct ena_admin_proto_input {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 05/14] net/ena: fix RSS default hash configuration
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (3 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 04/14] net/ena: fix ethtool RSS flow configuration Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 06/14] net/ena: fix NULL dereference when removing the driver after device reset failed Netanel Belgazal
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

ENA default hash configures IPv4_frag hash twice instead of
configure non-IP packets.

The bug caused IPv4 fragmented packets to be calculated based on
L2 source and destination address instead of L3 source and destination.
IPv4 packets can reach to the wrong Rx queue.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 3066d9c..46aad3a 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -2184,7 +2184,7 @@ int ena_com_set_default_hash_ctrl(struct ena_com_dev *ena_dev)
 	hash_ctrl->selected_fields[ENA_ADMIN_RSS_IP4_FRAG].fields =
 		ENA_ADMIN_RSS_L3_SA | ENA_ADMIN_RSS_L3_DA;
 
-	hash_ctrl->selected_fields[ENA_ADMIN_RSS_IP4_FRAG].fields =
+	hash_ctrl->selected_fields[ENA_ADMIN_RSS_NOT_IP].fields =
 		ENA_ADMIN_RSS_L2_DA | ENA_ADMIN_RSS_L2_SA;
 
 	for (i = 0; i < ENA_ADMIN_RSS_PROTO_NUM; i++) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 06/14] net/ena: fix NULL dereference when removing the driver after device reset failed
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (4 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 05/14] net/ena: fix RSS default hash configuration Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 07/14] net/ena: refactor ena_get_stats64 to be atomic context safe Netanel Belgazal
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

If for some reason the device stops responding, and the device reset
failes to recover the device, the mmio register read data structure
will not be reinitialized.

On driver removal, the driver will also try to reset the device, but
this time the mmio data structure will be NULL.

To solve this issue, perform the device reset in the remove function
only if the device is runnig.

Crash log
   54.240382] BUG: unable to handle kernel NULL pointer dereference at           (null)
[   54.244186] IP: [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186] PGD 0
[   54.244186] Oops: 0002 [#1] SMP
[   54.244186] Modules linked in: ena_drv(OE-) snd_hda_codec_generic kvm_intel kvm crct10dif_pclmul ppdev crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_intel aes_x86_64 snd_hda_controller lrw gf128mul cirrus glue_helper ablk_helper ttm snd_hda_codec drm_kms_helper cryptd snd_hwdep drm snd_pcm pvpanic snd_timer syscopyarea sysfillrect snd parport_pc sysimgblt serio_raw soundcore i2c_piix4 mac_hid lp parport psmouse floppy
[   54.244186] CPU: 5 PID: 1841 Comm: rmmod Tainted: G           OE 3.16.0-031600-generic #201408031935
[   54.244186] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   54.244186] task: ffff880135852880 ti: ffff8800bb640000 task.ti: ffff8800bb640000
[   54.244186] RIP: 0010:[<ffffffffc067de5a>]  [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186] RSP: 0018:ffff8800bb643d50  EFLAGS: 00010083
[   54.244186] RAX: 000000000000deb0 RBX: 0000000000030d40 RCX: 0000000000000003
[   54.244186] RDX: 0000000000000202 RSI: 0000000000000058 RDI: ffffc90000775104
[   54.244186] RBP: ffff8800bb643d88 R08: 0000000000000000 R09: cf00000000000000
[   54.244186] R10: 0000000fffffffe0 R11: 0000000000000001 R12: 0000000000000000
[   54.244186] R13: ffffc90000765000 R14: ffffc90000775104 R15: 00007fca1fa98090
[   54.244186] FS:  00007fca1f1bd740(0000) GS:ffff88013fd40000(0000) knlGS:0000000000000000
[   54.244186] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   54.244186] CR2: 0000000000000000 CR3: 00000000b9cf6000 CR4: 00000000001406e0
[   54.244186] Stack:
[   54.244186]  0000000000000202 0000005800000286 ffffc90000765000 ffffc90000765000
[   54.244186]  ffff880135f6b000 ffff8800b9360000 00007fca1fa98090 ffff8800bb643db8
[   54.244186]  ffffffffc0680b3d ffff8800b93608c0 ffffc90000765000 ffff880135f6b000
[   54.244186] Call Trace:
[   54.244186]  [<ffffffffc0680b3d>] ena_com_dev_reset+0x1d/0x1b0 [ena_drv]
[   54.244186]  [<ffffffffc0678497>] ena_remove+0xa7/0x130 [ena_drv]
[   54.244186]  [<ffffffff813d4df6>] pci_device_remove+0x46/0xc0
[   54.244186]  [<ffffffff814c3b7f>] __device_release_driver+0x7f/0xf0
[   54.244186]  [<ffffffff814c4738>] driver_detach+0xc8/0xd0
[   54.244186]  [<ffffffff814c3969>] bus_remove_driver+0x59/0xd0
[   54.244186]  [<ffffffff814c4fde>] driver_unregister+0x2e/0x60
[   54.244186]  [<ffffffff810f0a80>] ? show_refcnt+0x40/0x40
[   54.244186]  [<ffffffff813d4ec3>] pci_unregister_driver+0x23/0xa0
[   54.244186]  [<ffffffffc068413f>] ena_cleanup+0x10/0xed1 [ena_drv]
[   54.244186]  [<ffffffff810f3a47>] SyS_delete_module+0x157/0x1e0
[   54.244186]  [<ffffffff81014fb7>] ? do_notify_resume+0xc7/0xd0
[   54.244186]  [<ffffffff81793fad>] system_call_fastpath+0x1a/0x1f
[   54.244186] Code: c3 4d 8d b5 04 01 01 00 4c 89 f7 e8 e1 5a 11 c1 48 89 45 c8 41 0f b7 85 00 01 01 00 8d 48 01 66 2d 52 21 66 41 89 8d 00 01 01 00 <66> 41 89 04 24 0f b7 45 d4 89 45 d0 89 c1 41 0f b7 85 00 01 01
[   54.244186] RIP  [<ffffffffc067de5a>] ena_com_reg_bar_read32+0x8a/0x180 [ena_drv]
[   54.244186]  RSP <ffff8800bb643d50>
[   54.244186] CR2: 0000000000000000
[   54.244186] ---[ end trace 18dd9889b6497810 ]---

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index f409cfd..639f0aa 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2509,6 +2509,8 @@ static void ena_fw_reset_device(struct work_struct *work)
 err:
 	rtnl_unlock();
 
+	clear_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags);
+
 	dev_err(&pdev->dev,
 		"Reset attempt failed. Can not reset the device\n");
 }
@@ -3118,7 +3120,9 @@ static void ena_remove(struct pci_dev *pdev)
 
 	cancel_work_sync(&adapter->resume_io_task);
 
-	ena_com_dev_reset(ena_dev);
+	/* Reset the device only if the device is running. */
+	if (test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
+		ena_com_dev_reset(ena_dev);
 
 	ena_free_mgmnt_irq(adapter);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 07/14] net/ena: refactor ena_get_stats64 to be atomic context safe
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (5 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 06/14] net/ena: fix NULL dereference when removing the driver after device reset failed Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 08/14] net/ena: fix potential access to freed memory during device reset Netanel Belgazal
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

ndo_get_stat64() can be called from atomic context, but the current
implementation sends an admin command to retrieve the statistics from
the device. This admin command can sleep.

This patch re-factors the implementation of ena_get_stats64() to use
the {rx,tx}bytes/count from the driver's inner counters, and to obtain
the rx drop counter from the asynchronous keep alive (heart bit)
event.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_admin_defs.h |  8 ++++
 drivers/net/ethernet/amazon/ena/ena_netdev.c     | 57 +++++++++++++++++-------
 drivers/net/ethernet/amazon/ena/ena_netdev.h     |  1 +
 3 files changed, 51 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
index e1594d6..5b6509d 100644
--- a/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
+++ b/drivers/net/ethernet/amazon/ena/ena_admin_defs.h
@@ -873,6 +873,14 @@ struct ena_admin_aenq_link_change_desc {
 	u32 flags;
 };
 
+struct ena_admin_aenq_keep_alive_desc {
+	struct ena_admin_aenq_common_desc aenq_common_desc;
+
+	u32 rx_drops_low;
+
+	u32 rx_drops_high;
+};
+
 struct ena_admin_ena_mmio_req_read_less_resp {
 	u16 req_id;
 
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 639f0aa..ea3c801 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2169,28 +2169,46 @@ static struct rtnl_link_stats64 *ena_get_stats64(struct net_device *netdev,
 						 struct rtnl_link_stats64 *stats)
 {
 	struct ena_adapter *adapter = netdev_priv(netdev);
-	struct ena_admin_basic_stats ena_stats;
-	int rc;
+	struct ena_ring *rx_ring, *tx_ring;
+	unsigned int start;
+	u64 rx_drops;
+	int i;
 
 	if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
 		return NULL;
 
-	rc = ena_com_get_dev_basic_stats(adapter->ena_dev, &ena_stats);
-	if (rc)
-		return NULL;
+	for (i = 0; i < adapter->num_queues; i++) {
+		u64 bytes, packets;
+
+		tx_ring = &adapter->tx_ring[i];
+
+		do {
+			start = u64_stats_fetch_begin_irq(&tx_ring->syncp);
+			packets = tx_ring->tx_stats.cnt;
+			bytes = tx_ring->tx_stats.bytes;
+		} while (u64_stats_fetch_retry_irq(&tx_ring->syncp, start));
+
+		stats->tx_packets += packets;
+		stats->tx_bytes += bytes;
 
-	stats->tx_bytes = ((u64)ena_stats.tx_bytes_high << 32) |
-		ena_stats.tx_bytes_low;
-	stats->rx_bytes = ((u64)ena_stats.rx_bytes_high << 32) |
-		ena_stats.rx_bytes_low;
+		rx_ring = &adapter->rx_ring[i];
+
+		do {
+			start = u64_stats_fetch_begin_irq(&rx_ring->syncp);
+			packets = rx_ring->rx_stats.cnt;
+			bytes = rx_ring->rx_stats.bytes;
+		} while (u64_stats_fetch_retry_irq(&rx_ring->syncp, start));
 
-	stats->rx_packets = ((u64)ena_stats.rx_pkts_high << 32) |
-		ena_stats.rx_pkts_low;
-	stats->tx_packets = ((u64)ena_stats.tx_pkts_high << 32) |
-		ena_stats.tx_pkts_low;
+		stats->rx_packets += packets;
+		stats->rx_bytes += bytes;
+	}
+
+	do {
+		start = u64_stats_fetch_begin_irq(&adapter->syncp);
+		rx_drops = adapter->dev_stats.rx_drops;
+	} while (u64_stats_fetch_retry_irq(&adapter->syncp, start));
 
-	stats->rx_dropped = ((u64)ena_stats.rx_drops_high << 32) |
-		ena_stats.rx_drops_low;
+	stats->rx_dropped = rx_drops;
 
 	stats->multicast = 0;
 	stats->collisions = 0;
@@ -3213,8 +3231,17 @@ static void ena_keep_alive_wd(void *adapter_data,
 			      struct ena_admin_aenq_entry *aenq_e)
 {
 	struct ena_adapter *adapter = (struct ena_adapter *)adapter_data;
+	struct ena_admin_aenq_keep_alive_desc *desc;
+	u64 rx_drops;
 
+	desc = (struct ena_admin_aenq_keep_alive_desc *)aenq_e;
 	adapter->last_keep_alive_jiffies = jiffies;
+
+	rx_drops = ((u64)desc->rx_drops_high << 32) | desc->rx_drops_low;
+
+	u64_stats_update_begin(&adapter->syncp);
+	adapter->dev_stats.rx_drops = rx_drops;
+	u64_stats_update_end(&adapter->syncp);
 }
 
 static void ena_notification(void *adapter_data,
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index 69d7e9e..f0ddc11 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -241,6 +241,7 @@ struct ena_stats_dev {
 	u64 interface_up;
 	u64 interface_down;
 	u64 admin_q_pause;
+	u64 rx_drops;
 };
 
 enum ena_flags_t {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 08/14] net/ena: fix potential access to freed memory during device reset
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (6 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 07/14] net/ena: refactor ena_get_stats64 to be atomic context safe Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 09/14] net/ena: use napi_complete_done() return value Netanel Belgazal
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

If the ena driver detects that the device is not behave as expected,
it tries to reset the device.
The reset flow calls ena_down, which will frees all the resources
the driver allocates and then it will reset the device.

This flow can cause memory corruption if the device is still writes
to the driver's memory space.
To overcome this potential race, move the reset before the device
resources are freed.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 56 +++++++++++++++++++++-------
 1 file changed, 43 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index ea3c801..606fb5c 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -80,14 +80,18 @@ static void ena_tx_timeout(struct net_device *dev)
 {
 	struct ena_adapter *adapter = netdev_priv(dev);
 
+	/* Change the state of the device to trigger reset
+	 * Check that we are not in the middle or a trigger already
+	 */
+
+	if (test_and_set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
+		return;
+
 	u64_stats_update_begin(&adapter->syncp);
 	adapter->dev_stats.tx_timeout++;
 	u64_stats_update_end(&adapter->syncp);
 
 	netif_err(adapter, tx_err, dev, "Transmit time out\n");
-
-	/* Change the state of the device to trigger reset */
-	set_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
 }
 
 static void update_rx_ring_mtu(struct ena_adapter *adapter, int mtu)
@@ -1109,7 +1113,8 @@ static int ena_io_poll(struct napi_struct *napi, int budget)
 
 	tx_budget = tx_ring->ring_size / ENA_TX_POLL_BUDGET_DIVIDER;
 
-	if (!test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags)) {
+	if (!test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags) ||
+	    test_bit(ENA_FLAG_TRIGGER_RESET, &tx_ring->adapter->flags)) {
 		napi_complete_done(napi, 0);
 		return 0;
 	}
@@ -1698,12 +1703,22 @@ static void ena_down(struct ena_adapter *adapter)
 	adapter->dev_stats.interface_down++;
 	u64_stats_update_end(&adapter->syncp);
 
-	/* After this point the napi handler won't enable the tx queue */
-	ena_napi_disable_all(adapter);
 	netif_carrier_off(adapter->netdev);
 	netif_tx_disable(adapter->netdev);
 
+	/* After this point the napi handler won't enable the tx queue */
+	ena_napi_disable_all(adapter);
+
 	/* After destroy the queue there won't be any new interrupts */
+
+	if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags)) {
+		int rc;
+
+		rc = ena_com_dev_reset(adapter->ena_dev);
+		if (rc)
+			dev_err(&adapter->pdev->dev, "Device reset failed\n");
+	}
+
 	ena_destroy_all_io_queues(adapter);
 
 	ena_disable_io_intr_sync(adapter);
@@ -2065,6 +2080,14 @@ static void ena_netpoll(struct net_device *netdev)
 	struct ena_adapter *adapter = netdev_priv(netdev);
 	int i;
 
+	/* Dont schedule NAPI if the driver is in the middle of reset
+	 * or netdev is down.
+	 */
+
+	if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags) ||
+	    test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
+		return;
+
 	for (i = 0; i < adapter->num_queues; i++)
 		napi_schedule(&adapter->ena_napi[i].napi);
 }
@@ -2451,6 +2474,14 @@ static void ena_fw_reset_device(struct work_struct *work)
 	bool dev_up, wd_state;
 	int rc;
 
+	if (unlikely(!test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
+		dev_err(&pdev->dev,
+			"device reset schedule while reset bit is off\n");
+		return;
+	}
+
+	netif_carrier_off(netdev);
+
 	del_timer_sync(&adapter->timer_service);
 
 	rtnl_lock();
@@ -2464,12 +2495,6 @@ static void ena_fw_reset_device(struct work_struct *work)
 	 */
 	ena_close(netdev);
 
-	rc = ena_com_dev_reset(ena_dev);
-	if (rc) {
-		dev_err(&pdev->dev, "Device reset failed\n");
-		goto err;
-	}
-
 	ena_free_mgmnt_irq(adapter);
 
 	ena_disable_msix(adapter);
@@ -2482,6 +2507,8 @@ static void ena_fw_reset_device(struct work_struct *work)
 
 	ena_com_mmio_reg_read_request_destroy(ena_dev);
 
+	clear_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags);
+
 	/* Finish with the destroy part. Start the init part */
 
 	rc = ena_device_init(ena_dev, adapter->pdev, &get_feat_ctx, &wd_state);
@@ -2547,6 +2574,9 @@ static void check_for_missing_tx_completions(struct ena_adapter *adapter)
 	if (!test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
 		return;
 
+	if (test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))
+		return;
+
 	budget = ENA_MONITORED_TX_QUEUES;
 
 	for (i = adapter->last_monitored_tx_qid; i < adapter->num_queues; i++) {
@@ -2646,7 +2676,7 @@ static void ena_timer_service(unsigned long data)
 	if (host_info)
 		ena_update_host_info(host_info, adapter->netdev);
 
-	if (unlikely(test_and_clear_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
+	if (unlikely(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
 		netif_err(adapter, drv, adapter->netdev,
 			  "Trigger reset is on\n");
 		ena_dump_stats_to_dmesg(adapter);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 09/14] net/ena: use napi_complete_done() return value
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (7 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 08/14] net/ena: fix potential access to freed memory during device reset Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 10/14] net/ena: use READ_ONCE to access completion descriptors Netanel Belgazal
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

Do not unamsk interrupts if we are in busy poll mode.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 44 ++++++++++++++++++----------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 606fb5c..d1e1d9d 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -1122,26 +1122,40 @@ static int ena_io_poll(struct napi_struct *napi, int budget)
 	tx_work_done = ena_clean_tx_irq(tx_ring, tx_budget);
 	rx_work_done = ena_clean_rx_irq(rx_ring, napi, budget);
 
-	if ((budget > rx_work_done) && (tx_budget > tx_work_done)) {
-		napi_complete_done(napi, rx_work_done);
+	/* If the device is about to reset or down, avoid unmask
+	 * the interrupt and return 0 so NAPI won't reschedule
+	 */
+	if (unlikely(!test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags) ||
+		     test_bit(ENA_FLAG_TRIGGER_RESET, &tx_ring->adapter->flags))) {
+		napi_complete_done(napi, 0);
+		ret = 0;
 
+	} else if ((budget > rx_work_done) && (tx_budget > tx_work_done)) {
 		napi_comp_call = 1;
-		/* Tx and Rx share the same interrupt vector */
-		if (ena_com_get_adaptive_moderation_enabled(rx_ring->ena_dev))
-			ena_adjust_intr_moderation(rx_ring, tx_ring);
 
-		/* Update intr register: rx intr delay, tx intr delay and
-		 * interrupt unmask
+		/* Update numa and unmask the interrupt only when schedule
+		 * from the interrupt context (vs from sk_busy_loop)
 		 */
-		ena_com_update_intr_reg(&intr_reg,
-					rx_ring->smoothed_interval,
-					tx_ring->smoothed_interval,
-					true);
+		if (napi_complete_done(napi, rx_work_done)) {
+			/* Tx and Rx share the same interrupt vector */
+			if (ena_com_get_adaptive_moderation_enabled(rx_ring->ena_dev))
+				ena_adjust_intr_moderation(rx_ring, tx_ring);
+
+			/* Update intr register: rx intr delay,
+			 * tx intr delay and interrupt unmask
+			 */
+			ena_com_update_intr_reg(&intr_reg,
+						rx_ring->smoothed_interval,
+						tx_ring->smoothed_interval,
+						true);
+
+			/* It is a shared MSI-X.
+			 * Tx and Rx CQ have pointer to it.
+			 * So we use one of them to reach the intr reg
+			 */
+			ena_com_unmask_intr(rx_ring->ena_com_io_cq, &intr_reg);
+		}
 
-		/* It is a shared MSI-X. Tx and Rx CQ have pointer to it.
-		 * So we use one of them to reach the intr reg
-		 */
-		ena_com_unmask_intr(rx_ring->ena_com_io_cq, &intr_reg);
 
 		ena_update_ring_numa_node(tx_ring, rx_ring);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 10/14] net/ena: use READ_ONCE to access completion descriptors
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (8 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 09/14] net/ena: use napi_complete_done() return value Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 11/14] net/ena: reduce the severity of ena printouts Netanel Belgazal
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

Completion descriptors are accessed from the driver and from the device.
To avoid reading the old value, use READ_ONCE macro.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.h     | 1 +
 drivers/net/ethernet/amazon/ena/ena_eth_com.c | 8 ++++----
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.h b/drivers/net/ethernet/amazon/ena/ena_com.h
index 509d7b8..c9b33ee 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.h
+++ b/drivers/net/ethernet/amazon/ena/ena_com.h
@@ -33,6 +33,7 @@
 #ifndef ENA_COM
 #define ENA_COM
 
+#include <linux/compiler.h>
 #include <linux/delay.h>
 #include <linux/dma-mapping.h>
 #include <linux/gfp.h>
diff --git a/drivers/net/ethernet/amazon/ena/ena_eth_com.c b/drivers/net/ethernet/amazon/ena/ena_eth_com.c
index 539c536..f999305 100644
--- a/drivers/net/ethernet/amazon/ena/ena_eth_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_eth_com.c
@@ -45,7 +45,7 @@ static inline struct ena_eth_io_rx_cdesc_base *ena_com_get_next_rx_cdesc(
 	cdesc = (struct ena_eth_io_rx_cdesc_base *)(io_cq->cdesc_addr.virt_addr
 			+ (head_masked * io_cq->cdesc_entry_size_in_bytes));
 
-	desc_phase = (cdesc->status & ENA_ETH_IO_RX_CDESC_BASE_PHASE_MASK) >>
+	desc_phase = (READ_ONCE(cdesc->status) & ENA_ETH_IO_RX_CDESC_BASE_PHASE_MASK) >>
 			ENA_ETH_IO_RX_CDESC_BASE_PHASE_SHIFT;
 
 	if (desc_phase != expected_phase)
@@ -141,7 +141,7 @@ static inline u16 ena_com_cdesc_rx_pkt_get(struct ena_com_io_cq *io_cq,
 
 		ena_com_cq_inc_head(io_cq);
 		count++;
-		last = (cdesc->status & ENA_ETH_IO_RX_CDESC_BASE_LAST_MASK) >>
+		last = (READ_ONCE(cdesc->status) & ENA_ETH_IO_RX_CDESC_BASE_LAST_MASK) >>
 			ENA_ETH_IO_RX_CDESC_BASE_LAST_SHIFT;
 	} while (!last);
 
@@ -489,13 +489,13 @@ int ena_com_tx_comp_req_id_get(struct ena_com_io_cq *io_cq, u16 *req_id)
 	 * expected, it mean that the device still didn't update
 	 * this completion.
 	 */
-	cdesc_phase = cdesc->flags & ENA_ETH_IO_TX_CDESC_PHASE_MASK;
+	cdesc_phase = READ_ONCE(cdesc->flags) & ENA_ETH_IO_TX_CDESC_PHASE_MASK;
 	if (cdesc_phase != expected_phase)
 		return -EAGAIN;
 
 	ena_com_cq_inc_head(io_cq);
 
-	*req_id = cdesc->req_id;
+	*req_id = READ_ONCE(cdesc->req_id);
 
 	return 0;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 11/14] net/ena: reduce the severity of ena printouts
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (9 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 10/14] net/ena: use READ_ONCE to access completion descriptors Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 12/14] net/ena: change driver's default timeouts Netanel Belgazal
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c    | 27 +++++++++++++++++----------
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 14 +++++++++++---
 2 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 46aad3a..5518b1f 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -784,7 +784,7 @@ static int ena_com_get_feature_ex(struct ena_com_dev *ena_dev,
 	int ret;
 
 	if (!ena_com_check_supported_feature_id(ena_dev, feature_id)) {
-		pr_info("Feature %d isn't supported\n", feature_id);
+		pr_debug("Feature %d isn't supported\n", feature_id);
 		return -EPERM;
 	}
 
@@ -1126,7 +1126,13 @@ int ena_com_execute_admin_command(struct ena_com_admin_queue *admin_queue,
 	comp_ctx = ena_com_submit_admin_cmd(admin_queue, cmd, cmd_size,
 					    comp, comp_size);
 	if (unlikely(IS_ERR(comp_ctx))) {
-		pr_err("Failed to submit command [%ld]\n", PTR_ERR(comp_ctx));
+		if (comp_ctx == ERR_PTR(-ENODEV))
+			pr_debug("Failed to submit command [%ld]\n",
+				 PTR_ERR(comp_ctx));
+		else
+			pr_err("Failed to submit command [%ld]\n",
+			       PTR_ERR(comp_ctx));
+
 		return PTR_ERR(comp_ctx);
 	}
 
@@ -1895,7 +1901,7 @@ int ena_com_set_dev_mtu(struct ena_com_dev *ena_dev, int mtu)
 	int ret;
 
 	if (!ena_com_check_supported_feature_id(ena_dev, ENA_ADMIN_MTU)) {
-		pr_info("Feature %d isn't supported\n", ENA_ADMIN_MTU);
+		pr_debug("Feature %d isn't supported\n", ENA_ADMIN_MTU);
 		return -EPERM;
 	}
 
@@ -1948,8 +1954,8 @@ int ena_com_set_hash_function(struct ena_com_dev *ena_dev)
 
 	if (!ena_com_check_supported_feature_id(ena_dev,
 						ENA_ADMIN_RSS_HASH_FUNCTION)) {
-		pr_info("Feature %d isn't supported\n",
-			ENA_ADMIN_RSS_HASH_FUNCTION);
+		pr_debug("Feature %d isn't supported\n",
+			 ENA_ADMIN_RSS_HASH_FUNCTION);
 		return -EPERM;
 	}
 
@@ -2112,7 +2118,8 @@ int ena_com_set_hash_ctrl(struct ena_com_dev *ena_dev)
 
 	if (!ena_com_check_supported_feature_id(ena_dev,
 						ENA_ADMIN_RSS_HASH_INPUT)) {
-		pr_info("Feature %d isn't supported\n", ENA_ADMIN_RSS_HASH_INPUT);
+		pr_debug("Feature %d isn't supported\n",
+			 ENA_ADMIN_RSS_HASH_INPUT);
 		return -EPERM;
 	}
 
@@ -2270,8 +2277,8 @@ int ena_com_indirect_table_set(struct ena_com_dev *ena_dev)
 
 	if (!ena_com_check_supported_feature_id(
 		    ena_dev, ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG)) {
-		pr_info("Feature %d isn't supported\n",
-			ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG);
+		pr_debug("Feature %d isn't supported\n",
+			 ENA_ADMIN_RSS_REDIRECTION_TABLE_CONFIG);
 		return -EPERM;
 	}
 
@@ -2542,8 +2549,8 @@ int ena_com_init_interrupt_moderation(struct ena_com_dev *ena_dev)
 
 	if (rc) {
 		if (rc == -EPERM) {
-			pr_info("Feature %d isn't supported\n",
-				ENA_ADMIN_INTERRUPT_MODERATION);
+			pr_debug("Feature %d isn't supported\n",
+				 ENA_ADMIN_INTERRUPT_MODERATION);
 			rc = 0;
 		} else {
 			pr_err("Failed to get interrupt moderation admin cmd. rc: %d\n",
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index d1e1d9d..96048bd 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -563,6 +563,7 @@ static void ena_free_all_rx_bufs(struct ena_adapter *adapter)
  */
 static void ena_free_tx_bufs(struct ena_ring *tx_ring)
 {
+	bool print_once = true;
 	u32 i;
 
 	for (i = 0; i < tx_ring->ring_size; i++) {
@@ -574,9 +575,16 @@ static void ena_free_tx_bufs(struct ena_ring *tx_ring)
 		if (!tx_info->skb)
 			continue;
 
-		netdev_notice(tx_ring->netdev,
-			      "free uncompleted tx skb qid %d idx 0x%x\n",
-			      tx_ring->qid, i);
+		if (print_once) {
+			netdev_notice(tx_ring->netdev,
+				      "free uncompleted tx skb qid %d idx 0x%x\n",
+				      tx_ring->qid, i);
+			print_once = false;
+		} else {
+			netdev_dbg(tx_ring->netdev,
+				   "free uncompleted tx skb qid %d idx 0x%x\n",
+				   tx_ring->qid, i);
+		}
 
 		ena_buf = tx_info->bufs;
 		dma_unmap_single(tx_ring->dev,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 12/14] net/ena: change driver's default timeouts
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (10 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 11/14] net/ena: reduce the severity of ena printouts Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 13/14] net/ena: change condition for host attribute configuration Netanel Belgazal
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

The timeouts were too agressive and sometimes cause false alarms.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c    | 4 ++--
 drivers/net/ethernet/amazon/ena/ena_netdev.h | 6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 5518b1f..8029e7c 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -36,9 +36,9 @@
 /*****************************************************************************/
 
 /* Timeout in micro-sec */
-#define ADMIN_CMD_TIMEOUT_US (1000000)
+#define ADMIN_CMD_TIMEOUT_US (3000000)
 
-#define ENA_ASYNC_QUEUE_DEPTH 4
+#define ENA_ASYNC_QUEUE_DEPTH 16
 #define ENA_ADMIN_QUEUE_DEPTH 32
 
 #define MIN_ENA_VER (((ENA_COMMON_SPEC_VERSION_MAJOR) << \
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index f0ddc11..efe0ea1 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -100,7 +100,7 @@
 /* Number of queues to check for missing queues per timer service */
 #define ENA_MONITORED_TX_QUEUES	4
 /* Max timeout packets before device reset */
-#define MAX_NUM_OF_TIMEOUTED_PACKETS 32
+#define MAX_NUM_OF_TIMEOUTED_PACKETS 128
 
 #define ENA_TX_RING_IDX_NEXT(idx, ring_size) (((idx) + 1) & ((ring_size) - 1))
 
@@ -116,9 +116,9 @@
 #define ENA_IO_IRQ_IDX(q)		(ENA_IO_IRQ_FIRST_IDX + (q))
 
 /* ENA device should send keep alive msg every 1 sec.
- * We wait for 3 sec just to be on the safe side.
+ * We wait for 6 sec just to be on the safe side.
  */
-#define ENA_DEVICE_KALIVE_TIMEOUT	(3 * HZ)
+#define ENA_DEVICE_KALIVE_TIMEOUT	(6 * HZ)
 
 #define ENA_MMIO_DISABLE_REG_READ	BIT(0)
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 13/14] net/ena: change condition for host attribute configuration
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (11 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 12/14] net/ena: change driver's default timeouts Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-26 22:18 ` [PATCH V3 net-next 14/14] net/ena: update driver version to 1.1.2 Netanel Belgazal
  2017-01-27 16:07 ` [PATCH V3 net-next 00/14] Bug Fixes in ENA driver David Miller
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

Move the host info config to be the first admin command that is executed.
This change require the driver to remove the 'feature check'
from host info configuration flow.
The check is removed since the supported features bitmask field
is retrieved only after calling ENA_ADMIN_DEVICE_ATTRIBUTES admin command.

If set host info is not supported an error will be returned by the device.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_com.c    | 8 +++-----
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 5 +++--
 2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_com.c b/drivers/net/ethernet/amazon/ena/ena_com.c
index 8029e7c..08d11ce 100644
--- a/drivers/net/ethernet/amazon/ena/ena_com.c
+++ b/drivers/net/ethernet/amazon/ena/ena_com.c
@@ -2451,11 +2451,9 @@ int ena_com_set_host_attributes(struct ena_com_dev *ena_dev)
 
 	int ret;
 
-	if (!ena_com_check_supported_feature_id(ena_dev,
-						ENA_ADMIN_HOST_ATTR_CONFIG)) {
-		pr_warn("Set host attribute isn't supported\n");
-		return -EPERM;
-	}
+	/* Host attribute config is called before ena_com_get_dev_attr_feat
+	 * so ena_com can't check if the feature is supported.
+	 */
 
 	memset(&cmd, 0x0, sizeof(cmd));
 	admin_queue = &ena_dev->admin_queue;
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 96048bd..7b9c80f 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2416,6 +2416,8 @@ static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev,
 	 */
 	ena_com_set_admin_polling_mode(ena_dev, true);
 
+	ena_config_host_info(ena_dev);
+
 	/* Get Device Attributes*/
 	rc = ena_com_get_dev_attr_feat(ena_dev, get_feat_ctx);
 	if (rc) {
@@ -2440,11 +2442,10 @@ static int ena_device_init(struct ena_com_dev *ena_dev, struct pci_dev *pdev,
 
 	*wd_state = !!(aenq_groups & BIT(ENA_ADMIN_KEEP_ALIVE));
 
-	ena_config_host_info(ena_dev);
-
 	return 0;
 
 err_admin_init:
+	ena_com_delete_host_info(ena_dev);
 	ena_com_admin_destroy(ena_dev);
 err_mmio_read_less:
 	ena_com_mmio_reg_read_request_destroy(ena_dev);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH V3 net-next 14/14] net/ena: update driver version to 1.1.2
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (12 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 13/14] net/ena: change condition for host attribute configuration Netanel Belgazal
@ 2017-01-26 22:18 ` Netanel Belgazal
  2017-01-27 16:07 ` [PATCH V3 net-next 00/14] Bug Fixes in ENA driver David Miller
  14 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-26 22:18 UTC (permalink / raw)
  To: linux-kernel, davem, netdev
  Cc: Netanel Belgazal, dwmw, zorik, alex, saeed, msw, aliguori, nafea,
	eric.dumazet

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.h b/drivers/net/ethernet/amazon/ena/ena_netdev.h
index efe0ea1..ed62d8e 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.h
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.h
@@ -44,7 +44,7 @@
 #include "ena_eth_com.h"
 
 #define DRV_MODULE_VER_MAJOR	1
-#define DRV_MODULE_VER_MINOR	0
+#define DRV_MODULE_VER_MINOR	1
 #define DRV_MODULE_VER_SUBMINOR 2
 
 #define DRV_MODULE_NAME		"ena"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH V3 net-next 00/14] Bug Fixes in ENA driver.
  2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
                   ` (13 preceding siblings ...)
  2017-01-26 22:18 ` [PATCH V3 net-next 14/14] net/ena: update driver version to 1.1.2 Netanel Belgazal
@ 2017-01-27 16:07 ` David Miller
  14 siblings, 0 replies; 18+ messages in thread
From: David Miller @ 2017-01-27 16:07 UTC (permalink / raw)
  To: netanel
  Cc: linux-kernel, netdev, dwmw, zorik, alex, saeed, msw, aliguori,
	nafea, eric.dumazet

From: Netanel Belgazal <netanel@annapurnalabs.com>
Date: Fri, 27 Jan 2017 00:18:02 +0200

> Changes between V3 and V2:
> * Fix typos and correct alignment in commit messages.
> * use napi_complete_done() return value to determine when the napi
> handler needs to unmask the interrupts rather than implementing
> non standard solution.
> * Remove new features from this patchset and leave bug fixes only.
> * Give example in the commit message for kernel crashes.
> * Use BIT(x) instead of use the value explicitly.

This series does not apply cleanly to net-next, please respin.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails
  2017-01-26 22:18 ` [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails Netanel Belgazal
@ 2017-01-27 23:33   ` Lino Sanfilippo
  2017-01-31 22:14     ` Netanel Belgazal
  0 siblings, 1 reply; 18+ messages in thread
From: Lino Sanfilippo @ 2017-01-27 23:33 UTC (permalink / raw)
  To: Netanel Belgazal, linux-kernel, davem, netdev
  Cc: dwmw, zorik, alex, saeed, msw, aliguori, nafea, eric.dumazet

Hi,

On 26.01.2017 23:18, Netanel Belgazal wrote:
> When driver fails in probe, it will release all resources,
> including adapter.
> In case of probe failure, ena_remove should not try to
> free the adapter resources.
>
> Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
> ---
>  drivers/net/ethernet/amazon/ena/ena_netdev.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> index 7493ea3..cb60567 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> @@ -3046,6 +3046,7 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
>  err_free_region:
>  	ena_release_bars(ena_dev, pdev);
>  err_free_ena_dev:
> +	pci_set_drvdata(pdev, NULL);
>  	vfree(ena_dev);
>  err_disable_device:
>  	pci_disable_device(pdev);
>

Is this change really a "fix"? remove() should only be called if
probe() has been successful before, otherwise not. Did you experience
something different?

Regards,
Lino

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails
  2017-01-27 23:33   ` Lino Sanfilippo
@ 2017-01-31 22:14     ` Netanel Belgazal
  0 siblings, 0 replies; 18+ messages in thread
From: Netanel Belgazal @ 2017-01-31 22:14 UTC (permalink / raw)
  To: Lino Sanfilippo, linux-kernel, davem, netdev
  Cc: dwmw, zorik, alex, saeed, msw, aliguori, nafea, eric.dumazet

Hi,

You are right. I'll remove this patch.

Regards,

Netanel

On 01/28/2017 01:33 AM, Lino Sanfilippo wrote:
> Hi,
>
> On 26.01.2017 23:18, Netanel Belgazal wrote:
>> When driver fails in probe, it will release all resources,
>> including adapter.
>> In case of probe failure, ena_remove should not try to
>> free the adapter resources.
>>
>> Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
>> ---
>>  drivers/net/ethernet/amazon/ena/ena_netdev.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c 
>> b/drivers/net/ethernet/amazon/ena/ena_netdev.c
>> index 7493ea3..cb60567 100644
>> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
>> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
>> @@ -3046,6 +3046,7 @@ static int ena_probe(struct pci_dev *pdev, 
>> const struct pci_device_id *ent)
>>  err_free_region:
>>      ena_release_bars(ena_dev, pdev);
>>  err_free_ena_dev:
>> +    pci_set_drvdata(pdev, NULL);
>>      vfree(ena_dev);
>>  err_disable_device:
>>      pci_disable_device(pdev);
>>
>
> Is this change really a "fix"? remove() should only be called if
> probe() has been successful before, otherwise not. Did you experience
> something different?
>
> Regards,
> Lino

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-01-31 22:14 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-26 22:18 [PATCH V3 net-next 00/14] Bug Fixes in ENA driver Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 01/14] net/ena: remove ntuple filter support from device feature list Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 02/14] net/ena: fix error handling when probe fails Netanel Belgazal
2017-01-27 23:33   ` Lino Sanfilippo
2017-01-31 22:14     ` Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 03/14] net/ena: fix queues number calculation Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 04/14] net/ena: fix ethtool RSS flow configuration Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 05/14] net/ena: fix RSS default hash configuration Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 06/14] net/ena: fix NULL dereference when removing the driver after device reset failed Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 07/14] net/ena: refactor ena_get_stats64 to be atomic context safe Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 08/14] net/ena: fix potential access to freed memory during device reset Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 09/14] net/ena: use napi_complete_done() return value Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 10/14] net/ena: use READ_ONCE to access completion descriptors Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 11/14] net/ena: reduce the severity of ena printouts Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 12/14] net/ena: change driver's default timeouts Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 13/14] net/ena: change condition for host attribute configuration Netanel Belgazal
2017-01-26 22:18 ` [PATCH V3 net-next 14/14] net/ena: update driver version to 1.1.2 Netanel Belgazal
2017-01-27 16:07 ` [PATCH V3 net-next 00/14] Bug Fixes in ENA driver David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).