All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/25] linux/eal: Remove most causes of panic on init
@ 2017-02-08 18:51 Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 01/25] eal: CPU init will no longer panic Aaron Conole
                   ` (26 more replies)
  0 siblings, 27 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

In many cases, it's enough to simply let the application know that the
call to initialize DPDK has failed.  A complete halt can then be
decided by the application based on error returned (and the app could
even attempt a possible re-attempt after some corrective action by the
user or application).

Changes ->v2:
- Audited all "RTE_LOG (" calls that were introduced, and converted
  to "RTE_LOG("
- Added some fprintf(stderr, "") lines to indicate errors before logging
  is initialized
- Removed assignments to errno.
- Changed patch 14/25 to reflect EFAULT, and document in 25/25

I kept the rte_errno reflection, since this is control-path code and the
init function returns a sentinel value of -1.

Aaron Conole (25):
  eal: CPU init will no longer panic
  eal: return error instead of panic for cpu init
  eal: No panic on hugepages info init
  eal: do not panic on failed hugepage query
  eal: failure to parse args returns error
  eal-common: introduce a way to query cpu support
  eal: Signal error when CPU isn't supported
  eal: do not panic on memzone initialization fails
  eal: set errno when exiting for already called
  eal: Do not panic on log failures
  eal: Do not panic on pci-probe
  eal: do not panic on vfio failure
  eal: do not panic on memory init
  eal: do not panic on tailq init
  eal: do not panic on alarm init
  eal: convert timer_init not to call panic
  eal: change the private pipe call to reflect errno
  eal: Do not panic on interrupt thread init
  eal: do not error if plugins fail to init
  eal_pci: Continue probing even on failures
  eal: do not panic on failed PCI probe
  eal_common_dev: continue initializing vdevs
  eal: do not panic (or abort) if vdev init fails
  eal: do not panic when bus probe fails
  rte_eal_init: add info about rte_errno codes

 lib/librte_eal/common/eal_common_cpuflags.c        |  13 ++-
 lib/librte_eal/common/eal_common_dev.c             |   5 +-
 lib/librte_eal/common/eal_common_lcore.c           |   7 +-
 lib/librte_eal/common/eal_common_pci.c             |  15 ++-
 lib/librte_eal/common/eal_common_tailqs.c          |   3 +-
 .../common/include/generic/rte_cpuflags.h          |   9 ++
 lib/librte_eal/common/include/rte_eal.h            |  27 ++++-
 lib/librte_eal/linuxapp/eal/eal.c                  | 122 +++++++++++++++------
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c    |   6 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |   5 +-
 10 files changed, 161 insertions(+), 51 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v2 01/25] eal: CPU init will no longer panic
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 02/25] eal: return error instead of panic for cpu init Aaron Conole
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After this change, the EAL CPU NUMA node resolution step can no longer
emit an rte_panic.  This aligns with the code in rte_eal_init, which
expects failures to return an error code.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_lcore.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_lcore.c b/lib/librte_eal/common/eal_common_lcore.c
index 2cd4132..84fa0cb 100644
--- a/lib/librte_eal/common/eal_common_lcore.c
+++ b/lib/librte_eal/common/eal_common_lcore.c
@@ -83,16 +83,17 @@ rte_eal_cpu_init(void)
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = eal_cpu_core_id(lcore_id);
 		lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id);
-		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES)
+		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES) {
 #ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
 			lcore_config[lcore_id].socket_id = 0;
 #else
-			rte_panic("Socket ID (%u) is greater than "
+			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than "
 				"RTE_MAX_NUMA_NODES (%d)\n",
 				lcore_config[lcore_id].socket_id,
 				RTE_MAX_NUMA_NODES);
+			return -1;
 #endif
-
+		}
 		RTE_LOG(DEBUG, EAL, "Detected lcore %u as "
 				"core %u on socket %u\n",
 				lcore_id, lcore_config[lcore_id].core_id,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 02/25] eal: return error instead of panic for cpu init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 01/25] eal: CPU init will no longer panic Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 03/25] eal: No panic on hugepages info init Aaron Conole
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There may be no way to gracefully recover, but the application
should be notified that a failure happened, rather than completely
aborting.  This allows the user to proceed with a "slow-path" type
solution.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index bf6b818..d8e00f5 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -767,8 +767,11 @@ rte_eal_init(int argc, char **argv)
 	/* set log level as early as possible */
 	rte_set_log_level(internal_config.log_level);
 
-	if (rte_eal_cpu_init() < 0)
-		rte_panic("Cannot detect lcores\n");
+	if (rte_eal_cpu_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot detect lcores\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	fctret = eal_parse_args(argc, argv);
 	if (fctret < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 03/25] eal: No panic on hugepages info init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 01/25] eal: CPU init will no longer panic Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 02/25] eal: return error instead of panic for cpu init Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 04/25] eal: do not panic on failed hugepage query Aaron Conole
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When attempting to scan hugepages, signal to the eal.c that an error has
occured, rather than performing a panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 18858e2..4d47eaf 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -283,9 +283,11 @@ eal_hugepage_info_init(void)
 	struct dirent *dirent;
 
 	dir = opendir(sys_dir_path);
-	if (dir == NULL)
-		rte_panic("Cannot open directory %s to read system hugepage "
+	if (dir == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot open directory %s to read system hugepage "
 			  "info\n", sys_dir_path);
+		return -1;
+	}
 
 	for (dirent = readdir(dir); dirent != NULL; dirent = readdir(dir)) {
 		struct hugepage_info *hpi;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 04/25] eal: do not panic on failed hugepage query
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (2 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 03/25] eal: No panic on hugepages info init Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 05/25] eal: failure to parse args returns error Aaron Conole
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

If we fail to acquire hugepage information, simply signal an error to
the application.  This clears the run_once counter, allowing the user or
application to take a corrective action and retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index d8e00f5..8daa2be 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -780,8 +780,12 @@ rte_eal_init(int argc, char **argv)
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
 			internal_config.xen_dom0_support == 0 &&
-			eal_hugepage_info_init() < 0)
-		rte_panic("Cannot get hugepage information\n");
+			eal_hugepage_info_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot get hugepage information\n");
+		rte_errno = EACCES;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
 		if (internal_config.no_hugetlbfs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 05/25] eal: failure to parse args returns error
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (3 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 04/25] eal: do not panic on failed hugepage query Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 06/25] eal-common: introduce a way to query cpu support Aaron Conole
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's possible that the application could take a corrective action here,
and either prompt the user for different arguments, or at least perform
a better logging.  Exiting this early prevents any useful information
gathering from the application layer.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8daa2be..a626774 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -774,8 +774,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	fctret = eal_parse_args(argc, argv);
-	if (fctret < 0)
-		exit(1);
+	if (fctret < 0) {
+		RTE_LOG(ERR, EAL, "Invalid 'command line' arguments\n");
+		rte_errno = EINVAL;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 06/25] eal-common: introduce a way to query cpu support
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (4 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 05/25] eal: failure to parse args returns error Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 07/25] eal: Signal error when CPU isn't supported Aaron Conole
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This adds a new API to check for the eal cpu versions.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_cpuflags.c          | 13 +++++++++++--
 lib/librte_eal/common/include/generic/rte_cpuflags.h |  9 +++++++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
index b5f76f7..2c2127b 100644
--- a/lib/librte_eal/common/eal_common_cpuflags.c
+++ b/lib/librte_eal/common/eal_common_cpuflags.c
@@ -43,6 +43,13 @@
 void
 rte_cpu_check_supported(void)
 {
+	if (!rte_cpu_is_supported())
+		exit(1);
+}
+
+bool
+rte_cpu_is_supported(void)
+{
 	/* This is generated at compile-time by the build system */
 	static const enum rte_cpu_flag_t compile_time_flags[] = {
 			RTE_COMPILE_TIME_CPUFLAGS
@@ -57,14 +64,16 @@ rte_cpu_check_supported(void)
 			fprintf(stderr,
 				"ERROR: CPU feature flag lookup failed with error %d\n",
 				ret);
-			exit(1);
+			return false;
 		}
 		if (!ret) {
 			fprintf(stderr,
 			        "ERROR: This system does not support \"%s\".\n"
 			        "Please check that RTE_MACHINE is set correctly.\n",
 			        rte_cpu_get_flag_name(compile_time_flags[i]));
-			exit(1);
+			return false;
 		}
 	}
+
+	return true;
 }
diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
index 71321f3..e4342ad 100644
--- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -40,6 +40,7 @@
  */
 
 #include <errno.h>
+#include <stdbool.h>
 
 /**
  * Enumeration of all CPU features supported
@@ -82,4 +83,12 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature);
 void
 rte_cpu_check_supported(void);
 
+/**
+ * This function checks that the currently used CPU supports the CPU features
+ * that were specified at compile time. It is called automatically within the
+ * EAL, so does not need to be used by applications.  This version returns a
+ * result so that decisions may be made (for instance, graceful shutdowns).
+ */
+bool
+rte_cpu_is_supported(void);
 #endif /* _RTE_CPUFLAGS_H_ */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 07/25] eal: Signal error when CPU isn't supported
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (5 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 06/25] eal-common: introduce a way to query cpu support Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 08/25] eal: do not panic on memzone initialization fails Aaron Conole
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's now possible to gracefully exit the application, or for
applications which support non-dpdk datapaths working in concert with
DPDK datapaths, there no longer is the possibility of exiting for
unsupported CPUs.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index a626774..88d59a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -752,7 +752,11 @@ rte_eal_init(int argc, char **argv)
 	char thread_name[RTE_MAX_THREAD_NAME_LEN];
 
 	/* checks if the machine is adequate */
-	rte_cpu_check_supported();
+	if (!rte_cpu_is_supported()) {
+		fprintf(stderr, "EAL: FATAL - unsupported cpu type.\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 08/25] eal: do not panic on memzone initialization fails
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (6 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 07/25] eal: Signal error when CPU isn't supported Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 09/25] eal: set errno when exiting for already called Aaron Conole
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When memzone initialization fails, report the error to the calling
application rather than panic().  Without a good way of detaching /
releasing hugepages, at this point the application will have to restart.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 88d59a2..8f9bce1 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -832,8 +832,11 @@ rte_eal_init(int argc, char **argv)
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
 
-	if (rte_eal_memzone_init() < 0)
-		rte_panic("Cannot init memzone\n");
+	if (rte_eal_memzone_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init memzone\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
 
 	if (rte_eal_tailqs_init() < 0)
 		rte_panic("Cannot init tail queues for objects\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 09/25] eal: set errno when exiting for already called
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (7 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 08/25] eal: do not panic on memzone initialization fails Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 10/25] eal: Do not panic on log failures Aaron Conole
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8f9bce1..debb083 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -758,8 +758,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (!rte_atomic32_test_and_set(&run_once))
+	if (!rte_atomic32_test_and_set(&run_once)) {
+        fprintf(stderr, "EAL: ERROR - already called initialization.\n");
+		rte_errno = EALREADY;
 		return -1;
+	}
 
 	logid = strrchr(argv[0], '/');
 	logid = strdup(logid ? logid + 1: argv[0]);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 10/25] eal: Do not panic on log failures
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (8 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 09/25] eal: set errno when exiting for already called Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 11/25] eal: Do not panic on pci-probe Aaron Conole
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When log initialization fails, it's generally because the fopencookie
failed.  While this is rare in practice, it could happen, and it is
likely because of memory pressure.  So, flag the error, and allow the
user to retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index debb083..1d828bf 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -818,8 +818,13 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
-	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
-		rte_panic("Cannot init logs\n");
+	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init logging\n");
+		fprintf(stderr, "EAL: ERROR - cannot init logging.\n");
+		rte_errno = EIO;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 11/25] eal: Do not panic on pci-probe
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (9 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 10/25] eal: Do not panic on log failures Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 12/25] eal: do not panic on vfio failure Aaron Conole
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This will usually be an issue because of permissions.  However, it could
also be caused by OOM.  In either case, errno will contain the
underlying cause.  It is safe to re-init the system here, so allow the
application to take corrective action and reinit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 1d828bf..ab1aeef 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -826,8 +826,12 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_pci_init() < 0)
-		rte_panic("Cannot init PCI\n");
+	if (rte_eal_pci_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init PCI\n");
+		rte_errno = EUNATCH;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 12/25] eal: do not panic on vfio failure
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (10 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 11/25] eal: Do not panic on pci-probe Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 13/25] eal: do not panic on memory init Aaron Conole
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index ab1aeef..ec26153 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -834,8 +834,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 #ifdef VFIO_PRESENT
-	if (rte_eal_vfio_setup() < 0)
-		rte_panic("Cannot init VFIO\n");
+	if (rte_eal_vfio_setup() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init VFIO\n");
+		rte_errno = EAGAIN;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 #endif
 
 	if (rte_eal_memory_init() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 13/25] eal: do not panic on memory init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (11 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 12/25] eal: do not panic on vfio failure Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 14/25] eal: do not panic on tailq init Aaron Conole
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This can only happen when access to hugepages (either as primary or
secondary process) fails (and that is usually permissions).  Since the
manner of failure is not reversible, we cannot allow retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index ec26153..f5f0ad4 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -842,8 +842,11 @@ rte_eal_init(int argc, char **argv)
 	}
 #endif
 
-	if (rte_eal_memory_init() < 0)
-		rte_panic("Cannot init memory\n");
+	if (rte_eal_memory_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init memory\n");
+		rte_errno = EACCES;
+		return -1;
+	}
 
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 14/25] eal: do not panic on tailq init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (12 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 13/25] eal: do not panic on memory init Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 15/25] eal: do not panic on alarm init Aaron Conole
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There are some theoretical racy conditions in the system that _could_
cause early tailq init to fail;  however, no need to panic the
application.  While it can't continue using DPDK, it could make better
alerts to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_tailqs.c | 3 +--
 lib/librte_eal/linuxapp/eal/eal.c         | 7 +++++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_tailqs.c b/lib/librte_eal/common/eal_common_tailqs.c
index bb08ec8..4f69828 100644
--- a/lib/librte_eal/common/eal_common_tailqs.c
+++ b/lib/librte_eal/common/eal_common_tailqs.c
@@ -188,8 +188,7 @@ rte_eal_tailqs_init(void)
 		if (t->head == NULL) {
 			RTE_LOG(ERR, EAL,
 				"Cannot initialize tailq: %s\n", t->name);
-			/* no need to TAILQ_REMOVE, we are going to panic in
-			 * rte_eal_init() */
+			/* TAILQ_REMOVE not needed, error is already fatal */
 			goto fail;
 		}
 	}
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f5f0ad4..adcebc8 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -857,8 +857,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_tailqs_init() < 0)
-		rte_panic("Cannot init tail queues for objects\n");
+	if (rte_eal_tailqs_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init tail queues for objects\n");
+		rte_errno = EFAULT;
+		return -1;
+	}
 
 	if (rte_eal_alarm_init() < 0)
 		rte_panic("Cannot init interrupt-handling thread\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 15/25] eal: do not panic on alarm init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (13 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 14/25] eal: do not panic on tailq init Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 16/25] eal: convert timer_init not to call panic Aaron Conole
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

rte_eal_alarm_init() call uses the linux timerfd framework to create a
poll()-able timer using standard posix file operations.  This could fail
for a few reasons given in the man-pages, but many could be
corrected by the user application.  No need to panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index adcebc8..df8c794 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -61,6 +61,7 @@
 #include <rte_launch.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
+#include <rte_errno.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
 #include <rte_log.h>
@@ -863,8 +864,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_alarm_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_alarm_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init interrupt-handling thread\n");
+		/* rte_eal_alarm_init sets rte_errno on failure. */
+		return -1;
+	}
 
 	if (rte_eal_timer_init() < 0)
 		rte_panic("Cannot init HPET or TSC timers\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 16/25] eal: convert timer_init not to call panic
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (14 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 15/25] eal: do not panic on alarm init Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 17/25] eal: change the private pipe call to reflect errno Aaron Conole
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After code inspection, there is no way for eal_timer_init() to fail.  It
simply returns 0 in all cases.  As such, this test could either go-away
or stay here as 'future-proofing'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index df8c794..a9f1385 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -870,8 +870,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_timer_init() < 0)
-		rte_panic("Cannot init HPET or TSC timers\n");
+	if (rte_eal_timer_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init HPET or TSC timers\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	eal_check_mem_on_local_socket();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 17/25] eal: change the private pipe call to reflect errno
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (15 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 16/25] eal: convert timer_init not to call panic Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 18/25] eal: Do not panic on interrupt thread init Aaron Conole
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There could be some confusion as to why the call failed - this change
will always reflect the value of the error in rte_error.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index b5b3f2b..b1a287c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -896,13 +896,16 @@ rte_eal_intr_init(void)
 	 * create a pipe which will be waited by epoll and notified to
 	 * rebuild the wait list of epoll.
 	 */
-	if (pipe(intr_pipe.pipefd) < 0)
+	if (pipe(intr_pipe.pipefd) < 0) {
+		rte_errno = errno;
 		return -1;
+	}
 
 	/* create the host thread to wait/handle the interrupt */
 	ret = pthread_create(&intr_thread, NULL,
 			eal_intr_thread_main, NULL);
 	if (ret != 0) {
+		rte_errno = ret;
 		RTE_LOG(ERR, EAL,
 			"Failed to create thread for interrupt handling\n");
 	} else {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 18/25] eal: Do not panic on interrupt thread init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (16 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 17/25] eal: change the private pipe call to reflect errno Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 19/25] eal: do not error if plugins fail to init Aaron Conole
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When initializing the interrupt thread, there are a number of possible
reasons for failure - some of which are correctable by the application.
Do not panic() needlessly, and give the application a change to reflect
this information to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index a9f1385..14ec2e0 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -889,8 +889,10 @@ rte_eal_init(int argc, char **argv)
 		rte_config.master_lcore, (int)thread_id, cpuset,
 		ret == 0 ? "" : "...");
 
-	if (rte_eal_intr_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_intr_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init interrupt-handling thread\n");
+		return -1;
+	}
 
 	if (rte_bus_scan())
 		rte_panic("Cannot scan the buses for devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 19/25] eal: do not error if plugins fail to init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (17 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 18/25] eal: Do not panic on interrupt thread init Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 20/25] eal_pci: Continue probing even on failures Aaron Conole
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Plugins are useful and important.  However, it seems crazy to abort
everything just because they don't initialize properly.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 14ec2e0..2ca6c20 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -878,8 +878,9 @@ rte_eal_init(int argc, char **argv)
 
 	eal_check_mem_on_local_socket();
 
-	if (eal_plugins_init() < 0)
-		rte_panic("Cannot init plugins\n");
+	if (eal_plugins_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init plugins\n");
+	}
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 20/25] eal_pci: Continue probing even on failures
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (18 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 19/25] eal: do not error if plugins fail to init Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 21/25] eal: do not panic on failed PCI probe Aaron Conole
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Some devices may be inaccessible for a variety of reasons, or the
PCI-bus may be unavailable causing the whole thing to fail.  Still,
better to continue attempts at probes.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_pci.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 72547bd..9416190 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -69,6 +69,7 @@
 #include <sys/queue.h>
 #include <sys/mman.h>
 
+#include <rte_errno.h>
 #include <rte_interrupts.h>
 #include <rte_log.h>
 #include <rte_pci.h>
@@ -416,6 +417,7 @@ rte_eal_pci_probe(void)
 	struct rte_pci_device *dev = NULL;
 	struct rte_devargs *devargs;
 	int probe_all = 0;
+	int ret_1 = 0;
 	int ret = 0;
 
 	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
@@ -430,17 +432,20 @@ rte_eal_pci_probe(void)
 
 		/* probe all or only whitelisted devices */
 		if (probe_all)
-			ret = pci_probe_all_drivers(dev);
+			ret_1 = pci_probe_all_drivers(dev);
 		else if (devargs != NULL &&
 			devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
-			ret = pci_probe_all_drivers(dev);
-		if (ret < 0)
-			rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
+			ret_1 = pci_probe_all_drivers(dev);
+		if (ret_1 < 0) {
+			RTE_LOG(ERR, EAL, "Requested device " PCI_PRI_FMT
 				 " cannot be used\n", dev->addr.domain, dev->addr.bus,
 				 dev->addr.devid, dev->addr.function);
+			rte_errno = errno;
+			ret = 1;
+		}
 	}
 
-	return 0;
+	return -ret;
 }
 
 /* dump one device */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 21/25] eal: do not panic on failed PCI probe
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (19 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 20/25] eal_pci: Continue probing even on failures Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 22/25] eal_common_dev: continue initializing vdevs Aaron Conole
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It may even be possible to simply log the error and continue on letting
the user check the logs and restart the application when things are failed.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 2ca6c20..a692307 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -939,8 +939,11 @@ rte_eal_init(int argc, char **argv)
 		rte_panic("Cannot probe devices\n");
 
 	/* Probe & Initialize PCI devices */
-	if (rte_eal_pci_probe())
-		rte_panic("Cannot probe PCI\n");
+	if (rte_eal_pci_probe()) {
+		RTE_LOG(ERR, EAL, "Cannot probe PCI\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 22/25] eal_common_dev: continue initializing vdevs
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (20 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 21/25] eal: do not panic on failed PCI probe Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 23/25] eal: do not panic (or abort) if vdev init fails Aaron Conole
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Even if one vdev should fail, there's no need to prevent further
processing.  Log the error, and reflect it to the higher levels to
decide.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_dev.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 4f3b493..9889997 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -80,6 +80,7 @@ int
 rte_eal_dev_init(void)
 {
 	struct rte_devargs *devargs;
+	int ret = 0;
 
 	/*
 	 * Note that the dev_driver_list is populated here
@@ -97,11 +98,11 @@ rte_eal_dev_init(void)
 					devargs->args)) {
 			RTE_LOG(ERR, EAL, "failed to initialize %s device\n",
 					devargs->virt.drv_name);
-			return -1;
+			ret = -1;
 		}
 	}
 
-	return 0;
+	return ret;
 }
 
 int rte_eal_dev_attach(const char *name, const char *devargs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 23/25] eal: do not panic (or abort) if vdev init fails
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (21 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 22/25] eal_common_dev: continue initializing vdevs Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 24/25] eal: do not panic when bus probe fails Aaron Conole
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Seems like it's possible to continue.  At least, the error is reflected
properly in the logs.  A user could then go and correct or investigate
the situation.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index a692307..d338e61 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -946,7 +946,7 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	if (rte_eal_dev_init() < 0)
-		rte_panic("Cannot init pmd devices\n");
+		RTE_LOG(ERR, EAL, "Cannot init pmd devices\n");
 
 	rte_eal_mcfg_complete();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 24/25] eal: do not panic when bus probe fails
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (22 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 23/25] eal: do not panic (or abort) if vdev init fails Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 18:51 ` [PATCH v2 25/25] rte_eal_init: add info about rte_errno codes Aaron Conole
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index d338e61..5018796 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -935,8 +935,11 @@ rte_eal_init(int argc, char **argv)
 	rte_eal_mp_wait_lcore();
 
 	/* Probe all the buses and devices/drivers on them */
-	if (rte_bus_probe())
-		rte_panic("Cannot probe devices\n");
+	if (rte_bus_probe()) {
+		RTE_LOG(ERR, EAL, "Cannot probe devices\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	/* Probe & Initialize PCI devices */
 	if (rte_eal_pci_probe()) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v2 25/25] rte_eal_init: add info about rte_errno codes
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (23 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 24/25] eal: do not panic when bus probe fails Aaron Conole
@ 2017-02-08 18:51 ` Aaron Conole
  2017-02-08 19:11 ` [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 18:51 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

The rte_eal_init function will now pass failure reason hints to the
application.  To help app developers deciper this, add some brief
information about what the codes are indicating.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/include/rte_eal.h | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 03fee50..5f184c9 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -159,7 +159,32 @@ int rte_eal_iopl_init(void);
  *     function call and should not be further interpreted by the
  *     application.  The EAL does not take any ownership of the memory used
  *     for either the argv array, or its members.
- *   - On failure, a negative error value.
+ *   - On failure, -1 and rte_errno is set to a value indicating the cause
+ *     for failure.
+ *
+ *   The error codes returned via rte_errno:
+ *     EACCES indicates a permissions issue.
+ *
+ *     EAGAIN indicates either a bus or system resource was not available,
+ *            try again.
+ *
+ *     EALREADY indicates that the rte_eal_init function has already been
+ *              called, and cannot be called again.
+ *
+ *     EFAULT indicates the tailq configuration name was not found in
+ *            memory configuration.
+ *
+ *     EINVAL indicates invalid parameters were passed as argv/argc.
+ *
+ *     EIO indicates failure to setup the logging handlers.  This is usually
+ *         caused by an out-of-memory condition.
+ *
+ *     ENODEV indicates memory setup issues.
+ *
+ *     ENOTSUP indicates that the EAL cannot initialize on this system.
+ *
+ *     EUNATCH indicates that the PCI bus is either not present, or is not
+ *             readable by the eal.
  */
 int rte_eal_init(int argc, char **argv);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* Re: [PATCH v2 00/25] linux/eal: Remove most causes of panic on init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (24 preceding siblings ...)
  2017-02-08 18:51 ` [PATCH v2 25/25] rte_eal_init: add info about rte_errno codes Aaron Conole
@ 2017-02-08 19:11 ` Aaron Conole
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-08 19:11 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Aaron Conole <aconole@redhat.com> writes:

> In many cases, it's enough to simply let the application know that the
> call to initialize DPDK has failed.  A complete halt can then be
> decided by the application based on error returned (and the app could
> even attempt a possible re-attempt after some corrective action by the
> user or application).
>
> Changes ->v2:
> - Audited all "RTE_LOG (" calls that were introduced, and converted
>   to "RTE_LOG("
> - Added some fprintf(stderr, "") lines to indicate errors before logging
>   is initialized
> - Removed assignments to errno.
> - Changed patch 14/25 to reflect EFAULT, and document in 25/25
>
> I kept the rte_errno reflection, since this is control-path code and the
> init function returns a sentinel value of -1.
>

I got 3 new checkpatch warnings that seem to have missed my local commit
hook.  I'll fix them up and send a new series in-reply to this one.

Sorry for the noise.

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v3 00/25] linux/eal: Remove most causes of panic on init
  2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
                   ` (25 preceding siblings ...)
  2017-02-08 19:11 ` [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
@ 2017-02-09 14:29 ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 01/25] eal: CPU init will no longer panic Aaron Conole
                     ` (26 more replies)
  26 siblings, 27 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

In many cases, it's enough to simply let the application know that the
call to initialize DPDK has failed.  A complete halt can then be
decided by the application based on error returned (and the app could
even attempt a possible re-attempt after some corrective action by the
user or application).

Changes ->v2:
- Audited all "RTE_LOG (" calls that were introduced, and converted
  to "RTE_LOG("
- Added some fprintf(stderr, "") lines to indicate errors before logging
  is initialized
- Removed assignments to errno.
- Changed patch 14/25 to reflect EFAULT, and document in 25/25

Changes ->v3:
- Checkpatch issues in patches 3 (spelling mistake), 9 (issue with leading
  spaces), and 19 (braces around single line statement if-condition)

I kept the rte_errno reflection, since this is control-path code and the
init function returns a sentinel value of -1.

Aaron Conole (25):
  eal: CPU init will no longer panic
  eal: return error instead of panic for cpu init
  eal: No panic on hugepages info init
  eal: do not panic on failed hugepage query
  eal: failure to parse args returns error
  eal-common: introduce a way to query cpu support
  eal: Signal error when CPU isn't supported
  eal: do not panic on memzone initialization fails
  eal: set errno when exiting for already called
  eal: Do not panic on log failures
  eal: Do not panic on pci-probe
  eal: do not panic on vfio failure
  eal: do not panic on memory init
  eal: do not panic on tailq init
  eal: do not panic on alarm init
  eal: convert timer_init not to call panic
  eal: change the private pipe call to reflect errno
  eal: Do not panic on interrupt thread init
  eal: do not error if plugins fail to init
  eal_pci: Continue probing even on failures
  eal: do not panic on failed PCI probe
  eal_common_dev: continue initializing vdevs
  eal: do not panic (or abort) if vdev init fails
  eal: do not panic when bus probe fails
  rte_eal_init: add info about rte_errno codes

 lib/librte_eal/common/eal_common_cpuflags.c        |  13 ++-
 lib/librte_eal/common/eal_common_dev.c             |   5 +-
 lib/librte_eal/common/eal_common_lcore.c           |   7 +-
 lib/librte_eal/common/eal_common_pci.c             |  15 ++-
 lib/librte_eal/common/eal_common_tailqs.c          |   3 +-
 .../common/include/generic/rte_cpuflags.h          |   9 ++
 lib/librte_eal/common/include/rte_eal.h            |  27 ++++-
 lib/librte_eal/linuxapp/eal/eal.c                  | 122 +++++++++++++++------
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c    |   6 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |   5 +-
 10 files changed, 161 insertions(+), 51 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v3 01/25] eal: CPU init will no longer panic
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 02/25] eal: return error instead of panic for cpu init Aaron Conole
                     ` (25 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After this change, the EAL CPU NUMA node resolution step can no longer
emit an rte_panic.  This aligns with the code in rte_eal_init, which
expects failures to return an error code.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_lcore.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_lcore.c b/lib/librte_eal/common/eal_common_lcore.c
index 2cd4132..84fa0cb 100644
--- a/lib/librte_eal/common/eal_common_lcore.c
+++ b/lib/librte_eal/common/eal_common_lcore.c
@@ -83,16 +83,17 @@ rte_eal_cpu_init(void)
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = eal_cpu_core_id(lcore_id);
 		lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id);
-		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES)
+		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES) {
 #ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
 			lcore_config[lcore_id].socket_id = 0;
 #else
-			rte_panic("Socket ID (%u) is greater than "
+			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than "
 				"RTE_MAX_NUMA_NODES (%d)\n",
 				lcore_config[lcore_id].socket_id,
 				RTE_MAX_NUMA_NODES);
+			return -1;
 #endif
-
+		}
 		RTE_LOG(DEBUG, EAL, "Detected lcore %u as "
 				"core %u on socket %u\n",
 				lcore_id, lcore_config[lcore_id].core_id,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 02/25] eal: return error instead of panic for cpu init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 01/25] eal: CPU init will no longer panic Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 03/25] eal: No panic on hugepages info init Aaron Conole
                     ` (24 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There may be no way to gracefully recover, but the application
should be notified that a failure happened, rather than completely
aborting.  This allows the user to proceed with a "slow-path" type
solution.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index bf6b818..d8e00f5 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -767,8 +767,11 @@ rte_eal_init(int argc, char **argv)
 	/* set log level as early as possible */
 	rte_set_log_level(internal_config.log_level);
 
-	if (rte_eal_cpu_init() < 0)
-		rte_panic("Cannot detect lcores\n");
+	if (rte_eal_cpu_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot detect lcores\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	fctret = eal_parse_args(argc, argv);
 	if (fctret < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 03/25] eal: No panic on hugepages info init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 01/25] eal: CPU init will no longer panic Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 02/25] eal: return error instead of panic for cpu init Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 04/25] eal: do not panic on failed hugepage query Aaron Conole
                     ` (23 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When attempting to scan hugepages, signal to the eal.c that an error has
occurred, rather than performing a panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 18858e2..4d47eaf 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -283,9 +283,11 @@ eal_hugepage_info_init(void)
 	struct dirent *dirent;
 
 	dir = opendir(sys_dir_path);
-	if (dir == NULL)
-		rte_panic("Cannot open directory %s to read system hugepage "
+	if (dir == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot open directory %s to read system hugepage "
 			  "info\n", sys_dir_path);
+		return -1;
+	}
 
 	for (dirent = readdir(dir); dirent != NULL; dirent = readdir(dir)) {
 		struct hugepage_info *hpi;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 04/25] eal: do not panic on failed hugepage query
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (2 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 03/25] eal: No panic on hugepages info init Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 05/25] eal: failure to parse args returns error Aaron Conole
                     ` (22 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

If we fail to acquire hugepage information, simply signal an error to
the application.  This clears the run_once counter, allowing the user or
application to take a corrective action and retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index d8e00f5..8daa2be 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -780,8 +780,12 @@ rte_eal_init(int argc, char **argv)
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
 			internal_config.xen_dom0_support == 0 &&
-			eal_hugepage_info_init() < 0)
-		rte_panic("Cannot get hugepage information\n");
+			eal_hugepage_info_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot get hugepage information\n");
+		rte_errno = EACCES;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
 		if (internal_config.no_hugetlbfs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 05/25] eal: failure to parse args returns error
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (3 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 04/25] eal: do not panic on failed hugepage query Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 06/25] eal-common: introduce a way to query cpu support Aaron Conole
                     ` (21 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's possible that the application could take a corrective action here,
and either prompt the user for different arguments, or at least perform
a better logging.  Exiting this early prevents any useful information
gathering from the application layer.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8daa2be..a626774 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -774,8 +774,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	fctret = eal_parse_args(argc, argv);
-	if (fctret < 0)
-		exit(1);
+	if (fctret < 0) {
+		RTE_LOG(ERR, EAL, "Invalid 'command line' arguments\n");
+		rte_errno = EINVAL;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 06/25] eal-common: introduce a way to query cpu support
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (4 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 05/25] eal: failure to parse args returns error Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 07/25] eal: Signal error when CPU isn't supported Aaron Conole
                     ` (20 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This adds a new API to check for the eal cpu versions.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_cpuflags.c          | 13 +++++++++++--
 lib/librte_eal/common/include/generic/rte_cpuflags.h |  9 +++++++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
index b5f76f7..2c2127b 100644
--- a/lib/librte_eal/common/eal_common_cpuflags.c
+++ b/lib/librte_eal/common/eal_common_cpuflags.c
@@ -43,6 +43,13 @@
 void
 rte_cpu_check_supported(void)
 {
+	if (!rte_cpu_is_supported())
+		exit(1);
+}
+
+bool
+rte_cpu_is_supported(void)
+{
 	/* This is generated at compile-time by the build system */
 	static const enum rte_cpu_flag_t compile_time_flags[] = {
 			RTE_COMPILE_TIME_CPUFLAGS
@@ -57,14 +64,16 @@ rte_cpu_check_supported(void)
 			fprintf(stderr,
 				"ERROR: CPU feature flag lookup failed with error %d\n",
 				ret);
-			exit(1);
+			return false;
 		}
 		if (!ret) {
 			fprintf(stderr,
 			        "ERROR: This system does not support \"%s\".\n"
 			        "Please check that RTE_MACHINE is set correctly.\n",
 			        rte_cpu_get_flag_name(compile_time_flags[i]));
-			exit(1);
+			return false;
 		}
 	}
+
+	return true;
 }
diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
index 71321f3..e4342ad 100644
--- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -40,6 +40,7 @@
  */
 
 #include <errno.h>
+#include <stdbool.h>
 
 /**
  * Enumeration of all CPU features supported
@@ -82,4 +83,12 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature);
 void
 rte_cpu_check_supported(void);
 
+/**
+ * This function checks that the currently used CPU supports the CPU features
+ * that were specified at compile time. It is called automatically within the
+ * EAL, so does not need to be used by applications.  This version returns a
+ * result so that decisions may be made (for instance, graceful shutdowns).
+ */
+bool
+rte_cpu_is_supported(void);
 #endif /* _RTE_CPUFLAGS_H_ */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 07/25] eal: Signal error when CPU isn't supported
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (5 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 06/25] eal-common: introduce a way to query cpu support Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 08/25] eal: do not panic on memzone initialization fails Aaron Conole
                     ` (19 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's now possible to gracefully exit the application, or for
applications which support non-dpdk datapaths working in concert with
DPDK datapaths, there no longer is the possibility of exiting for
unsupported CPUs.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index a626774..88d59a2 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -752,7 +752,11 @@ rte_eal_init(int argc, char **argv)
 	char thread_name[RTE_MAX_THREAD_NAME_LEN];
 
 	/* checks if the machine is adequate */
-	rte_cpu_check_supported();
+	if (!rte_cpu_is_supported()) {
+		fprintf(stderr, "EAL: FATAL - unsupported cpu type.\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 08/25] eal: do not panic on memzone initialization fails
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (6 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 07/25] eal: Signal error when CPU isn't supported Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 09/25] eal: set errno when exiting for already called Aaron Conole
                     ` (18 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When memzone initialization fails, report the error to the calling
application rather than panic().  Without a good way of detaching /
releasing hugepages, at this point the application will have to restart.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 88d59a2..8f9bce1 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -832,8 +832,11 @@ rte_eal_init(int argc, char **argv)
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
 
-	if (rte_eal_memzone_init() < 0)
-		rte_panic("Cannot init memzone\n");
+	if (rte_eal_memzone_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init memzone\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
 
 	if (rte_eal_tailqs_init() < 0)
 		rte_panic("Cannot init tail queues for objects\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 09/25] eal: set errno when exiting for already called
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (7 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 08/25] eal: do not panic on memzone initialization fails Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 10/25] eal: Do not panic on log failures Aaron Conole
                     ` (17 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8f9bce1..e0dff6e 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -758,8 +758,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (!rte_atomic32_test_and_set(&run_once))
+	if (!rte_atomic32_test_and_set(&run_once)) {
+		fprintf(stderr, "EAL: ERROR - already called initialization.\n");
+		rte_errno = EALREADY;
 		return -1;
+	}
 
 	logid = strrchr(argv[0], '/');
 	logid = strdup(logid ? logid + 1: argv[0]);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 10/25] eal: Do not panic on log failures
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (8 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 09/25] eal: set errno when exiting for already called Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 11/25] eal: Do not panic on pci-probe Aaron Conole
                     ` (16 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When log initialization fails, it's generally because the fopencookie
failed.  While this is rare in practice, it could happen, and it is
likely because of memory pressure.  So, flag the error, and allow the
user to retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index e0dff6e..9bb00d5 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -818,8 +818,13 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
-	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
-		rte_panic("Cannot init logs\n");
+	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init logging\n");
+		fprintf(stderr, "EAL: ERROR - cannot init logging.\n");
+		rte_errno = EIO;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 11/25] eal: Do not panic on pci-probe
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (9 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 10/25] eal: Do not panic on log failures Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 12/25] eal: do not panic on vfio failure Aaron Conole
                     ` (15 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This will usually be an issue because of permissions.  However, it could
also be caused by OOM.  In either case, errno will contain the
underlying cause.  It is safe to re-init the system here, so allow the
application to take corrective action and reinit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 9bb00d5..2a3d2f6 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -826,8 +826,12 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_pci_init() < 0)
-		rte_panic("Cannot init PCI\n");
+	if (rte_eal_pci_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init PCI\n");
+		rte_errno = EUNATCH;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 12/25] eal: do not panic on vfio failure
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (10 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 11/25] eal: Do not panic on pci-probe Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 13/25] eal: do not panic on memory init Aaron Conole
                     ` (14 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 2a3d2f6..b21d715 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -834,8 +834,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 #ifdef VFIO_PRESENT
-	if (rte_eal_vfio_setup() < 0)
-		rte_panic("Cannot init VFIO\n");
+	if (rte_eal_vfio_setup() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init VFIO\n");
+		rte_errno = EAGAIN;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 #endif
 
 	if (rte_eal_memory_init() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 13/25] eal: do not panic on memory init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (11 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 12/25] eal: do not panic on vfio failure Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 14/25] eal: do not panic on tailq init Aaron Conole
                     ` (13 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This can only happen when access to hugepages (either as primary or
secondary process) fails (and that is usually permissions).  Since the
manner of failure is not reversible, we cannot allow retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index b21d715..97ee409 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -842,8 +842,11 @@ rte_eal_init(int argc, char **argv)
 	}
 #endif
 
-	if (rte_eal_memory_init() < 0)
-		rte_panic("Cannot init memory\n");
+	if (rte_eal_memory_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init memory\n");
+		rte_errno = EACCES;
+		return -1;
+	}
 
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 14/25] eal: do not panic on tailq init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (12 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 13/25] eal: do not panic on memory init Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 15/25] eal: do not panic on alarm init Aaron Conole
                     ` (12 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There are some theoretical racy conditions in the system that _could_
cause early tailq init to fail;  however, no need to panic the
application.  While it can't continue using DPDK, it could make better
alerts to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_tailqs.c | 3 +--
 lib/librte_eal/linuxapp/eal/eal.c         | 7 +++++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_tailqs.c b/lib/librte_eal/common/eal_common_tailqs.c
index bb08ec8..4f69828 100644
--- a/lib/librte_eal/common/eal_common_tailqs.c
+++ b/lib/librte_eal/common/eal_common_tailqs.c
@@ -188,8 +188,7 @@ rte_eal_tailqs_init(void)
 		if (t->head == NULL) {
 			RTE_LOG(ERR, EAL,
 				"Cannot initialize tailq: %s\n", t->name);
-			/* no need to TAILQ_REMOVE, we are going to panic in
-			 * rte_eal_init() */
+			/* TAILQ_REMOVE not needed, error is already fatal */
 			goto fail;
 		}
 	}
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 97ee409..9e2daca 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -857,8 +857,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_tailqs_init() < 0)
-		rte_panic("Cannot init tail queues for objects\n");
+	if (rte_eal_tailqs_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init tail queues for objects\n");
+		rte_errno = EFAULT;
+		return -1;
+	}
 
 	if (rte_eal_alarm_init() < 0)
 		rte_panic("Cannot init interrupt-handling thread\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 15/25] eal: do not panic on alarm init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (13 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 14/25] eal: do not panic on tailq init Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 16/25] eal: convert timer_init not to call panic Aaron Conole
                     ` (11 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

rte_eal_alarm_init() call uses the linux timerfd framework to create a
poll()-able timer using standard posix file operations.  This could fail
for a few reasons given in the man-pages, but many could be
corrected by the user application.  No need to panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 9e2daca..5dcd2d5 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -61,6 +61,7 @@
 #include <rte_launch.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
+#include <rte_errno.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
 #include <rte_log.h>
@@ -863,8 +864,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_alarm_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_alarm_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init interrupt-handling thread\n");
+		/* rte_eal_alarm_init sets rte_errno on failure. */
+		return -1;
+	}
 
 	if (rte_eal_timer_init() < 0)
 		rte_panic("Cannot init HPET or TSC timers\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 16/25] eal: convert timer_init not to call panic
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (14 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 15/25] eal: do not panic on alarm init Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 17/25] eal: change the private pipe call to reflect errno Aaron Conole
                     ` (10 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After code inspection, there is no way for eal_timer_init() to fail.  It
simply returns 0 in all cases.  As such, this test could either go-away
or stay here as 'future-proofing'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 5dcd2d5..805f284 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -870,8 +870,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_timer_init() < 0)
-		rte_panic("Cannot init HPET or TSC timers\n");
+	if (rte_eal_timer_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init HPET or TSC timers\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	eal_check_mem_on_local_socket();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 17/25] eal: change the private pipe call to reflect errno
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (15 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 16/25] eal: convert timer_init not to call panic Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 18/25] eal: Do not panic on interrupt thread init Aaron Conole
                     ` (9 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There could be some confusion as to why the call failed - this change
will always reflect the value of the error in rte_error.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index b5b3f2b..b1a287c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -896,13 +896,16 @@ rte_eal_intr_init(void)
 	 * create a pipe which will be waited by epoll and notified to
 	 * rebuild the wait list of epoll.
 	 */
-	if (pipe(intr_pipe.pipefd) < 0)
+	if (pipe(intr_pipe.pipefd) < 0) {
+		rte_errno = errno;
 		return -1;
+	}
 
 	/* create the host thread to wait/handle the interrupt */
 	ret = pthread_create(&intr_thread, NULL,
 			eal_intr_thread_main, NULL);
 	if (ret != 0) {
+		rte_errno = ret;
 		RTE_LOG(ERR, EAL,
 			"Failed to create thread for interrupt handling\n");
 	} else {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 18/25] eal: Do not panic on interrupt thread init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (16 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 17/25] eal: change the private pipe call to reflect errno Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 19/25] eal: do not error if plugins fail to init Aaron Conole
                     ` (8 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When initializing the interrupt thread, there are a number of possible
reasons for failure - some of which are correctable by the application.
Do not panic() needlessly, and give the application a change to reflect
this information to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 805f284..93b83aa 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -889,8 +889,10 @@ rte_eal_init(int argc, char **argv)
 		rte_config.master_lcore, (int)thread_id, cpuset,
 		ret == 0 ? "" : "...");
 
-	if (rte_eal_intr_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_intr_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init interrupt-handling thread\n");
+		return -1;
+	}
 
 	if (rte_bus_scan())
 		rte_panic("Cannot scan the buses for devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 19/25] eal: do not error if plugins fail to init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (17 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 18/25] eal: Do not panic on interrupt thread init Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 20/25] eal_pci: Continue probing even on failures Aaron Conole
                     ` (7 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Plugins are useful and important.  However, it seems crazy to abort
everything just because they don't initialize properly.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 93b83aa..31a5986 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -879,7 +879,7 @@ rte_eal_init(int argc, char **argv)
 	eal_check_mem_on_local_socket();
 
 	if (eal_plugins_init() < 0)
-		rte_panic("Cannot init plugins\n");
+		RTE_LOG(ERR, EAL, "Cannot init plugins\n");
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 20/25] eal_pci: Continue probing even on failures
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (18 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 19/25] eal: do not error if plugins fail to init Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 21/25] eal: do not panic on failed PCI probe Aaron Conole
                     ` (6 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Some devices may be inaccessible for a variety of reasons, or the
PCI-bus may be unavailable causing the whole thing to fail.  Still,
better to continue attempts at probes.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_pci.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 72547bd..9416190 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -69,6 +69,7 @@
 #include <sys/queue.h>
 #include <sys/mman.h>
 
+#include <rte_errno.h>
 #include <rte_interrupts.h>
 #include <rte_log.h>
 #include <rte_pci.h>
@@ -416,6 +417,7 @@ rte_eal_pci_probe(void)
 	struct rte_pci_device *dev = NULL;
 	struct rte_devargs *devargs;
 	int probe_all = 0;
+	int ret_1 = 0;
 	int ret = 0;
 
 	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
@@ -430,17 +432,20 @@ rte_eal_pci_probe(void)
 
 		/* probe all or only whitelisted devices */
 		if (probe_all)
-			ret = pci_probe_all_drivers(dev);
+			ret_1 = pci_probe_all_drivers(dev);
 		else if (devargs != NULL &&
 			devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
-			ret = pci_probe_all_drivers(dev);
-		if (ret < 0)
-			rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
+			ret_1 = pci_probe_all_drivers(dev);
+		if (ret_1 < 0) {
+			RTE_LOG(ERR, EAL, "Requested device " PCI_PRI_FMT
 				 " cannot be used\n", dev->addr.domain, dev->addr.bus,
 				 dev->addr.devid, dev->addr.function);
+			rte_errno = errno;
+			ret = 1;
+		}
 	}
 
-	return 0;
+	return -ret;
 }
 
 /* dump one device */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 21/25] eal: do not panic on failed PCI probe
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (19 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 20/25] eal_pci: Continue probing even on failures Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 22/25] eal_common_dev: continue initializing vdevs Aaron Conole
                     ` (5 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It may even be possible to simply log the error and continue on letting
the user check the logs and restart the application when things are failed.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 31a5986..3458d41 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -938,8 +938,11 @@ rte_eal_init(int argc, char **argv)
 		rte_panic("Cannot probe devices\n");
 
 	/* Probe & Initialize PCI devices */
-	if (rte_eal_pci_probe())
-		rte_panic("Cannot probe PCI\n");
+	if (rte_eal_pci_probe()) {
+		RTE_LOG(ERR, EAL, "Cannot probe PCI\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 22/25] eal_common_dev: continue initializing vdevs
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (20 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 21/25] eal: do not panic on failed PCI probe Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 23/25] eal: do not panic (or abort) if vdev init fails Aaron Conole
                     ` (4 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Even if one vdev should fail, there's no need to prevent further
processing.  Log the error, and reflect it to the higher levels to
decide.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_dev.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 4f3b493..9889997 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -80,6 +80,7 @@ int
 rte_eal_dev_init(void)
 {
 	struct rte_devargs *devargs;
+	int ret = 0;
 
 	/*
 	 * Note that the dev_driver_list is populated here
@@ -97,11 +98,11 @@ rte_eal_dev_init(void)
 					devargs->args)) {
 			RTE_LOG(ERR, EAL, "failed to initialize %s device\n",
 					devargs->virt.drv_name);
-			return -1;
+			ret = -1;
 		}
 	}
 
-	return 0;
+	return ret;
 }
 
 int rte_eal_dev_attach(const char *name, const char *devargs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 23/25] eal: do not panic (or abort) if vdev init fails
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (21 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 22/25] eal_common_dev: continue initializing vdevs Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 24/25] eal: do not panic when bus probe fails Aaron Conole
                     ` (3 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Seems like it's possible to continue.  At least, the error is reflected
properly in the logs.  A user could then go and correct or investigate
the situation.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 3458d41..be03b63 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -945,7 +945,7 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	if (rte_eal_dev_init() < 0)
-		rte_panic("Cannot init pmd devices\n");
+		RTE_LOG(ERR, EAL, "Cannot init pmd devices\n");
 
 	rte_eal_mcfg_complete();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 24/25] eal: do not panic when bus probe fails
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (22 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 23/25] eal: do not panic (or abort) if vdev init fails Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 14:29   ` [PATCH v3 25/25] rte_eal_init: add info about rte_errno codes Aaron Conole
                     ` (2 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index be03b63..57c41e0 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -934,8 +934,11 @@ rte_eal_init(int argc, char **argv)
 	rte_eal_mp_wait_lcore();
 
 	/* Probe all the buses and devices/drivers on them */
-	if (rte_bus_probe())
-		rte_panic("Cannot probe devices\n");
+	if (rte_bus_probe()) {
+		RTE_LOG(ERR, EAL, "Cannot probe devices\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	/* Probe & Initialize PCI devices */
 	if (rte_eal_pci_probe()) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v3 25/25] rte_eal_init: add info about rte_errno codes
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (23 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 24/25] eal: do not panic when bus probe fails Aaron Conole
@ 2017-02-09 14:29   ` Aaron Conole
  2017-02-09 22:37     ` Stephen Hemminger
  2017-02-09 22:38   ` [PATCH v3 00/25] linux/eal: Remove most causes of panic on init Stephen Hemminger
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
  26 siblings, 1 reply; 159+ messages in thread
From: Aaron Conole @ 2017-02-09 14:29 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

The rte_eal_init function will now pass failure reason hints to the
application.  To help app developers deciper this, add some brief
information about what the codes are indicating.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/include/rte_eal.h | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 03fee50..5f184c9 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -159,7 +159,32 @@ int rte_eal_iopl_init(void);
  *     function call and should not be further interpreted by the
  *     application.  The EAL does not take any ownership of the memory used
  *     for either the argv array, or its members.
- *   - On failure, a negative error value.
+ *   - On failure, -1 and rte_errno is set to a value indicating the cause
+ *     for failure.
+ *
+ *   The error codes returned via rte_errno:
+ *     EACCES indicates a permissions issue.
+ *
+ *     EAGAIN indicates either a bus or system resource was not available,
+ *            try again.
+ *
+ *     EALREADY indicates that the rte_eal_init function has already been
+ *              called, and cannot be called again.
+ *
+ *     EFAULT indicates the tailq configuration name was not found in
+ *            memory configuration.
+ *
+ *     EINVAL indicates invalid parameters were passed as argv/argc.
+ *
+ *     EIO indicates failure to setup the logging handlers.  This is usually
+ *         caused by an out-of-memory condition.
+ *
+ *     ENODEV indicates memory setup issues.
+ *
+ *     ENOTSUP indicates that the EAL cannot initialize on this system.
+ *
+ *     EUNATCH indicates that the PCI bus is either not present, or is not
+ *             readable by the eal.
  */
 int rte_eal_init(int argc, char **argv);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* Re: [PATCH v3 25/25] rte_eal_init: add info about rte_errno codes
  2017-02-09 14:29   ` [PATCH v3 25/25] rte_eal_init: add info about rte_errno codes Aaron Conole
@ 2017-02-09 22:37     ` Stephen Hemminger
  2017-02-14 21:31       ` Aaron Conole
  0 siblings, 1 reply; 159+ messages in thread
From: Stephen Hemminger @ 2017-02-09 22:37 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Bruce Richardson

On Thu,  9 Feb 2017 09:29:53 -0500
Aaron Conole <aconole@redhat.com> wrote:

> + *   The error codes returned via rte_errno:
> + *     EACCES indicates a permissions issue.
> + *
> + *     EAGAIN indicates either a bus or system resource was not available,
> + *            try again.
> + *
> + *     EALREADY indicates that the rte_eal_init function has already been
> + *              called, and cannot be called again.
> + *
> + *     EFAULT indicates the tailq configuration name was not found in
> + *            memory configuration.
> + *
> + *     EINVAL indicates invalid parameters were passed as argv/argc.
> + *
> + *     EIO indicates failure to setup the logging handlers.  This is usually
> + *         caused by an out-of-memory condition.
> + *
> + *     ENODEV indicates memory setup issues.
> + *
> + *     ENOTSUP indicates that the EAL cannot initialize on this system.
> + *
> + *     EUNATCH indicates that the PCI bus is either not present, or is not
> + *             readable by the eal.
>   */

You might want to be less restrictive about wording in the comment.
In future more errors might be returned, and also for out of memory
ENOMEM is better.

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v3 00/25] linux/eal: Remove most causes of panic on init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (24 preceding siblings ...)
  2017-02-09 14:29   ` [PATCH v3 25/25] rte_eal_init: add info about rte_errno codes Aaron Conole
@ 2017-02-09 22:38   ` Stephen Hemminger
  2017-02-14 20:50     ` Aaron Conole
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
  26 siblings, 1 reply; 159+ messages in thread
From: Stephen Hemminger @ 2017-02-09 22:38 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Bruce Richardson

On Thu,  9 Feb 2017 09:29:28 -0500
Aaron Conole <aconole@redhat.com> wrote:

> In many cases, it's enough to simply let the application know that the
> call to initialize DPDK has failed.  A complete halt can then be
> decided by the application based on error returned (and the app could
> even attempt a possible re-attempt after some corrective action by the
> user or application).
> 
> Changes ->v2:
> - Audited all "RTE_LOG (" calls that were introduced, and converted
>   to "RTE_LOG("
> - Added some fprintf(stderr, "") lines to indicate errors before logging
>   is initialized
> - Removed assignments to errno.
> - Changed patch 14/25 to reflect EFAULT, and document in 25/25
> 
> Changes ->v3:
> - Checkpatch issues in patches 3 (spelling mistake), 9 (issue with leading
>   spaces), and 19 (braces around single line statement if-condition)
> 
> I kept the rte_errno reflection, since this is control-path code and the
> init function returns a sentinel value of -1.
> 
> Aaron Conole (25):
>   eal: CPU init will no longer panic
>   eal: return error instead of panic for cpu init
>   eal: No panic on hugepages info init
>   eal: do not panic on failed hugepage query
>   eal: failure to parse args returns error
>   eal-common: introduce a way to query cpu support
>   eal: Signal error when CPU isn't supported
>   eal: do not panic on memzone initialization fails
>   eal: set errno when exiting for already called
>   eal: Do not panic on log failures
>   eal: Do not panic on pci-probe
>   eal: do not panic on vfio failure
>   eal: do not panic on memory init
>   eal: do not panic on tailq init
>   eal: do not panic on alarm init
>   eal: convert timer_init not to call panic
>   eal: change the private pipe call to reflect errno
>   eal: Do not panic on interrupt thread init
>   eal: do not error if plugins fail to init
>   eal_pci: Continue probing even on failures
>   eal: do not panic on failed PCI probe
>   eal_common_dev: continue initializing vdevs
>   eal: do not panic (or abort) if vdev init fails
>   eal: do not panic when bus probe fails
>   rte_eal_init: add info about rte_errno codes
> 
>  lib/librte_eal/common/eal_common_cpuflags.c        |  13 ++-
>  lib/librte_eal/common/eal_common_dev.c             |   5 +-
>  lib/librte_eal/common/eal_common_lcore.c           |   7 +-
>  lib/librte_eal/common/eal_common_pci.c             |  15 ++-
>  lib/librte_eal/common/eal_common_tailqs.c          |   3 +-
>  .../common/include/generic/rte_cpuflags.h          |   9 ++
>  lib/librte_eal/common/include/rte_eal.h            |  27 ++++-
>  lib/librte_eal/linuxapp/eal/eal.c                  | 122 +++++++++++++++------
>  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c    |   6 +-
>  lib/librte_eal/linuxapp/eal/eal_interrupts.c       |   5 +-
>  10 files changed, 161 insertions(+), 51 deletions(-)
> 

I worry that some of these early failure messages may never be visible
because the logging system has not been initialized. Might be safer to
just use fprintf(stderr, ...) or define a new wrapper function.

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v3 00/25] linux/eal: Remove most causes of panic on init
  2017-02-09 22:38   ` [PATCH v3 00/25] linux/eal: Remove most causes of panic on init Stephen Hemminger
@ 2017-02-14 20:50     ` Aaron Conole
  0 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-14 20:50 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Bruce Richardson

Stephen Hemminger <stephen@networkplumber.org> writes:

> On Thu,  9 Feb 2017 09:29:28 -0500
> Aaron Conole <aconole@redhat.com> wrote:
>
>> In many cases, it's enough to simply let the application know that the
>> call to initialize DPDK has failed.  A complete halt can then be
>> decided by the application based on error returned (and the app could
>> even attempt a possible re-attempt after some corrective action by the
>> user or application).
>> 
>> ...
>> 
>
> I worry that some of these early failure messages may never be visible
> because the logging system has not been initialized. Might be safer to
> just use fprintf(stderr, ...) or define a new wrapper function.

Thanks for the suggestion, Stephen!  I've folded it into my series.

-Aaron

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v3 25/25] rte_eal_init: add info about rte_errno codes
  2017-02-09 22:37     ` Stephen Hemminger
@ 2017-02-14 21:31       ` Aaron Conole
  0 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-14 21:31 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Bruce Richardson

Stephen Hemminger <stephen@networkplumber.org> writes:

> On Thu,  9 Feb 2017 09:29:53 -0500
> Aaron Conole <aconole@redhat.com> wrote:
>
>> + *   The error codes returned via rte_errno:
>> + *     EACCES indicates a permissions issue.
>> + *
>> + *     EAGAIN indicates either a bus or system resource was not available,
>> + *            try again.
>> + *
>> + *     EALREADY indicates that the rte_eal_init function has already been
>> + *              called, and cannot be called again.
>> + *
>> + *     EFAULT indicates the tailq configuration name was not found in
>> + *            memory configuration.
>> + *
>> + *     EINVAL indicates invalid parameters were passed as argv/argc.
>> + *
>> + *     EIO indicates failure to setup the logging handlers.  This is usually
>> + *         caused by an out-of-memory condition.
>> + *
>> + *     ENODEV indicates memory setup issues.
>> + *
>> + *     ENOTSUP indicates that the EAL cannot initialize on this system.
>> + *
>> + *     EUNATCH indicates that the PCI bus is either not present, or is not
>> + *             readable by the eal.
>>   */
>
> You might want to be less restrictive about wording in the comment.
> In future more errors might be returned, and also for out of memory
> ENOMEM is better.

Sure thing, I'll switch EIO and ENODEV to ENOMEM, does that make sense?

Also, which message do you refer to?  Is it "The error codes returned
via rte_errno" section?  I assume that adding new error codes will also
bring an update to the eal_init documentation, but perhaps I'm
misunderstanding.

Thanks for your review, Stephen!

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v4 00/26] linux/eal: Remove most causes of panic on init
  2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
                     ` (25 preceding siblings ...)
  2017-02-09 22:38   ` [PATCH v3 00/25] linux/eal: Remove most causes of panic on init Stephen Hemminger
@ 2017-02-25 16:02   ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 01/26] eal: CPU init will no longer panic Aaron Conole
                       ` (27 more replies)
  26 siblings, 28 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

In many cases, it's enough to simply let the application know that the
call to initialize DPDK has failed.  A complete halt can then be
decided by the application based on error returned (and the app could
even attempt a possible re-attempt after some corrective action by the
user or application).

Changes ->v2:
- Audited all "RTE_LOG (" calls that were introduced, and converted
  to "RTE_LOG("
- Added some fprintf(stderr, "") lines to indicate errors before logging
  is initialized
- Removed assignments to errno.
- Changed patch 14/25 to reflect EFAULT, and document in 25/25

Changes ->v3:
- Checkpatch issues in patches 3 (spelling mistake), 9 (issue with leading
  spaces), and 19 (braces around single line statement if-condition)

Changes ->v4:
- Error text cleanup.
- Add a new check around rte_bus_scan(), added during the development of
  this series.

I kept the rte_errno reflection, since this is control-path code and the
init function returns a sentinel value of -1.

Aaron Conole (26):
  eal: CPU init will no longer panic
  eal: return error instead of panic for cpu init
  eal: No panic on hugepages info init
  eal: do not panic on failed hugepage query
  eal: failure to parse args returns error
  eal-common: introduce a way to query cpu support
  eal: Signal error when CPU isn't supported
  eal: do not panic on memzone initialization fails
  eal: set errno when exiting for already called
  eal: Do not panic on log failures
  eal: Do not panic on pci-probe
  eal: do not panic on vfio failure
  eal: do not panic on memory init
  eal: do not panic on tailq init
  eal: do not panic on alarm init
  eal: convert timer_init not to call panic
  eal: change the private pipe call to reflect errno
  eal: Do not panic on interrupt thread init
  eal: do not error if plugins fail to init
  eal_pci: Continue probing even on failures
  eal: do not panic on failed PCI probe
  eal_common_dev: continue initializing vdevs
  eal: do not panic (or abort) if vdev init fails
  eal: do not panic when bus probe fails
  eal: do not panic on failed bus scan
  rte_eal_init: add info about rte_errno codes

 lib/librte_eal/common/eal_common_cpuflags.c        |  13 +-
 lib/librte_eal/common/eal_common_dev.c             |   5 +-
 lib/librte_eal/common/eal_common_lcore.c           |   7 +-
 lib/librte_eal/common/eal_common_pci.c             |  15 ++-
 lib/librte_eal/common/eal_common_tailqs.c          |   3 +-
 .../common/include/generic/rte_cpuflags.h          |   9 ++
 lib/librte_eal/common/include/rte_eal.h            |  27 ++++-
 lib/librte_eal/linuxapp/eal/eal.c                  | 131 +++++++++++++++------
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c    |   6 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |   5 +-
 10 files changed, 169 insertions(+), 52 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v4 01/26] eal: CPU init will no longer panic
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 02/26] eal: return error instead of panic for cpu init Aaron Conole
                       ` (26 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After this change, the EAL CPU NUMA node resolution step can no longer
emit an rte_panic.  This aligns with the code in rte_eal_init, which
expects failures to return an error code.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_lcore.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_lcore.c b/lib/librte_eal/common/eal_common_lcore.c
index 2cd4132..84fa0cb 100644
--- a/lib/librte_eal/common/eal_common_lcore.c
+++ b/lib/librte_eal/common/eal_common_lcore.c
@@ -83,16 +83,17 @@ rte_eal_cpu_init(void)
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = eal_cpu_core_id(lcore_id);
 		lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id);
-		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES)
+		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES) {
 #ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
 			lcore_config[lcore_id].socket_id = 0;
 #else
-			rte_panic("Socket ID (%u) is greater than "
+			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than "
 				"RTE_MAX_NUMA_NODES (%d)\n",
 				lcore_config[lcore_id].socket_id,
 				RTE_MAX_NUMA_NODES);
+			return -1;
 #endif
-
+		}
 		RTE_LOG(DEBUG, EAL, "Detected lcore %u as "
 				"core %u on socket %u\n",
 				lcore_id, lcore_config[lcore_id].core_id,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 02/26] eal: return error instead of panic for cpu init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 01/26] eal: CPU init will no longer panic Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-27 12:58       ` Bruce Richardson
  2017-02-27 13:00       ` Bruce Richardson
  2017-02-25 16:02     ` [PATCH v4 03/26] eal: No panic on hugepages info init Aaron Conole
                       ` (25 subsequent siblings)
  27 siblings, 2 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There may be no way to gracefully recover, but the application
should be notified that a failure happened, rather than completely
aborting.  This allows the user to proceed with a "slow-path" type
solution.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index bf6b818..5023d0d 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -740,6 +740,12 @@ static int rte_eal_vfio_setup(void)
 }
 #endif
 
+static void rte_eal_init_alert(const char *msg)
+{
+    fprintf(stderr, "EAL: FATAL: %s\n", msg);
+    RTE_LOG(ERR, EAL, "%s\n", msg);
+}
+
 /* Launch threads, called at application init(). */
 int
 rte_eal_init(int argc, char **argv)
@@ -767,8 +773,11 @@ rte_eal_init(int argc, char **argv)
 	/* set log level as early as possible */
 	rte_set_log_level(internal_config.log_level);
 
-	if (rte_eal_cpu_init() < 0)
-		rte_panic("Cannot detect lcores\n");
+	if (rte_eal_cpu_init() < 0) {
+		rte_eal_init_alert("Cannot detect lcores.");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	fctret = eal_parse_args(argc, argv);
 	if (fctret < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 03/26] eal: No panic on hugepages info init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 01/26] eal: CPU init will no longer panic Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 02/26] eal: return error instead of panic for cpu init Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 04/26] eal: do not panic on failed hugepage query Aaron Conole
                       ` (24 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When attempting to scan hugepages, signal to the eal.c that an error has
occurred, rather than performing a panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 18858e2..4d47eaf 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -283,9 +283,11 @@ eal_hugepage_info_init(void)
 	struct dirent *dirent;
 
 	dir = opendir(sys_dir_path);
-	if (dir == NULL)
-		rte_panic("Cannot open directory %s to read system hugepage "
+	if (dir == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot open directory %s to read system hugepage "
 			  "info\n", sys_dir_path);
+		return -1;
+	}
 
 	for (dirent = readdir(dir); dirent != NULL; dirent = readdir(dir)) {
 		struct hugepage_info *hpi;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 04/26] eal: do not panic on failed hugepage query
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (2 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 03/26] eal: No panic on hugepages info init Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 05/26] eal: failure to parse args returns error Aaron Conole
                       ` (23 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

If we fail to acquire hugepage information, simply signal an error to
the application.  This clears the run_once counter, allowing the user or
application to take a corrective action and retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 5023d0d..c76ed94 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -786,8 +786,12 @@ rte_eal_init(int argc, char **argv)
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
 			internal_config.xen_dom0_support == 0 &&
-			eal_hugepage_info_init() < 0)
-		rte_panic("Cannot get hugepage information\n");
+			eal_hugepage_info_init() < 0) {
+		rte_eal_init_alert("Cannot get hugepage information.");
+		rte_errno = EACCES;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
 		if (internal_config.no_hugetlbfs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 05/26] eal: failure to parse args returns error
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (3 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 04/26] eal: do not panic on failed hugepage query Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 06/26] eal-common: introduce a way to query cpu support Aaron Conole
                       ` (22 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's possible that the application could take a corrective action here,
and either prompt the user for different arguments, or at least perform
a better logging.  Exiting this early prevents any useful information
gathering from the application layer.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index c76ed94..0f682e1 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -780,8 +780,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	fctret = eal_parse_args(argc, argv);
-	if (fctret < 0)
-		exit(1);
+	if (fctret < 0) {
+		rte_eal_init_alert("Invalid 'command line' arguments.");
+		rte_errno = EINVAL;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 06/26] eal-common: introduce a way to query cpu support
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (4 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 05/26] eal: failure to parse args returns error Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-27 13:48       ` Bruce Richardson
  2017-02-25 16:02     ` [PATCH v4 07/26] eal: Signal error when CPU isn't supported Aaron Conole
                       ` (21 subsequent siblings)
  27 siblings, 1 reply; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This adds a new API to check for the eal cpu versions.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_cpuflags.c          | 13 +++++++++++--
 lib/librte_eal/common/include/generic/rte_cpuflags.h |  9 +++++++++
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
index b5f76f7..2c2127b 100644
--- a/lib/librte_eal/common/eal_common_cpuflags.c
+++ b/lib/librte_eal/common/eal_common_cpuflags.c
@@ -43,6 +43,13 @@
 void
 rte_cpu_check_supported(void)
 {
+	if (!rte_cpu_is_supported())
+		exit(1);
+}
+
+bool
+rte_cpu_is_supported(void)
+{
 	/* This is generated at compile-time by the build system */
 	static const enum rte_cpu_flag_t compile_time_flags[] = {
 			RTE_COMPILE_TIME_CPUFLAGS
@@ -57,14 +64,16 @@ rte_cpu_check_supported(void)
 			fprintf(stderr,
 				"ERROR: CPU feature flag lookup failed with error %d\n",
 				ret);
-			exit(1);
+			return false;
 		}
 		if (!ret) {
 			fprintf(stderr,
 			        "ERROR: This system does not support \"%s\".\n"
 			        "Please check that RTE_MACHINE is set correctly.\n",
 			        rte_cpu_get_flag_name(compile_time_flags[i]));
-			exit(1);
+			return false;
 		}
 	}
+
+	return true;
 }
diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
index 71321f3..e4342ad 100644
--- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -40,6 +40,7 @@
  */
 
 #include <errno.h>
+#include <stdbool.h>
 
 /**
  * Enumeration of all CPU features supported
@@ -82,4 +83,12 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature);
 void
 rte_cpu_check_supported(void);
 
+/**
+ * This function checks that the currently used CPU supports the CPU features
+ * that were specified at compile time. It is called automatically within the
+ * EAL, so does not need to be used by applications.  This version returns a
+ * result so that decisions may be made (for instance, graceful shutdowns).
+ */
+bool
+rte_cpu_is_supported(void);
 #endif /* _RTE_CPUFLAGS_H_ */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 07/26] eal: Signal error when CPU isn't supported
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (5 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 06/26] eal-common: introduce a way to query cpu support Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 08/26] eal: do not panic on memzone initialization fails Aaron Conole
                       ` (20 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's now possible to gracefully exit the application, or for
applications which support non-dpdk datapaths working in concert with
DPDK datapaths, there no longer is the possibility of exiting for
unsupported CPUs.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 0f682e1..0adaaa2 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -758,7 +758,11 @@ rte_eal_init(int argc, char **argv)
 	char thread_name[RTE_MAX_THREAD_NAME_LEN];
 
 	/* checks if the machine is adequate */
-	rte_cpu_check_supported();
+	if (!rte_cpu_is_supported()) {
+		rte_eal_init_alert("unsupported cpu type.");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 08/26] eal: do not panic on memzone initialization fails
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (6 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 07/26] eal: Signal error when CPU isn't supported Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 09/26] eal: set errno when exiting for already called Aaron Conole
                       ` (19 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When memzone initialization fails, report the error to the calling
application rather than panic().  Without a good way of detaching /
releasing hugepages, at this point the application will have to restart.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 0adaaa2..0005ebc 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -838,8 +838,11 @@ rte_eal_init(int argc, char **argv)
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
 
-	if (rte_eal_memzone_init() < 0)
-		rte_panic("Cannot init memzone\n");
+	if (rte_eal_memzone_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init memzone\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
 
 	if (rte_eal_tailqs_init() < 0)
 		rte_panic("Cannot init tail queues for objects\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 09/26] eal: set errno when exiting for already called
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (7 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 08/26] eal: do not panic on memzone initialization fails Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 10/26] eal: Do not panic on log failures Aaron Conole
                       ` (18 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 0005ebc..beb786e 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -764,8 +764,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (!rte_atomic32_test_and_set(&run_once))
+	if (!rte_atomic32_test_and_set(&run_once)) {
+		rte_eal_init_alert("already called initialization.");
+		rte_errno = EALREADY;
 		return -1;
+	}
 
 	logid = strrchr(argv[0], '/');
 	logid = strdup(logid ? logid + 1: argv[0]);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 10/26] eal: Do not panic on log failures
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (8 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 09/26] eal: set errno when exiting for already called Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 11/26] eal: Do not panic on pci-probe Aaron Conole
                       ` (17 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When log initialization fails, it's generally because the fopencookie
failed.  While this is rare in practice, it could happen, and it is
likely because of memory pressure.  So, flag the error, and allow the
user to retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index beb786e..25f8ae8 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -824,8 +824,12 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
-	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
-		rte_panic("Cannot init logs\n");
+	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
+		rte_eal_init_alert("Cannot init logging.");
+		rte_errno = ENOMEM;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 11/26] eal: Do not panic on pci-probe
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (9 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 10/26] eal: Do not panic on log failures Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 12/26] eal: do not panic on vfio failure Aaron Conole
                       ` (16 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This will usually be an issue because of permissions.  However, it could
also be caused by OOM.  In either case, errno will contain the
underlying cause.  It is safe to re-init the system here, so allow the
application to take corrective action and reinit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 25f8ae8..5534b4b 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -831,8 +831,12 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_pci_init() < 0)
-		rte_panic("Cannot init PCI\n");
+	if (rte_eal_pci_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init PCI\n");
+		rte_errno = EUNATCH;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 12/26] eal: do not panic on vfio failure
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (10 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 11/26] eal: Do not panic on pci-probe Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 13/26] eal: do not panic on memory init Aaron Conole
                       ` (15 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 5534b4b..0e7e8c8 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -839,8 +839,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 #ifdef VFIO_PRESENT
-	if (rte_eal_vfio_setup() < 0)
-		rte_panic("Cannot init VFIO\n");
+	if (rte_eal_vfio_setup() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init VFIO\n");
+		rte_errno = EAGAIN;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 #endif
 
 	if (rte_eal_memory_init() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 13/26] eal: do not panic on memory init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (11 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 12/26] eal: do not panic on vfio failure Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 14/26] eal: do not panic on tailq init Aaron Conole
                       ` (14 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This can only happen when access to hugepages (either as primary or
secondary process) fails (and that is usually permissions).  Since the
manner of failure is not reversible, we cannot allow retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 0e7e8c8..30d6a7d 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -847,8 +847,11 @@ rte_eal_init(int argc, char **argv)
 	}
 #endif
 
-	if (rte_eal_memory_init() < 0)
-		rte_panic("Cannot init memory\n");
+	if (rte_eal_memory_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init memory\n");
+		rte_errno = ENOMEM;
+		return -1;
+	}
 
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 14/26] eal: do not panic on tailq init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (12 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 13/26] eal: do not panic on memory init Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 15/26] eal: do not panic on alarm init Aaron Conole
                       ` (13 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There are some theoretical racy conditions in the system that _could_
cause early tailq init to fail;  however, no need to panic the
application.  While it can't continue using DPDK, it could make better
alerts to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_tailqs.c | 3 +--
 lib/librte_eal/linuxapp/eal/eal.c         | 7 +++++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_tailqs.c b/lib/librte_eal/common/eal_common_tailqs.c
index bb08ec8..4f69828 100644
--- a/lib/librte_eal/common/eal_common_tailqs.c
+++ b/lib/librte_eal/common/eal_common_tailqs.c
@@ -188,8 +188,7 @@ rte_eal_tailqs_init(void)
 		if (t->head == NULL) {
 			RTE_LOG(ERR, EAL,
 				"Cannot initialize tailq: %s\n", t->name);
-			/* no need to TAILQ_REMOVE, we are going to panic in
-			 * rte_eal_init() */
+			/* TAILQ_REMOVE not needed, error is already fatal */
 			goto fail;
 		}
 	}
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 30d6a7d..008e804 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -862,8 +862,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_tailqs_init() < 0)
-		rte_panic("Cannot init tail queues for objects\n");
+	if (rte_eal_tailqs_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init tail queues for objects\n");
+		rte_errno = EFAULT;
+		return -1;
+	}
 
 	if (rte_eal_alarm_init() < 0)
 		rte_panic("Cannot init interrupt-handling thread\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 15/26] eal: do not panic on alarm init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (13 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 14/26] eal: do not panic on tailq init Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:02     ` [PATCH v4 16/26] eal: convert timer_init not to call panic Aaron Conole
                       ` (12 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

rte_eal_alarm_init() call uses the linux timerfd framework to create a
poll()-able timer using standard posix file operations.  This could fail
for a few reasons given in the man-pages, but many could be
corrected by the user application.  No need to panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 008e804..aa64f69 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -61,6 +61,7 @@
 #include <rte_launch.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
+#include <rte_errno.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
 #include <rte_log.h>
@@ -868,8 +869,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_alarm_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_alarm_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init interrupt-handling thread\n");
+		/* rte_eal_alarm_init sets rte_errno on failure. */
+		return -1;
+	}
 
 	if (rte_eal_timer_init() < 0)
 		rte_panic("Cannot init HPET or TSC timers\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 16/26] eal: convert timer_init not to call panic
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (14 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 15/26] eal: do not panic on alarm init Aaron Conole
@ 2017-02-25 16:02     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 17/26] eal: change the private pipe call to reflect errno Aaron Conole
                       ` (11 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:02 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After code inspection, there is no way for eal_timer_init() to fail.  It
simply returns 0 in all cases.  As such, this test could either go-away
or stay here as 'future-proofing'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index aa64f69..bf20245 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -875,8 +875,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_timer_init() < 0)
-		rte_panic("Cannot init HPET or TSC timers\n");
+	if (rte_eal_timer_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init HPET or TSC timers\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	eal_check_mem_on_local_socket();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 17/26] eal: change the private pipe call to reflect errno
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (15 preceding siblings ...)
  2017-02-25 16:02     ` [PATCH v4 16/26] eal: convert timer_init not to call panic Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 18/26] eal: Do not panic on interrupt thread init Aaron Conole
                       ` (10 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There could be some confusion as to why the call failed - this change
will always reflect the value of the error in rte_error.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 92a19cb..5bb833e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -898,13 +898,16 @@ rte_eal_intr_init(void)
 	 * create a pipe which will be waited by epoll and notified to
 	 * rebuild the wait list of epoll.
 	 */
-	if (pipe(intr_pipe.pipefd) < 0)
+	if (pipe(intr_pipe.pipefd) < 0) {
+		rte_errno = errno;
 		return -1;
+	}
 
 	/* create the host thread to wait/handle the interrupt */
 	ret = pthread_create(&intr_thread, NULL,
 			eal_intr_thread_main, NULL);
 	if (ret != 0) {
+		rte_errno = ret;
 		RTE_LOG(ERR, EAL,
 			"Failed to create thread for interrupt handling\n");
 	} else {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 18/26] eal: Do not panic on interrupt thread init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (16 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 17/26] eal: change the private pipe call to reflect errno Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 19/26] eal: do not error if plugins fail to init Aaron Conole
                       ` (9 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When initializing the interrupt thread, there are a number of possible
reasons for failure - some of which are correctable by the application.
Do not panic() needlessly, and give the application a change to reflect
this information to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index bf20245..02b075c 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -894,8 +894,10 @@ rte_eal_init(int argc, char **argv)
 		rte_config.master_lcore, (int)thread_id, cpuset,
 		ret == 0 ? "" : "...");
 
-	if (rte_eal_intr_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_intr_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init interrupt-handling thread\n");
+		return -1;
+	}
 
 	if (rte_bus_scan())
 		rte_panic("Cannot scan the buses for devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 19/26] eal: do not error if plugins fail to init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (17 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 18/26] eal: Do not panic on interrupt thread init Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 20/26] eal_pci: Continue probing even on failures Aaron Conole
                       ` (8 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Plugins are useful and important.  However, it seems crazy to abort
everything just because they don't initialize properly.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 02b075c..2069b6f 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -884,7 +884,7 @@ rte_eal_init(int argc, char **argv)
 	eal_check_mem_on_local_socket();
 
 	if (eal_plugins_init() < 0)
-		rte_panic("Cannot init plugins\n");
+		RTE_LOG(ERR, EAL, "Cannot init plugins\n");
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 20/26] eal_pci: Continue probing even on failures
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (18 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 19/26] eal: do not error if plugins fail to init Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 21/26] eal: do not panic on failed PCI probe Aaron Conole
                       ` (7 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Some devices may be inaccessible for a variety of reasons, or the
PCI-bus may be unavailable causing the whole thing to fail.  Still,
better to continue attempts at probes.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_pci.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 72547bd..9416190 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -69,6 +69,7 @@
 #include <sys/queue.h>
 #include <sys/mman.h>
 
+#include <rte_errno.h>
 #include <rte_interrupts.h>
 #include <rte_log.h>
 #include <rte_pci.h>
@@ -416,6 +417,7 @@ rte_eal_pci_probe(void)
 	struct rte_pci_device *dev = NULL;
 	struct rte_devargs *devargs;
 	int probe_all = 0;
+	int ret_1 = 0;
 	int ret = 0;
 
 	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
@@ -430,17 +432,20 @@ rte_eal_pci_probe(void)
 
 		/* probe all or only whitelisted devices */
 		if (probe_all)
-			ret = pci_probe_all_drivers(dev);
+			ret_1 = pci_probe_all_drivers(dev);
 		else if (devargs != NULL &&
 			devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
-			ret = pci_probe_all_drivers(dev);
-		if (ret < 0)
-			rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
+			ret_1 = pci_probe_all_drivers(dev);
+		if (ret_1 < 0) {
+			RTE_LOG(ERR, EAL, "Requested device " PCI_PRI_FMT
 				 " cannot be used\n", dev->addr.domain, dev->addr.bus,
 				 dev->addr.devid, dev->addr.function);
+			rte_errno = errno;
+			ret = 1;
+		}
 	}
 
-	return 0;
+	return -ret;
 }
 
 /* dump one device */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 21/26] eal: do not panic on failed PCI probe
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (19 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 20/26] eal_pci: Continue probing even on failures Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
                       ` (6 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It may even be possible to simply log the error and continue on letting
the user check the logs and restart the application when things are failed.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 2069b6f..481b53b 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -943,8 +943,11 @@ rte_eal_init(int argc, char **argv)
 		rte_panic("Cannot probe devices\n");
 
 	/* Probe & Initialize PCI devices */
-	if (rte_eal_pci_probe())
-		rte_panic("Cannot probe PCI\n");
+	if (rte_eal_pci_probe()) {
+		RTE_LOG(ERR, EAL, "Cannot probe PCI\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 22/26] eal_common_dev: continue initializing vdevs
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (20 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 21/26] eal: do not panic on failed PCI probe Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
                       ` (5 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Even if one vdev should fail, there's no need to prevent further
processing.  Log the error, and reflect it to the higher levels to
decide.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_dev.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 4f3b493..9889997 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -80,6 +80,7 @@ int
 rte_eal_dev_init(void)
 {
 	struct rte_devargs *devargs;
+	int ret = 0;
 
 	/*
 	 * Note that the dev_driver_list is populated here
@@ -97,11 +98,11 @@ rte_eal_dev_init(void)
 					devargs->args)) {
 			RTE_LOG(ERR, EAL, "failed to initialize %s device\n",
 					devargs->virt.drv_name);
-			return -1;
+			ret = -1;
 		}
 	}
 
-	return 0;
+	return ret;
 }
 
 int rte_eal_dev_attach(const char *name, const char *devargs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 23/26] eal: do not panic (or abort) if vdev init fails
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (21 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 24/26] eal: do not panic when bus probe fails Aaron Conole
                       ` (4 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Seems like it's possible to continue.  At least, the error is reflected
properly in the logs.  A user could then go and correct or investigate
the situation.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 481b53b..803cdd2 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -950,7 +950,7 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	if (rte_eal_dev_init() < 0)
-		rte_panic("Cannot init pmd devices\n");
+		RTE_LOG(ERR, EAL, "Cannot init pmd devices\n");
 
 	rte_eal_mcfg_complete();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 24/26] eal: do not panic when bus probe fails
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (22 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 25/26] eal: do not panic on failed bus scan Aaron Conole
                       ` (3 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 803cdd2..c94ac9b 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -939,8 +939,11 @@ rte_eal_init(int argc, char **argv)
 	rte_eal_mp_wait_lcore();
 
 	/* Probe all the buses and devices/drivers on them */
-	if (rte_bus_probe())
-		rte_panic("Cannot probe devices\n");
+	if (rte_bus_probe()) {
+		RTE_LOG(ERR, EAL, "Cannot probe devices\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	/* Probe & Initialize PCI devices */
 	if (rte_eal_pci_probe()) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 25/26] eal: do not panic on failed bus scan
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (23 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 24/26] eal: do not panic when bus probe fails Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-25 16:03     ` [PATCH v4 26/26] rte_eal_init: add info about rte_errno codes Aaron Conole
                       ` (2 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

For now, do an abort.  It's likely that even aborting the initialization
is premature in this case, as it may be possible to proceed even if one
bus or another is not available.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index c94ac9b..41050a0 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -899,8 +899,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_bus_scan())
-		rte_panic("Cannot scan the buses for devices\n");
+	if (rte_bus_scan()) {
+		RTE_LOG(ERR, EAL, "Cannot scan the buses for devices\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
 
 	RTE_LCORE_FOREACH_SLAVE(i) {
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v4 26/26] rte_eal_init: add info about rte_errno codes
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (24 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 25/26] eal: do not panic on failed bus scan Aaron Conole
@ 2017-02-25 16:03     ` Aaron Conole
  2017-02-27 13:59     ` [PATCH v4 00/26] linux/eal: Remove most causes of panic on init Bruce Richardson
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-25 16:03 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

The rte_eal_init function will now pass failure reason hints to the
application.  To help app developers deciper this, add some brief
information about what the codes are indicating.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/include/rte_eal.h | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 03fee50..9251244 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -159,7 +159,32 @@ int rte_eal_iopl_init(void);
  *     function call and should not be further interpreted by the
  *     application.  The EAL does not take any ownership of the memory used
  *     for either the argv array, or its members.
- *   - On failure, a negative error value.
+ *   - On failure, -1 and rte_errno is set to a value indicating the cause
+ *     for failure.  In some instances, the application will need to be
+ *     restarted as part of clearing the issue.
+ *
+ *   Error codes returned via rte_errno:
+ *     EACCES indicates a permissions issue.
+ *
+ *     EAGAIN indicates either a bus or system resource was not available,
+ *            setup may be attempted again.
+ *
+ *     EALREADY indicates that the rte_eal_init function has already been
+ *              called, and cannot be called again.
+ *
+ *     EFAULT indicates the tailq configuration name was not found in
+ *            memory configuration.
+ *
+ *     EINVAL indicates invalid parameters were passed as argv/argc.
+ *
+ *     ENOMEM indicates failure likely caused by an out-of-memory condition.
+ *
+ *     ENODEV indicates memory setup issues.
+ *
+ *     ENOTSUP indicates that the EAL cannot initialize on this system.
+ *
+ *     EUNATCH indicates that the PCI bus is either not present, or is not
+ *             readable by the eal.
  */
 int rte_eal_init(int argc, char **argv);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 02/26] eal: return error instead of panic for cpu init
  2017-02-25 16:02     ` [PATCH v4 02/26] eal: return error instead of panic for cpu init Aaron Conole
@ 2017-02-27 12:58       ` Bruce Richardson
  2017-02-27 14:35         ` Aaron Conole
  2017-02-27 13:00       ` Bruce Richardson
  1 sibling, 1 reply; 159+ messages in thread
From: Bruce Richardson @ 2017-02-27 12:58 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger

On Sat, Feb 25, 2017 at 11:02:45AM -0500, Aaron Conole wrote:
> There may be no way to gracefully recover, but the application
> should be notified that a failure happened, rather than completely
> aborting.  This allows the user to proceed with a "slow-path" type
> solution.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index bf6b818..5023d0d 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -740,6 +740,12 @@ static int rte_eal_vfio_setup(void)
>  }
>  #endif
>  
> +static void rte_eal_init_alert(const char *msg)
> +{
> +    fprintf(stderr, "EAL: FATAL: %s\n", msg);
> +    RTE_LOG(ERR, EAL, "%s\n", msg);
> +}
> +
>  /* Launch threads, called at application init(). */
>  int
>  rte_eal_init(int argc, char **argv)
> @@ -767,8 +773,11 @@ rte_eal_init(int argc, char **argv)
>  	/* set log level as early as possible */
>  	rte_set_log_level(internal_config.log_level);
>  
> -	if (rte_eal_cpu_init() < 0)
> -		rte_panic("Cannot detect lcores\n");
> +	if (rte_eal_cpu_init() < 0) {
> +		rte_eal_init_alert("Cannot detect lcores.");
> +		rte_errno = ENOTSUP;
> +		return -1;
> +	}
>  
>  	fctret = eal_parse_args(argc, argv);
>  	if (fctret < 0)
> -- 
eal.c needs to include rte_errno.h after this change, otherwise it won't
compile.

/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 02/26] eal: return error instead of panic for cpu init
  2017-02-25 16:02     ` [PATCH v4 02/26] eal: return error instead of panic for cpu init Aaron Conole
  2017-02-27 12:58       ` Bruce Richardson
@ 2017-02-27 13:00       ` Bruce Richardson
  2017-02-27 14:34         ` Aaron Conole
  1 sibling, 1 reply; 159+ messages in thread
From: Bruce Richardson @ 2017-02-27 13:00 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger

On Sat, Feb 25, 2017 at 11:02:45AM -0500, Aaron Conole wrote:
> There may be no way to gracefully recover, but the application
> should be notified that a failure happened, rather than completely
> aborting.  This allows the user to proceed with a "slow-path" type
> solution.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index bf6b818..5023d0d 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -740,6 +740,12 @@ static int rte_eal_vfio_setup(void)
>  }
>  #endif
>  
> +static void rte_eal_init_alert(const char *msg)
> +{
> +    fprintf(stderr, "EAL: FATAL: %s\n", msg);
> +    RTE_LOG(ERR, EAL, "%s\n", msg);
> +}
Checkpatch flags the use of spaces rather than tabs here.

/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 06/26] eal-common: introduce a way to query cpu support
  2017-02-25 16:02     ` [PATCH v4 06/26] eal-common: introduce a way to query cpu support Aaron Conole
@ 2017-02-27 13:48       ` Bruce Richardson
  2017-02-27 14:33         ` Aaron Conole
  0 siblings, 1 reply; 159+ messages in thread
From: Bruce Richardson @ 2017-02-27 13:48 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger

On Sat, Feb 25, 2017 at 11:02:49AM -0500, Aaron Conole wrote:
> This adds a new API to check for the eal cpu versions.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---
>  lib/librte_eal/common/eal_common_cpuflags.c          | 13 +++++++++++--
>  lib/librte_eal/common/include/generic/rte_cpuflags.h |  9 +++++++++
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
> index b5f76f7..2c2127b 100644
> --- a/lib/librte_eal/common/eal_common_cpuflags.c
> +++ b/lib/librte_eal/common/eal_common_cpuflags.c
> @@ -43,6 +43,13 @@
>  void
>  rte_cpu_check_supported(void)
>  {
> +	if (!rte_cpu_is_supported())
> +		exit(1);
> +}
> +
> +bool
> +rte_cpu_is_supported(void)
> +{
>  	/* This is generated at compile-time by the build system */
>  	static const enum rte_cpu_flag_t compile_time_flags[] = {
>  			RTE_COMPILE_TIME_CPUFLAGS
> @@ -57,14 +64,16 @@ rte_cpu_check_supported(void)
>  			fprintf(stderr,
>  				"ERROR: CPU feature flag lookup failed with error %d\n",
>  				ret);
> -			exit(1);
> +			return false;
>  		}
>  		if (!ret) {
>  			fprintf(stderr,
>  			        "ERROR: This system does not support \"%s\".\n"
>  			        "Please check that RTE_MACHINE is set correctly.\n",
>  			        rte_cpu_get_flag_name(compile_time_flags[i]));
> -			exit(1);
> +			return false;
>  		}
>  	}
> +
> +	return true;
>  }
> diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
> index 71321f3..e4342ad 100644
> --- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
> +++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
> @@ -40,6 +40,7 @@
>   */
>  
>  #include <errno.h>
> +#include <stdbool.h>
>  

The addition of this include is causing all sorts of compilation errors
inside the PMDs, as many of them seem to be defining their own bools
types. :-(

For safety sake, probably best to have the function return int rather
than bool.

/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 00/26] linux/eal: Remove most causes of panic on init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (25 preceding siblings ...)
  2017-02-25 16:03     ` [PATCH v4 26/26] rte_eal_init: add info about rte_errno codes Aaron Conole
@ 2017-02-27 13:59     ` Bruce Richardson
  2017-02-27 14:34       ` Aaron Conole
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
  27 siblings, 1 reply; 159+ messages in thread
From: Bruce Richardson @ 2017-02-27 13:59 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger

On Sat, Feb 25, 2017 at 11:02:43AM -0500, Aaron Conole wrote:
> In many cases, it's enough to simply let the application know that the
> call to initialize DPDK has failed.  A complete halt can then be
> decided by the application based on error returned (and the app could
> even attempt a possible re-attempt after some corrective action by the
> user or application).
>
Spotted a few issues when I tried applying each of the patches and
compile testing them; those I've flagged on the patches that I test
applied. Otherwise, this looks a great set to have.

I assume the equivalent changes to the BSD EAL are "left as an exercise
for the reader"? :-)

	/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 06/26] eal-common: introduce a way to query cpu support
  2017-02-27 13:48       ` Bruce Richardson
@ 2017-02-27 14:33         ` Aaron Conole
  2017-02-27 15:11           ` Bruce Richardson
  0 siblings, 1 reply; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 14:33 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Stephen Hemminger

Bruce Richardson <bruce.richardson@intel.com> writes:

> On Sat, Feb 25, 2017 at 11:02:49AM -0500, Aaron Conole wrote:
>> This adds a new API to check for the eal cpu versions.
>> 
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>>  lib/librte_eal/common/eal_common_cpuflags.c          | 13 +++++++++++--
>>  lib/librte_eal/common/include/generic/rte_cpuflags.h |  9 +++++++++
>>  2 files changed, 20 insertions(+), 2 deletions(-)
>> 
>> diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
>> index b5f76f7..2c2127b 100644
>> --- a/lib/librte_eal/common/eal_common_cpuflags.c
>> +++ b/lib/librte_eal/common/eal_common_cpuflags.c
>> @@ -43,6 +43,13 @@
>>  void
>>  rte_cpu_check_supported(void)
>>  {
>> +	if (!rte_cpu_is_supported())
>> +		exit(1);
>> +}
>> +
>> +bool
>> +rte_cpu_is_supported(void)
>> +{
>>  	/* This is generated at compile-time by the build system */
>>  	static const enum rte_cpu_flag_t compile_time_flags[] = {
>>  			RTE_COMPILE_TIME_CPUFLAGS
>> @@ -57,14 +64,16 @@ rte_cpu_check_supported(void)
>>  			fprintf(stderr,
>>  				"ERROR: CPU feature flag lookup failed with error %d\n",
>>  				ret);
>> -			exit(1);
>> +			return false;
>>  		}
>>  		if (!ret) {
>>  			fprintf(stderr,
>>  			        "ERROR: This system does not support \"%s\".\n"
>>  			        "Please check that RTE_MACHINE is set correctly.\n",
>>  			        rte_cpu_get_flag_name(compile_time_flags[i]));
>> -			exit(1);
>> +			return false;
>>  		}
>>  	}
>> +
>> +	return true;
>>  }
>> diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
>> index 71321f3..e4342ad 100644
>> --- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
>> +++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
>> @@ -40,6 +40,7 @@
>>   */
>>  
>>  #include <errno.h>
>> +#include <stdbool.h>
>>  
>
> The addition of this include is causing all sorts of compilation errors
> inside the PMDs, as many of them seem to be defining their own bools
> types. :-(
>
> For safety sake, probably best to have the function return int rather
> than bool.

Will do - I never saw the issue, but perhaps I was excluding the PMDs
in question.

Thanks for the review, Bruce!

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 00/26] linux/eal: Remove most causes of panic on init
  2017-02-27 13:59     ` [PATCH v4 00/26] linux/eal: Remove most causes of panic on init Bruce Richardson
@ 2017-02-27 14:34       ` Aaron Conole
  0 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 14:34 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Stephen Hemminger

Bruce Richardson <bruce.richardson@intel.com> writes:

> On Sat, Feb 25, 2017 at 11:02:43AM -0500, Aaron Conole wrote:
>> In many cases, it's enough to simply let the application know that the
>> call to initialize DPDK has failed.  A complete halt can then be
>> decided by the application based on error returned (and the app could
>> even attempt a possible re-attempt after some corrective action by the
>> user or application).
>>
> Spotted a few issues when I tried applying each of the patches and
> compile testing them; those I've flagged on the patches that I test
> applied. Otherwise, this looks a great set to have.
>
> I assume the equivalent changes to the BSD EAL are "left as an exercise
> for the reader"? :-)

For now.  In my copious free time, I might have a crack at it, although
I don't have a working BSD DPDK setup.

> 	/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 02/26] eal: return error instead of panic for cpu init
  2017-02-27 13:00       ` Bruce Richardson
@ 2017-02-27 14:34         ` Aaron Conole
  0 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 14:34 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Stephen Hemminger

Bruce Richardson <bruce.richardson@intel.com> writes:

> On Sat, Feb 25, 2017 at 11:02:45AM -0500, Aaron Conole wrote:
>> There may be no way to gracefully recover, but the application
>> should be notified that a failure happened, rather than completely
>> aborting.  This allows the user to proceed with a "slow-path" type
>> solution.
>> 
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>>  lib/librte_eal/linuxapp/eal/eal.c | 13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>> 
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c
>> b/lib/librte_eal/linuxapp/eal/eal.c
>> index bf6b818..5023d0d 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -740,6 +740,12 @@ static int rte_eal_vfio_setup(void)
>>  }
>>  #endif
>>  
>> +static void rte_eal_init_alert(const char *msg)
>> +{
>> +    fprintf(stderr, "EAL: FATAL: %s\n", msg);
>> +    RTE_LOG(ERR, EAL, "%s\n", msg);
>> +}
> Checkpatch flags the use of spaces rather than tabs here.

Yes, I caught it too late.  Sorry, I'll fix it.

> /Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 02/26] eal: return error instead of panic for cpu init
  2017-02-27 12:58       ` Bruce Richardson
@ 2017-02-27 14:35         ` Aaron Conole
  0 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 14:35 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Stephen Hemminger

Bruce Richardson <bruce.richardson@intel.com> writes:

> On Sat, Feb 25, 2017 at 11:02:45AM -0500, Aaron Conole wrote:
>> There may be no way to gracefully recover, but the application
>> should be notified that a failure happened, rather than completely
>> aborting.  This allows the user to proceed with a "slow-path" type
>> solution.
>> 
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>>  lib/librte_eal/linuxapp/eal/eal.c | 13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>> 
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
>> index bf6b818..5023d0d 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -740,6 +740,12 @@ static int rte_eal_vfio_setup(void)
>>  }
>>  #endif
>>  
>> +static void rte_eal_init_alert(const char *msg)
>> +{
>> +    fprintf(stderr, "EAL: FATAL: %s\n", msg);
>> +    RTE_LOG(ERR, EAL, "%s\n", msg);
>> +}
>> +
>>  /* Launch threads, called at application init(). */
>>  int
>>  rte_eal_init(int argc, char **argv)
>> @@ -767,8 +773,11 @@ rte_eal_init(int argc, char **argv)
>>  	/* set log level as early as possible */
>>  	rte_set_log_level(internal_config.log_level);
>>  
>> -	if (rte_eal_cpu_init() < 0)
>> -		rte_panic("Cannot detect lcores\n");
>> +	if (rte_eal_cpu_init() < 0) {
>> +		rte_eal_init_alert("Cannot detect lcores.");
>> +		rte_errno = ENOTSUP;
>> +		return -1;
>> +	}
>>  
>>  	fctret = eal_parse_args(argc, argv);
>>  	if (fctret < 0)
>> -- 
> eal.c needs to include rte_errno.h after this change, otherwise it won't
> compile.

oops.. I reordered some of my original work, and the rte_errno.h change
was introduced too late.  Thanks for catching this!

> /Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v4 06/26] eal-common: introduce a way to query cpu support
  2017-02-27 14:33         ` Aaron Conole
@ 2017-02-27 15:11           ` Bruce Richardson
  0 siblings, 0 replies; 159+ messages in thread
From: Bruce Richardson @ 2017-02-27 15:11 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger

On Mon, Feb 27, 2017 at 09:33:19AM -0500, Aaron Conole wrote:
> Bruce Richardson <bruce.richardson@intel.com> writes:
> 
> > On Sat, Feb 25, 2017 at 11:02:49AM -0500, Aaron Conole wrote:
> >> This adds a new API to check for the eal cpu versions.
> >> 
> >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> >> ---
> >>  lib/librte_eal/common/eal_common_cpuflags.c          | 13 +++++++++++--
> >>  lib/librte_eal/common/include/generic/rte_cpuflags.h |  9 +++++++++
> >>  2 files changed, 20 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
> >> index b5f76f7..2c2127b 100644
> >> --- a/lib/librte_eal/common/eal_common_cpuflags.c
> >> +++ b/lib/librte_eal/common/eal_common_cpuflags.c
> >> @@ -43,6 +43,13 @@
> >>  void
> >>  rte_cpu_check_supported(void)
> >>  {
> >> +	if (!rte_cpu_is_supported())
> >> +		exit(1);
> >> +}
> >> +
> >> +bool
> >> +rte_cpu_is_supported(void)
> >> +{
> >>  	/* This is generated at compile-time by the build system */
> >>  	static const enum rte_cpu_flag_t compile_time_flags[] = {
> >>  			RTE_COMPILE_TIME_CPUFLAGS
> >> @@ -57,14 +64,16 @@ rte_cpu_check_supported(void)
> >>  			fprintf(stderr,
> >>  				"ERROR: CPU feature flag lookup failed with error %d\n",
> >>  				ret);
> >> -			exit(1);
> >> +			return false;
> >>  		}
> >>  		if (!ret) {
> >>  			fprintf(stderr,
> >>  			        "ERROR: This system does not support \"%s\".\n"
> >>  			        "Please check that RTE_MACHINE is set correctly.\n",
> >>  			        rte_cpu_get_flag_name(compile_time_flags[i]));
> >> -			exit(1);
> >> +			return false;
> >>  		}
> >>  	}
> >> +
> >> +	return true;
> >>  }
> >> diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
> >> index 71321f3..e4342ad 100644
> >> --- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
> >> +++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
> >> @@ -40,6 +40,7 @@
> >>   */
> >>  
> >>  #include <errno.h>
> >> +#include <stdbool.h>
> >>  
> >
> > The addition of this include is causing all sorts of compilation errors
> > inside the PMDs, as many of them seem to be defining their own bools
> > types. :-(
> >
> > For safety sake, probably best to have the function return int rather
> > than bool.
> 
> Will do - I never saw the issue, but perhaps I was excluding the PMDs
> in question.
> 
> Thanks for the review, Bruce!

No problem.
FYI, the drivers I saw the errors in with this patch are:
* qede
* i40e
* ixgbe
* cxgbe

Regards,
/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v5 00/26] linux/eal: Remove most causes of panic on init
  2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
                       ` (26 preceding siblings ...)
  2017-02-27 13:59     ` [PATCH v4 00/26] linux/eal: Remove most causes of panic on init Bruce Richardson
@ 2017-02-27 16:17     ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 01/26] eal: CPU init will no longer panic Aaron Conole
                         ` (27 more replies)
  27 siblings, 28 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

In many cases, it's enough to simply let the application know that the
call to initialize DPDK has failed.  A complete halt can then be
decided by the application based on error returned (and the app could
even attempt a possible re-attempt after some corrective action by the
user or application).

Changes ->v2:
- Audited all "RTE_LOG (" calls that were introduced, and converted
  to "RTE_LOG("
- Added some fprintf(stderr, "") lines to indicate errors before logging
  is initialized
- Removed assignments to errno.
- Changed patch 14/25 to reflect EFAULT, and document in 25/25

Changes ->v3:
- Checkpatch issues in patches 3 (spelling mistake), 9 (issue with leading
  spaces), and 19 (braces around single line statement if-condition)

Changes ->v4:
- Error text cleanup.
- Add a new check around rte_bus_scan(), added during the development of
  this series.

Changes ->v5:
- checkpatch.pl cleanup in patch 02/26
- move rte_errno.h include from patch 15 to patch 02
- remove stdbool.h and use int as return type in patch 06/26

I kept the rte_errno reflection, since this is control-path code and the
init function returns a sentinel value of -1.

Aaron Conole (26):
  eal: CPU init will no longer panic
  eal: return error instead of panic for cpu init
  eal: No panic on hugepages info init
  eal: do not panic on failed hugepage query
  eal: failure to parse args returns error
  eal-common: introduce a way to query cpu support
  eal: Signal error when CPU isn't supported
  eal: do not panic on memzone initialization fails
  eal: set errno when exiting for already called
  eal: Do not panic on log failures
  eal: Do not panic on pci-probe
  eal: do not panic on vfio failure
  eal: do not panic on memory init
  eal: do not panic on tailq init
  eal: do not panic on alarm init
  eal: convert timer_init not to call panic
  eal: change the private pipe call to reflect errno
  eal: Do not panic on interrupt thread init
  eal: do not error if plugins fail to init
  eal_pci: Continue probing even on failures
  eal: do not panic on failed PCI probe
  eal_common_dev: continue initializing vdevs
  eal: do not panic (or abort) if vdev init fails
  eal: do not panic when bus probe fails
  eal: do not panic on failed bus scan
  rte_eal_init: add info about rte_errno codes

 lib/librte_eal/common/eal_common_cpuflags.c        |  13 +-
 lib/librte_eal/common/eal_common_dev.c             |   5 +-
 lib/librte_eal/common/eal_common_lcore.c           |   7 +-
 lib/librte_eal/common/eal_common_pci.c             |  15 ++-
 lib/librte_eal/common/eal_common_tailqs.c          |   3 +-
 .../common/include/generic/rte_cpuflags.h          |   8 ++
 lib/librte_eal/common/include/rte_eal.h            |  27 ++++-
 lib/librte_eal/linuxapp/eal/eal.c                  | 131 +++++++++++++++------
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c    |   6 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |   5 +-
 10 files changed, 168 insertions(+), 52 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v5 01/26] eal: CPU init will no longer panic
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 02/26] eal: return error instead of panic for cpu init Aaron Conole
                         ` (26 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After this change, the EAL CPU NUMA node resolution step can no longer
emit an rte_panic.  This aligns with the code in rte_eal_init, which
expects failures to return an error code.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_lcore.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_lcore.c b/lib/librte_eal/common/eal_common_lcore.c
index 2cd4132..84fa0cb 100644
--- a/lib/librte_eal/common/eal_common_lcore.c
+++ b/lib/librte_eal/common/eal_common_lcore.c
@@ -83,16 +83,17 @@ rte_eal_cpu_init(void)
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = eal_cpu_core_id(lcore_id);
 		lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id);
-		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES)
+		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES) {
 #ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
 			lcore_config[lcore_id].socket_id = 0;
 #else
-			rte_panic("Socket ID (%u) is greater than "
+			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than "
 				"RTE_MAX_NUMA_NODES (%d)\n",
 				lcore_config[lcore_id].socket_id,
 				RTE_MAX_NUMA_NODES);
+			return -1;
 #endif
-
+		}
 		RTE_LOG(DEBUG, EAL, "Detected lcore %u as "
 				"core %u on socket %u\n",
 				lcore_id, lcore_config[lcore_id].core_id,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 02/26] eal: return error instead of panic for cpu init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 01/26] eal: CPU init will no longer panic Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 03/26] eal: No panic on hugepages info init Aaron Conole
                         ` (25 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There may be no way to gracefully recover, but the application
should be notified that a failure happened, rather than completely
aborting.  This allows the user to proceed with a "slow-path" type
solution.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index bf6b818..81692e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -61,6 +61,7 @@
 #include <rte_launch.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
+#include <rte_errno.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
 #include <rte_log.h>
@@ -740,6 +741,12 @@ static int rte_eal_vfio_setup(void)
 }
 #endif
 
+static void rte_eal_init_alert(const char *msg)
+{
+	fprintf(stderr, "EAL: FATAL: %s\n", msg);
+	RTE_LOG(ERR, EAL, "%s\n", msg);
+}
+
 /* Launch threads, called at application init(). */
 int
 rte_eal_init(int argc, char **argv)
@@ -767,8 +774,11 @@ rte_eal_init(int argc, char **argv)
 	/* set log level as early as possible */
 	rte_set_log_level(internal_config.log_level);
 
-	if (rte_eal_cpu_init() < 0)
-		rte_panic("Cannot detect lcores\n");
+	if (rte_eal_cpu_init() < 0) {
+		rte_eal_init_alert("Cannot detect lcores.");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	fctret = eal_parse_args(argc, argv);
 	if (fctret < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 03/26] eal: No panic on hugepages info init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 01/26] eal: CPU init will no longer panic Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 02/26] eal: return error instead of panic for cpu init Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-28 14:25         ` Bruce Richardson
  2017-02-27 16:17       ` [PATCH v5 04/26] eal: do not panic on failed hugepage query Aaron Conole
                         ` (24 subsequent siblings)
  27 siblings, 1 reply; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When attempting to scan hugepages, signal to the eal.c that an error has
occurred, rather than performing a panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 18858e2..4d47eaf 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -283,9 +283,11 @@ eal_hugepage_info_init(void)
 	struct dirent *dirent;
 
 	dir = opendir(sys_dir_path);
-	if (dir == NULL)
-		rte_panic("Cannot open directory %s to read system hugepage "
+	if (dir == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot open directory %s to read system hugepage "
 			  "info\n", sys_dir_path);
+		return -1;
+	}
 
 	for (dirent = readdir(dir); dirent != NULL; dirent = readdir(dir)) {
 		struct hugepage_info *hpi;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 04/26] eal: do not panic on failed hugepage query
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (2 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 03/26] eal: No panic on hugepages info init Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 05/26] eal: failure to parse args returns error Aaron Conole
                         ` (23 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

If we fail to acquire hugepage information, simply signal an error to
the application.  This clears the run_once counter, allowing the user or
application to take a corrective action and retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 81692e7..12bd941 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -787,8 +787,12 @@ rte_eal_init(int argc, char **argv)
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
 			internal_config.xen_dom0_support == 0 &&
-			eal_hugepage_info_init() < 0)
-		rte_panic("Cannot get hugepage information\n");
+			eal_hugepage_info_init() < 0) {
+		rte_eal_init_alert("Cannot get hugepage information.");
+		rte_errno = EACCES;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
 		if (internal_config.no_hugetlbfs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 05/26] eal: failure to parse args returns error
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (3 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 04/26] eal: do not panic on failed hugepage query Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 06/26] eal-common: introduce a way to query cpu support Aaron Conole
                         ` (22 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's possible that the application could take a corrective action here,
and either prompt the user for different arguments, or at least perform
a better logging.  Exiting this early prevents any useful information
gathering from the application layer.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 12bd941..f7511ab 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -781,8 +781,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	fctret = eal_parse_args(argc, argv);
-	if (fctret < 0)
-		exit(1);
+	if (fctret < 0) {
+		rte_eal_init_alert("Invalid 'command line' arguments.");
+		rte_errno = EINVAL;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 06/26] eal-common: introduce a way to query cpu support
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (4 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 05/26] eal: failure to parse args returns error Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 07/26] eal: Signal error when CPU isn't supported Aaron Conole
                         ` (21 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This adds a new API to check for the eal cpu versions.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_cpuflags.c          | 13 +++++++++++--
 lib/librte_eal/common/include/generic/rte_cpuflags.h |  8 ++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
index b5f76f7..9a2d080 100644
--- a/lib/librte_eal/common/eal_common_cpuflags.c
+++ b/lib/librte_eal/common/eal_common_cpuflags.c
@@ -43,6 +43,13 @@
 void
 rte_cpu_check_supported(void)
 {
+	if (!rte_cpu_is_supported())
+		exit(1);
+}
+
+int
+rte_cpu_is_supported(void)
+{
 	/* This is generated at compile-time by the build system */
 	static const enum rte_cpu_flag_t compile_time_flags[] = {
 			RTE_COMPILE_TIME_CPUFLAGS
@@ -57,14 +64,16 @@ rte_cpu_check_supported(void)
 			fprintf(stderr,
 				"ERROR: CPU feature flag lookup failed with error %d\n",
 				ret);
-			exit(1);
+			return 0;
 		}
 		if (!ret) {
 			fprintf(stderr,
 			        "ERROR: This system does not support \"%s\".\n"
 			        "Please check that RTE_MACHINE is set correctly.\n",
 			        rte_cpu_get_flag_name(compile_time_flags[i]));
-			exit(1);
+			return 0;
 		}
 	}
+
+	return 1;
 }
diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
index 71321f3..8d27031 100644
--- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -82,4 +82,12 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature);
 void
 rte_cpu_check_supported(void);
 
+/**
+ * This function checks that the currently used CPU supports the CPU features
+ * that were specified at compile time. It is called automatically within the
+ * EAL, so does not need to be used by applications.  This version returns a
+ * result so that decisions may be made (for instance, graceful shutdowns).
+ */
+int
+rte_cpu_is_supported(void);
 #endif /* _RTE_CPUFLAGS_H_ */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 07/26] eal: Signal error when CPU isn't supported
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (5 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 06/26] eal-common: introduce a way to query cpu support Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 08/26] eal: do not panic on memzone initialization fails Aaron Conole
                         ` (20 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's now possible to gracefully exit the application, or for
applications which support non-dpdk datapaths working in concert with
DPDK datapaths, there no longer is the possibility of exiting for
unsupported CPUs.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f7511ab..a671ed4 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -759,7 +759,11 @@ rte_eal_init(int argc, char **argv)
 	char thread_name[RTE_MAX_THREAD_NAME_LEN];
 
 	/* checks if the machine is adequate */
-	rte_cpu_check_supported();
+	if (!rte_cpu_is_supported()) {
+		rte_eal_init_alert("unsupported cpu type.");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 08/26] eal: do not panic on memzone initialization fails
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (6 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 07/26] eal: Signal error when CPU isn't supported Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-28 14:27         ` Bruce Richardson
  2017-02-27 16:17       ` [PATCH v5 09/26] eal: set errno when exiting for already called Aaron Conole
                         ` (19 subsequent siblings)
  27 siblings, 1 reply; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When memzone initialization fails, report the error to the calling
application rather than panic().  Without a good way of detaching /
releasing hugepages, at this point the application will have to restart.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index a671ed4..1e54ca1 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -839,8 +839,11 @@ rte_eal_init(int argc, char **argv)
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
 
-	if (rte_eal_memzone_init() < 0)
-		rte_panic("Cannot init memzone\n");
+	if (rte_eal_memzone_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init memzone\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
 
 	if (rte_eal_tailqs_init() < 0)
 		rte_panic("Cannot init tail queues for objects\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 09/26] eal: set errno when exiting for already called
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (7 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 08/26] eal: do not panic on memzone initialization fails Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 10/26] eal: Do not panic on log failures Aaron Conole
                         ` (18 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 1e54ca1..4b6c7b8 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -765,8 +765,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (!rte_atomic32_test_and_set(&run_once))
+	if (!rte_atomic32_test_and_set(&run_once)) {
+		rte_eal_init_alert("already called initialization.");
+		rte_errno = EALREADY;
 		return -1;
+	}
 
 	logid = strrchr(argv[0], '/');
 	logid = strdup(logid ? logid + 1: argv[0]);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 10/26] eal: Do not panic on log failures
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (8 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 09/26] eal: set errno when exiting for already called Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 11/26] eal: Do not panic on pci-probe Aaron Conole
                         ` (17 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When log initialization fails, it's generally because the fopencookie
failed.  While this is rare in practice, it could happen, and it is
likely because of memory pressure.  So, flag the error, and allow the
user to retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 4b6c7b8..46bbaa7 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -825,8 +825,12 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
-	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
-		rte_panic("Cannot init logs\n");
+	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
+		rte_eal_init_alert("Cannot init logging.");
+		rte_errno = ENOMEM;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 11/26] eal: Do not panic on pci-probe
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (9 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 10/26] eal: Do not panic on log failures Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 12/26] eal: do not panic on vfio failure Aaron Conole
                         ` (16 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This will usually be an issue because of permissions.  However, it could
also be caused by OOM.  In either case, errno will contain the
underlying cause.  It is safe to re-init the system here, so allow the
application to take corrective action and reinit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 46bbaa7..c9f8c11 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -832,8 +832,12 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_pci_init() < 0)
-		rte_panic("Cannot init PCI\n");
+	if (rte_eal_pci_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init PCI\n");
+		rte_errno = EUNATCH;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 12/26] eal: do not panic on vfio failure
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (10 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 11/26] eal: Do not panic on pci-probe Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 13/26] eal: do not panic on memory init Aaron Conole
                         ` (15 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index c9f8c11..2e7faa8 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -840,8 +840,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 #ifdef VFIO_PRESENT
-	if (rte_eal_vfio_setup() < 0)
-		rte_panic("Cannot init VFIO\n");
+	if (rte_eal_vfio_setup() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init VFIO\n");
+		rte_errno = EAGAIN;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 #endif
 
 	if (rte_eal_memory_init() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 13/26] eal: do not panic on memory init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (11 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 12/26] eal: do not panic on vfio failure Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:17       ` [PATCH v5 14/26] eal: do not panic on tailq init Aaron Conole
                         ` (14 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This can only happen when access to hugepages (either as primary or
secondary process) fails (and that is usually permissions).  Since the
manner of failure is not reversible, we cannot allow retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 2e7faa8..799f5f6 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -848,8 +848,11 @@ rte_eal_init(int argc, char **argv)
 	}
 #endif
 
-	if (rte_eal_memory_init() < 0)
-		rte_panic("Cannot init memory\n");
+	if (rte_eal_memory_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init memory\n");
+		rte_errno = ENOMEM;
+		return -1;
+	}
 
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 14/26] eal: do not panic on tailq init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (12 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 13/26] eal: do not panic on memory init Aaron Conole
@ 2017-02-27 16:17       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 15/26] eal: do not panic on alarm init Aaron Conole
                         ` (13 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:17 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There are some theoretical racy conditions in the system that _could_
cause early tailq init to fail;  however, no need to panic the
application.  While it can't continue using DPDK, it could make better
alerts to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_tailqs.c | 3 +--
 lib/librte_eal/linuxapp/eal/eal.c         | 7 +++++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_tailqs.c b/lib/librte_eal/common/eal_common_tailqs.c
index bb08ec8..4f69828 100644
--- a/lib/librte_eal/common/eal_common_tailqs.c
+++ b/lib/librte_eal/common/eal_common_tailqs.c
@@ -188,8 +188,7 @@ rte_eal_tailqs_init(void)
 		if (t->head == NULL) {
 			RTE_LOG(ERR, EAL,
 				"Cannot initialize tailq: %s\n", t->name);
-			/* no need to TAILQ_REMOVE, we are going to panic in
-			 * rte_eal_init() */
+			/* TAILQ_REMOVE not needed, error is already fatal */
 			goto fail;
 		}
 	}
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 799f5f6..e0767c0 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -863,8 +863,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_tailqs_init() < 0)
-		rte_panic("Cannot init tail queues for objects\n");
+	if (rte_eal_tailqs_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init tail queues for objects\n");
+		rte_errno = EFAULT;
+		return -1;
+	}
 
 	if (rte_eal_alarm_init() < 0)
 		rte_panic("Cannot init interrupt-handling thread\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 15/26] eal: do not panic on alarm init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (13 preceding siblings ...)
  2017-02-27 16:17       ` [PATCH v5 14/26] eal: do not panic on tailq init Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 16/26] eal: convert timer_init not to call panic Aaron Conole
                         ` (12 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

rte_eal_alarm_init() call uses the linux timerfd framework to create a
poll()-able timer using standard posix file operations.  This could fail
for a few reasons given in the man-pages, but many could be
corrected by the user application.  No need to panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index e0767c0..23811bf 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -869,8 +869,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_alarm_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_alarm_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init interrupt-handling thread\n");
+		/* rte_eal_alarm_init sets rte_errno on failure. */
+		return -1;
+	}
 
 	if (rte_eal_timer_init() < 0)
 		rte_panic("Cannot init HPET or TSC timers\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 16/26] eal: convert timer_init not to call panic
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (14 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 15/26] eal: do not panic on alarm init Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 17/26] eal: change the private pipe call to reflect errno Aaron Conole
                         ` (11 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After code inspection, there is no way for eal_timer_init() to fail.  It
simply returns 0 in all cases.  As such, this test could either go-away
or stay here as 'future-proofing'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 23811bf..81085d5 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -875,8 +875,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_timer_init() < 0)
-		rte_panic("Cannot init HPET or TSC timers\n");
+	if (rte_eal_timer_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init HPET or TSC timers\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	eal_check_mem_on_local_socket();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 17/26] eal: change the private pipe call to reflect errno
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (15 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 16/26] eal: convert timer_init not to call panic Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 18/26] eal: Do not panic on interrupt thread init Aaron Conole
                         ` (10 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There could be some confusion as to why the call failed - this change
will always reflect the value of the error in rte_error.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 92a19cb..5bb833e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -898,13 +898,16 @@ rte_eal_intr_init(void)
 	 * create a pipe which will be waited by epoll and notified to
 	 * rebuild the wait list of epoll.
 	 */
-	if (pipe(intr_pipe.pipefd) < 0)
+	if (pipe(intr_pipe.pipefd) < 0) {
+		rte_errno = errno;
 		return -1;
+	}
 
 	/* create the host thread to wait/handle the interrupt */
 	ret = pthread_create(&intr_thread, NULL,
 			eal_intr_thread_main, NULL);
 	if (ret != 0) {
+		rte_errno = ret;
 		RTE_LOG(ERR, EAL,
 			"Failed to create thread for interrupt handling\n");
 	} else {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 18/26] eal: Do not panic on interrupt thread init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (16 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 17/26] eal: change the private pipe call to reflect errno Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 19/26] eal: do not error if plugins fail to init Aaron Conole
                         ` (9 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When initializing the interrupt thread, there are a number of possible
reasons for failure - some of which are correctable by the application.
Do not panic() needlessly, and give the application a change to reflect
this information to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 81085d5..b4ae845 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -894,8 +894,10 @@ rte_eal_init(int argc, char **argv)
 		rte_config.master_lcore, (int)thread_id, cpuset,
 		ret == 0 ? "" : "...");
 
-	if (rte_eal_intr_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_intr_init() < 0) {
+		RTE_LOG(ERR, EAL, "Cannot init interrupt-handling thread\n");
+		return -1;
+	}
 
 	if (rte_bus_scan())
 		rte_panic("Cannot scan the buses for devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 19/26] eal: do not error if plugins fail to init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (17 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 18/26] eal: Do not panic on interrupt thread init Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 20/26] eal_pci: Continue probing even on failures Aaron Conole
                         ` (8 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Plugins are useful and important.  However, it seems crazy to abort
everything just because they don't initialize properly.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index b4ae845..291fa53 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -884,7 +884,7 @@ rte_eal_init(int argc, char **argv)
 	eal_check_mem_on_local_socket();
 
 	if (eal_plugins_init() < 0)
-		rte_panic("Cannot init plugins\n");
+		RTE_LOG(ERR, EAL, "Cannot init plugins\n");
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 20/26] eal_pci: Continue probing even on failures
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (18 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 19/26] eal: do not error if plugins fail to init Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 21/26] eal: do not panic on failed PCI probe Aaron Conole
                         ` (7 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Some devices may be inaccessible for a variety of reasons, or the
PCI-bus may be unavailable causing the whole thing to fail.  Still,
better to continue attempts at probes.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_pci.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 72547bd..9416190 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -69,6 +69,7 @@
 #include <sys/queue.h>
 #include <sys/mman.h>
 
+#include <rte_errno.h>
 #include <rte_interrupts.h>
 #include <rte_log.h>
 #include <rte_pci.h>
@@ -416,6 +417,7 @@ rte_eal_pci_probe(void)
 	struct rte_pci_device *dev = NULL;
 	struct rte_devargs *devargs;
 	int probe_all = 0;
+	int ret_1 = 0;
 	int ret = 0;
 
 	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
@@ -430,17 +432,20 @@ rte_eal_pci_probe(void)
 
 		/* probe all or only whitelisted devices */
 		if (probe_all)
-			ret = pci_probe_all_drivers(dev);
+			ret_1 = pci_probe_all_drivers(dev);
 		else if (devargs != NULL &&
 			devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
-			ret = pci_probe_all_drivers(dev);
-		if (ret < 0)
-			rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
+			ret_1 = pci_probe_all_drivers(dev);
+		if (ret_1 < 0) {
+			RTE_LOG(ERR, EAL, "Requested device " PCI_PRI_FMT
 				 " cannot be used\n", dev->addr.domain, dev->addr.bus,
 				 dev->addr.devid, dev->addr.function);
+			rte_errno = errno;
+			ret = 1;
+		}
 	}
 
-	return 0;
+	return -ret;
 }
 
 /* dump one device */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 21/26] eal: do not panic on failed PCI probe
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (19 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 20/26] eal_pci: Continue probing even on failures Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
                         ` (6 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It may even be possible to simply log the error and continue on letting
the user check the logs and restart the application when things are failed.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 291fa53..fbc1dcc 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -943,8 +943,11 @@ rte_eal_init(int argc, char **argv)
 		rte_panic("Cannot probe devices\n");
 
 	/* Probe & Initialize PCI devices */
-	if (rte_eal_pci_probe())
-		rte_panic("Cannot probe PCI\n");
+	if (rte_eal_pci_probe()) {
+		RTE_LOG(ERR, EAL, "Cannot probe PCI\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 22/26] eal_common_dev: continue initializing vdevs
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (20 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 21/26] eal: do not panic on failed PCI probe Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
                         ` (5 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Even if one vdev should fail, there's no need to prevent further
processing.  Log the error, and reflect it to the higher levels to
decide.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/eal_common_dev.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 4f3b493..9889997 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -80,6 +80,7 @@ int
 rte_eal_dev_init(void)
 {
 	struct rte_devargs *devargs;
+	int ret = 0;
 
 	/*
 	 * Note that the dev_driver_list is populated here
@@ -97,11 +98,11 @@ rte_eal_dev_init(void)
 					devargs->args)) {
 			RTE_LOG(ERR, EAL, "failed to initialize %s device\n",
 					devargs->virt.drv_name);
-			return -1;
+			ret = -1;
 		}
 	}
 
-	return 0;
+	return ret;
 }
 
 int rte_eal_dev_attach(const char *name, const char *devargs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 23/26] eal: do not panic (or abort) if vdev init fails
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (21 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 24/26] eal: do not panic when bus probe fails Aaron Conole
                         ` (4 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Seems like it's possible to continue.  At least, the error is reflected
properly in the logs.  A user could then go and correct or investigate
the situation.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index fbc1dcc..77a1950 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -950,7 +950,7 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	if (rte_eal_dev_init() < 0)
-		rte_panic("Cannot init pmd devices\n");
+		RTE_LOG(ERR, EAL, "Cannot init pmd devices\n");
 
 	rte_eal_mcfg_complete();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 24/26] eal: do not panic when bus probe fails
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (22 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 25/26] eal: do not panic on failed bus scan Aaron Conole
                         ` (3 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 77a1950..361256f 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -939,8 +939,11 @@ rte_eal_init(int argc, char **argv)
 	rte_eal_mp_wait_lcore();
 
 	/* Probe all the buses and devices/drivers on them */
-	if (rte_bus_probe())
-		rte_panic("Cannot probe devices\n");
+	if (rte_bus_probe()) {
+		RTE_LOG(ERR, EAL, "Cannot probe devices\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	/* Probe & Initialize PCI devices */
 	if (rte_eal_pci_probe()) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 25/26] eal: do not panic on failed bus scan
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (23 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 24/26] eal: do not panic when bus probe fails Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-27 16:18       ` [PATCH v5 26/26] rte_eal_init: add info about rte_errno codes Aaron Conole
                         ` (2 subsequent siblings)
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

For now, do an abort.  It's likely that even aborting the initialization
is premature in this case, as it may be possible to proceed even if one
bus or another is not available.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 361256f..77f0d24 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -899,8 +899,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_bus_scan())
-		rte_panic("Cannot scan the buses for devices\n");
+	if (rte_bus_scan()) {
+		RTE_LOG(ERR, EAL, "Cannot scan the buses for devices\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
 
 	RTE_LCORE_FOREACH_SLAVE(i) {
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v5 26/26] rte_eal_init: add info about rte_errno codes
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (24 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 25/26] eal: do not panic on failed bus scan Aaron Conole
@ 2017-02-27 16:18       ` Aaron Conole
  2017-02-28 14:45       ` [PATCH v5 00/26] linux/eal: Remove most causes of panic on init Bruce Richardson
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
  27 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-27 16:18 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

The rte_eal_init function will now pass failure reason hints to the
application.  To help app developers deciper this, add some brief
information about what the codes are indicating.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_eal/common/include/rte_eal.h | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 03fee50..9251244 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -159,7 +159,32 @@ int rte_eal_iopl_init(void);
  *     function call and should not be further interpreted by the
  *     application.  The EAL does not take any ownership of the memory used
  *     for either the argv array, or its members.
- *   - On failure, a negative error value.
+ *   - On failure, -1 and rte_errno is set to a value indicating the cause
+ *     for failure.  In some instances, the application will need to be
+ *     restarted as part of clearing the issue.
+ *
+ *   Error codes returned via rte_errno:
+ *     EACCES indicates a permissions issue.
+ *
+ *     EAGAIN indicates either a bus or system resource was not available,
+ *            setup may be attempted again.
+ *
+ *     EALREADY indicates that the rte_eal_init function has already been
+ *              called, and cannot be called again.
+ *
+ *     EFAULT indicates the tailq configuration name was not found in
+ *            memory configuration.
+ *
+ *     EINVAL indicates invalid parameters were passed as argv/argc.
+ *
+ *     ENOMEM indicates failure likely caused by an out-of-memory condition.
+ *
+ *     ENODEV indicates memory setup issues.
+ *
+ *     ENOTSUP indicates that the EAL cannot initialize on this system.
+ *
+ *     EUNATCH indicates that the PCI bus is either not present, or is not
+ *             readable by the eal.
  */
 int rte_eal_init(int argc, char **argv);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* Re: [PATCH v5 03/26] eal: No panic on hugepages info init
  2017-02-27 16:17       ` [PATCH v5 03/26] eal: No panic on hugepages info init Aaron Conole
@ 2017-02-28 14:25         ` Bruce Richardson
  2017-02-28 14:48           ` Aaron Conole
  0 siblings, 1 reply; 159+ messages in thread
From: Bruce Richardson @ 2017-02-28 14:25 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger

On Mon, Feb 27, 2017 at 11:17:48AM -0500, Aaron Conole wrote:
> When attempting to scan hugepages, signal to the eal.c that an error has
> occurred, rather than performing a panic.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> index 18858e2..4d47eaf 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> @@ -283,9 +283,11 @@ eal_hugepage_info_init(void)
>  	struct dirent *dirent;
>  
>  	dir = opendir(sys_dir_path);
> -	if (dir == NULL)
> -		rte_panic("Cannot open directory %s to read system hugepage "
> +	if (dir == NULL) {
> +		RTE_LOG(ERR, EAL, "Cannot open directory %s to read system hugepage "
>  			  "info\n", sys_dir_path);
> +		return -1;
> +	}

Minor nit.
The error message should go on a line on its own, without any breaks to
make it easy to "grep". This should also eliminate the checkpatch
complaint about it being too long.

/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v5 08/26] eal: do not panic on memzone initialization fails
  2017-02-27 16:17       ` [PATCH v5 08/26] eal: do not panic on memzone initialization fails Aaron Conole
@ 2017-02-28 14:27         ` Bruce Richardson
  2017-02-28 14:46           ` Aaron Conole
  0 siblings, 1 reply; 159+ messages in thread
From: Bruce Richardson @ 2017-02-28 14:27 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger

On Mon, Feb 27, 2017 at 11:17:53AM -0500, Aaron Conole wrote:
> When memzone initialization fails, report the error to the calling
> application rather than panic().  Without a good way of detaching /
> releasing hugepages, at this point the application will have to restart.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index a671ed4..1e54ca1 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -839,8 +839,11 @@ rte_eal_init(int argc, char **argv)
>  	/* the directories are locked during eal_hugepage_info_init */
>  	eal_hugedirs_unlock();
>  
> -	if (rte_eal_memzone_init() < 0)
> -		rte_panic("Cannot init memzone\n");
> +	if (rte_eal_memzone_init() < 0) {
> +		RTE_LOG(ERR, EAL, "Cannot init memzone\n");

Any particular reason why not "rte_eal_init_alert" as with the other
cases?

/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v5 00/26] linux/eal: Remove most causes of panic on init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (25 preceding siblings ...)
  2017-02-27 16:18       ` [PATCH v5 26/26] rte_eal_init: add info about rte_errno codes Aaron Conole
@ 2017-02-28 14:45       ` Bruce Richardson
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
  27 siblings, 0 replies; 159+ messages in thread
From: Bruce Richardson @ 2017-02-28 14:45 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger

On Mon, Feb 27, 2017 at 11:17:45AM -0500, Aaron Conole wrote:
> In many cases, it's enough to simply let the application know that the
> call to initialize DPDK has failed.  A complete halt can then be
> decided by the application based on error returned (and the app could
> even attempt a possible re-attempt after some corrective action by the
> user or application).
> 
Set looks pretty good. Just a couple of minor issues, apart from the one
checkpatch issue I flagged:

* seems to be some inconsistency between using the "rte_eal_init_alert"
  function and RTE_LOG. If there is some reason why some cases use
  the function and others don't I think that might need to be called out
  somewhere.
* check-git-log flags a number of minor errors with commit titles and
  messages - most common is commit titles starting with a capital
  letter.

Otherwise:
Series-Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v5 08/26] eal: do not panic on memzone initialization fails
  2017-02-28 14:27         ` Bruce Richardson
@ 2017-02-28 14:46           ` Aaron Conole
  0 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 14:46 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Stephen Hemminger

Bruce Richardson <bruce.richardson@intel.com> writes:

> On Mon, Feb 27, 2017 at 11:17:53AM -0500, Aaron Conole wrote:
>> When memzone initialization fails, report the error to the calling
>> application rather than panic().  Without a good way of detaching /
>> releasing hugepages, at this point the application will have to restart.
>> 
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>>  lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>> 
>> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
>> index a671ed4..1e54ca1 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -839,8 +839,11 @@ rte_eal_init(int argc, char **argv)
>>  	/* the directories are locked during eal_hugepage_info_init */
>>  	eal_hugedirs_unlock();
>>  
>> -	if (rte_eal_memzone_init() < 0)
>> -		rte_panic("Cannot init memzone\n");
>> +	if (rte_eal_memzone_init() < 0) {
>> +		RTE_LOG(ERR, EAL, "Cannot init memzone\n");
>
> Any particular reason why not "rte_eal_init_alert" as with the other
> cases?

I only used rte_eal_init_alert() for cases which occur before logging
happens, but I am not opposed to switching it for all the cases.  I'll
swap them all for v6.

Thanks again for the review, Bruce!

-Aaron

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v5 03/26] eal: No panic on hugepages info init
  2017-02-28 14:25         ` Bruce Richardson
@ 2017-02-28 14:48           ` Aaron Conole
  0 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 14:48 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev, Stephen Hemminger

Bruce Richardson <bruce.richardson@intel.com> writes:

> On Mon, Feb 27, 2017 at 11:17:48AM -0500, Aaron Conole wrote:
>> When attempting to scan hugepages, signal to the eal.c that an error has
>> occurred, rather than performing a panic.
>> 
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>>  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>> 
>> diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
>> index 18858e2..4d47eaf 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
>> @@ -283,9 +283,11 @@ eal_hugepage_info_init(void)
>>  	struct dirent *dirent;
>>  
>>  	dir = opendir(sys_dir_path);
>> -	if (dir == NULL)
>> -		rte_panic("Cannot open directory %s to read system hugepage "
>> +	if (dir == NULL) {
>> +		RTE_LOG(ERR, EAL, "Cannot open directory %s to read system hugepage "
>>  			  "info\n", sys_dir_path);
>> +		return -1;
>> +	}
>
> Minor nit.
> The error message should go on a line on its own, without any breaks to
> make it easy to "grep". This should also eliminate the checkpatch
> complaint about it being too long.

Yes, will fix it for v6.

-Aaron

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v6 00/26] linux/eal: Remove most causes of panic on init
  2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
                         ` (26 preceding siblings ...)
  2017-02-28 14:45       ` [PATCH v5 00/26] linux/eal: Remove most causes of panic on init Bruce Richardson
@ 2017-02-28 18:52       ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 01/26] eal: cpu init will no longer panic Aaron Conole
                           ` (26 more replies)
  27 siblings, 27 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

From: Aaron Conole <aconole@bytheb.org>

In many cases, it's enough to simply let the application know that the
call to initialize DPDK has failed.  A complete halt can then be
decided by the application based on error returned (and the app could
even attempt a possible re-attempt after some corrective action by the
user or application).

Changes ->v2:
- Audited all "RTE_LOG (" calls that were introduced, and converted
  to "RTE_LOG("
- Added some fprintf(stderr, "") lines to indicate errors before logging
  is initialized
- Removed assignments to errno.
- Changed patch 14/25 to reflect EFAULT, and document in 25/25

Changes ->v3:
- Checkpatch issues in patches 3 (spelling mistake), 9 (issue with leading
  spaces), and 19 (braces around single line statement if-condition)

Changes ->v4:
- Error text cleanup.
- Add a new check around rte_bus_scan(), added during the development of
  this series.

Changes ->v5:
- checkpatch.pl cleanup in patch 02/26
- move rte_errno.h include from patch 15 to patch 02
- remove stdbool.h and use int as return type in patch 06/26

Changes ->v6:
- convert all of the initialization calls to RTE_LOG() to rte_eal_init_alert()
- run through check-git-log and checkpatches
- add Bruce's ack to the series

I kept the rte_errno reflection, since this is control-path code and the
init function returns a sentinel value of -1.

Aaron Conole (26):
  eal: cpu init will no longer panic
  eal: return error instead of panic for cpu init
  eal: do not panic on hugepage info init
  eal: do not panic on failed hugepage query
  eal: do not panic if parsing args returns error
  eal-common: introduce a way to query cpu support
  eal: do not panic when CPU isn't supported
  eal: do not panic on memzone initialization fails
  eal: set errno when exiting for already called
  eal: do not panic on log failures
  eal: do not panic on PCI-probe
  eal: do not panic on vfio failure
  eal: do not panic on memory init
  eal: do not panic on tailq init
  eal: do not panic on alarm init
  eal: convert timer init not to call panic
  eal: change the private pipe call to reflect errno
  eal: do not panic on interrupt thread init
  eal: do not error if plugins fail to init
  eal_pci: continue probing even on failures
  eal: do not panic on failed PCI-probe
  eal_common_dev: continue initializing vdevs
  eal: do not panic (or abort) if vdev init fails
  eal: do not panic when bus probe fails
  eal: do not panic on failed bus scan
  rte_eal_init: add info about various error codes

 lib/librte_eal/common/eal_common_cpuflags.c        |  13 +-
 lib/librte_eal/common/eal_common_dev.c             |   5 +-
 lib/librte_eal/common/eal_common_lcore.c           |   7 +-
 lib/librte_eal/common/eal_common_pci.c             |  15 ++-
 lib/librte_eal/common/eal_common_tailqs.c          |   3 +-
 .../common/include/generic/rte_cpuflags.h          |   8 ++
 lib/librte_eal/common/include/rte_eal.h            |  27 ++++-
 lib/librte_eal/linuxapp/eal/eal.c                  | 131 +++++++++++++++------
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c    |   9 +-
 lib/librte_eal/linuxapp/eal/eal_interrupts.c       |   5 +-
 10 files changed, 170 insertions(+), 53 deletions(-)

-- 
2.9.3

^ permalink raw reply	[flat|nested] 159+ messages in thread

* [PATCH v6 01/26] eal: cpu init will no longer panic
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 02/26] eal: return error instead of panic for cpu init Aaron Conole
                           ` (25 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After this change, the EAL CPU NUMA node resolution step can no longer
emit an rte_panic.  This aligns with the code in rte_eal_init, which
expects failures to return an error code.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/common/eal_common_lcore.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_lcore.c b/lib/librte_eal/common/eal_common_lcore.c
index 2cd4132..84fa0cb 100644
--- a/lib/librte_eal/common/eal_common_lcore.c
+++ b/lib/librte_eal/common/eal_common_lcore.c
@@ -83,16 +83,17 @@ rte_eal_cpu_init(void)
 		config->lcore_role[lcore_id] = ROLE_RTE;
 		lcore_config[lcore_id].core_id = eal_cpu_core_id(lcore_id);
 		lcore_config[lcore_id].socket_id = eal_cpu_socket_id(lcore_id);
-		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES)
+		if (lcore_config[lcore_id].socket_id >= RTE_MAX_NUMA_NODES) {
 #ifdef RTE_EAL_ALLOW_INV_SOCKET_ID
 			lcore_config[lcore_id].socket_id = 0;
 #else
-			rte_panic("Socket ID (%u) is greater than "
+			RTE_LOG(ERR, EAL, "Socket ID (%u) is greater than "
 				"RTE_MAX_NUMA_NODES (%d)\n",
 				lcore_config[lcore_id].socket_id,
 				RTE_MAX_NUMA_NODES);
+			return -1;
 #endif
-
+		}
 		RTE_LOG(DEBUG, EAL, "Detected lcore %u as "
 				"core %u on socket %u\n",
 				lcore_id, lcore_config[lcore_id].core_id,
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 02/26] eal: return error instead of panic for cpu init
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 01/26] eal: cpu init will no longer panic Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 03/26] eal: do not panic on hugepage info init Aaron Conole
                           ` (24 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There may be no way to gracefully recover, but the application
should be notified that a failure happened, rather than completely
aborting.  This allows the user to proceed with a "slow-path" type
solution.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index bf6b818..81692e7 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -61,6 +61,7 @@
 #include <rte_launch.h>
 #include <rte_eal.h>
 #include <rte_eal_memconfig.h>
+#include <rte_errno.h>
 #include <rte_per_lcore.h>
 #include <rte_lcore.h>
 #include <rte_log.h>
@@ -740,6 +741,12 @@ static int rte_eal_vfio_setup(void)
 }
 #endif
 
+static void rte_eal_init_alert(const char *msg)
+{
+	fprintf(stderr, "EAL: FATAL: %s\n", msg);
+	RTE_LOG(ERR, EAL, "%s\n", msg);
+}
+
 /* Launch threads, called at application init(). */
 int
 rte_eal_init(int argc, char **argv)
@@ -767,8 +774,11 @@ rte_eal_init(int argc, char **argv)
 	/* set log level as early as possible */
 	rte_set_log_level(internal_config.log_level);
 
-	if (rte_eal_cpu_init() < 0)
-		rte_panic("Cannot detect lcores\n");
+	if (rte_eal_cpu_init() < 0) {
+		rte_eal_init_alert("Cannot detect lcores.");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	fctret = eal_parse_args(argc, argv);
 	if (fctret < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 03/26] eal: do not panic on hugepage info init
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 01/26] eal: cpu init will no longer panic Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 02/26] eal: return error instead of panic for cpu init Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 04/26] eal: do not panic on failed hugepage query Aaron Conole
                           ` (23 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When attempting to scan hugepages, signal to the eal that an error has
occurred, rather than performing a panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 18858e2..7a21e8f 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -283,9 +283,12 @@ eal_hugepage_info_init(void)
 	struct dirent *dirent;
 
 	dir = opendir(sys_dir_path);
-	if (dir == NULL)
-		rte_panic("Cannot open directory %s to read system hugepage "
-			  "info\n", sys_dir_path);
+	if (dir == NULL) {
+		RTE_LOG(ERR, EAL,
+			"Cannot open directory %s to read system hugepage info\n",
+			sys_dir_path);
+		return -1;
+	}
 
 	for (dirent = readdir(dir); dirent != NULL; dirent = readdir(dir)) {
 		struct hugepage_info *hpi;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 04/26] eal: do not panic on failed hugepage query
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (2 preceding siblings ...)
  2017-02-28 18:52         ` [PATCH v6 03/26] eal: do not panic on hugepage info init Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 05/26] eal: do not panic if parsing args returns error Aaron Conole
                           ` (22 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

If we fail to acquire hugepage information, simply signal an error to
the application.  This clears the run_once counter, allowing the user or
application to take a corrective action and retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 81692e7..12bd941 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -787,8 +787,12 @@ rte_eal_init(int argc, char **argv)
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
 			internal_config.xen_dom0_support == 0 &&
-			eal_hugepage_info_init() < 0)
-		rte_panic("Cannot get hugepage information\n");
+			eal_hugepage_info_init() < 0) {
+		rte_eal_init_alert("Cannot get hugepage information.");
+		rte_errno = EACCES;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
 		if (internal_config.no_hugetlbfs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 05/26] eal: do not panic if parsing args returns error
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (3 preceding siblings ...)
  2017-02-28 18:52         ` [PATCH v6 04/26] eal: do not panic on failed hugepage query Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 06/26] eal-common: introduce a way to query cpu support Aaron Conole
                           ` (21 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's possible that the application could take a corrective action here,
and either prompt the user for different arguments, or at least perform
a better logging.  Exiting this early prevents any useful information
gathering from the application layer.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 12bd941..f7511ab 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -781,8 +781,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	fctret = eal_parse_args(argc, argv);
-	if (fctret < 0)
-		exit(1);
+	if (fctret < 0) {
+		rte_eal_init_alert("Invalid 'command line' arguments.");
+		rte_errno = EINVAL;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (internal_config.no_hugetlbfs == 0 &&
 			internal_config.process_type != RTE_PROC_SECONDARY &&
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 06/26] eal-common: introduce a way to query cpu support
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (4 preceding siblings ...)
  2017-02-28 18:52         ` [PATCH v6 05/26] eal: do not panic if parsing args returns error Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-03-08 21:45           ` Thomas Monjalon
  2017-02-28 18:52         ` [PATCH v6 07/26] eal: do not panic when CPU isn't supported Aaron Conole
                           ` (20 subsequent siblings)
  26 siblings, 1 reply; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This adds a new API to check for the eal cpu versions.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/common/eal_common_cpuflags.c          | 13 +++++++++++--
 lib/librte_eal/common/include/generic/rte_cpuflags.h |  8 ++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_cpuflags.c b/lib/librte_eal/common/eal_common_cpuflags.c
index b5f76f7..9a2d080 100644
--- a/lib/librte_eal/common/eal_common_cpuflags.c
+++ b/lib/librte_eal/common/eal_common_cpuflags.c
@@ -43,6 +43,13 @@
 void
 rte_cpu_check_supported(void)
 {
+	if (!rte_cpu_is_supported())
+		exit(1);
+}
+
+int
+rte_cpu_is_supported(void)
+{
 	/* This is generated at compile-time by the build system */
 	static const enum rte_cpu_flag_t compile_time_flags[] = {
 			RTE_COMPILE_TIME_CPUFLAGS
@@ -57,14 +64,16 @@ rte_cpu_check_supported(void)
 			fprintf(stderr,
 				"ERROR: CPU feature flag lookup failed with error %d\n",
 				ret);
-			exit(1);
+			return 0;
 		}
 		if (!ret) {
 			fprintf(stderr,
 			        "ERROR: This system does not support \"%s\".\n"
 			        "Please check that RTE_MACHINE is set correctly.\n",
 			        rte_cpu_get_flag_name(compile_time_flags[i]));
-			exit(1);
+			return 0;
 		}
 	}
+
+	return 1;
 }
diff --git a/lib/librte_eal/common/include/generic/rte_cpuflags.h b/lib/librte_eal/common/include/generic/rte_cpuflags.h
index 71321f3..8d27031 100644
--- a/lib/librte_eal/common/include/generic/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/generic/rte_cpuflags.h
@@ -82,4 +82,12 @@ rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature);
 void
 rte_cpu_check_supported(void);
 
+/**
+ * This function checks that the currently used CPU supports the CPU features
+ * that were specified at compile time. It is called automatically within the
+ * EAL, so does not need to be used by applications.  This version returns a
+ * result so that decisions may be made (for instance, graceful shutdowns).
+ */
+int
+rte_cpu_is_supported(void);
 #endif /* _RTE_CPUFLAGS_H_ */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 07/26] eal: do not panic when CPU isn't supported
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (5 preceding siblings ...)
  2017-02-28 18:52         ` [PATCH v6 06/26] eal-common: introduce a way to query cpu support Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 08/26] eal: do not panic on memzone initialization fails Aaron Conole
                           ` (19 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

It's now possible to gracefully exit the application, or for
applications which support non-dpdk datapaths working in concert with
DPDK datapaths, there no longer is the possibility of exiting for
unsupported CPUs.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f7511ab..a671ed4 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -759,7 +759,11 @@ rte_eal_init(int argc, char **argv)
 	char thread_name[RTE_MAX_THREAD_NAME_LEN];
 
 	/* checks if the machine is adequate */
-	rte_cpu_check_supported();
+	if (!rte_cpu_is_supported()) {
+		rte_eal_init_alert("unsupported cpu type.");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (!rte_atomic32_test_and_set(&run_once))
 		return -1;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 08/26] eal: do not panic on memzone initialization fails
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (6 preceding siblings ...)
  2017-02-28 18:52         ` [PATCH v6 07/26] eal: do not panic when CPU isn't supported Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 09/26] eal: set errno when exiting for already called Aaron Conole
                           ` (18 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When memzone initialization fails, report the error to the calling
application rather than panic().  Without a good way of detaching /
releasing hugepages, at this point the application will have to restart.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index a671ed4..5a92b28 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -839,8 +839,11 @@ rte_eal_init(int argc, char **argv)
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
 
-	if (rte_eal_memzone_init() < 0)
-		rte_panic("Cannot init memzone\n");
+	if (rte_eal_memzone_init() < 0) {
+		rte_eal_init_alert("Cannot init memzone\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
 
 	if (rte_eal_tailqs_init() < 0)
 		rte_panic("Cannot init tail queues for objects\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 09/26] eal: set errno when exiting for already called
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (7 preceding siblings ...)
  2017-02-28 18:52         ` [PATCH v6 08/26] eal: do not panic on memzone initialization fails Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:52         ` [PATCH v6 10/26] eal: do not panic on log failures Aaron Conole
                           ` (17 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 5a92b28..564cac3 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -765,8 +765,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (!rte_atomic32_test_and_set(&run_once))
+	if (!rte_atomic32_test_and_set(&run_once)) {
+		rte_eal_init_alert("already called initialization.");
+		rte_errno = EALREADY;
 		return -1;
+	}
 
 	logid = strrchr(argv[0], '/');
 	logid = strdup(logid ? logid + 1: argv[0]);
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 10/26] eal: do not panic on log failures
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (8 preceding siblings ...)
  2017-02-28 18:52         ` [PATCH v6 09/26] eal: set errno when exiting for already called Aaron Conole
@ 2017-02-28 18:52         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 11/26] eal: do not panic on PCI-probe Aaron Conole
                           ` (16 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:52 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When log initialization fails, it's generally because the fopencookie
failed.  While this is rare in practice, it could happen, and it is
likely because of memory pressure.  So, flag the error, and allow the
user to retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 564cac3..e1740a6 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -825,8 +825,12 @@ rte_eal_init(int argc, char **argv)
 
 	rte_config_init();
 
-	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
-		rte_panic("Cannot init logs\n");
+	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
+		rte_eal_init_alert("Cannot init logging.");
+		rte_errno = ENOMEM;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 	if (rte_eal_pci_init() < 0)
 		rte_panic("Cannot init PCI\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 11/26] eal: do not panic on PCI-probe
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (9 preceding siblings ...)
  2017-02-28 18:52         ` [PATCH v6 10/26] eal: do not panic on log failures Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 12/26] eal: do not panic on vfio failure Aaron Conole
                           ` (15 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This will usually be an issue because of permissions.  However, it could
also be caused by OOM.  In either case, errno will contain the
underlying cause.  It is safe to re-init the system here, so allow the
application to take corrective action and reinit.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index e1740a6..d5ef7b5 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -832,8 +832,12 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_pci_init() < 0)
-		rte_panic("Cannot init PCI\n");
+	if (rte_eal_pci_init() < 0) {
+		rte_eal_init_alert("Cannot init PCI\n");
+		rte_errno = EUNATCH;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 12/26] eal: do not panic on vfio failure
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (10 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 11/26] eal: do not panic on PCI-probe Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 13/26] eal: do not panic on memory init Aaron Conole
                           ` (14 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index d5ef7b5..10eefd3 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -840,8 +840,12 @@ rte_eal_init(int argc, char **argv)
 	}
 
 #ifdef VFIO_PRESENT
-	if (rte_eal_vfio_setup() < 0)
-		rte_panic("Cannot init VFIO\n");
+	if (rte_eal_vfio_setup() < 0) {
+		rte_eal_init_alert("Cannot init VFIO\n");
+		rte_errno = EAGAIN;
+		rte_atomic32_clear(&run_once);
+		return -1;
+	}
 #endif
 
 	if (rte_eal_memory_init() < 0)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 13/26] eal: do not panic on memory init
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (11 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 12/26] eal: do not panic on vfio failure Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 14/26] eal: do not panic on tailq init Aaron Conole
                           ` (13 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

This can only happen when access to hugepages (either as primary or
secondary process) fails (and that is usually permissions).  Since the
manner of failure is not reversible, we cannot allow retry.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 10eefd3..ae0beed 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -848,8 +848,11 @@ rte_eal_init(int argc, char **argv)
 	}
 #endif
 
-	if (rte_eal_memory_init() < 0)
-		rte_panic("Cannot init memory\n");
+	if (rte_eal_memory_init() < 0) {
+		rte_eal_init_alert("Cannot init memory\n");
+		rte_errno = ENOMEM;
+		return -1;
+	}
 
 	/* the directories are locked during eal_hugepage_info_init */
 	eal_hugedirs_unlock();
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 14/26] eal: do not panic on tailq init
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (12 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 13/26] eal: do not panic on memory init Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 15/26] eal: do not panic on alarm init Aaron Conole
                           ` (12 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There are some theoretical racy conditions in the system that _could_
cause early tailq init to fail;  however, no need to panic the
application.  While it can't continue using DPDK, it could make better
alerts to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/common/eal_common_tailqs.c | 3 +--
 lib/librte_eal/linuxapp/eal/eal.c         | 7 +++++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_tailqs.c b/lib/librte_eal/common/eal_common_tailqs.c
index bb08ec8..4f69828 100644
--- a/lib/librte_eal/common/eal_common_tailqs.c
+++ b/lib/librte_eal/common/eal_common_tailqs.c
@@ -188,8 +188,7 @@ rte_eal_tailqs_init(void)
 		if (t->head == NULL) {
 			RTE_LOG(ERR, EAL,
 				"Cannot initialize tailq: %s\n", t->name);
-			/* no need to TAILQ_REMOVE, we are going to panic in
-			 * rte_eal_init() */
+			/* TAILQ_REMOVE not needed, error is already fatal */
 			goto fail;
 		}
 	}
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index ae0beed..aa10192 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -863,8 +863,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_tailqs_init() < 0)
-		rte_panic("Cannot init tail queues for objects\n");
+	if (rte_eal_tailqs_init() < 0) {
+		rte_eal_init_alert("Cannot init tail queues for objects\n");
+		rte_errno = EFAULT;
+		return -1;
+	}
 
 	if (rte_eal_alarm_init() < 0)
 		rte_panic("Cannot init interrupt-handling thread\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 15/26] eal: do not panic on alarm init
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (13 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 14/26] eal: do not panic on tailq init Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 16/26] eal: convert timer init not to call panic Aaron Conole
                           ` (11 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

rte_eal_alarm_init() call uses the linux timerfd framework to create a
poll()-able timer using standard posix file operations.  This could fail
for a few reasons given in the man-pages, but many could be
corrected by the user application.  No need to panic.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index aa10192..7aa6b6e 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -869,8 +869,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_alarm_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_alarm_init() < 0) {
+		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
+		/* rte_eal_alarm_init sets rte_errno on failure. */
+		return -1;
+	}
 
 	if (rte_eal_timer_init() < 0)
 		rte_panic("Cannot init HPET or TSC timers\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 16/26] eal: convert timer init not to call panic
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (14 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 15/26] eal: do not panic on alarm init Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 17/26] eal: change the private pipe call to reflect errno Aaron Conole
                           ` (10 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

After code inspection, there is no way for eal_timer_init() to fail.  It
simply returns 0 in all cases.  As such, this test could either go-away
or stay here as 'future-proofing'.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 7aa6b6e..73a7d38 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -875,8 +875,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_eal_timer_init() < 0)
-		rte_panic("Cannot init HPET or TSC timers\n");
+	if (rte_eal_timer_init() < 0) {
+		rte_eal_init_alert("Cannot init HPET or TSC timers\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	eal_check_mem_on_local_socket();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 17/26] eal: change the private pipe call to reflect errno
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (15 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 16/26] eal: convert timer init not to call panic Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 18/26] eal: do not panic on interrupt thread init Aaron Conole
                           ` (9 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

There could be some confusion as to why the call failed - this change
will always reflect the value of the error in rte_error.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal_interrupts.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_interrupts.c b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
index 92a19cb..5bb833e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_interrupts.c
+++ b/lib/librte_eal/linuxapp/eal/eal_interrupts.c
@@ -898,13 +898,16 @@ rte_eal_intr_init(void)
 	 * create a pipe which will be waited by epoll and notified to
 	 * rebuild the wait list of epoll.
 	 */
-	if (pipe(intr_pipe.pipefd) < 0)
+	if (pipe(intr_pipe.pipefd) < 0) {
+		rte_errno = errno;
 		return -1;
+	}
 
 	/* create the host thread to wait/handle the interrupt */
 	ret = pthread_create(&intr_thread, NULL,
 			eal_intr_thread_main, NULL);
 	if (ret != 0) {
+		rte_errno = ret;
 		RTE_LOG(ERR, EAL,
 			"Failed to create thread for interrupt handling\n");
 	} else {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 18/26] eal: do not panic on interrupt thread init
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (16 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 17/26] eal: change the private pipe call to reflect errno Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 19/26] eal: do not error if plugins fail to init Aaron Conole
                           ` (8 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

When initializing the interrupt thread, there are a number of possible
reasons for failure - some of which are correctable by the application.
Do not panic() needlessly, and give the application a change to reflect
this information to the user.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 73a7d38..d174ad4 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -894,8 +894,10 @@ rte_eal_init(int argc, char **argv)
 		rte_config.master_lcore, (int)thread_id, cpuset,
 		ret == 0 ? "" : "...");
 
-	if (rte_eal_intr_init() < 0)
-		rte_panic("Cannot init interrupt-handling thread\n");
+	if (rte_eal_intr_init() < 0) {
+		rte_eal_init_alert("Cannot init interrupt-handling thread\n");
+		return -1;
+	}
 
 	if (rte_bus_scan())
 		rte_panic("Cannot scan the buses for devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 19/26] eal: do not error if plugins fail to init
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (17 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 18/26] eal: do not panic on interrupt thread init Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 20/26] eal_pci: continue probing even on failures Aaron Conole
                           ` (7 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Plugins are useful and important.  However, it seems crazy to abort
everything just because they don't initialize properly.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index d174ad4..0ba9766 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -884,7 +884,7 @@ rte_eal_init(int argc, char **argv)
 	eal_check_mem_on_local_socket();
 
 	if (eal_plugins_init() < 0)
-		rte_panic("Cannot init plugins\n");
+		rte_eal_init_alert("Cannot init plugins\n");
 
 	eal_thread_init_master(rte_config.master_lcore);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 20/26] eal_pci: continue probing even on failures
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (18 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 19/26] eal: do not error if plugins fail to init Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-03-08 22:04           ` Thomas Monjalon
  2017-02-28 18:53         ` [PATCH v6 21/26] eal: do not panic on failed PCI-probe Aaron Conole
                           ` (6 subsequent siblings)
  26 siblings, 1 reply; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Some devices may be inaccessible for a variety of reasons, or the
PCI-bus may be unavailable causing the whole thing to fail.  Still,
better to continue attempts at probes.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/common/eal_common_pci.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 72547bd..9416190 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -69,6 +69,7 @@
 #include <sys/queue.h>
 #include <sys/mman.h>
 
+#include <rte_errno.h>
 #include <rte_interrupts.h>
 #include <rte_log.h>
 #include <rte_pci.h>
@@ -416,6 +417,7 @@ rte_eal_pci_probe(void)
 	struct rte_pci_device *dev = NULL;
 	struct rte_devargs *devargs;
 	int probe_all = 0;
+	int ret_1 = 0;
 	int ret = 0;
 
 	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
@@ -430,17 +432,20 @@ rte_eal_pci_probe(void)
 
 		/* probe all or only whitelisted devices */
 		if (probe_all)
-			ret = pci_probe_all_drivers(dev);
+			ret_1 = pci_probe_all_drivers(dev);
 		else if (devargs != NULL &&
 			devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
-			ret = pci_probe_all_drivers(dev);
-		if (ret < 0)
-			rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
+			ret_1 = pci_probe_all_drivers(dev);
+		if (ret_1 < 0) {
+			RTE_LOG(ERR, EAL, "Requested device " PCI_PRI_FMT
 				 " cannot be used\n", dev->addr.domain, dev->addr.bus,
 				 dev->addr.devid, dev->addr.function);
+			rte_errno = errno;
+			ret = 1;
+		}
 	}
 
-	return 0;
+	return -ret;
 }
 
 /* dump one device */
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 21/26] eal: do not panic on failed PCI-probe
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (19 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 20/26] eal_pci: continue probing even on failures Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
                           ` (5 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Since PCI isn't neccessarily required, it may be possible to simply log
the error and continue on letting the user check the logs and restart
the application when things have failed.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 0ba9766..b13e1dd 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -943,8 +943,11 @@ rte_eal_init(int argc, char **argv)
 		rte_panic("Cannot probe devices\n");
 
 	/* Probe & Initialize PCI devices */
-	if (rte_eal_pci_probe())
-		rte_panic("Cannot probe PCI\n");
+	if (rte_eal_pci_probe()) {
+		rte_eal_init_alert("Cannot probe PCI\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	if (rte_eal_dev_init() < 0)
 		rte_panic("Cannot init pmd devices\n");
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 22/26] eal_common_dev: continue initializing vdevs
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (20 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 21/26] eal: do not panic on failed PCI-probe Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
                           ` (4 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Even if one vdev should fail, there's no need to prevent further
processing.  Log the error, and reflect it to the higher levels to
decide.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/common/eal_common_dev.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 4f3b493..9889997 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -80,6 +80,7 @@ int
 rte_eal_dev_init(void)
 {
 	struct rte_devargs *devargs;
+	int ret = 0;
 
 	/*
 	 * Note that the dev_driver_list is populated here
@@ -97,11 +98,11 @@ rte_eal_dev_init(void)
 					devargs->args)) {
 			RTE_LOG(ERR, EAL, "failed to initialize %s device\n",
 					devargs->virt.drv_name);
-			return -1;
+			ret = -1;
 		}
 	}
 
-	return 0;
+	return ret;
 }
 
 int rte_eal_dev_attach(const char *name, const char *devargs)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 23/26] eal: do not panic (or abort) if vdev init fails
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (21 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 24/26] eal: do not panic when bus probe fails Aaron Conole
                           ` (3 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Seems like it's possible to continue.  At least, the error is reflected
properly in the logs.  A user could then go and correct or investigate
the situation.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index b13e1dd..ddc50f2 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -950,7 +950,7 @@ rte_eal_init(int argc, char **argv)
 	}
 
 	if (rte_eal_dev_init() < 0)
-		rte_panic("Cannot init pmd devices\n");
+		rte_eal_init_alert("Cannot init pmd devices\n");
 
 	rte_eal_mcfg_complete();
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 24/26] eal: do not panic when bus probe fails
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (22 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 25/26] eal: do not panic on failed bus scan Aaron Conole
                           ` (2 subsequent siblings)
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index ddc50f2..8274196 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -939,8 +939,11 @@ rte_eal_init(int argc, char **argv)
 	rte_eal_mp_wait_lcore();
 
 	/* Probe all the buses and devices/drivers on them */
-	if (rte_bus_probe())
-		rte_panic("Cannot probe devices\n");
+	if (rte_bus_probe()) {
+		rte_eal_init_alert("Cannot probe devices\n");
+		rte_errno = ENOTSUP;
+		return -1;
+	}
 
 	/* Probe & Initialize PCI devices */
 	if (rte_eal_pci_probe()) {
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 25/26] eal: do not panic on failed bus scan
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (23 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 24/26] eal: do not panic when bus probe fails Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-02-28 18:53         ` [PATCH v6 26/26] rte_eal_init: add info about various error codes Aaron Conole
  2017-03-08 21:58         ` [PATCH v6 00/26] linux/eal: Remove most causes of panic on init Thomas Monjalon
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

For now, do an abort.  It's likely that even aborting the initialization
is premature in this case, as it may be possible to proceed even if one
bus or another is not available.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/linuxapp/eal/eal.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 8274196..6d6b825 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -899,8 +899,11 @@ rte_eal_init(int argc, char **argv)
 		return -1;
 	}
 
-	if (rte_bus_scan())
-		rte_panic("Cannot scan the buses for devices\n");
+	if (rte_bus_scan()) {
+		rte_eal_init_alert("Cannot scan the buses for devices\n");
+		rte_errno = ENODEV;
+		return -1;
+	}
 
 	RTE_LCORE_FOREACH_SLAVE(i) {
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* [PATCH v6 26/26] rte_eal_init: add info about various error codes
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (24 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 25/26] eal: do not panic on failed bus scan Aaron Conole
@ 2017-02-28 18:53         ` Aaron Conole
  2017-03-08 21:58         ` [PATCH v6 00/26] linux/eal: Remove most causes of panic on init Thomas Monjalon
  26 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-02-28 18:53 UTC (permalink / raw)
  To: dev; +Cc: Stephen Hemminger, Bruce Richardson

The rte_eal_init function will now pass failure reason hints to the
application.  To help app developers deciper this, add some brief
information about what the codes are indicating.

Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 lib/librte_eal/common/include/rte_eal.h | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/include/rte_eal.h b/lib/librte_eal/common/include/rte_eal.h
index 03fee50..9251244 100644
--- a/lib/librte_eal/common/include/rte_eal.h
+++ b/lib/librte_eal/common/include/rte_eal.h
@@ -159,7 +159,32 @@ int rte_eal_iopl_init(void);
  *     function call and should not be further interpreted by the
  *     application.  The EAL does not take any ownership of the memory used
  *     for either the argv array, or its members.
- *   - On failure, a negative error value.
+ *   - On failure, -1 and rte_errno is set to a value indicating the cause
+ *     for failure.  In some instances, the application will need to be
+ *     restarted as part of clearing the issue.
+ *
+ *   Error codes returned via rte_errno:
+ *     EACCES indicates a permissions issue.
+ *
+ *     EAGAIN indicates either a bus or system resource was not available,
+ *            setup may be attempted again.
+ *
+ *     EALREADY indicates that the rte_eal_init function has already been
+ *              called, and cannot be called again.
+ *
+ *     EFAULT indicates the tailq configuration name was not found in
+ *            memory configuration.
+ *
+ *     EINVAL indicates invalid parameters were passed as argv/argc.
+ *
+ *     ENOMEM indicates failure likely caused by an out-of-memory condition.
+ *
+ *     ENODEV indicates memory setup issues.
+ *
+ *     ENOTSUP indicates that the EAL cannot initialize on this system.
+ *
+ *     EUNATCH indicates that the PCI bus is either not present, or is not
+ *             readable by the eal.
  */
 int rte_eal_init(int argc, char **argv);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 159+ messages in thread

* Re: [PATCH v6 06/26] eal-common: introduce a way to query cpu support
  2017-02-28 18:52         ` [PATCH v6 06/26] eal-common: introduce a way to query cpu support Aaron Conole
@ 2017-03-08 21:45           ` Thomas Monjalon
  0 siblings, 0 replies; 159+ messages in thread
From: Thomas Monjalon @ 2017-03-08 21:45 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger, Bruce Richardson

2017-02-28 13:52, Aaron Conole:
> +/**
> + * This function checks that the currently used CPU supports the CPU features
> + * that were specified at compile time. It is called automatically within the
> + * EAL, so does not need to be used by applications.  This version returns a
> + * result so that decisions may be made (for instance, graceful shutdowns).
> + */
> +int
> +rte_cpu_is_supported(void);
>  #endif /* _RTE_CPUFLAGS_H_ */

A blank line is missing.

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v6 00/26] linux/eal: Remove most causes of panic on init
  2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
                           ` (25 preceding siblings ...)
  2017-02-28 18:53         ` [PATCH v6 26/26] rte_eal_init: add info about various error codes Aaron Conole
@ 2017-03-08 21:58         ` Thomas Monjalon
  2017-03-09  9:11           ` Bruce Richardson
  26 siblings, 1 reply; 159+ messages in thread
From: Thomas Monjalon @ 2017-03-08 21:58 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger, Bruce Richardson

Hi,

Thanks for the work.
I think it needs to be completed to have the same behaviour on bsdapp.

As a another version is required, I add some small comments about
the formatting.

I think you should use the form "do not panic on <step>" for most
of the commits.

Some commits may be squashed (see below):

2017-02-28 13:52, Aaron Conole:
> Aaron Conole (26):
>   eal: cpu init will no longer panic
>   eal: return error instead of panic for cpu init
squashed?

>   eal: do not panic on hugepage info init
>   eal: do not panic on failed hugepage query
squashed?

>   eal: do not panic if parsing args returns error
>   eal-common: introduce a way to query cpu support
>   eal: do not panic when CPU isn't supported
squashed?

>   eal: do not panic on memzone initialization fails
>   eal: set errno when exiting for already called
>   eal: do not panic on log failures
>   eal: do not panic on PCI-probe
It is not really the probe here

>   eal: do not panic on vfio failure
>   eal: do not panic on memory init
>   eal: do not panic on tailq init
>   eal: do not panic on alarm init
>   eal: convert timer init not to call panic
>   eal: change the private pipe call to reflect errno
>   eal: do not panic on interrupt thread init
>   eal: do not error if plugins fail to init
>   eal_pci: continue probing even on failures
>   eal: do not panic on failed PCI-probe
squashed?

>   eal_common_dev: continue initializing vdevs
>   eal: do not panic (or abort) if vdev init fails
squashed?

>   eal: do not panic when bus probe fails
>   eal: do not panic on failed bus scan
>   rte_eal_init: add info about various error codes

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v6 20/26] eal_pci: continue probing even on failures
  2017-02-28 18:53         ` [PATCH v6 20/26] eal_pci: continue probing even on failures Aaron Conole
@ 2017-03-08 22:04           ` Thomas Monjalon
  0 siblings, 0 replies; 159+ messages in thread
From: Thomas Monjalon @ 2017-03-08 22:04 UTC (permalink / raw)
  To: Aaron Conole; +Cc: dev, Stephen Hemminger, Bruce Richardson

2017-02-28 13:53, Aaron Conole:
> +	int ret_1 = 0;

You do not need to add a new variable.

>  	int ret = 0;
>  
>  	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
> @@ -430,17 +432,20 @@ rte_eal_pci_probe(void)
>  
>  		/* probe all or only whitelisted devices */
>  		if (probe_all)
> -			ret = pci_probe_all_drivers(dev);
> +			ret_1 = pci_probe_all_drivers(dev);
>  		else if (devargs != NULL &&
>  			devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
> -			ret = pci_probe_all_drivers(dev);
> -		if (ret < 0)
> -			rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
> +			ret_1 = pci_probe_all_drivers(dev);
> +		if (ret_1 < 0) {
> +			RTE_LOG(ERR, EAL, "Requested device " PCI_PRI_FMT
>  				 " cannot be used\n", dev->addr.domain, dev->addr.bus,
>  				 dev->addr.devid, dev->addr.function);
> +			rte_errno = errno;
> +			ret = 1;
> +		}
>  	}
>  
> -	return 0;
> +	return -ret;

It may be more explicit to use only one variable ret and filter
the positive values:
	ret < 0 ? -1 : 0

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v6 00/26] linux/eal: Remove most causes of panic on init
  2017-03-08 21:58         ` [PATCH v6 00/26] linux/eal: Remove most causes of panic on init Thomas Monjalon
@ 2017-03-09  9:11           ` Bruce Richardson
  2017-03-09  9:26             ` Thomas Monjalon
  0 siblings, 1 reply; 159+ messages in thread
From: Bruce Richardson @ 2017-03-09  9:11 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Aaron Conole, dev, Stephen Hemminger

On Wed, Mar 08, 2017 at 10:58:27PM +0100, Thomas Monjalon wrote:
> Hi,
> 
> Thanks for the work.
> I think it needs to be completed to have the same behaviour on bsdapp.

Ideally, yes, but I also don't think the lack of BSD changes should
block the inclusion of this set. In terms of application writers, the
apps don't need to be written differently for BSD compared to Linux
because of this change. All that is different is that the BSD version
will panic rather than return the error code. 

Regards,
/Bruce

> 

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v6 00/26] linux/eal: Remove most causes of panic on init
  2017-03-09  9:11           ` Bruce Richardson
@ 2017-03-09  9:26             ` Thomas Monjalon
  2017-03-09  9:38               ` Richardson, Bruce
  0 siblings, 1 reply; 159+ messages in thread
From: Thomas Monjalon @ 2017-03-09  9:26 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: Aaron Conole, dev, Stephen Hemminger

2017-03-09 09:11, Bruce Richardson:
> On Wed, Mar 08, 2017 at 10:58:27PM +0100, Thomas Monjalon wrote:
> > Hi,
> > 
> > Thanks for the work.
> > I think it needs to be completed to have the same behaviour on bsdapp.
> 
> Ideally, yes, but I also don't think the lack of BSD changes should
> block the inclusion of this set. In terms of application writers, the
> apps don't need to be written differently for BSD compared to Linux
> because of this change. All that is different is that the BSD version
> will panic rather than return the error code. 

So you do not have any issue about having a different behaviour
on Linux and BSD?
You are the bsdapp maintainer, so it is your call.

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v6 00/26] linux/eal: Remove most causes of panic on init
  2017-03-09  9:26             ` Thomas Monjalon
@ 2017-03-09  9:38               ` Richardson, Bruce
  2017-03-10 18:34                 ` Aaron Conole
  0 siblings, 1 reply; 159+ messages in thread
From: Richardson, Bruce @ 2017-03-09  9:38 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: Aaron Conole, dev, Stephen Hemminger



> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Thursday, March 9, 2017 9:26 AM
> To: Richardson, Bruce <bruce.richardson@intel.com>
> Cc: Aaron Conole <aconole@redhat.com>; dev@dpdk.org; Stephen Hemminger
> <stephen@networkplumber.org>
> Subject: Re: [dpdk-dev] [PATCH v6 00/26] linux/eal: Remove most causes of
> panic on init
> 
> 2017-03-09 09:11, Bruce Richardson:
> > On Wed, Mar 08, 2017 at 10:58:27PM +0100, Thomas Monjalon wrote:
> > > Hi,
> > >
> > > Thanks for the work.
> > > I think it needs to be completed to have the same behaviour on bsdapp.
> >
> > Ideally, yes, but I also don't think the lack of BSD changes should
> > block the inclusion of this set. In terms of application writers, the
> > apps don't need to be written differently for BSD compared to Linux
> > because of this change. All that is different is that the BSD version
> > will panic rather than return the error code.
> 
> So you do not have any issue about having a different behaviour on Linux
> and BSD?
> You are the bsdapp maintainer, so it is your call.

I would infinitely prefer to have the same behavior. However, so long as this does not require a user to change their app to be different on BSD, I don't think lack of BSD support should block improving Linux.

Aaron - will you be able to do equivalent changes to BSD within the 17.05 timeframe?

/Bruce

^ permalink raw reply	[flat|nested] 159+ messages in thread

* Re: [PATCH v6 00/26] linux/eal: Remove most causes of panic on init
  2017-03-09  9:38               ` Richardson, Bruce
@ 2017-03-10 18:34                 ` Aaron Conole
  0 siblings, 0 replies; 159+ messages in thread
From: Aaron Conole @ 2017-03-10 18:34 UTC (permalink / raw)
  To: Richardson, Bruce; +Cc: Thomas Monjalon, dev, Stephen Hemminger, Flavio Leitner

"Richardson, Bruce" <bruce.richardson@intel.com> writes:

>> -----Original Message-----
>> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
>> Sent: Thursday, March 9, 2017 9:26 AM
>> To: Richardson, Bruce <bruce.richardson@intel.com>
>> Cc: Aaron Conole <aconole@redhat.com>; dev@dpdk.org; Stephen Hemminger
>> <stephen@networkplumber.org>
>> Subject: Re: [dpdk-dev] [PATCH v6 00/26] linux/eal: Remove most causes of
>> panic on init
>> 
>> 2017-03-09 09:11, Bruce Richardson:
>> > On Wed, Mar 08, 2017 at 10:58:27PM +0100, Thomas Monjalon wrote:
>> > > Hi,
>> > >
>> > > Thanks for the work.
>> > > I think it needs to be completed to have the same behaviour on bsdapp.
>> >
>> > Ideally, yes, but I also don't think the lack of BSD changes should
>> > block the inclusion of this set. In terms of application writers, the
>> > apps don't need to be written differently for BSD compared to Linux
>> > because of this change. All that is different is that the BSD version
>> > will panic rather than return the error code.
>> 
>> So you do not have any issue about having a different behaviour on Linux
>> and BSD?
>> You are the bsdapp maintainer, so it is your call.
>
> I would infinitely prefer to have the same behavior. However, so long
> as this does not require a user to change their app to be different on
> BSD, I don't think lack of BSD support should block improving Linux.
>
> Aaron - will you be able to do equivalent changes to BSD within the 17.05 timeframe?

I'm going to spend some time next week looking into it.  If it looks
like it's akin to s/linuxapp/bsdapp/ and everything works, then I'll do
it.  If I get hung up on anything, I'll probably update with my status.

-Aaron

^ permalink raw reply	[flat|nested] 159+ messages in thread

end of thread, other threads:[~2017-03-10 18:34 UTC | newest]

Thread overview: 159+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-08 18:51 [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
2017-02-08 18:51 ` [PATCH v2 01/25] eal: CPU init will no longer panic Aaron Conole
2017-02-08 18:51 ` [PATCH v2 02/25] eal: return error instead of panic for cpu init Aaron Conole
2017-02-08 18:51 ` [PATCH v2 03/25] eal: No panic on hugepages info init Aaron Conole
2017-02-08 18:51 ` [PATCH v2 04/25] eal: do not panic on failed hugepage query Aaron Conole
2017-02-08 18:51 ` [PATCH v2 05/25] eal: failure to parse args returns error Aaron Conole
2017-02-08 18:51 ` [PATCH v2 06/25] eal-common: introduce a way to query cpu support Aaron Conole
2017-02-08 18:51 ` [PATCH v2 07/25] eal: Signal error when CPU isn't supported Aaron Conole
2017-02-08 18:51 ` [PATCH v2 08/25] eal: do not panic on memzone initialization fails Aaron Conole
2017-02-08 18:51 ` [PATCH v2 09/25] eal: set errno when exiting for already called Aaron Conole
2017-02-08 18:51 ` [PATCH v2 10/25] eal: Do not panic on log failures Aaron Conole
2017-02-08 18:51 ` [PATCH v2 11/25] eal: Do not panic on pci-probe Aaron Conole
2017-02-08 18:51 ` [PATCH v2 12/25] eal: do not panic on vfio failure Aaron Conole
2017-02-08 18:51 ` [PATCH v2 13/25] eal: do not panic on memory init Aaron Conole
2017-02-08 18:51 ` [PATCH v2 14/25] eal: do not panic on tailq init Aaron Conole
2017-02-08 18:51 ` [PATCH v2 15/25] eal: do not panic on alarm init Aaron Conole
2017-02-08 18:51 ` [PATCH v2 16/25] eal: convert timer_init not to call panic Aaron Conole
2017-02-08 18:51 ` [PATCH v2 17/25] eal: change the private pipe call to reflect errno Aaron Conole
2017-02-08 18:51 ` [PATCH v2 18/25] eal: Do not panic on interrupt thread init Aaron Conole
2017-02-08 18:51 ` [PATCH v2 19/25] eal: do not error if plugins fail to init Aaron Conole
2017-02-08 18:51 ` [PATCH v2 20/25] eal_pci: Continue probing even on failures Aaron Conole
2017-02-08 18:51 ` [PATCH v2 21/25] eal: do not panic on failed PCI probe Aaron Conole
2017-02-08 18:51 ` [PATCH v2 22/25] eal_common_dev: continue initializing vdevs Aaron Conole
2017-02-08 18:51 ` [PATCH v2 23/25] eal: do not panic (or abort) if vdev init fails Aaron Conole
2017-02-08 18:51 ` [PATCH v2 24/25] eal: do not panic when bus probe fails Aaron Conole
2017-02-08 18:51 ` [PATCH v2 25/25] rte_eal_init: add info about rte_errno codes Aaron Conole
2017-02-08 19:11 ` [PATCH v2 00/25] linux/eal: Remove most causes of panic on init Aaron Conole
2017-02-09 14:29 ` [PATCH v3 " Aaron Conole
2017-02-09 14:29   ` [PATCH v3 01/25] eal: CPU init will no longer panic Aaron Conole
2017-02-09 14:29   ` [PATCH v3 02/25] eal: return error instead of panic for cpu init Aaron Conole
2017-02-09 14:29   ` [PATCH v3 03/25] eal: No panic on hugepages info init Aaron Conole
2017-02-09 14:29   ` [PATCH v3 04/25] eal: do not panic on failed hugepage query Aaron Conole
2017-02-09 14:29   ` [PATCH v3 05/25] eal: failure to parse args returns error Aaron Conole
2017-02-09 14:29   ` [PATCH v3 06/25] eal-common: introduce a way to query cpu support Aaron Conole
2017-02-09 14:29   ` [PATCH v3 07/25] eal: Signal error when CPU isn't supported Aaron Conole
2017-02-09 14:29   ` [PATCH v3 08/25] eal: do not panic on memzone initialization fails Aaron Conole
2017-02-09 14:29   ` [PATCH v3 09/25] eal: set errno when exiting for already called Aaron Conole
2017-02-09 14:29   ` [PATCH v3 10/25] eal: Do not panic on log failures Aaron Conole
2017-02-09 14:29   ` [PATCH v3 11/25] eal: Do not panic on pci-probe Aaron Conole
2017-02-09 14:29   ` [PATCH v3 12/25] eal: do not panic on vfio failure Aaron Conole
2017-02-09 14:29   ` [PATCH v3 13/25] eal: do not panic on memory init Aaron Conole
2017-02-09 14:29   ` [PATCH v3 14/25] eal: do not panic on tailq init Aaron Conole
2017-02-09 14:29   ` [PATCH v3 15/25] eal: do not panic on alarm init Aaron Conole
2017-02-09 14:29   ` [PATCH v3 16/25] eal: convert timer_init not to call panic Aaron Conole
2017-02-09 14:29   ` [PATCH v3 17/25] eal: change the private pipe call to reflect errno Aaron Conole
2017-02-09 14:29   ` [PATCH v3 18/25] eal: Do not panic on interrupt thread init Aaron Conole
2017-02-09 14:29   ` [PATCH v3 19/25] eal: do not error if plugins fail to init Aaron Conole
2017-02-09 14:29   ` [PATCH v3 20/25] eal_pci: Continue probing even on failures Aaron Conole
2017-02-09 14:29   ` [PATCH v3 21/25] eal: do not panic on failed PCI probe Aaron Conole
2017-02-09 14:29   ` [PATCH v3 22/25] eal_common_dev: continue initializing vdevs Aaron Conole
2017-02-09 14:29   ` [PATCH v3 23/25] eal: do not panic (or abort) if vdev init fails Aaron Conole
2017-02-09 14:29   ` [PATCH v3 24/25] eal: do not panic when bus probe fails Aaron Conole
2017-02-09 14:29   ` [PATCH v3 25/25] rte_eal_init: add info about rte_errno codes Aaron Conole
2017-02-09 22:37     ` Stephen Hemminger
2017-02-14 21:31       ` Aaron Conole
2017-02-09 22:38   ` [PATCH v3 00/25] linux/eal: Remove most causes of panic on init Stephen Hemminger
2017-02-14 20:50     ` Aaron Conole
2017-02-25 16:02   ` [PATCH v4 00/26] " Aaron Conole
2017-02-25 16:02     ` [PATCH v4 01/26] eal: CPU init will no longer panic Aaron Conole
2017-02-25 16:02     ` [PATCH v4 02/26] eal: return error instead of panic for cpu init Aaron Conole
2017-02-27 12:58       ` Bruce Richardson
2017-02-27 14:35         ` Aaron Conole
2017-02-27 13:00       ` Bruce Richardson
2017-02-27 14:34         ` Aaron Conole
2017-02-25 16:02     ` [PATCH v4 03/26] eal: No panic on hugepages info init Aaron Conole
2017-02-25 16:02     ` [PATCH v4 04/26] eal: do not panic on failed hugepage query Aaron Conole
2017-02-25 16:02     ` [PATCH v4 05/26] eal: failure to parse args returns error Aaron Conole
2017-02-25 16:02     ` [PATCH v4 06/26] eal-common: introduce a way to query cpu support Aaron Conole
2017-02-27 13:48       ` Bruce Richardson
2017-02-27 14:33         ` Aaron Conole
2017-02-27 15:11           ` Bruce Richardson
2017-02-25 16:02     ` [PATCH v4 07/26] eal: Signal error when CPU isn't supported Aaron Conole
2017-02-25 16:02     ` [PATCH v4 08/26] eal: do not panic on memzone initialization fails Aaron Conole
2017-02-25 16:02     ` [PATCH v4 09/26] eal: set errno when exiting for already called Aaron Conole
2017-02-25 16:02     ` [PATCH v4 10/26] eal: Do not panic on log failures Aaron Conole
2017-02-25 16:02     ` [PATCH v4 11/26] eal: Do not panic on pci-probe Aaron Conole
2017-02-25 16:02     ` [PATCH v4 12/26] eal: do not panic on vfio failure Aaron Conole
2017-02-25 16:02     ` [PATCH v4 13/26] eal: do not panic on memory init Aaron Conole
2017-02-25 16:02     ` [PATCH v4 14/26] eal: do not panic on tailq init Aaron Conole
2017-02-25 16:02     ` [PATCH v4 15/26] eal: do not panic on alarm init Aaron Conole
2017-02-25 16:02     ` [PATCH v4 16/26] eal: convert timer_init not to call panic Aaron Conole
2017-02-25 16:03     ` [PATCH v4 17/26] eal: change the private pipe call to reflect errno Aaron Conole
2017-02-25 16:03     ` [PATCH v4 18/26] eal: Do not panic on interrupt thread init Aaron Conole
2017-02-25 16:03     ` [PATCH v4 19/26] eal: do not error if plugins fail to init Aaron Conole
2017-02-25 16:03     ` [PATCH v4 20/26] eal_pci: Continue probing even on failures Aaron Conole
2017-02-25 16:03     ` [PATCH v4 21/26] eal: do not panic on failed PCI probe Aaron Conole
2017-02-25 16:03     ` [PATCH v4 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
2017-02-25 16:03     ` [PATCH v4 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
2017-02-25 16:03     ` [PATCH v4 24/26] eal: do not panic when bus probe fails Aaron Conole
2017-02-25 16:03     ` [PATCH v4 25/26] eal: do not panic on failed bus scan Aaron Conole
2017-02-25 16:03     ` [PATCH v4 26/26] rte_eal_init: add info about rte_errno codes Aaron Conole
2017-02-27 13:59     ` [PATCH v4 00/26] linux/eal: Remove most causes of panic on init Bruce Richardson
2017-02-27 14:34       ` Aaron Conole
2017-02-27 16:17     ` [PATCH v5 " Aaron Conole
2017-02-27 16:17       ` [PATCH v5 01/26] eal: CPU init will no longer panic Aaron Conole
2017-02-27 16:17       ` [PATCH v5 02/26] eal: return error instead of panic for cpu init Aaron Conole
2017-02-27 16:17       ` [PATCH v5 03/26] eal: No panic on hugepages info init Aaron Conole
2017-02-28 14:25         ` Bruce Richardson
2017-02-28 14:48           ` Aaron Conole
2017-02-27 16:17       ` [PATCH v5 04/26] eal: do not panic on failed hugepage query Aaron Conole
2017-02-27 16:17       ` [PATCH v5 05/26] eal: failure to parse args returns error Aaron Conole
2017-02-27 16:17       ` [PATCH v5 06/26] eal-common: introduce a way to query cpu support Aaron Conole
2017-02-27 16:17       ` [PATCH v5 07/26] eal: Signal error when CPU isn't supported Aaron Conole
2017-02-27 16:17       ` [PATCH v5 08/26] eal: do not panic on memzone initialization fails Aaron Conole
2017-02-28 14:27         ` Bruce Richardson
2017-02-28 14:46           ` Aaron Conole
2017-02-27 16:17       ` [PATCH v5 09/26] eal: set errno when exiting for already called Aaron Conole
2017-02-27 16:17       ` [PATCH v5 10/26] eal: Do not panic on log failures Aaron Conole
2017-02-27 16:17       ` [PATCH v5 11/26] eal: Do not panic on pci-probe Aaron Conole
2017-02-27 16:17       ` [PATCH v5 12/26] eal: do not panic on vfio failure Aaron Conole
2017-02-27 16:17       ` [PATCH v5 13/26] eal: do not panic on memory init Aaron Conole
2017-02-27 16:17       ` [PATCH v5 14/26] eal: do not panic on tailq init Aaron Conole
2017-02-27 16:18       ` [PATCH v5 15/26] eal: do not panic on alarm init Aaron Conole
2017-02-27 16:18       ` [PATCH v5 16/26] eal: convert timer_init not to call panic Aaron Conole
2017-02-27 16:18       ` [PATCH v5 17/26] eal: change the private pipe call to reflect errno Aaron Conole
2017-02-27 16:18       ` [PATCH v5 18/26] eal: Do not panic on interrupt thread init Aaron Conole
2017-02-27 16:18       ` [PATCH v5 19/26] eal: do not error if plugins fail to init Aaron Conole
2017-02-27 16:18       ` [PATCH v5 20/26] eal_pci: Continue probing even on failures Aaron Conole
2017-02-27 16:18       ` [PATCH v5 21/26] eal: do not panic on failed PCI probe Aaron Conole
2017-02-27 16:18       ` [PATCH v5 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
2017-02-27 16:18       ` [PATCH v5 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
2017-02-27 16:18       ` [PATCH v5 24/26] eal: do not panic when bus probe fails Aaron Conole
2017-02-27 16:18       ` [PATCH v5 25/26] eal: do not panic on failed bus scan Aaron Conole
2017-02-27 16:18       ` [PATCH v5 26/26] rte_eal_init: add info about rte_errno codes Aaron Conole
2017-02-28 14:45       ` [PATCH v5 00/26] linux/eal: Remove most causes of panic on init Bruce Richardson
2017-02-28 18:52       ` [PATCH v6 " Aaron Conole
2017-02-28 18:52         ` [PATCH v6 01/26] eal: cpu init will no longer panic Aaron Conole
2017-02-28 18:52         ` [PATCH v6 02/26] eal: return error instead of panic for cpu init Aaron Conole
2017-02-28 18:52         ` [PATCH v6 03/26] eal: do not panic on hugepage info init Aaron Conole
2017-02-28 18:52         ` [PATCH v6 04/26] eal: do not panic on failed hugepage query Aaron Conole
2017-02-28 18:52         ` [PATCH v6 05/26] eal: do not panic if parsing args returns error Aaron Conole
2017-02-28 18:52         ` [PATCH v6 06/26] eal-common: introduce a way to query cpu support Aaron Conole
2017-03-08 21:45           ` Thomas Monjalon
2017-02-28 18:52         ` [PATCH v6 07/26] eal: do not panic when CPU isn't supported Aaron Conole
2017-02-28 18:52         ` [PATCH v6 08/26] eal: do not panic on memzone initialization fails Aaron Conole
2017-02-28 18:52         ` [PATCH v6 09/26] eal: set errno when exiting for already called Aaron Conole
2017-02-28 18:52         ` [PATCH v6 10/26] eal: do not panic on log failures Aaron Conole
2017-02-28 18:53         ` [PATCH v6 11/26] eal: do not panic on PCI-probe Aaron Conole
2017-02-28 18:53         ` [PATCH v6 12/26] eal: do not panic on vfio failure Aaron Conole
2017-02-28 18:53         ` [PATCH v6 13/26] eal: do not panic on memory init Aaron Conole
2017-02-28 18:53         ` [PATCH v6 14/26] eal: do not panic on tailq init Aaron Conole
2017-02-28 18:53         ` [PATCH v6 15/26] eal: do not panic on alarm init Aaron Conole
2017-02-28 18:53         ` [PATCH v6 16/26] eal: convert timer init not to call panic Aaron Conole
2017-02-28 18:53         ` [PATCH v6 17/26] eal: change the private pipe call to reflect errno Aaron Conole
2017-02-28 18:53         ` [PATCH v6 18/26] eal: do not panic on interrupt thread init Aaron Conole
2017-02-28 18:53         ` [PATCH v6 19/26] eal: do not error if plugins fail to init Aaron Conole
2017-02-28 18:53         ` [PATCH v6 20/26] eal_pci: continue probing even on failures Aaron Conole
2017-03-08 22:04           ` Thomas Monjalon
2017-02-28 18:53         ` [PATCH v6 21/26] eal: do not panic on failed PCI-probe Aaron Conole
2017-02-28 18:53         ` [PATCH v6 22/26] eal_common_dev: continue initializing vdevs Aaron Conole
2017-02-28 18:53         ` [PATCH v6 23/26] eal: do not panic (or abort) if vdev init fails Aaron Conole
2017-02-28 18:53         ` [PATCH v6 24/26] eal: do not panic when bus probe fails Aaron Conole
2017-02-28 18:53         ` [PATCH v6 25/26] eal: do not panic on failed bus scan Aaron Conole
2017-02-28 18:53         ` [PATCH v6 26/26] rte_eal_init: add info about various error codes Aaron Conole
2017-03-08 21:58         ` [PATCH v6 00/26] linux/eal: Remove most causes of panic on init Thomas Monjalon
2017-03-09  9:11           ` Bruce Richardson
2017-03-09  9:26             ` Thomas Monjalon
2017-03-09  9:38               ` Richardson, Bruce
2017-03-10 18:34                 ` Aaron Conole

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.