All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] Restructure EAL device model for bus support
@ 2016-11-17  5:29 Shreyansh Jain
  2016-11-17  5:30 ` [RFC PATCH 1/6] eal: define container macro Shreyansh Jain
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17  5:29 UTC (permalink / raw)
  To: david.marchand; +Cc: dev, Shreyansh Jain

DPDK has been inherently a PCI inclined framework. Because of this, the
design of device tree (or list) within DPDK is also PCI inclined. A non-PCI
device doesn't have a way of being expressed without using hooks started from
EAL to PMD.

With this cover letter, some patches are presented which try to break this
strict linkage of EAL with PCI devices. Aim is to generalize the device
hierarchy on the lines of how Linux handles it:

        device A1
          |
  +==.===='==============.============+ Bus A.
     |                    `--> driver A11     \
  device A2                `-> driver A12      \______
                                                |CPU |
                                                /`````
        device A1                              /
          |                                   /
  +==.===='==============.============+ Bus A`
     |                    `--> driver B11
  device A2                `-> driver B12

Simply put:
 - a bus is connect to CPU (or core)
 - devices are conneted to Bus
 - drivers are running instances which manage one or more devices
 - bus is responsible for identifying devices (and interrupt propogation)
 - driver is responsible for initializing the device

(*Reusing text from email [1])
In context of DPDK EAL:
 - a generic bus (not a driver, not a device). I don't know how to categorize
   a bus. It is certainly not a device, and then handler for a bus (physical)
   can be considered a 'bus driver'. So, just 'rte_bus'.
 - there is a bus for each physical implementation (or virtual). So, a rte_bus
   Object for PCI, VDEV, ABC, DEF and so on.
 - Buses are registered just like a PMD - RTE_PMD_BUS_REGISTER()
 - Each registered bus is part of a doubly list.
   -- Each device inherits rte_bus
   -- Each driver inherits rte_bus
   -- Device and Drivers lists are part of rte_bus
 - eth_driver is no more required - it was just a holder for PMDs to register
   themselves. It can be replaced with rte_xxx_driver and corresponding init/
   uninit moved to rte_driver
 - rte_eth_dev modified to disassociate itself from rte_pci_device and connect
   to generic rte_device

Once again, improvising from [1]:

                                  __ rte_bus_list
                                 /
                     +----------'---+
                     |rte_bus       |
                     | driver_list------> List of rte_bus specific
                     | device_list----    devices
                     | scan         | `-> List of rte_bus associated
                     | match        |     drivers
                     | dump         |
                     | ..some refcnt| (#)
                     +--|------|----+
              _________/        \_________
    +--------/----+                     +-\---------------+
    |rte_device   |                     |rte_driver       |
    | rte_bus     |                     | rte_bus         |
    | rte_driver  |(#)                  | init            |
    |             |                     | uninit          |
    |  devargs    |                     | dev_private_size|
    +---||--------+                     | drv_flags       |(#)
        ||                              | intr_handle(2*) |(#)
        | \                             +----------\\\----+
        |  \_____________                           \\\
        |                \                          |||
 +------|---------+ +----|----------+               |||
 |rte_pci_device  | |rte_xxx_device | (4*)          |||
 | PCI specific   | | xxx device    |               |||
 | info (mem,)    | | specific fns  |              / | \
 +----------------+ +---------------+             /  |  \
                            _____________________/  /    \
                           /                    ___/      \
            +-------------'--+    +------------'---+    +--'------------+
            |rte_pci_driver  |    |rte_vdev_driver |    |rte_xxx_driver |
            | PCI id table,  |    | <probably,     |    | ....          |
            | other driver   |    |  nothing>      |    +---------------+
            | data           |    +----------------+
            +----------------+
    
    (1*) Problem is that probe functions have different arguments. So,
         generalizing them might be some rework in the respective device
         layers
    (2*) Interrupt handling for each driver type might be different. I am not
         sure how to generalize that either. This is grey area for me.
    (3*) Probably exposing a bitmask for device capabilities. Nothing similar
         exists now to relate it. Don't know if that is useful. Allowing
         applications to question a device about what it supports and what it
         doesn't - making it more flexible at application layer (but more code
         as well.)
    (4*) Even vdev would be an instantiated as a device. It is not being done
         currently.
    (#)  Items which are still pending

With this cover letter, some patches have been posted. These are _only_ for
discussion purpose. They are not complete and neither compilable.
All the patches, except 0001, have sufficient context about what the changes
are and rationale for same. Obviously, code is best answer.

=== Patch description: ===

Patch 0001: Introduce container_of macro. Originally a patch from Jan.

Patch 0002: introduce changes to rte_device/rte_driver for rte_bus, and
            rte_bus definition itself.

Patch 0003: Add a new layer for 'bus driver' with linuxapp PCI as an example

Patch 0004: Changes with respect to rte_bus APIs and impact on eal_common_pci

Patch 0005: Change to rte_eal_init (of linuxapp only, for now) for supporting
            bus->scan. Probe is still being done old way, but in a new wrapper

Patch 0006: eth_driver removal and corresponding changes in ixgbe_ethdev, as
            an example. Includes changes to rte_ethdev to remove most possible
            PCI references. But, work still remains.

=== Pending Items/Questions: ===

 - Interrupt and notification handling. How to allow drivers to be notified
   of presence/plugging of a device.
 - Placement of bus driver/handling code. librte_bus, bus/, drivers/bus?
 -- Also from a pespective of a external library and whether symbols would be
    available in that.
 -- and secondary processes
 - VDEV bus is missing from current set.
 - Locking of list for supporting hotplugging. Or, at the least safe add/
   remove
 - PMDINFOGEN support or lack of it.
 - Is there ever a case where rte_eth_dev needs to be resolved from
   rte_pci_device? I couldn't find any such use and neither a use-case for it.
 - There should be a way for Bus APIs to return a generic list handle so that
   EAL doesn't need to bother about bus->driver_list like dereferencing. This
   is untidy as well as less portable (in terms of code movement, not arch).
 - Are more helper hooks required for a bus?
 -- I can think of scan, match, dump, find, plug (device), unplug (device),
    associate (driver), disassociate (driver). But, most of the work is
    already being done by lower instances (rte_device/driver etc).

Further:
 - In next few days I will make all necessary changes on the lines mentioned
   in the patches. This would include changing the drivers/* and librte_eal/*
 - As an when review comments float in and agreement reached, I will keep
   changing the model
 - There are grey areas like interrupt, notification, locking of bus/list
   which require more discussion. I will try and post a rfc for those as well
   or if someone can help me on those - great
 - Change would include PCI bus and VDEV bus handling. A new bus (NXP's FSLMC)
   would also be layered over this series to verify the model of 'bus
   registration'. This is also part of 17.02 roadmap.

[1] http://dpdk.org/ml/archives/dev/2016-November/050186.html

Jan Viktorin (1):
  eal: define container macro

Shreyansh Jain (5):
  eal: introduce bus-device-driver structure
  bus: add bus driver layer
  eal/common: handle bus abstraction for device/driver objects
  eal: supporting bus model in init process
  eal: removing eth_driver

 bus/Makefile                               |  36 +++
 bus/pci/Makefile                           |  37 +++
 bus/pci/linuxapp/pci_bus.c                 | 418 +++++++++++++++++++++++++++++
 bus/pci/linuxapp/pci_bus.h                 |  55 ++++
 drivers/net/ixgbe/ixgbe_ethdev.c           |  49 ++--
 lib/librte_eal/common/eal_common_bus.c     | 188 +++++++++++++
 lib/librte_eal/common/eal_common_dev.c     |  31 ++-
 lib/librte_eal/common/eal_common_pci.c     | 226 +++++++++-------
 lib/librte_eal/common/include/rte_bus.h    | 243 +++++++++++++++++
 lib/librte_eal/common/include/rte_common.h |  18 ++
 lib/librte_eal/common/include/rte_dev.h    |  36 +--
 lib/librte_eal/common/include/rte_pci.h    |  11 +-
 lib/librte_eal/linuxapp/eal/Makefile       |   1 +
 lib/librte_eal/linuxapp/eal/eal.c          |  51 +++-
 lib/librte_eal/linuxapp/eal/eal_pci.c      | 298 --------------------
 lib/librte_ether/rte_ethdev.c              |  36 ++-
 lib/librte_ether/rte_ethdev.h              |   6 +-
 17 files changed, 1262 insertions(+), 478 deletions(-)
 create mode 100644 bus/Makefile
 create mode 100644 bus/pci/Makefile
 create mode 100644 bus/pci/linuxapp/pci_bus.c
 create mode 100644 bus/pci/linuxapp/pci_bus.h
 create mode 100644 lib/librte_eal/common/eal_common_bus.c
 create mode 100644 lib/librte_eal/common/include/rte_bus.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH 1/6] eal: define container macro
  2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
@ 2016-11-17  5:30 ` Shreyansh Jain
  2016-11-17 12:06   ` Jan Blunck
  2016-11-17  5:30 ` [RFC PATCH 2/6] eal: introduce bus-device-driver structure Shreyansh Jain
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17  5:30 UTC (permalink / raw)
  To: david.marchand; +Cc: dev, Jan Viktorin, Shreyansh Jain

From: Jan Viktorin <viktorin@rehivetech.com>

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
---
 lib/librte_eal/common/include/rte_common.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_common.h b/lib/librte_eal/common/include/rte_common.h
index db5ac91..8152bd9 100644
--- a/lib/librte_eal/common/include/rte_common.h
+++ b/lib/librte_eal/common/include/rte_common.h
@@ -331,6 +331,24 @@ rte_bsf32(uint32_t v)
 #define offsetof(TYPE, MEMBER)  __builtin_offsetof (TYPE, MEMBER)
 #endif
 
+/**
+ * Return pointer to the wrapping struct instance.
+ * Example:
+ *
+ *  struct wrapper {
+ *      ...
+ *      struct child c;
+ *      ...
+ *  };
+ *
+ *  struct child *x = obtain(...);
+ *  struct wrapper *w = container_of(x, struct wrapper, c);
+ */
+#ifndef container_of
+#define container_of(p, type, member) \
+	((type *) (((char *) (p)) - offsetof(type, member)))
+#endif
+
 #define _RTE_STR(x) #x
 /** Take a macro value and get a string version of it */
 #define RTE_STR(x) _RTE_STR(x)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 2/6] eal: introduce bus-device-driver structure
  2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
  2016-11-17  5:30 ` [RFC PATCH 1/6] eal: define container macro Shreyansh Jain
@ 2016-11-17  5:30 ` Shreyansh Jain
  2016-11-17 11:19   ` Jan Blunck
  2016-11-17  5:30 ` [RFC PATCH 3/6] bus: add bus driver layer Shreyansh Jain
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17  5:30 UTC (permalink / raw)
  To: david.marchand; +Cc: dev, Shreyansh Jain

A device is connected to a bus and services by a driver associated with
the bus. It is responsibility of the bus to identify the devices (scan)
and then assign each device to a matching driver.

A PMD would allocate a rte_xxx_driver and rte_xxx_device.
rte_xxx_driver has rte_driver and rte_bus embedded. Similarly, rte_xxx_device
has rte_device and rte_bus embedded.

When a ethernet or crypto device (rte_eth_dev, rte_cryptodev) is allocated,
it contains a reference of rte_device and rte_driver.
Each ethernet device implementation would use container_of for finding the
enclosing structure of rte_xxx_*.

                            +-------------------+
 +--------------+           |rte_pci_device     |
 |rte_eth_dev   |           |+-----------------+|
 |+------------+|   .-------->rte_device       ||
 ||rte_device*-----'        |+-----------------+|
 |+------------+|           ||rte_bus          ||
 |              |           |+-----------------+|
 /              /           +-------------------+

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
---
 lib/librte_eal/common/include/rte_bus.h | 243 ++++++++++++++++++++++++++++++++
 lib/librte_eal/common/include/rte_dev.h |  36 ++---
 2 files changed, 261 insertions(+), 18 deletions(-)
 create mode 100644 lib/librte_eal/common/include/rte_bus.h

diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
new file mode 100644
index 0000000..dc3aeb8
--- /dev/null
+++ b/lib/librte_eal/common/include/rte_bus.h
@@ -0,0 +1,243 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 NXP
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of NXP nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BUS_H_
+#define _RTE_BUS_H_
+
+/**
+ * @file
+ *
+ * RTE PMD Bus Abstraction interfaces
+ *
+ * This file exposes APIs and Interfaces for Bus Abstraction over the devices
+ * drivers in EAL.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdio.h>
+#include <sys/queue.h>
+
+#include <rte_log.h>
+#include <rte_dev.h>
+
+
+/** Double linked list of buses */
+TAILQ_HEAD(rte_bus_list, rte_bus);
+
+/**
+ * Bus specific scan for devices attached on the bus.
+ * For each bus object, the scan would be reponsible for finding devices and
+ * adding them to its private device list.
+ *
+ * Successful detection of a device results in rte_device object which is
+ * embedded within the respective device type (rte_pci_device, for example).
+ * Thereafter, PCI specific bus would need to perform
+ * container_of(rte_pci_device) to obtain PCI device object.
+ *
+ * Scan failure of a bus is not treated as exit criteria for application. Scan
+ * for all other buses would still continue.
+ *
+ * @param void
+ * @return
+ *	0 for successful scan
+ *	!0 (<0) for unsuccessful scan with error value
+ */
+typedef int (* bus_scan_t)(void);
+
+/**
+ * Bus specific match for devices and drivers which can service them.
+ * For each scanned device, during probe the match would link the devices with
+ * drivers which can service the device.
+ *
+ * It is the work of each bus handler to obtain the specific device object
+ * using container_of (or typecasting, as a less preferred way).
+ *
+ * @param drv
+ *	Driver object attached to the bus
+ * @param dev
+ *	Device object which is being probed.
+ * @return
+ *	0 for successful match
+ *	!0 for unsuccessful match
+ */
+typedef int (* bus_match_t)(struct rte_driver *drv, struct rte_device *dev);
+
+/**
+ * Dump the devices on the bus.
+ * Each bus type can define its own definition of information to dump.
+ *
+ * @param bus
+ *	Handle for bus, device from which are to be dumped.
+ * @param f
+ *	Handle to output device or file.
+ * @return void
+ */
+typedef void (* bus_dump_t)(struct rte_bus *bus, FILE *f);
+
+/**
+ * Search for a specific device in device list of the bus
+ * This would rely on the bus specific addressing. Each implementation would
+ * extract its specific device type and perform address compare.
+ *
+ * @param dev
+ *	device handle to search for.
+ * @return
+ *	rte_device handle for matched device, or NULL
+ */
+typedef struct rte_device * (* bus_device_get_t)(struct rte_device *dev);
+
+struct rte_bus {
+	TAILQ_ENTRY(rte_bus) next;   /**< Next bus object in linked list */
+	struct rte_driver_list driver_list; /**< List of all drivers of bus */
+	struct rte_device_list device_list; /**< List of all devices on bus */
+	const char *name;            /**< Name of the bus */
+	/* Mandatory hooks */
+	bus_scan_t *scan;            /**< Hook for scanning for devices */
+	bus_match_t *match;          /**< Hook for matching device & driver */
+	/* Optional hooks */
+	bus_dump_t *dump_dev;        /**< Hook for dumping devices on bus */
+	bus_device_get_t *find_dev;  /**< Search for a device on bus */
+};
+
+/** @internal
+ * Add a device to a bus.
+ *
+ * @param bus
+ *	Bus on which device is to be added
+ * @param dev
+ *	Device handle
+ * @return
+ *	None
+ */
+void
+rte_eal_bus_add_device(struct rte_bus *bus, struct rte_device *dev);
+
+/** @internal
+ * Remove a device from its bus.
+ *
+ * @param dev
+ *	Device handle to remove
+ * @return
+ *	None
+ */
+void
+rte_eal_bus_remove_device(struct rte_device *dev);
+
+/** @internal
+ * Associate a driver with a bus.
+ *
+ * @param bus
+ *	Bus on which driver is to be added
+ * @param dev
+ *	Driver handle
+ * @return
+ *	None
+ */
+void
+rte_eal_bus_add_driver(struct rte_bus *bus, struct rte_driver *drv);
+
+/** @internal
+ * Disassociate a driver from its bus.
+ *
+ * @param dev
+ *	Driver handle to remove
+ * @return
+ *	None
+ */
+void
+rte_eal_bus_remove_driver(struct rte_driver *drv);
+
+/**
+ * Register a Bus handler.
+ *
+ * @param driver
+ *   A pointer to a rte_bus structure describing the bus
+ *   to be registered.
+ */
+void rte_eal_bus_register(struct rte_bus *bus);
+
+/**
+ * Unregister a Bus handler.
+ *
+ * @param driver
+ *   A pointer to a rte_bus structure describing the bus
+ *   to be unregistered.
+ */
+void rte_eal_bus_unregister(struct rte_bus *bus);
+
+/**
+ * Obtain handle for bus given its name.
+ *
+ * @param bus_name
+ *	Name of the bus handle to search
+ * @return
+ *	Pointer to Bus object if name matches any registered bus object
+ *	NULL, if no matching bus found
+ */
+struct rte_bus * rte_eal_get_bus(const char *bus_name);
+
+/**
+ * Register a device driver.
+ *
+ * @param driver
+ *   A pointer to a rte_dev structure describing the driver
+ *   to be registered.
+ */
+void rte_eal_driver_register(struct rte_driver *driver);
+
+/**
+ * Unregister a device driver.
+ *
+ * @param driver
+ *   A pointer to a rte_dev structure describing the driver
+ *   to be unregistered.
+ */
+void rte_eal_driver_unregister(struct rte_driver *driver);
+
+/** Helper for Bus registration */
+#define RTE_PMD_REGISTER_BUS(nm, bus) \
+RTE_INIT(businitfn_ ##nm); \
+static void businitfn_ ##nm(void) \
+{\
+	(bus).name = RTE_STR(nm);\
+	rte_eal_bus_register(&bus); \
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BUS_H */
diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h
index 8840380..b08bab5 100644
--- a/lib/librte_eal/common/include/rte_dev.h
+++ b/lib/librte_eal/common/include/rte_dev.h
@@ -116,12 +116,14 @@ TAILQ_HEAD(rte_device_list, rte_device);
 
 /* Forward declaration */
 struct rte_driver;
+struct rte_bus;
 
 /**
  * A structure describing a generic device.
  */
 struct rte_device {
 	TAILQ_ENTRY(rte_device) next; /**< Next device */
+	struct rte_bus *bus;          /**< Bus on which device is placed */
 	struct rte_driver *driver;    /**< Associated driver */
 	int numa_node;                /**< NUMA node connection */
 	struct rte_devargs *devargs;  /**< Device user arguments */
@@ -144,31 +146,29 @@ void rte_eal_device_insert(struct rte_device *dev);
 void rte_eal_device_remove(struct rte_device *dev);
 
 /**
- * A structure describing a device driver.
+ * @internal
+ * TODO
  */
-struct rte_driver {
-	TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
-	const char *name;                   /**< Driver name. */
-	const char *alias;              /**< Driver alias. */
-};
+typedef int (*driver_init_t)(struct rte_device *eth_dev);
 
 /**
- * Register a device driver.
- *
- * @param driver
- *   A pointer to a rte_dev structure describing the driver
- *   to be registered.
+ * @internal
+ * TODO
  */
-void rte_eal_driver_register(struct rte_driver *driver);
+typedef int (*driver_uninit_t)(struct rte_device *eth_dev);
 
 /**
- * Unregister a device driver.
- *
- * @param driver
- *   A pointer to a rte_dev structure describing the driver
- *   to be unregistered.
+ * A structure describing a device driver.
  */
-void rte_eal_driver_unregister(struct rte_driver *driver);
+struct rte_driver {
+	TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
+	struct rte_bus *bus;           /**< Bus which drivers services */
+	const char *name;              /**< Driver name. */
+	const char *alias;             /**< Driver alias. */
+	driver_init_t *init;           /**< Driver initialization */
+	driver_uninit_t *uninit;       /**< Driver uninitialization */
+	unsigned int dev_private_size; /**< Size of device private data ??*/
+};
 
 /**
  * Initalize all the registered drivers in this process
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 3/6] bus: add bus driver layer
  2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
  2016-11-17  5:30 ` [RFC PATCH 1/6] eal: define container macro Shreyansh Jain
  2016-11-17  5:30 ` [RFC PATCH 2/6] eal: introduce bus-device-driver structure Shreyansh Jain
@ 2016-11-17  5:30 ` Shreyansh Jain
  2016-11-17  5:30 ` [RFC PATCH 4/6] eal/common: handle bus abstraction for device/driver objects Shreyansh Jain
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17  5:30 UTC (permalink / raw)
  To: david.marchand; +Cc: dev, Shreyansh Jain

A bus is managed using a 'bus driver'. This patch introduces a sample
PCI bus driver which essentially is the scan and match implementation for
PCI.

There are multiple possible alternatives to where such a driver can be
kept within the DPDK code:
1. librte_bus
 - For each type of bus, there would a file like 'eal_xxx_bus' which would
   then be bound with the eal library.
 - Within this way, another possibility was librte_bus_xxx.

2. Within the drivers/* folder:
 - drivers/bus, parallel to drivers/net and drivers/crypto
 - This way, each bus implmentation would be within the drivers/* area and
   a new implementation would only mean adding a new set of files and
   corresponding changes to the Makefiles.

3. Problem with (3) is that the naming is misleading. drivers/net and
   drivers/crypto are essentially two functionalities rather than drivers.
   Putting driver/bus parallel to these would be misleading in literal
   terms.
   Another possibility is to keep the drivers parallel to 'drivers' folder
   in root of DPDK.
   This is the implementation preferred in this patch.
   A new bus would mean adding a new folder structure within 'bus' -
   including distinction between linuxapp and bsdapp.

In all the three cases, the bus drivers would be instantiated using a
constructor similar to RTE_PMD_REGISTER_XXX. OR, specifically in case of
(2), it would be a priority based constructor.
(__attribute__((contructor(XX))) to assure that buses are loaded before
the drivers are loaded.

Further, as of now the 'pci_bus.c' only shows the scan and dump hooks.
rte_bus also includes hooks for 'dump'ing all devices on the bus and
'find_dev' for finding a device given its rte_device. Each of these
'optional' implementation are more like helpers but have implementation
which is specific to the bus. Hooks for finding a device can be used for
hotplugging (searching before adding a new device, removing existing).
Similarly, 'dump' is a way to represent device information in bus specific
way.

Missing/Grey areas:
 - A lot of PCI specific code is still in EAL - for example mapping. These
   can be moved into bus. Mapping of memory should be a bus property.
 - PMDINFOGEN symbol support for bus - this patch doesn't take care of
   that
 - there would be multiple lists for each bus. Would those be shared
   across various libraries, internal or external, which can be associated
   with DPDK? Should a tailq registration like method be used?

Pending work:
 - The integration of bus/* with librte_eal is not complete. There might
   be symbol issues.
 - Notification support for drivers over a bus doesn't exist yet. Probably
   that needs to be integrated with interrupt handling - no work on this
   has been done yet.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
---
 bus/Makefile               |  36 ++++
 bus/pci/Makefile           |  37 ++++
 bus/pci/linuxapp/pci_bus.c | 418 +++++++++++++++++++++++++++++++++++++++++++++
 bus/pci/linuxapp/pci_bus.h |  55 ++++++
 4 files changed, 546 insertions(+)
 create mode 100644 bus/Makefile
 create mode 100644 bus/pci/Makefile
 create mode 100644 bus/pci/linuxapp/pci_bus.c
 create mode 100644 bus/pci/linuxapp/pci_bus.h

diff --git a/bus/Makefile b/bus/Makefile
new file mode 100644
index 0000000..cfa548c
--- /dev/null
+++ b/bus/Makefile
@@ -0,0 +1,36 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2016 NXP.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of NXP nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+DIRS-y += pci
+
+include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/bus/pci/Makefile b/bus/pci/Makefile
new file mode 100644
index 0000000..fb6cc44
--- /dev/null
+++ b/bus/pci/Makefile
@@ -0,0 +1,37 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of Intel Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+DIRS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += linuxapp
+DIRS-$(CONFIG_RTE_EXEC_ENV_BSDAPP) += bsdapp
+
+include $(RTE_SDK)/mk/rte.subdir.mk
diff --git a/bus/pci/linuxapp/pci_bus.c b/bus/pci/linuxapp/pci_bus.c
new file mode 100644
index 0000000..b61cb0a
--- /dev/null
+++ b/bus/pci/linuxapp/pci_bus.c
@@ -0,0 +1,418 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 NXP.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of NXP nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <string.h>
+#include <dirent.h>
+
+#include <rte_log.h>
+#include <rte_bus.h>
+#include <rte_pci.h>
+#include <rte_eal_memconfig.h>
+#include <rte_malloc.h>
+#include <rte_devargs.h>
+#include <rte_memcpy.h>
+
+#include "eal_filesystem.h"
+#include "eal_private.h"
+#include "eal_pci_init.h"
+
+
+
+/**
+ * Default PCI matching function called duing probe for drivers
+ *
+ * @param drv
+ *	The PCI driver handle
+ * @param dev
+ *	The PCI device handle
+ * @return
+ *	0 for successful match
+ *	!0 for unsuccessful match
+ */
+int rte_eal_pci_match_default(struct rte_pci_driver *drv,
+			      struct rte_pci_device *dev);
+
+
+/*
+ * split up a pci address into its constituent parts.
+ */
+static int
+parse_pci_addr_format(const char *buf, int bufsize, uint16_t *domain,
+		      uint8_t *bus, uint8_t *devid, uint8_t *function)
+{
+	/* first split on ':' */
+	union splitaddr {
+		struct {
+			char *domain;
+			char *bus;
+			char *devid;
+			char *function;
+		};
+		char *str[PCI_FMT_NVAL];
+		/* last element-separator is "." not ":" */
+	} splitaddr;
+
+	char *buf_copy = strndup(buf, bufsize);
+	if (buf_copy == NULL)
+		return -1;
+
+	if (rte_strsplit(buf_copy, bufsize, splitaddr.str, PCI_FMT_NVAL, ':')
+			!= PCI_FMT_NVAL - 1)
+		goto error;
+	/* final split is on '.' between devid and function */
+	splitaddr.function = strchr(splitaddr.devid,'.');
+	if (splitaddr.function == NULL)
+		goto error;
+	*splitaddr.function++ = '\0';
+
+	/* now convert to int values */
+	errno = 0;
+	*domain = (uint16_t)strtoul(splitaddr.domain, NULL, 16);
+	*bus = (uint8_t)strtoul(splitaddr.bus, NULL, 16);
+	*devid = (uint8_t)strtoul(splitaddr.devid, NULL, 16);
+	*function = (uint8_t)strtoul(splitaddr.function, NULL, 10);
+	if (errno != 0)
+		goto error;
+
+	free(buf_copy); /* free the copy made with strdup */
+	return 0;
+error:
+	free(buf_copy);
+	return -1;
+}
+
+/* parse the "resource" sysfs file */
+static int
+pci_parse_sysfs_resource(const char *filename,
+			 struct rte_pci_device *dev)
+{
+	FILE *f;
+	char buf[BUFSIZ];
+	int i;
+	uint64_t phys_addr, end_addr, flags;
+
+	f = fopen(filename, "r");
+	if (f == NULL) {
+		RTE_LOG(ERR, EAL, "Cannot open sysfs resource\n");
+		return -1;
+	}
+
+	for (i = 0; i<PCI_MAX_RESOURCE; i++) {
+
+		if (fgets(buf, sizeof(buf), f) == NULL) {
+			RTE_LOG(ERR, EAL,
+				"%s(): cannot read resource\n", __func__);
+			goto error;
+		}
+		if (pci_parse_one_sysfs_resource(buf, sizeof(buf), &phys_addr,
+				&end_addr, &flags) < 0)
+			goto error;
+
+		if (flags & IORESOURCE_MEM) {
+			dev->mem_resource[i].phys_addr = phys_addr;
+			dev->mem_resource[i].len = end_addr - phys_addr + 1;
+			/* not mapped for now */
+			dev->mem_resource[i].addr = NULL;
+		}
+	}
+	fclose(f);
+	return 0;
+
+error:
+	fclose(f);
+	return -1;
+}
+
+/* Scan one pci sysfs entry, and fill the devices list from it. */
+static int
+pci_scan_one(const char *dirname, uint16_t domain, uint8_t bus,uint8_t devid,
+	     uint8_t function)
+{
+	char filename[PATH_MAX];
+	unsigned long tmp;
+	struct rte_pci_device *pci_dev;
+	struct rte_bus *pci_bus;
+	struct rte_device *dev;
+	char driver[PATH_MAX];
+	int ret;
+
+	/* Get the PCI bus for device; It it doesn't exist, can't continue */
+	pci_bus = rte_eal_get_bus(PCI_BUS_NAME);
+	if (!pci_bus) {
+		RTE_LOG(ERR, EAL, "Unable to get PCI bus.\n");
+		return -1;
+	}
+
+	pci_dev = malloc(sizeof(*pci_dev));
+	if (pci_dev == NULL)
+		return -1;
+
+	memset(pci_dev, 0, sizeof(*pci_dev));
+	pci_dev->addr.domain = domain;
+	pci_dev->addr.bus = bus;
+	pci_dev->addr.devid = devid;
+	pci_dev->addr.function = function;
+
+	/* Set Bus */
+	pci_dev->device->bus = pci_bus;
+
+	/* get vendor id */
+	snprintf(filename, sizeof(filename), "%s/vendor", dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
+		free(pci_dev);
+		return -1;
+	}
+	pci_dev->id.vendor_id = (uint16_t)tmp;
+
+	/* get device id */
+	snprintf(filename, sizeof(filename), "%s/device", dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
+		free(pci_dev);
+		return -1;
+	}
+	pci_dev->id.device_id = (uint16_t)tmp;
+
+	/* get subsystem_vendor id */
+	snprintf(filename, sizeof(filename), "%s/subsystem_vendor",
+		 dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
+		free(pci_dev);
+		return -1;
+	}
+	pci_dev->id.subsystem_vendor_id = (uint16_t)tmp;
+
+	/* get subsystem_device id */
+	snprintf(filename, sizeof(filename), "%s/subsystem_device",
+		 dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
+		free(pci_dev);
+		return -1;
+	}
+	pci_dev->id.subsystem_device_id = (uint16_t)tmp;
+
+	/* get class_id */
+	snprintf(filename, sizeof(filename), "%s/class",
+		 dirname);
+	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
+		free(pci_dev);
+		return -1;
+	}
+	/* the least 24 bits are valid: class, subclass, program interface */
+	pci_dev->id.class_id = (uint32_t)tmp & RTE_CLASS_ANY_ID;
+
+	/* get max_vfs */
+	pci_dev->max_vfs = 0;
+	snprintf(filename, sizeof(filename), "%s/max_vfs", dirname);
+	if (!access(filename, F_OK) &&
+	    eal_parse_sysfs_value(filename, &tmp) == 0)
+		pci_dev->max_vfs = (uint16_t)tmp;
+	else {
+		/* for non igb_uio driver, need kernel version >= 3.8 */
+		snprintf(filename, sizeof(filename),
+			 "%s/sriov_numvfs", dirname);
+		if (!access(filename, F_OK) &&
+		    eal_parse_sysfs_value(filename, &tmp) == 0)
+			pci_dev->max_vfs = (uint16_t)tmp;
+	}
+
+	/* get numa node */
+	snprintf(filename, sizeof(filename), "%s/numa_node",
+		 dirname);
+	if (access(filename, R_OK) != 0) {
+		/* if no NUMA support, set default to 0 */
+		pci_dev->device.numa_node = 0;
+	} else {
+		if (eal_parse_sysfs_value(filename, &tmp) < 0) {
+			free(pci_dev);
+			return -1;
+		}
+		pci_dev->device.numa_node = tmp;
+	}
+
+	/* parse resources */
+	snprintf(filename, sizeof(filename), "%s/resource", dirname);
+	if (pci_parse_sysfs_resource(filename, pci_dev) < 0) {
+		RTE_LOG(ERR, EAL, "%s(): cannot parse resource\n", __func__);
+		free(pci_dev);
+		return -1;
+	}
+
+	/* parse driver */
+	snprintf(filename, sizeof(filename), "%s/driver", dirname);
+	ret = pci_get_kernel_driver_by_path(filename, driver);
+	if (ret < 0) {
+		RTE_LOG(ERR, EAL, "Fail to get kernel driver\n");
+		free(pci_dev);
+		return -1;
+	}
+
+	if (!ret) {
+		if (!strcmp(driver, "vfio-pci"))
+			pci_dev->kdrv = RTE_KDRV_VFIO;
+		else if (!strcmp(driver, "igb_uio"))
+			pci_dev->kdrv = RTE_KDRV_IGB_UIO;
+		else if (!strcmp(driver, "uio_pci_generic"))
+			pci_dev->kdrv = RTE_KDRV_UIO_GENERIC;
+		else
+			pci_dev->kdrv = RTE_KDRV_UNKNOWN;
+	} else
+		pci_dev->kdrv = RTE_KDRV_NONE;
+
+	/* device is valid, add in list (sorted) */
+	if (TAILQ_EMPTY(&pci_bus->device_list)) {
+		rte_eal_bus_add_device(pci_bus, dev->device);
+	} else {
+		struct rte_device *dev2 = NULL;
+		struct rte_pci_device *pci_dev2;
+		int ret;
+
+		TAILQ_FOREACH(dev2, &pci_bus->device_list, next) {
+			pci_dev2 = container_of(dev2, struct rte_pci_device,
+						device);
+
+			ret = rte_eal_compare_pci_addr(&pci_dev->addr,
+						       &pci_dev2->addr);
+			if (ret > 0)
+				continue;
+
+			if (ret < 0) {
+				/* TODO Pending: 'before' handler for
+				 * bus->device_list
+				 */
+				rte_eal_bus_add_device(pci_bus,
+						       pci_dev2->device);
+			} else { /* already registered */
+				pci_dev2->kdrv = pci_dev->kdrv;
+				pci_dev2->max_vfs = pci_dev->max_vfs;
+				memmove(pci_dev2->mem_resource,
+					pci_dev->mem_resource,
+					sizeof(pci_dev->mem_resource));
+				free(pci_dev);
+			}
+			return 0;
+		}
+		rte_eal_device_insert(&pci_dev->device);
+		TAILQ_INSERT_TAIL(&pci_device_list, pci_dev, next);
+	}
+
+	return 0;
+}
+
+/* Default PCI match implementation */
+int
+pci_bus_match(struct rte_pci_driver *drv,
+			  struct rte_pci_device *dev)
+{
+	int ret = 1;
+	const struct rte_pci_id *id_table;
+
+	for (id_table = drv->id_table; id_table->vendor_id != 0; id_table++) {
+
+		/* check if device's identifiers match the driver's ones */
+		if (id_table->vendor_id != dev->id.vendor_id &&
+				id_table->vendor_id != PCI_ANY_ID)
+			continue;
+		if (id_table->device_id != dev->id.device_id &&
+				id_table->device_id != PCI_ANY_ID)
+			continue;
+		if (id_table->subsystem_vendor_id !=
+		    dev->id.subsystem_vendor_id &&
+		    id_table->subsystem_vendor_id != PCI_ANY_ID)
+			continue;
+		if (id_table->subsystem_device_id !=
+		    dev->id.subsystem_device_id &&
+		    id_table->subsystem_device_id != PCI_ANY_ID)
+			continue;
+		if (id_table->class_id != dev->id.class_id &&
+				id_table->class_id != RTE_CLASS_ANY_ID)
+			continue;
+		ret = 0;
+		break;
+	}
+
+	/* Returning a >0 value as <0 is considered error by caller */
+	return ret;
+}
+
+/*
+ * Scan the content of the PCI bus, and the devices in the devices
+ * list
+ */
+int
+pci_bus_scan(void)
+{
+	struct dirent *e;
+	DIR *dir;
+	char dirname[PATH_MAX];
+	uint16_t domain;
+	uint8_t bus, devid, function;
+
+	/* for debug purposes, PCI can be disabled */
+	if (internal_config.no_pci)
+		return 0;
+
+	dir = opendir(pci_get_sysfs_path());
+	if (dir == NULL) {
+		RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
+			__func__, strerror(errno));
+		return -1;
+	}
+
+	while ((e = readdir(dir)) != NULL) {
+		if (e->d_name[0] == '.')
+			continue;
+
+		if (parse_pci_addr_format(e->d_name, sizeof(e->d_name),
+				&domain, &bus, &devid, &function) != 0)
+			continue;
+
+		snprintf(dirname, sizeof(dirname), "%s/%s",
+				pci_get_sysfs_path(), e->d_name);
+		if (pci_scan_one(dirname, domain, bus, devid, function) < 0)
+			goto error;
+	}
+	closedir(dir);
+	return 0;
+
+error:
+	closedir(dir);
+	return -1;
+}
+
+struct rte_bus pci_bus = {
+	.scan = pci_bus_scan,
+	.match = pci_bus_match,
+	.dump = NULL,
+};
+
+RTE_PMD_REGISTER_BUS("pci_bus", pci_bus);
diff --git a/bus/pci/linuxapp/pci_bus.h b/bus/pci/linuxapp/pci_bus.h
new file mode 100644
index 0000000..580fa36
--- /dev/null
+++ b/bus/pci/linuxapp/pci_bus.h
@@ -0,0 +1,55 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 NXP
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of NXP nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+/**
+ * Default PCI matching function called duing probe for drivers
+ *
+ * @param drv
+ *	The PCI driver handle
+ * @param dev
+ *	The PCI device handle
+ * @return
+ *	0 for successful match
+ *	!0 for unsuccessful match
+ */
+int bus_pci_match(struct rte_pci_driver *drv,
+		  struct rte_pci_device *dev);
+
+/**
+* Scan the content of the PCI bus, and the devices in the devices
+* list
+*
+* @return
+*  0 on success, negative on error
+*/
+int bus_pci_scan(void);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 4/6] eal/common: handle bus abstraction for device/driver objects
  2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
                   ` (2 preceding siblings ...)
  2016-11-17  5:30 ` [RFC PATCH 3/6] bus: add bus driver layer Shreyansh Jain
@ 2016-11-17  5:30 ` Shreyansh Jain
  2016-11-17  5:30 ` [RFC PATCH 5/6] eal: supporting bus model in init process Shreyansh Jain
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17  5:30 UTC (permalink / raw)
  To: david.marchand; +Cc: dev, Shreyansh Jain

Primary changes done by this patch are based on:
 - Devices belong to the bus hence the device list is bus instance
   specific
 - Similarly, drivers belong to the bus and thus they too are enclosed
   within the bus instance.
 - All device insertion and driver registration should proceed through
   bus APIs. A new file, eal_common_bus.c has been added for that.

Exiting driver registration and device insert/remove APIs have been
modified to work with bus on which device/driver belong. On the same
lines, the PCI common functions have been modified to work with bus rather
than device/driver directly.

rte_eal_pci_scan is no longer an exposed API. It is part of the bus
implementation. Though, probe continues to be part of the common PCI
operations.

Probe has been split into match and probe. Match has been moved to bus/*
code and is a hook now. EAL code would be modified to handle this hook.

Missing/Grey area:
 - Some API like inserting a device at a particular position in the device
   list are missing. Same for driver. These are needed for cases where
   device update is done rather than addition.
 - Probe is a property of a driver but it should be initiated from a bus,
   for example when added a new device (hotplugging). At present
   rte_driver has the probe hook. This should be wrapped around some API
   at the bus level so that bus can search through multiple drivers
   associated with it for calling probe.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
---
 lib/librte_eal/common/eal_common_bus.c  | 188 ++++++++++++++++++++++++++
 lib/librte_eal/common/eal_common_dev.c  |  31 +++--
 lib/librte_eal/common/eal_common_pci.c  | 226 +++++++++++++++++++-------------
 lib/librte_eal/common/include/rte_pci.h |  11 +-
 4 files changed, 342 insertions(+), 114 deletions(-)
 create mode 100644 lib/librte_eal/common/eal_common_bus.c

diff --git a/lib/librte_eal/common/eal_common_bus.c b/lib/librte_eal/common/eal_common_bus.c
new file mode 100644
index 0000000..3de1ac7
--- /dev/null
+++ b/lib/librte_eal/common/eal_common_bus.c
@@ -0,0 +1,188 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2016 NXP
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of NXP nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <inttypes.h>
+#include <sys/queue.h>
+
+#include <rte_dev.h>
+#include <rte_devargs.h>
+#include <rte_debug.h>
+#include <rte_devargs.h>
+#include <rte_log.h>
+
+#include "eal_private.h"
+
+/** @internal
+ * Add a device to a bus.
+ */
+void
+rte_eal_bus_add_device(struct rte_bus *bus, struct rte_device *dev)
+{
+	/* XXX all the additions can be address ordered ?
+	 * for example, calling rte_eal_compare_pci_addr and getting <=
+	 * and performing insert a specific location
+	 */
+	RTE_VERIFY(bus);
+	RTE_VERIFY(dev);
+
+	TAILQ_INSERT_TAIL(&bus->device_list, dev, next);
+}
+
+/** @internal
+ * Remove a device from its bus.
+ */
+void
+rte_eal_bus_remove_device(struct rte_device *dev)
+{
+	struct rte_bus *bus;
+	RTE_VERIFY(bus);
+	RTE_VERIFY(dev);
+
+	bus = dev->bus;
+	TAILQ_REMOVE(&bus->device_list, dev, next);
+}
+
+/** @internal
+ * Associate a driver with a bus.
+ */
+void
+rte_eal_bus_add_driver(struct rte_bus *bus, struct rte_driver *drv)
+{
+	RTE_VERIFY(bus);
+	RTE_VERIFY(drv);
+
+	TAILQ_INSERT_TAIL(&bus->driver_list, drv, next);
+}
+
+/** @internal
+ * Disassociate a driver from bus.
+ */
+void
+rte_eal_bus_remove_driver(struct rte_driver *drv)
+{
+	struct rte_bus *bus;
+	RTE_VERIFY(bus);
+	RTE_VERIFY(drv);
+
+	bus = drv->bus;
+	TAILQ_REMOVE(&bus->driver_list, dev, next);
+}
+
+/**
+ * Scan all the associated buses
+ */
+int
+rte_eal_bus_scan(void)
+{
+	int ret = 0;
+	struct rte_bus *bus;
+
+	if (TAILQ_EMPTY(&rte_bus_list)) {
+		return 0;
+	}
+
+	TAILQ_FOREACH(bus, &rte_bus_list, next) {
+		ret = bus->scan();
+		if (ret) {
+			RTE_LOG(ERR, EAL, "Scan for [%s] bus failed.\n",
+				bus->name);
+			continue; /* not a reason to break other bus scan */
+		}
+	}
+
+	/* This function essentially never returns an error - in case of error
+	 * no devices (or limited devices) are available to the application
+	 * which then can fail/report error.
+	 */
+	return 0;
+}
+
+/**
+ * Get the bus handle using its name
+ */
+struct rte_bus *
+rte_eal_get_bus(const char *bus_name)
+{
+	struct rte_bus *bus;
+
+	RTE_VERIFY(bus_name);
+
+	TAILQ_FOREACH(bus, &rte_bus_list, next) {
+		RTE_VERIFY(bus->name);
+
+		if (!strncmp(bus_name, bus->name)) {
+			RTE_LOG(DEBUG, EAL, "Returning Bus object %p\n", bus);
+			return bus;
+		}
+	}
+
+	/* Unable to find bus requested */
+	return NULL;
+}
+
+/* register a bus */
+void
+rte_eal_bus_register(struct rte_bus *bus)
+{
+	/* 3 conditions must meet:
+	 * 1. scan hook should be defined.
+	 * 2. match hook should be defined.
+	 * 3. Name should be a valid string. (valid?)
+	 */
+	RTE_VERIFY(bus);
+	RTE_VERIFY(bus->scan);
+	RTE_VERIFY(bus->probe);
+	RTE_VERIFY(bus->name && strlen(bus->name));
+
+	/* Initialize the driver and device list associated with the bus */
+	TAILQ_HEAD_INITIALIZER(&(bus->driver_list));
+	TAILQ_HEAD_INITIALIZER(&(bus->device_list));
+
+	TAILQ_INSERT_TAIL(&rte_bus_list, bus, next);
+	RTE_LOG(DEBUG, EAL, "Registered [%s] bus.\n", bus->name);
+}
+
+/* unregister a bus */
+void
+rte_eal_bus_unregister(struct rte_bus *bus)
+{
+	/* All devices and drivers associated with the bus should have been
+	 * 'device->uninit' and 'driver->remove()' already.
+	 */
+	RTE_VERIFY(TAILQ_EMPTY(&(bus->driver_list)));
+	RTE_VERIFY(TAILQ_EMPTY(&(bus->device_list)));
+
+	TAILQ_REMOVE(&rte_bus_list, bus, next);
+}
diff --git a/lib/librte_eal/common/eal_common_dev.c b/lib/librte_eal/common/eal_common_dev.c
index 4f3b493..bb8d266 100644
--- a/lib/librte_eal/common/eal_common_dev.c
+++ b/lib/librte_eal/common/eal_common_dev.c
@@ -45,35 +45,44 @@
 
 #include "eal_private.h"
 
-/** Global list of device drivers. */
-static struct rte_driver_list dev_driver_list =
-	TAILQ_HEAD_INITIALIZER(dev_driver_list);
-/** Global list of device drivers. */
-static struct rte_device_list dev_device_list =
-	TAILQ_HEAD_INITIALIZER(dev_device_list);
-
 /* register a driver */
 void
 rte_eal_driver_register(struct rte_driver *driver)
 {
-	TAILQ_INSERT_TAIL(&dev_driver_list, driver, next);
+	struct rte_bus *bus;
+
+	RTE_VERIFY(driver && driver->bus);
+	bus = driver->bus;
+	TAILQ_INSERT_TAIL(&(bus->driver_list), driver, next);
 }
 
 /* unregister a driver */
 void
 rte_eal_driver_unregister(struct rte_driver *driver)
 {
-	TAILQ_REMOVE(&dev_driver_list, driver, next);
+	struct rte_bus;
+
+	RTE_VERIFY(driver && driver->bus);
+	bus = driver->bus;
+	TAILQ_REMOVE(&(bus->driver_list), driver, next);
 }
 
 void rte_eal_device_insert(struct rte_device *dev)
 {
-	TAILQ_INSERT_TAIL(&dev_device_list, dev, next);
+	struct rte_bus *bus;
+
+	RTE_VERIFY(dev && dev->bus);
+	bus = dev->bus;
+	TAILQ_INSERT_TAIL(&(bus->device_list), dev, next);
 }
 
 void rte_eal_device_remove(struct rte_device *dev)
 {
-	TAILQ_REMOVE(&dev_device_list, dev, next);
+	struct rte_bus *bus;
+
+	RTE_VERIFY(dev && dev->bus);
+	bus = dev->bus;
+	TAILQ_REMOVE(&(bus->device_list), dev, next);
 }
 
 int
diff --git a/lib/librte_eal/common/eal_common_pci.c b/lib/librte_eal/common/eal_common_pci.c
index 971ad20..7a6d258 100644
--- a/lib/librte_eal/common/eal_common_pci.c
+++ b/lib/librte_eal/common/eal_common_pci.c
@@ -82,12 +82,13 @@
 
 #include "eal_private.h"
 
-struct pci_driver_list pci_driver_list =
-	TAILQ_HEAD_INITIALIZER(pci_driver_list);
-struct pci_device_list pci_device_list =
-	TAILQ_HEAD_INITIALIZER(pci_device_list);
-
 #define SYSFS_PCI_DEVICES "/sys/bus/pci/devices"
+#define PCI_BUS_NAME "pci_bus"
+
+static struct rte_bus_list pci_bus_list =
+	TAILQ_HEAD_INITIALIZER(pci_bus_list);
+
+struct rte_bus pci_bus;
 
 const char *pci_get_sysfs_path(void)
 {
@@ -160,64 +161,40 @@ static int
 rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *dev)
 {
 	int ret;
-	const struct rte_pci_id *id_table;
-
-	for (id_table = dr->id_table; id_table->vendor_id != 0; id_table++) {
-
-		/* check if device's identifiers match the driver's ones */
-		if (id_table->vendor_id != dev->id.vendor_id &&
-				id_table->vendor_id != PCI_ANY_ID)
-			continue;
-		if (id_table->device_id != dev->id.device_id &&
-				id_table->device_id != PCI_ANY_ID)
-			continue;
-		if (id_table->subsystem_vendor_id != dev->id.subsystem_vendor_id &&
-				id_table->subsystem_vendor_id != PCI_ANY_ID)
-			continue;
-		if (id_table->subsystem_device_id != dev->id.subsystem_device_id &&
-				id_table->subsystem_device_id != PCI_ANY_ID)
-			continue;
-		if (id_table->class_id != dev->id.class_id &&
-				id_table->class_id != RTE_CLASS_ANY_ID)
-			continue;
-
-		struct rte_pci_addr *loc = &dev->addr;
-
-		RTE_LOG(INFO, EAL, "PCI device "PCI_PRI_FMT" on NUMA socket %i\n",
-				loc->domain, loc->bus, loc->devid, loc->function,
-				dev->device.numa_node);
-
-		/* no initialization when blacklisted, return without error */
-		if (dev->device.devargs != NULL &&
-			dev->device.devargs->type ==
-				RTE_DEVTYPE_BLACKLISTED_PCI) {
-			RTE_LOG(INFO, EAL, "  Device is blacklisted, not initializing\n");
-			return 1;
-		}
-
-		RTE_LOG(INFO, EAL, "  probe driver: %x:%x %s\n", dev->id.vendor_id,
-				dev->id.device_id, dr->driver.name);
+	struct rte_pci_addr *loc = &dev->addr;
+
+	RTE_LOG(INFO, EAL, "PCI device "PCI_PRI_FMT" on NUMA socket %i\n",
+			loc->domain, loc->bus, loc->devid, loc->function,
+			dev->device.numa_node);
+
+	/* no initialization when blacklisted, return without error */
+	if (dev->device.devargs != NULL &&
+		dev->device.devargs->type ==
+			RTE_DEVTYPE_BLACKLISTED_PCI) {
+		RTE_LOG(INFO, EAL, "  Device is blacklisted, not initializing\n");
+		return 1;
+	}
 
-		if (dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING) {
-			/* map resources for devices that use igb_uio */
-			ret = rte_eal_pci_map_device(dev);
-			if (ret != 0)
-				return ret;
-		} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
-				rte_eal_process_type() == RTE_PROC_PRIMARY) {
-			/* unbind current driver */
-			if (pci_unbind_kernel_driver(dev) < 0)
-				return -1;
-		}
+	RTE_LOG(INFO, EAL, "  probe driver: %x:%x %s\n", dev->id.vendor_id,
+			dev->id.device_id, dr->driver.name);
+
+	if (dr->drv_flags & RTE_PCI_DRV_NEED_MAPPING) {
+		/* map resources for devices that use igb_uio */
+		ret = rte_eal_pci_map_device(dev);
+		if (ret != 0)
+			return ret;
+	} else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND &&
+			rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		/* unbind current driver */
+		if (pci_unbind_kernel_driver(dev) < 0)
+			return -1;
+	}
 
-		/* reference driver structure */
-		dev->driver = dr;
+	/* reference driver structure */
+	dev->driver = dr;
 
-		/* call the driver probe() function */
-		return dr->probe(dr, dev);
-	}
-	/* return positive value if driver doesn't support this device */
-	return 1;
+	/* call the driver probe() function */
+	return dr->probe(dr, dev);
 }
 
 /*
@@ -284,6 +261,7 @@ static int
 pci_probe_all_drivers(struct rte_pci_device *dev)
 {
 	struct rte_pci_driver *dr = NULL;
+	struct rte_bus *pci_bus;
 	int rc = 0;
 
 	if (dev == NULL)
@@ -293,15 +271,21 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
 	if (dev->driver != NULL)
 		return 0;
 
-	TAILQ_FOREACH(dr, &pci_driver_list, next) {
-		rc = rte_eal_pci_probe_one_driver(dr, dev);
-		if (rc < 0)
-			/* negative value is an error */
-			return -1;
-		if (rc > 0)
-			/* positive value means driver doesn't support it */
-			continue;
-		return 0;
+	TAILQ_FOREACH(dr, &(pci_bus->driver_list), next) {
+		rc = pci_bus->match(dr, dev);
+		if (!rc) {
+			rc = rte_eal_pci_probe_one_driver(dr, dev);
+			if (rc < 0)
+				/* negative value is an error */
+				return -1;
+			if (rc > 0)
+				/* positive value means driver doesn't support
+				 * it
+				 */
+				continue;
+			return 0;
+		}
+		/* Else, continue to next driver */
 	}
 	return 1;
 }
@@ -314,13 +298,19 @@ pci_probe_all_drivers(struct rte_pci_device *dev)
 static int
 pci_detach_all_drivers(struct rte_pci_device *dev)
 {
-	struct rte_pci_driver *dr = NULL;
+	struct rte_pci_driver *pci_drv= NULL;
+	struct rte_driver *drv = NULL;
+	struct rte_bus *pci_bus;
 	int rc = 0;
 
 	if (dev == NULL)
 		return -1;
 
-	TAILQ_FOREACH(dr, &pci_driver_list, next) {
+	RTE_VERIFY(dev->device->bus);
+	pci_bus = dev->device->bus;
+
+	TAILQ_FOREACH(drv, &pci_bus->driver_list, next) {
+		pci_drv = container_of(drv, struct rte_pci_driver, driver);
 		rc = rte_eal_pci_detach_dev(dr, dev);
 		if (rc < 0)
 			/* negative value is an error */
@@ -340,19 +330,28 @@ pci_detach_all_drivers(struct rte_pci_device *dev)
 int
 rte_eal_pci_probe_one(const struct rte_pci_addr *addr)
 {
-	struct rte_pci_device *dev = NULL;
+	struct rte_bus *pci_bus = NULL;
+	struct rte_device *dev = NULL;
+	struct rte_pci_device *pci_dev = NULL;
 	int ret = 0;
 
 	if (addr == NULL)
 		return -1;
 
+	pci_bus = rte_eal_get_bus(PCI_BUS_NAME);
+	if (!pci_bus) {
+		RTE_LOG(DEBUG, EAL, "Unable to find PCI bus\n");
+		return -1;
+	}
+
 	/* update current pci device in global list, kernel bindings might have
 	 * changed since last time we looked at it.
 	 */
 	if (pci_update_device(addr) < 0)
 		goto err_return;
 
-	TAILQ_FOREACH(dev, &pci_device_list, next) {
+	TAILQ_FOREACH(dev, &pci_bus->device_list, next) {
+		pci_dev = container_of(dev, struct rte_pci_device, device);
 		if (rte_eal_compare_pci_addr(&dev->addr, addr))
 			continue;
 
@@ -376,22 +375,31 @@ err_return:
 int
 rte_eal_pci_detach(const struct rte_pci_addr *addr)
 {
-	struct rte_pci_device *dev = NULL;
+	struct rte_pci_device *pci_dev = NULL;
+	struct rte_bus *pci_bus = NULL;
+	struct rte_device *dev = NULL;
 	int ret = 0;
 
 	if (addr == NULL)
 		return -1;
 
-	TAILQ_FOREACH(dev, &pci_device_list, next) {
-		if (rte_eal_compare_pci_addr(&dev->addr, addr))
+	pci_bus = rte_eal_get_bus(PCI_BUS_NAME);
+	if (!pci_bus) {
+		RTE_LOG(DEBUG, EAL, "Unable to find PCI bus\n");
+		return -1;
+	}
+
+	TAILQ_FOREACH(dev, &(bus->device_list), next) {
+		pci_dev = container_of(dev, struct rte_pci_device, device);
+		if (rte_eal_compare_pci_addr(&pci_dev->addr, addr))
 			continue;
 
-		ret = pci_detach_all_drivers(dev);
+		ret = pci_detach_all_drivers(pci_dev);
 		if (ret < 0)
 			goto err_return;
 
-		TAILQ_REMOVE(&pci_device_list, dev, next);
-		free(dev);
+		TAILQ_REMOVE(&pci_bus->device_list, dev, next);
+		free(pci_dev);
 		return 0;
 	}
 	return -1;
@@ -411,31 +419,42 @@ err_return:
 int
 rte_eal_pci_probe(void)
 {
-	struct rte_pci_device *dev = NULL;
+	struct rte_device *dev = NULL;
+	struct rte_bus *pci_bus;
+	struct rte_pci_device *pci_dev;
 	struct rte_devargs *devargs;
 	int probe_all = 0;
 	int ret = 0;
 
+	pci_bus = rte_eal_get_bus(PCI_BUS_NAME);
+	if (!pci_bus) {
+		RTE_LOG(INFO, EAL, "(%s): No such bus exists\n", PCI_BUS_NAME);
+		/* Cannot continue ahead and this is not an error */
+		return;
+	}
+
 	if (rte_eal_devargs_type_count(RTE_DEVTYPE_WHITELISTED_PCI) == 0)
 		probe_all = 1;
 
-	TAILQ_FOREACH(dev, &pci_device_list, next) {
+	TAILQ_FOREACH(dev, &(pci_bus->device_list), next) {
+		pci_dev = container_of(dev, struct rte_pci_device, device);
 
 		/* set devargs in PCI structure */
-		devargs = pci_devargs_lookup(dev);
+		devargs = pci_devargs_lookup(pci_dev);
 		if (devargs != NULL)
-			dev->device.devargs = devargs;
+			pci_dev->device.devargs = devargs;
 
 		/* probe all or only whitelisted devices */
 		if (probe_all)
-			ret = pci_probe_all_drivers(dev);
+			ret = pci_probe_all_drivers(pci_dev);`
 		else if (devargs != NULL &&
 			devargs->type == RTE_DEVTYPE_WHITELISTED_PCI)
-			ret = pci_probe_all_drivers(dev);
+			ret = pci_probe_all_drivers(pci_dev);
 		if (ret < 0)
 			rte_exit(EXIT_FAILURE, "Requested device " PCI_PRI_FMT
-				 " cannot be used\n", dev->addr.domain, dev->addr.bus,
-				 dev->addr.devid, dev->addr.function);
+				 " cannot be used\n", pci_dev->addr.domain,
+				 pci_dev->addr.bus, pci_dev->addr.devid,
+				 pci_dev->addr.function);
 	}
 
 	return 0;
@@ -465,10 +484,19 @@ pci_dump_one_device(FILE *f, struct rte_pci_device *dev)
 void
 rte_eal_pci_dump(FILE *f)
 {
-	struct rte_pci_device *dev = NULL;
+	struct rte_device *dev = NULL;
+	struct rte_pci_device *pci_dev = NULL;
+	struct rte_bus *bus;
 
-	TAILQ_FOREACH(dev, &pci_device_list, next) {
-		pci_dump_one_device(f, dev);
+	bus = rte_eal_get_bus(PCI_BUS_NAME);
+	if (!bus) {
+		RTE_LOG(INFO, EAL, "(%s): No such bus exists\n", PCI_BUS_NAME);
+		return;
+	}
+
+	TAILQ_FOREACH(dev, &(bus->device_list), next) {
+		pci_dev = container_of(dev, struct rte_pci_device, device);
+		pci_dump_one_device(f, pci_dev);
 	}
 }
 
@@ -476,7 +504,16 @@ rte_eal_pci_dump(FILE *f)
 void
 rte_eal_pci_register(struct rte_pci_driver *driver)
 {
-	TAILQ_INSERT_TAIL(&pci_driver_list, driver, next);
+	struct rte_bus *pci_bus;
+
+	pci_bus = rte_eal_get_bus(PCI_BUS_NAME);
+	if (!pci_bus) {
+		RTE_LOG(INFO, EAL, "(%s) No such bus exists\n", PCI_BUS_NAME);
+		return;
+		/* TODO: How to return error from here? */
+	}
+
+	driver->driver->bus = pci_bus;
 	rte_eal_driver_register(&driver->driver);
 }
 
@@ -484,6 +521,9 @@ rte_eal_pci_register(struct rte_pci_driver *driver)
 void
 rte_eal_pci_unregister(struct rte_pci_driver *driver)
 {
-	rte_eal_driver_unregister(&driver->driver);
-	TAILQ_REMOVE(&pci_driver_list, driver, next);
+	struct rte_bus *pci_bus;
+
+	pci_bus = driver->driver->bus;
+	rte_eal_driver_unregister(&pci_bus, &driver->driver);
+	driver->driver->bus = NULL;
 }
diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h
index 9ce8847..cffc449 100644
--- a/lib/librte_eal/common/include/rte_pci.h
+++ b/lib/librte_eal/common/include/rte_pci.h
@@ -110,7 +110,7 @@ const char *pci_get_sysfs_path(void);
 
 /** Maximum number of PCI resources. */
 #define PCI_MAX_RESOURCE 6
-
+/
 /**
  * A structure describing an ID for a PCI driver. Each driver provides a
  * table of these IDs for each device that it supports.
@@ -360,15 +360,6 @@ rte_eal_compare_pci_addr(const struct rte_pci_addr *addr,
 }
 
 /**
- * Scan the content of the PCI bus, and the devices in the devices
- * list
- *
- * @return
- *  0 on success, negative on error
- */
-int rte_eal_pci_scan(void);
-
-/**
  * Probe the PCI bus for registered drivers.
  *
  * Scan the content of the PCI bus, and call the probe() function for
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 5/6] eal: supporting bus model in init process
  2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
                   ` (3 preceding siblings ...)
  2016-11-17  5:30 ` [RFC PATCH 4/6] eal/common: handle bus abstraction for device/driver objects Shreyansh Jain
@ 2016-11-17  5:30 ` Shreyansh Jain
  2016-11-17  5:30 ` [RFC PATCH 6/6] eal: removing eth_driver Shreyansh Jain
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17  5:30 UTC (permalink / raw)
  To: david.marchand; +Cc: dev, Shreyansh Jain

This patch makes necessary changes to the EAL for handling bus scan and
probe. Most of the scan function has been moved to bus/* implementation,
(currently only for pci, linuxapp). There are still some functions which
exists in the EAL (bind, unbind module, mmap etc).

Missing/Grey area;
 - Should all the operations for a PCI device, whether binding to a driver
   or mmap, be moved to PCI bus code base? All of that is not relevant for
   a bus, but then having multiple implementation areas for a common
   sub-system (PCI, here) is also not a nice thing.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
---
 lib/librte_eal/linuxapp/eal/Makefile  |   1 +
 lib/librte_eal/linuxapp/eal/eal.c     |  51 +++++-
 lib/librte_eal/linuxapp/eal/eal_pci.c | 298 ----------------------------------
 3 files changed, 44 insertions(+), 306 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile
index 4e206f0..124dceb 100644
--- a/lib/librte_eal/linuxapp/eal/Makefile
+++ b/lib/librte_eal/linuxapp/eal/Makefile
@@ -77,6 +77,7 @@ SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_timer.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_memzone.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_log.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_launch.c
+SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_bus.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_vdev.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_pci.c
 SRCS-$(CONFIG_RTE_EXEC_ENV_LINUXAPP) += eal_common_pci_uio.c
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 2075282..332d1f4 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -739,6 +739,44 @@ static int rte_eal_vfio_setup(void)
 }
 #endif
 
+static int
+rte_eal_scan(void)
+{
+	int ret = 0;
+
+	/* For now, as vdev replacement is not complete, continue with a call
+	 * rte_bus_list scan
+	 */
+	rte_eal_bus_scan();
+
+	ret = rte_eal_dev_init();
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Cannot init pmd devices\n");
+	}
+
+out:
+	return ret;
+}
+
+static int
+rte_eal_probe(void)
+{
+	int ret = 0;
+
+	/* Probe & Initialize PCI devices */
+	ret = rte_eal_pci_probe();
+	if (ret) {
+		RTE_LOG(ERR, EAL, "Cannot probe PCI\n");
+		goto out;
+	}
+
+	/* ToDo: vdev scan already does the probe - it should be moved here */
+
+	/* Last successful bus probe would set ret = 0 */
+out:
+	return ret;
+}
+
 /* Launch threads, called at application init(). */
 int
 rte_eal_init(int argc, char **argv)
@@ -802,9 +840,6 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0)
 		rte_panic("Cannot init logs\n");
 
-	if (rte_eal_pci_init() < 0)
-		rte_panic("Cannot init PCI\n");
-
 #ifdef VFIO_PRESENT
 	if (rte_eal_vfio_setup() < 0)
 		rte_panic("Cannot init VFIO\n");
@@ -828,6 +863,9 @@ rte_eal_init(int argc, char **argv)
 	if (rte_eal_timer_init() < 0)
 		rte_panic("Cannot init HPET or TSC timers\n");
 
+	if (rte_eal_scan() < 0)
+		rte_panic("Cannot scan for devices\n");
+
 	eal_check_mem_on_local_socket();
 
 	if (eal_plugins_init() < 0)
@@ -841,9 +879,6 @@ rte_eal_init(int argc, char **argv)
 		rte_config.master_lcore, (int)thread_id, cpuset,
 		ret == 0 ? "" : "...");
 
-	if (rte_eal_dev_init() < 0)
-		rte_panic("Cannot init pmd devices\n");
-
 	if (rte_eal_intr_init() < 0)
 		rte_panic("Cannot init interrupt-handling thread\n");
 
@@ -884,8 +919,8 @@ rte_eal_init(int argc, char **argv)
 	rte_eal_mp_wait_lcore();
 
 	/* Probe & Initialize PCI devices */
-	if (rte_eal_pci_probe())
-		rte_panic("Cannot probe PCI\n");
+	if (rte_eal_probe())
+		rte_panic("Cannot complete probe\n");
 
 	rte_eal_mcfg_complete();
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 876ba38..e3916ab 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -224,202 +224,6 @@ pci_parse_one_sysfs_resource(char *line, size_t len, uint64_t *phys_addr,
 	return 0;
 }
 
-/* parse the "resource" sysfs file */
-static int
-pci_parse_sysfs_resource(const char *filename, struct rte_pci_device *dev)
-{
-	FILE *f;
-	char buf[BUFSIZ];
-	int i;
-	uint64_t phys_addr, end_addr, flags;
-
-	f = fopen(filename, "r");
-	if (f == NULL) {
-		RTE_LOG(ERR, EAL, "Cannot open sysfs resource\n");
-		return -1;
-	}
-
-	for (i = 0; i<PCI_MAX_RESOURCE; i++) {
-
-		if (fgets(buf, sizeof(buf), f) == NULL) {
-			RTE_LOG(ERR, EAL,
-				"%s(): cannot read resource\n", __func__);
-			goto error;
-		}
-		if (pci_parse_one_sysfs_resource(buf, sizeof(buf), &phys_addr,
-				&end_addr, &flags) < 0)
-			goto error;
-
-		if (flags & IORESOURCE_MEM) {
-			dev->mem_resource[i].phys_addr = phys_addr;
-			dev->mem_resource[i].len = end_addr - phys_addr + 1;
-			/* not mapped for now */
-			dev->mem_resource[i].addr = NULL;
-		}
-	}
-	fclose(f);
-	return 0;
-
-error:
-	fclose(f);
-	return -1;
-}
-
-/* Scan one pci sysfs entry, and fill the devices list from it. */
-static int
-pci_scan_one(const char *dirname, uint16_t domain, uint8_t bus,
-	     uint8_t devid, uint8_t function)
-{
-	char filename[PATH_MAX];
-	unsigned long tmp;
-	struct rte_pci_device *dev;
-	char driver[PATH_MAX];
-	int ret;
-
-	dev = malloc(sizeof(*dev));
-	if (dev == NULL)
-		return -1;
-
-	memset(dev, 0, sizeof(*dev));
-	dev->addr.domain = domain;
-	dev->addr.bus = bus;
-	dev->addr.devid = devid;
-	dev->addr.function = function;
-
-	/* get vendor id */
-	snprintf(filename, sizeof(filename), "%s/vendor", dirname);
-	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
-		free(dev);
-		return -1;
-	}
-	dev->id.vendor_id = (uint16_t)tmp;
-
-	/* get device id */
-	snprintf(filename, sizeof(filename), "%s/device", dirname);
-	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
-		free(dev);
-		return -1;
-	}
-	dev->id.device_id = (uint16_t)tmp;
-
-	/* get subsystem_vendor id */
-	snprintf(filename, sizeof(filename), "%s/subsystem_vendor",
-		 dirname);
-	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
-		free(dev);
-		return -1;
-	}
-	dev->id.subsystem_vendor_id = (uint16_t)tmp;
-
-	/* get subsystem_device id */
-	snprintf(filename, sizeof(filename), "%s/subsystem_device",
-		 dirname);
-	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
-		free(dev);
-		return -1;
-	}
-	dev->id.subsystem_device_id = (uint16_t)tmp;
-
-	/* get class_id */
-	snprintf(filename, sizeof(filename), "%s/class",
-		 dirname);
-	if (eal_parse_sysfs_value(filename, &tmp) < 0) {
-		free(dev);
-		return -1;
-	}
-	/* the least 24 bits are valid: class, subclass, program interface */
-	dev->id.class_id = (uint32_t)tmp & RTE_CLASS_ANY_ID;
-
-	/* get max_vfs */
-	dev->max_vfs = 0;
-	snprintf(filename, sizeof(filename), "%s/max_vfs", dirname);
-	if (!access(filename, F_OK) &&
-	    eal_parse_sysfs_value(filename, &tmp) == 0)
-		dev->max_vfs = (uint16_t)tmp;
-	else {
-		/* for non igb_uio driver, need kernel version >= 3.8 */
-		snprintf(filename, sizeof(filename),
-			 "%s/sriov_numvfs", dirname);
-		if (!access(filename, F_OK) &&
-		    eal_parse_sysfs_value(filename, &tmp) == 0)
-			dev->max_vfs = (uint16_t)tmp;
-	}
-
-	/* get numa node */
-	snprintf(filename, sizeof(filename), "%s/numa_node",
-		 dirname);
-	if (access(filename, R_OK) != 0) {
-		/* if no NUMA support, set default to 0 */
-		dev->device.numa_node = 0;
-	} else {
-		if (eal_parse_sysfs_value(filename, &tmp) < 0) {
-			free(dev);
-			return -1;
-		}
-		dev->device.numa_node = tmp;
-	}
-
-	/* parse resources */
-	snprintf(filename, sizeof(filename), "%s/resource", dirname);
-	if (pci_parse_sysfs_resource(filename, dev) < 0) {
-		RTE_LOG(ERR, EAL, "%s(): cannot parse resource\n", __func__);
-		free(dev);
-		return -1;
-	}
-
-	/* parse driver */
-	snprintf(filename, sizeof(filename), "%s/driver", dirname);
-	ret = pci_get_kernel_driver_by_path(filename, driver);
-	if (ret < 0) {
-		RTE_LOG(ERR, EAL, "Fail to get kernel driver\n");
-		free(dev);
-		return -1;
-	}
-
-	if (!ret) {
-		if (!strcmp(driver, "vfio-pci"))
-			dev->kdrv = RTE_KDRV_VFIO;
-		else if (!strcmp(driver, "igb_uio"))
-			dev->kdrv = RTE_KDRV_IGB_UIO;
-		else if (!strcmp(driver, "uio_pci_generic"))
-			dev->kdrv = RTE_KDRV_UIO_GENERIC;
-		else
-			dev->kdrv = RTE_KDRV_UNKNOWN;
-	} else
-		dev->kdrv = RTE_KDRV_NONE;
-
-	/* device is valid, add in list (sorted) */
-	if (TAILQ_EMPTY(&pci_device_list)) {
-		rte_eal_device_insert(&dev->device);
-		TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
-	} else {
-		struct rte_pci_device *dev2;
-		int ret;
-
-		TAILQ_FOREACH(dev2, &pci_device_list, next) {
-			ret = rte_eal_compare_pci_addr(&dev->addr, &dev2->addr);
-			if (ret > 0)
-				continue;
-
-			if (ret < 0) {
-				TAILQ_INSERT_BEFORE(dev2, dev, next);
-				rte_eal_device_insert(&dev->device);
-			} else { /* already registered */
-				dev2->kdrv = dev->kdrv;
-				dev2->max_vfs = dev->max_vfs;
-				memmove(dev2->mem_resource, dev->mem_resource,
-					sizeof(dev->mem_resource));
-				free(dev);
-			}
-			return 0;
-		}
-		rte_eal_device_insert(&dev->device);
-		TAILQ_INSERT_TAIL(&pci_device_list, dev, next);
-	}
-
-	return 0;
-}
-
 int
 pci_update_device(const struct rte_pci_addr *addr)
 {
@@ -433,93 +237,7 @@ pci_update_device(const struct rte_pci_addr *addr)
 				addr->function);
 }
 
-/*
- * split up a pci address into its constituent parts.
- */
-static int
-parse_pci_addr_format(const char *buf, int bufsize, uint16_t *domain,
-		uint8_t *bus, uint8_t *devid, uint8_t *function)
-{
-	/* first split on ':' */
-	union splitaddr {
-		struct {
-			char *domain;
-			char *bus;
-			char *devid;
-			char *function;
-		};
-		char *str[PCI_FMT_NVAL]; /* last element-separator is "." not ":" */
-	} splitaddr;
-
-	char *buf_copy = strndup(buf, bufsize);
-	if (buf_copy == NULL)
-		return -1;
-
-	if (rte_strsplit(buf_copy, bufsize, splitaddr.str, PCI_FMT_NVAL, ':')
-			!= PCI_FMT_NVAL - 1)
-		goto error;
-	/* final split is on '.' between devid and function */
-	splitaddr.function = strchr(splitaddr.devid,'.');
-	if (splitaddr.function == NULL)
-		goto error;
-	*splitaddr.function++ = '\0';
-
-	/* now convert to int values */
-	errno = 0;
-	*domain = (uint16_t)strtoul(splitaddr.domain, NULL, 16);
-	*bus = (uint8_t)strtoul(splitaddr.bus, NULL, 16);
-	*devid = (uint8_t)strtoul(splitaddr.devid, NULL, 16);
-	*function = (uint8_t)strtoul(splitaddr.function, NULL, 10);
-	if (errno != 0)
-		goto error;
-
-	free(buf_copy); /* free the copy made with strdup */
-	return 0;
-error:
-	free(buf_copy);
-	return -1;
-}
-
-/*
- * Scan the content of the PCI bus, and the devices in the devices
- * list
- */
-int
-rte_eal_pci_scan(void)
-{
-	struct dirent *e;
-	DIR *dir;
-	char dirname[PATH_MAX];
-	uint16_t domain;
-	uint8_t bus, devid, function;
-
-	dir = opendir(pci_get_sysfs_path());
-	if (dir == NULL) {
-		RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
-			__func__, strerror(errno));
-		return -1;
-	}
 
-	while ((e = readdir(dir)) != NULL) {
-		if (e->d_name[0] == '.')
-			continue;
-
-		if (parse_pci_addr_format(e->d_name, sizeof(e->d_name), &domain,
-				&bus, &devid, &function) != 0)
-			continue;
-
-		snprintf(dirname, sizeof(dirname), "%s/%s",
-				pci_get_sysfs_path(), e->d_name);
-		if (pci_scan_one(dirname, domain, bus, devid, function) < 0)
-			goto error;
-	}
-	closedir(dir);
-	return 0;
-
-error:
-	closedir(dir);
-	return -1;
-}
 
 /* Read PCI config space. */
 int rte_eal_pci_read_config(const struct rte_pci_device *device,
@@ -754,19 +472,3 @@ rte_eal_pci_ioport_unmap(struct rte_pci_ioport *p)
 
 	return ret;
 }
-
-/* Init the PCI EAL subsystem */
-int
-rte_eal_pci_init(void)
-{
-	/* for debug purposes, PCI can be disabled */
-	if (internal_config.no_pci)
-		return 0;
-
-	if (rte_eal_pci_scan() < 0) {
-		RTE_LOG(ERR, EAL, "%s(): Cannot scan PCI bus\n", __func__);
-		return -1;
-	}
-
-	return 0;
-}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 6/6] eal: removing eth_driver
  2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
                   ` (4 preceding siblings ...)
  2016-11-17  5:30 ` [RFC PATCH 5/6] eal: supporting bus model in init process Shreyansh Jain
@ 2016-11-17  5:30 ` Shreyansh Jain
  2016-11-17 12:53   ` Jan Blunck
  2016-11-17 11:55 ` [RFC PATCH 0/6] Restructure EAL device model for bus support Jan Blunck
  2016-11-20 15:30 ` David Marchand
  7 siblings, 1 reply; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17  5:30 UTC (permalink / raw)
  To: david.marchand; +Cc: dev, Shreyansh Jain

This patch demonstrates how eth_driver can be replaced with appropriate
changes for rte_xxx_driver from the PMD itself. It uses ixgbe_ethernet as
an example.

A large set of changes exists in the rte_ethdev.c - primarily because too
much PCI centric code (names, assumption of rte_pci_device) still exists
in it. Most, except symbol naming, has been changed in this patch.

This proposes that:
 - PMD would declare the rte_xxx_driver. In case of ixgbe, it would be
   rte_pci_driver.
 - Probe and remove continue to exists in rte_pci_driver. But, the
   rte_driver has new hooks for init and uninit. The rationale is that
   once a ethernet or cryto device is created, the rte_driver->init would
   be responsible for initializing the device.
   -- Eth_dev -> rte_driver -> rte_pci_driver
                          |    `-> probe/remove
                          `--> init/uninit
 - necessary changes in the rte_eth_dev have also been done so that it
   refers to the rte_device and rte_driver rather than rte_xxx_*. This
   would imply, ethernet device is 'linked' to a rte_device/rte_driver
   which in turn is a rte_xxx_device/rte_xxx_driver type.
   - for all operations related to extraction relvant xxx type,
     container_of would have to be used.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 49 +++++++++++++++++++++-------------------
 lib/librte_ether/rte_ethdev.c    | 36 +++++++++++++++++------------
 lib/librte_ether/rte_ethdev.h    |  6 ++---
 3 files changed, 51 insertions(+), 40 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index edc9b22..acead31 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1419,7 +1419,7 @@ eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
 		return 0;
 	}
 
-	pci_dev = eth_dev->pci_dev;
+	pci_dev = container_of(eth_dev->device, struct rte_pci_device, device);
 
 	rte_eth_copy_pci_info(eth_dev, pci_dev);
 
@@ -1532,7 +1532,9 @@ static int
 eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
 {
 	struct ixgbe_hw *hw;
-	struct rte_pci_device *pci_dev = eth_dev->pci_dev;
+	struct rte_pci_device *pci_dev;
+
+	pci_dev = container_of(eth_dev->device, struct rte_pci_device, device);
 
 	PMD_INIT_FUNC_TRACE();
 
@@ -1562,32 +1564,33 @@ eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
 	return 0;
 }
 
-static struct eth_driver rte_ixgbe_pmd = {
-	.pci_drv = {
-		.id_table = pci_id_ixgbe_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
-			RTE_PCI_DRV_DETACHABLE,
-		.probe = rte_eth_dev_pci_probe,
-		.remove = rte_eth_dev_pci_remove,
+static struct rte_pci_driver rte_ixgbe_pci_driver = {
+	.id_table = pci_id_ixgbe_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+		     RTE_PCI_DRV_DETACHABLE,
+	.probe = rte_eth_dev_pci_probe,
+	.remove = rte_eth_dev_pci_remove,
+	.driver = {
+		.driver_init_t= eth_ixgbe_dev_init,
+		.driver_uninit_t= eth_ixgbe_dev_uninit,
+		.dev_private_size = sizeof(struct ixgbe_adapter),
 	},
-	.eth_dev_init = eth_ixgbe_dev_init,
-	.eth_dev_uninit = eth_ixgbe_dev_uninit,
-	.dev_private_size = sizeof(struct ixgbe_adapter),
 };
 
 /*
  * virtual function driver struct
  */
-static struct eth_driver rte_ixgbevf_pmd = {
-	.pci_drv = {
-		.id_table = pci_id_ixgbevf_map,
-		.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
-		.probe = rte_eth_dev_pci_probe,
-		.remove = rte_eth_dev_pci_remove,
+static struct rte_pci_driver rte_ixgbevf_pci_driver = {
+	.id_table = pci_id_ixgbevf_map,
+	.drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
+	.probe = rte_eth_dev_pci_probe,
+	.remove = rte_eth_dev_pci_remove,
+	.driver = {
+		/* rte_driver hooks */
+		.init = eth_ixgbevf_dev_init,
+		.uninit = eth_ixgbevf_dev_uninit,
+		.dev_private_size = sizeof(struct ixgbe_adapter),
 	},
-	.eth_dev_init = eth_ixgbevf_dev_init,
-	.eth_dev_uninit = eth_ixgbevf_dev_uninit,
-	.dev_private_size = sizeof(struct ixgbe_adapter),
 };
 
 static int
@@ -7592,7 +7595,7 @@ ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
 	ixgbevf_dev_interrupt_action(dev);
 }
 
-RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pci_driver);
 RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe, pci_id_ixgbe_map);
-RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pmd.pci_drv);
+RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pci_driver);
 RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe_vf, pci_id_ixgbevf_map);
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fde8112..3535ff4 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -235,13 +235,13 @@ int
 rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 		      struct rte_pci_device *pci_dev)
 {
-	struct eth_driver    *eth_drv;
+	struct rte_driver *drv;
 	struct rte_eth_dev *eth_dev;
 	char ethdev_name[RTE_ETH_NAME_MAX_LEN];
 
 	int diag;
 
-	eth_drv = (struct eth_driver *)pci_drv;
+	drv = pci_drv->driver;
 
 	rte_eal_pci_device_name(&pci_dev->addr, ethdev_name,
 			sizeof(ethdev_name));
@@ -252,13 +252,13 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
 		eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",
-				  eth_drv->dev_private_size,
+				  drv->dev_private_size,
 				  RTE_CACHE_LINE_SIZE);
 		if (eth_dev->data->dev_private == NULL)
 			rte_panic("Cannot allocate memzone for private port data\n");
 	}
-	eth_dev->pci_dev = pci_dev;
-	eth_dev->driver = eth_drv;
+	eth_dev->device = pci_dev->device;
+	eth_dev->driver = drv;
 	eth_dev->data->rx_mbuf_alloc_failed = 0;
 
 	/* init user callbacks */
@@ -270,7 +270,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 	eth_dev->data->mtu = ETHER_MTU;
 
 	/* Invoke PMD device initialization function */
-	diag = (*eth_drv->eth_dev_init)(eth_dev);
+	diag = (*drv->init)(eth_dev);
 	if (diag == 0)
 		return 0;
 
@@ -287,7 +287,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
 int
 rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
 {
-	const struct eth_driver *eth_drv;
+	const struct rte_driver *drv;
 	struct rte_eth_dev *eth_dev;
 	char ethdev_name[RTE_ETH_NAME_MAX_LEN];
 	int ret;
@@ -302,11 +302,11 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
 	if (eth_dev == NULL)
 		return -ENODEV;
 
-	eth_drv = (const struct eth_driver *)pci_dev->driver;
+	drv = pci_dev->driver;
 
 	/* Invoke PMD device uninit function */
-	if (*eth_drv->eth_dev_uninit) {
-		ret = (*eth_drv->eth_dev_uninit)(eth_dev);
+	if (*drv->uninit) {
+		ret = (*drv->uninit)(eth_dev);
 		if (ret)
 			return ret;
 	}
@@ -317,7 +317,7 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
 	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
 		rte_free(eth_dev->data->dev_private);
 
-	eth_dev->pci_dev = NULL;
+	eth_dev->device = NULL;
 	eth_dev->driver = NULL;
 	eth_dev->data = NULL;
 
@@ -1556,7 +1556,7 @@ rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
 
 	RTE_FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get);
 	(*dev->dev_ops->dev_infos_get)(dev, dev_info);
-	dev_info->pci_dev = dev->pci_dev;
+	dev_info->device = dev->device;
 	dev_info->driver_name = dev->data->drv_name;
 	dev_info->nb_rx_queues = dev->data->nb_rx_queues;
 	dev_info->nb_tx_queues = dev->data->nb_tx_queues;
@@ -2537,6 +2537,7 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
 {
 	uint32_t vec;
 	struct rte_eth_dev *dev;
+	struct rte_pci_device *pci_dev;
 	struct rte_intr_handle *intr_handle;
 	uint16_t qid;
 	int rc;
@@ -2544,6 +2545,10 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
 
 	dev = &rte_eth_devices[port_id];
+	/* TODO intr_handle is currently in rte_pci_device;
+	 * Below is incorrect until that time
+	 */
+	pci_dev = container_of(dev->device, struct rte_pci_device, device);
 	intr_handle = &dev->pci_dev->intr_handle;
 	if (!intr_handle->intr_vec) {
 		RTE_PMD_DEBUG_TRACE("RX Intr vector unset\n");
@@ -2572,7 +2577,7 @@ rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
 	const struct rte_memzone *mz;
 
 	snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
-		 dev->driver->pci_drv.driver.name, ring_name,
+		 dev->driver->name, ring_name,
 		 dev->data->port_id, queue_id);
 
 	mz = rte_memzone_lookup(z_name);
@@ -2593,6 +2598,7 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
 {
 	uint32_t vec;
 	struct rte_eth_dev *dev;
+	struct rte_pci_device *pci_dev;
 	struct rte_intr_handle *intr_handle;
 	int rc;
 
@@ -2604,7 +2610,9 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
 		return -EINVAL;
 	}
 
-	intr_handle = &dev->pci_dev->intr_handle;
+	/* TODO; Until intr_handle is available in rte_device, below is incorrect */
+	pci_dev = container_of(dev->device, struct rte_pci_device, device);
+	intr_handle = &pci_dev->intr_handle;
 	if (!intr_handle->intr_vec) {
 		RTE_PMD_DEBUG_TRACE("RX Intr vector unset\n");
 		return -EPERM;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 38641e8..2b1d826 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -876,7 +876,7 @@ struct rte_eth_conf {
  * Ethernet device information
  */
 struct rte_eth_dev_info {
-	struct rte_pci_device *pci_dev; /**< Device PCI information. */
+	struct rte_device *device; /**< Device PCI information. */
 	const char *driver_name; /**< Device Driver name. */
 	unsigned int if_index; /**< Index to bound host interface, or 0 if none.
 		Use if_indextoname() to translate into an interface name. */
@@ -1623,9 +1623,9 @@ struct rte_eth_dev {
 	eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */
 	eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */
 	struct rte_eth_dev_data *data;  /**< Pointer to device data */
-	const struct eth_driver *driver;/**< Driver for this device */
+	const struct rte_driver *driver;/**< Driver for this device */
 	const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
-	struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
+	struct rte_device *device; /**< Device instance */
 	/** User application callbacks for NIC interrupts */
 	struct rte_eth_dev_cb_list link_intr_cbs;
 	/**
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 2/6] eal: introduce bus-device-driver structure
  2016-11-17  5:30 ` [RFC PATCH 2/6] eal: introduce bus-device-driver structure Shreyansh Jain
@ 2016-11-17 11:19   ` Jan Blunck
  2016-11-17 13:00     ` Shreyansh Jain
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Blunck @ 2016-11-17 11:19 UTC (permalink / raw)
  To: Shreyansh Jain; +Cc: David Marchand, dev

On Thu, Nov 17, 2016 at 6:30 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
> A device is connected to a bus and services by a driver associated with
> the bus. It is responsibility of the bus to identify the devices (scan)
> and then assign each device to a matching driver.
>
> A PMD would allocate a rte_xxx_driver and rte_xxx_device.
> rte_xxx_driver has rte_driver and rte_bus embedded. Similarly, rte_xxx_device
> has rte_device and rte_bus embedded.

I don't think so: the rte_xxx_device embeds the generic rte_device and
references a the rte_bus
that it is attached to.

> When a ethernet or crypto device (rte_eth_dev, rte_cryptodev) is allocated,
> it contains a reference of rte_device and rte_driver.
> Each ethernet device implementation would use container_of for finding the
> enclosing structure of rte_xxx_*.
>
>                             +-------------------+
>  +--------------+           |rte_pci_device     |
>  |rte_eth_dev   |           |+-----------------+|
>  |+------------+|   .-------->rte_device       ||
>  ||rte_device*-----'        |+-----------------+|
>  |+------------+|           ||rte_bus          ||
>  |              |           |+-----------------+|
>  /              /           +-------------------+
>
> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
> ---
>  lib/librte_eal/common/include/rte_bus.h | 243 ++++++++++++++++++++++++++++++++
>  lib/librte_eal/common/include/rte_dev.h |  36 ++---
>  2 files changed, 261 insertions(+), 18 deletions(-)
>  create mode 100644 lib/librte_eal/common/include/rte_bus.h
>
> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
> new file mode 100644
> index 0000000..dc3aeb8
> --- /dev/null
> +++ b/lib/librte_eal/common/include/rte_bus.h
> @@ -0,0 +1,243 @@
> +/*-
> + *   BSD LICENSE
> + *
> + *   Copyright(c) 2016 NXP
> + *   All rights reserved.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of NXP nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> + */
> +
> +#ifndef _RTE_BUS_H_
> +#define _RTE_BUS_H_
> +
> +/**
> + * @file
> + *
> + * RTE PMD Bus Abstraction interfaces
> + *
> + * This file exposes APIs and Interfaces for Bus Abstraction over the devices
> + * drivers in EAL.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdio.h>
> +#include <sys/queue.h>
> +
> +#include <rte_log.h>
> +#include <rte_dev.h>
> +
> +
> +/** Double linked list of buses */
> +TAILQ_HEAD(rte_bus_list, rte_bus);
> +
> +/**
> + * Bus specific scan for devices attached on the bus.
> + * For each bus object, the scan would be reponsible for finding devices and
> + * adding them to its private device list.
> + *
> + * Successful detection of a device results in rte_device object which is
> + * embedded within the respective device type (rte_pci_device, for example).
> + * Thereafter, PCI specific bus would need to perform
> + * container_of(rte_pci_device) to obtain PCI device object.
> + *
> + * Scan failure of a bus is not treated as exit criteria for application. Scan
> + * for all other buses would still continue.
> + *
> + * @param void
> + * @return
> + *     0 for successful scan
> + *     !0 (<0) for unsuccessful scan with error value
> + */
> +typedef int (* bus_scan_t)(void);
> +
> +/**
> + * Bus specific match for devices and drivers which can service them.
> + * For each scanned device, during probe the match would link the devices with
> + * drivers which can service the device.
> + *
> + * It is the work of each bus handler to obtain the specific device object
> + * using container_of (or typecasting, as a less preferred way).
> + *
> + * @param drv
> + *     Driver object attached to the bus
> + * @param dev
> + *     Device object which is being probed.
> + * @return
> + *     0 for successful match
> + *     !0 for unsuccessful match
> + */
> +typedef int (* bus_match_t)(struct rte_driver *drv, struct rte_device *dev);

Do you think this should do match & probe?

I believe it is better to separate this into two functions to match()
and probe(). The
result of matching tells if the driver would want to claim the device
in general. But
probe()ing the device should only happen if the device isn't claimed
by a driver yet and
is not blacklisted.

> +
> +/**
> + * Dump the devices on the bus.
> + * Each bus type can define its own definition of information to dump.
> + *
> + * @param bus
> + *     Handle for bus, device from which are to be dumped.
> + * @param f
> + *     Handle to output device or file.
> + * @return void
> + */
> +typedef void (* bus_dump_t)(struct rte_bus *bus, FILE *f);
> +
> +/**
> + * Search for a specific device in device list of the bus
> + * This would rely on the bus specific addressing. Each implementation would
> + * extract its specific device type and perform address compare.
> + *
> + * @param dev
> + *     device handle to search for.
> + * @return
> + *     rte_device handle for matched device, or NULL
> + */
> +typedef struct rte_device * (* bus_device_get_t)(struct rte_device *dev);

>From my experience it is better to delegate this to a helper:

int bus_for_each_dev(struct rte_bus *bus, struct rte_device *start,
int (*match)(struct rte_device *dev, void *data), void *data);

Based on that you can easily implement all kinds of functions like
bus_find_dev(), bus_find_dev_by_name(), bus_dump(), ...

> +
> +struct rte_bus {
> +       TAILQ_ENTRY(rte_bus) next;   /**< Next bus object in linked list */
> +       struct rte_driver_list driver_list; /**< List of all drivers of bus */

TAILQ_HEAD?

> +       struct rte_device_list device_list; /**< List of all devices on bus */

TAILQ_HEAD?

> +       const char *name;            /**< Name of the bus */
> +       /* Mandatory hooks */
> +       bus_scan_t *scan;            /**< Hook for scanning for devices */
> +       bus_match_t *match;          /**< Hook for matching device & driver */
> +       /* Optional hooks */
> +       bus_dump_t *dump_dev;        /**< Hook for dumping devices on bus */
> +       bus_device_get_t *find_dev;  /**< Search for a device on bus */
> +};
> +
> +/** @internal
> + * Add a device to a bus.
> + *
> + * @param bus
> + *     Bus on which device is to be added
> + * @param dev
> + *     Device handle
> + * @return
> + *     None
> + */
> +void
> +rte_eal_bus_add_device(struct rte_bus *bus, struct rte_device *dev);

Why do we need this? From my understanding the rte_bus->scan() is
adding the devices to the rte_bus->device_list.

> +/** @internal
> + * Remove a device from its bus.
> + *
> + * @param dev
> + *     Device handle to remove
> + * @return
> + *     None
> + */
> +void
> +rte_eal_bus_remove_device(struct rte_device *dev);
> +
> +/** @internal
> + * Associate a driver with a bus.
> + *
> + * @param bus
> + *     Bus on which driver is to be added
> + * @param dev
> + *     Driver handle
> + * @return
> + *     None
> + */
> +void
> +rte_eal_bus_add_driver(struct rte_bus *bus, struct rte_driver *drv);
> +

What happens if a driver is added at runtime to a bus? Does that immediately
trigger a match & probe of unclaimed devices?

> +/** @internal
> + * Disassociate a driver from its bus.
> + *
> + * @param dev
> + *     Driver handle to remove
> + * @return
> + *     None
> + */
> +void
> +rte_eal_bus_remove_driver(struct rte_driver *drv);
> +
> +/**
> + * Register a Bus handler.
> + *
> + * @param driver
> + *   A pointer to a rte_bus structure describing the bus
> + *   to be registered.
> + */
> +void rte_eal_bus_register(struct rte_bus *bus);
> +
> +/**
> + * Unregister a Bus handler.
> + *
> + * @param driver
> + *   A pointer to a rte_bus structure describing the bus
> + *   to be unregistered.
> + */
> +void rte_eal_bus_unregister(struct rte_bus *bus);
> +
> +/**
> + * Obtain handle for bus given its name.
> + *
> + * @param bus_name
> + *     Name of the bus handle to search
> + * @return
> + *     Pointer to Bus object if name matches any registered bus object
> + *     NULL, if no matching bus found
> + */
> +struct rte_bus * rte_eal_get_bus(const char *bus_name);
> +
> +/**
> + * Register a device driver.
> + *
> + * @param driver
> + *   A pointer to a rte_dev structure describing the driver
> + *   to be registered.
> + */
> +void rte_eal_driver_register(struct rte_driver *driver);
> +
> +/**
> + * Unregister a device driver.
> + *
> + * @param driver
> + *   A pointer to a rte_dev structure describing the driver
> + *   to be unregistered.
> + */
> +void rte_eal_driver_unregister(struct rte_driver *driver);
> +
> +/** Helper for Bus registration */
> +#define RTE_PMD_REGISTER_BUS(nm, bus) \
> +RTE_INIT(businitfn_ ##nm); \
> +static void businitfn_ ##nm(void) \
> +{\
> +       (bus).name = RTE_STR(nm);\
> +       rte_eal_bus_register(&bus); \
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_BUS_H */
> diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h
> index 8840380..b08bab5 100644
> --- a/lib/librte_eal/common/include/rte_dev.h
> +++ b/lib/librte_eal/common/include/rte_dev.h
> @@ -116,12 +116,14 @@ TAILQ_HEAD(rte_device_list, rte_device);
>
>  /* Forward declaration */
>  struct rte_driver;
> +struct rte_bus;
>
>  /**
>   * A structure describing a generic device.
>   */
>  struct rte_device {
>         TAILQ_ENTRY(rte_device) next; /**< Next device */
> +       struct rte_bus *bus;          /**< Bus on which device is placed */
>         struct rte_driver *driver;    /**< Associated driver */
>         int numa_node;                /**< NUMA node connection */
>         struct rte_devargs *devargs;  /**< Device user arguments */
> @@ -144,31 +146,29 @@ void rte_eal_device_insert(struct rte_device *dev);
>  void rte_eal_device_remove(struct rte_device *dev);
>
>  /**
> - * A structure describing a device driver.
> + * @internal
> + * TODO
>   */
> -struct rte_driver {
> -       TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
> -       const char *name;                   /**< Driver name. */
> -       const char *alias;              /**< Driver alias. */
> -};
> +typedef int (*driver_init_t)(struct rte_device *eth_dev);
>
>  /**
> - * Register a device driver.
> - *
> - * @param driver
> - *   A pointer to a rte_dev structure describing the driver
> - *   to be registered.
> + * @internal
> + * TODO
>   */
> -void rte_eal_driver_register(struct rte_driver *driver);
> +typedef int (*driver_uninit_t)(struct rte_device *eth_dev);
>
>  /**
> - * Unregister a device driver.
> - *
> - * @param driver
> - *   A pointer to a rte_dev structure describing the driver
> - *   to be unregistered.
> + * A structure describing a device driver.
>   */
> -void rte_eal_driver_unregister(struct rte_driver *driver);
> +struct rte_driver {
> +       TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
> +       struct rte_bus *bus;           /**< Bus which drivers services */

I think this should be TAILQ_ENTRY instead.

> +       const char *name;              /**< Driver name. */
> +       const char *alias;             /**< Driver alias. */
> +       driver_init_t *init;           /**< Driver initialization */
> +       driver_uninit_t *uninit;       /**< Driver uninitialization */

Shouldn't this be probe() and remove()?

> +       unsigned int dev_private_size; /**< Size of device private data ??*/

I don't think that dev_private_size is really required at this level.
First of all this is related to the rte_eth_dev structure and
therefore it really depends on the driver_init_t (aka probe()) if it
is actually allocating an rte_eth_dev or not. Anyway that is up to the
drivers probe() function.

> +};
>
>  /**
>   * Initalize all the registered drivers in this process
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/6] Restructure EAL device model for bus support
  2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
                   ` (5 preceding siblings ...)
  2016-11-17  5:30 ` [RFC PATCH 6/6] eal: removing eth_driver Shreyansh Jain
@ 2016-11-17 11:55 ` Jan Blunck
  2016-11-17 13:08   ` Shreyansh Jain
  2016-11-20 15:30 ` David Marchand
  7 siblings, 1 reply; 21+ messages in thread
From: Jan Blunck @ 2016-11-17 11:55 UTC (permalink / raw)
  To: Shreyansh Jain; +Cc: David Marchand, dev

On Thu, Nov 17, 2016 at 6:29 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
> DPDK has been inherently a PCI inclined framework. Because of this, the
> design of device tree (or list) within DPDK is also PCI inclined. A non-PCI
> device doesn't have a way of being expressed without using hooks started from
> EAL to PMD.
>
> With this cover letter, some patches are presented which try to break this
> strict linkage of EAL with PCI devices. Aim is to generalize the device
> hierarchy on the lines of how Linux handles it:
>
>         device A1
>           |
>   +==.===='==============.============+ Bus A.
>      |                    `--> driver A11     \
>   device A2                `-> driver A12      \______
>                                                 |CPU |
>                                                 /`````
>         device A1                              /
>           |                                   /
>   +==.===='==============.============+ Bus A`
>      |                    `--> driver B11
>   device A2                `-> driver B12
>
> Simply put:
>  - a bus is connect to CPU (or core)
>  - devices are conneted to Bus
>  - drivers are running instances which manage one or more devices
>  - bus is responsible for identifying devices (and interrupt propogation)
>  - driver is responsible for initializing the device
>
> (*Reusing text from email [1])
> In context of DPDK EAL:
>  - a generic bus (not a driver, not a device). I don't know how to categorize
>    a bus. It is certainly not a device, and then handler for a bus (physical)
>    can be considered a 'bus driver'. So, just 'rte_bus'.
>  - there is a bus for each physical implementation (or virtual). So, a rte_bus
>    Object for PCI, VDEV, ABC, DEF and so on.
>  - Buses are registered just like a PMD - RTE_PMD_BUS_REGISTER()
>  - Each registered bus is part of a doubly list.
>    -- Each device inherits rte_bus
>    -- Each driver inherits rte_bus
>    -- Device and Drivers lists are part of rte_bus
>  - eth_driver is no more required - it was just a holder for PMDs to register
>    themselves. It can be replaced with rte_xxx_driver and corresponding init/
>    uninit moved to rte_driver
>  - rte_eth_dev modified to disassociate itself from rte_pci_device and connect
>    to generic rte_device
>
> Once again, improvising from [1]:
>
>                                   __ rte_bus_list
>                                  /
>                      +----------'---+
>                      |rte_bus       |
>                      | driver_list------> List of rte_bus specific
>                      | device_list----    devices
>                      | scan         | `-> List of rte_bus associated
>                      | match        |     drivers
>                      | dump         |
>                      | ..some refcnt| (#)
>                      +--|------|----+
>               _________/        \_________
>     +--------/----+                     +-\---------------+
>     |rte_device   |                     |rte_driver       |
>     | rte_bus     |                     | rte_bus         |
>     | rte_driver  |(#)                  | init            |
>     |             |                     | uninit          |
>     |  devargs    |                     | dev_private_size|
>     +---||--------+                     | drv_flags       |(#)
>         ||                              | intr_handle(2*) |(#)
>         | \                             +----------\\\----+
>         |  \_____________                           \\\
>         |                \                          |||
>  +------|---------+ +----|----------+               |||
>  |rte_pci_device  | |rte_xxx_device | (4*)          |||
>  | PCI specific   | | xxx device    |               |||
>  | info (mem,)    | | specific fns  |              / | \
>  +----------------+ +---------------+             /  |  \
>                             _____________________/  /    \
>                            /                    ___/      \
>             +-------------'--+    +------------'---+    +--'------------+
>             |rte_pci_driver  |    |rte_vdev_driver |    |rte_xxx_driver |
>             | PCI id table,  |    | <probably,     |    | ....          |
>             | other driver   |    |  nothing>      |    +---------------+
>             | data           |    +----------------+
>             +----------------+
>
>     (1*) Problem is that probe functions have different arguments. So,
>          generalizing them might be some rework in the respective device
>          layers
>     (2*) Interrupt handling for each driver type might be different. I am not
>          sure how to generalize that either. This is grey area for me.
>     (3*) Probably exposing a bitmask for device capabilities. Nothing similar
>          exists now to relate it. Don't know if that is useful. Allowing
>          applications to question a device about what it supports and what it
>          doesn't - making it more flexible at application layer (but more code
>          as well.)
>     (4*) Even vdev would be an instantiated as a device. It is not being done
>          currently.
>     (#)  Items which are still pending
>
> With this cover letter, some patches have been posted. These are _only_ for
> discussion purpose. They are not complete and neither compilable.
> All the patches, except 0001, have sufficient context about what the changes
> are and rationale for same. Obviously, code is best answer.
>
> === Patch description: ===
>
> Patch 0001: Introduce container_of macro. Originally a patch from Jan.
>
> Patch 0002: introduce changes to rte_device/rte_driver for rte_bus, and
>             rte_bus definition itself.
>
> Patch 0003: Add a new layer for 'bus driver' with linuxapp PCI as an example
>
> Patch 0004: Changes with respect to rte_bus APIs and impact on eal_common_pci

>From my point of view it is beneficial to keep the PCI support in one
place. Moving the match() and scan()
to the drivers directory doesn't seem to be technically required but
it makes the code harder to read and understand.


> Patch 0005: Change to rte_eal_init (of linuxapp only, for now) for supporting
>             bus->scan. Probe is still being done old way, but in a new wrapper
>
> Patch 0006: eth_driver removal and corresponding changes in ixgbe_ethdev, as
>             an example. Includes changes to rte_ethdev to remove most possible
>             PCI references. But, work still remains.

Making rte_ethdev independent from PCI device is not directly related
to the rest of the patches that add bus support. I think it makes
sense to handle this separately because of the impact of refactoring
rte_ethdev.


>
> === Pending Items/Questions: ===
>
>  - Interrupt and notification handling. How to allow drivers to be notified
>    of presence/plugging of a device.
>  - Placement of bus driver/handling code. librte_bus, bus/, drivers/bus?
>  -- Also from a pespective of a external library and whether symbols would be
>     available in that.
>  -- and secondary processes
>  - VDEV bus is missing from current set.
>  - Locking of list for supporting hotplugging. Or, at the least safe add/
>    remove
>  - PMDINFOGEN support or lack of it.
>  - Is there ever a case where rte_eth_dev needs to be resolved from
>    rte_pci_device? I couldn't find any such use and neither a use-case for it.
>  - There should be a way for Bus APIs to return a generic list handle so that
>    EAL doesn't need to bother about bus->driver_list like dereferencing. This
>    is untidy as well as less portable (in terms of code movement, not arch).
>  - Are more helper hooks required for a bus?
>  -- I can think of scan, match, dump, find, plug (device), unplug (device),
>     associate (driver), disassociate (driver). But, most of the work is
>     already being done by lower instances (rte_device/driver etc).
>
> Further:
>  - In next few days I will make all necessary changes on the lines mentioned
>    in the patches. This would include changing the drivers/* and librte_eal/*
>  - As an when review comments float in and agreement reached, I will keep
>    changing the model
>  - There are grey areas like interrupt, notification, locking of bus/list
>    which require more discussion. I will try and post a rfc for those as well
>    or if someone can help me on those - great

As already hinted on IRC I'm taking a look at the interrupt usage of ethdev.

>  - Change would include PCI bus and VDEV bus handling. A new bus (NXP's FSLMC)
>    would also be layered over this series to verify the model of 'bus
>    registration'. This is also part of 17.02 roadmap.
>
> [1] http://dpdk.org/ml/archives/dev/2016-November/050186.html
>
> Jan Viktorin (1):
>   eal: define container macro
>
> Shreyansh Jain (5):
>   eal: introduce bus-device-driver structure
>   bus: add bus driver layer
>   eal/common: handle bus abstraction for device/driver objects
>   eal: supporting bus model in init process
>   eal: removing eth_driver
>
>  bus/Makefile                               |  36 +++
>  bus/pci/Makefile                           |  37 +++
>  bus/pci/linuxapp/pci_bus.c                 | 418 +++++++++++++++++++++++++++++
>  bus/pci/linuxapp/pci_bus.h                 |  55 ++++
>  drivers/net/ixgbe/ixgbe_ethdev.c           |  49 ++--
>  lib/librte_eal/common/eal_common_bus.c     | 188 +++++++++++++
>  lib/librte_eal/common/eal_common_dev.c     |  31 ++-
>  lib/librte_eal/common/eal_common_pci.c     | 226 +++++++++-------
>  lib/librte_eal/common/include/rte_bus.h    | 243 +++++++++++++++++
>  lib/librte_eal/common/include/rte_common.h |  18 ++
>  lib/librte_eal/common/include/rte_dev.h    |  36 +--
>  lib/librte_eal/common/include/rte_pci.h    |  11 +-
>  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
>  lib/librte_eal/linuxapp/eal/eal.c          |  51 +++-
>  lib/librte_eal/linuxapp/eal/eal_pci.c      | 298 --------------------
>  lib/librte_ether/rte_ethdev.c              |  36 ++-
>  lib/librte_ether/rte_ethdev.h              |   6 +-
>  17 files changed, 1262 insertions(+), 478 deletions(-)
>  create mode 100644 bus/Makefile
>  create mode 100644 bus/pci/Makefile
>  create mode 100644 bus/pci/linuxapp/pci_bus.c
>  create mode 100644 bus/pci/linuxapp/pci_bus.h
>  create mode 100644 lib/librte_eal/common/eal_common_bus.c
>  create mode 100644 lib/librte_eal/common/include/rte_bus.h
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 1/6] eal: define container macro
  2016-11-17  5:30 ` [RFC PATCH 1/6] eal: define container macro Shreyansh Jain
@ 2016-11-17 12:06   ` Jan Blunck
  2016-11-17 13:01     ` Shreyansh Jain
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Blunck @ 2016-11-17 12:06 UTC (permalink / raw)
  To: Shreyansh Jain; +Cc: David Marchand, dev, Jan Viktorin

On Thu, Nov 17, 2016 at 6:30 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
> From: Jan Viktorin <viktorin@rehivetech.com>
>
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
> ---
>  lib/librte_eal/common/include/rte_common.h | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/lib/librte_eal/common/include/rte_common.h b/lib/librte_eal/common/include/rte_common.h
> index db5ac91..8152bd9 100644
> --- a/lib/librte_eal/common/include/rte_common.h
> +++ b/lib/librte_eal/common/include/rte_common.h
> @@ -331,6 +331,24 @@ rte_bsf32(uint32_t v)
>  #define offsetof(TYPE, MEMBER)  __builtin_offsetof (TYPE, MEMBER)
>  #endif
>
> +/**
> + * Return pointer to the wrapping struct instance.
> + * Example:
> + *
> + *  struct wrapper {
> + *      ...
> + *      struct child c;
> + *      ...
> + *  };
> + *
> + *  struct child *x = obtain(...);
> + *  struct wrapper *w = container_of(x, struct wrapper, c);
> + */
> +#ifndef container_of
> +#define container_of(p, type, member) \
> +       ((type *) (((char *) (p)) - offsetof(type, member)))

Are there any reasons why you choose to implement this in a non-type
safe way? Catching obvious bugs at compile time is in the interest of
us and our users from my point of view.


> +#endif
> +
>  #define _RTE_STR(x) #x
>  /** Take a macro value and get a string version of it */
>  #define RTE_STR(x) _RTE_STR(x)
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 6/6] eal: removing eth_driver
  2016-11-17  5:30 ` [RFC PATCH 6/6] eal: removing eth_driver Shreyansh Jain
@ 2016-11-17 12:53   ` Jan Blunck
  2016-11-18 13:05     ` Shreyansh Jain
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Blunck @ 2016-11-17 12:53 UTC (permalink / raw)
  To: Shreyansh Jain; +Cc: David Marchand, dev

On Thu, Nov 17, 2016 at 6:30 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
> This patch demonstrates how eth_driver can be replaced with appropriate
> changes for rte_xxx_driver from the PMD itself. It uses ixgbe_ethernet as
> an example.
>
> A large set of changes exists in the rte_ethdev.c - primarily because too
> much PCI centric code (names, assumption of rte_pci_device) still exists
> in it. Most, except symbol naming, has been changed in this patch.
>
> This proposes that:
>  - PMD would declare the rte_xxx_driver. In case of ixgbe, it would be
>    rte_pci_driver.
>  - Probe and remove continue to exists in rte_pci_driver. But, the
>    rte_driver has new hooks for init and uninit. The rationale is that
>    once a ethernet or cryto device is created, the rte_driver->init would
>    be responsible for initializing the device.
>    -- Eth_dev -> rte_driver -> rte_pci_driver
>                           |    `-> probe/remove
>                           `--> init/uninit

Hmm, from my perspective this moves struct rte_driver a step closer to
struct rte_eth_dev instead of decoupling them. It is up to the
rte_driver->probe if it wants to allocate a struct rte_eth_dev,
rte_crypto_dev or the famous rte_foo_dev.

Instead of explicitly modelling rte_eth_dev specifics like init, unit
or dev_private_size I think we should delegate this to the
rte_driver->probe instead. Most of what is in rte_eth_dev_pci_probe()
today is anyway a rte_eth_dev_allocate_priv() anyway. I already have
some patches in this area in my patch stack.


>  - necessary changes in the rte_eth_dev have also been done so that it
>    refers to the rte_device and rte_driver rather than rte_xxx_*. This
>    would imply, ethernet device is 'linked' to a rte_device/rte_driver
>    which in turn is a rte_xxx_device/rte_xxx_driver type.
>    - for all operations related to extraction relvant xxx type,
>      container_of would have to be used.
>
> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 49 +++++++++++++++++++++-------------------
>  lib/librte_ether/rte_ethdev.c    | 36 +++++++++++++++++------------
>  lib/librte_ether/rte_ethdev.h    |  6 ++---
>  3 files changed, 51 insertions(+), 40 deletions(-)
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
> index edc9b22..acead31 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -1419,7 +1419,7 @@ eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
>                 return 0;
>         }
>
> -       pci_dev = eth_dev->pci_dev;
> +       pci_dev = container_of(eth_dev->device, struct rte_pci_device, device);
>
>         rte_eth_copy_pci_info(eth_dev, pci_dev);
>
> @@ -1532,7 +1532,9 @@ static int
>  eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
>  {
>         struct ixgbe_hw *hw;
> -       struct rte_pci_device *pci_dev = eth_dev->pci_dev;
> +       struct rte_pci_device *pci_dev;
> +
> +       pci_dev = container_of(eth_dev->device, struct rte_pci_device, device);
>
>         PMD_INIT_FUNC_TRACE();
>
> @@ -1562,32 +1564,33 @@ eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
>         return 0;
>  }
>
> -static struct eth_driver rte_ixgbe_pmd = {
> -       .pci_drv = {
> -               .id_table = pci_id_ixgbe_map,
> -               .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> -                       RTE_PCI_DRV_DETACHABLE,
> -               .probe = rte_eth_dev_pci_probe,
> -               .remove = rte_eth_dev_pci_remove,
> +static struct rte_pci_driver rte_ixgbe_pci_driver = {
> +       .id_table = pci_id_ixgbe_map,
> +       .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
> +                    RTE_PCI_DRV_DETACHABLE,
> +       .probe = rte_eth_dev_pci_probe,
> +       .remove = rte_eth_dev_pci_remove,
> +       .driver = {
> +               .driver_init_t= eth_ixgbe_dev_init,
> +               .driver_uninit_t= eth_ixgbe_dev_uninit,
> +               .dev_private_size = sizeof(struct ixgbe_adapter),
>         },
> -       .eth_dev_init = eth_ixgbe_dev_init,
> -       .eth_dev_uninit = eth_ixgbe_dev_uninit,
> -       .dev_private_size = sizeof(struct ixgbe_adapter),
>  };
>
>  /*
>   * virtual function driver struct
>   */
> -static struct eth_driver rte_ixgbevf_pmd = {
> -       .pci_drv = {
> -               .id_table = pci_id_ixgbevf_map,
> -               .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
> -               .probe = rte_eth_dev_pci_probe,
> -               .remove = rte_eth_dev_pci_remove,
> +static struct rte_pci_driver rte_ixgbevf_pci_driver = {
> +       .id_table = pci_id_ixgbevf_map,
> +       .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
> +       .probe = rte_eth_dev_pci_probe,
> +       .remove = rte_eth_dev_pci_remove,
> +       .driver = {
> +               /* rte_driver hooks */
> +               .init = eth_ixgbevf_dev_init,
> +               .uninit = eth_ixgbevf_dev_uninit,
> +               .dev_private_size = sizeof(struct ixgbe_adapter),
>         },
> -       .eth_dev_init = eth_ixgbevf_dev_init,
> -       .eth_dev_uninit = eth_ixgbevf_dev_uninit,
> -       .dev_private_size = sizeof(struct ixgbe_adapter),
>  };
>
>  static int
> @@ -7592,7 +7595,7 @@ ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
>         ixgbevf_dev_interrupt_action(dev);
>  }
>
> -RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pmd.pci_drv);
> +RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pci_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe, pci_id_ixgbe_map);
> -RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pmd.pci_drv);
> +RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pci_driver);
>  RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe_vf, pci_id_ixgbevf_map);
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index fde8112..3535ff4 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -235,13 +235,13 @@ int
>  rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>                       struct rte_pci_device *pci_dev)
>  {
> -       struct eth_driver    *eth_drv;
> +       struct rte_driver *drv;
>         struct rte_eth_dev *eth_dev;
>         char ethdev_name[RTE_ETH_NAME_MAX_LEN];
>
>         int diag;
>
> -       eth_drv = (struct eth_driver *)pci_drv;
> +       drv = pci_drv->driver;
>
>         rte_eal_pci_device_name(&pci_dev->addr, ethdev_name,
>                         sizeof(ethdev_name));
> @@ -252,13 +252,13 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>
>         if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>                 eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",
> -                                 eth_drv->dev_private_size,
> +                                 drv->dev_private_size,
>                                   RTE_CACHE_LINE_SIZE);
>                 if (eth_dev->data->dev_private == NULL)
>                         rte_panic("Cannot allocate memzone for private port data\n");
>         }
> -       eth_dev->pci_dev = pci_dev;
> -       eth_dev->driver = eth_drv;
> +       eth_dev->device = pci_dev->device;
> +       eth_dev->driver = drv;
>         eth_dev->data->rx_mbuf_alloc_failed = 0;
>
>         /* init user callbacks */
> @@ -270,7 +270,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>         eth_dev->data->mtu = ETHER_MTU;
>
>         /* Invoke PMD device initialization function */
> -       diag = (*eth_drv->eth_dev_init)(eth_dev);
> +       diag = (*drv->init)(eth_dev);
>         if (diag == 0)
>                 return 0;
>
> @@ -287,7 +287,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>  int
>  rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
>  {
> -       const struct eth_driver *eth_drv;
> +       const struct rte_driver *drv;
>         struct rte_eth_dev *eth_dev;
>         char ethdev_name[RTE_ETH_NAME_MAX_LEN];
>         int ret;
> @@ -302,11 +302,11 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
>         if (eth_dev == NULL)
>                 return -ENODEV;
>
> -       eth_drv = (const struct eth_driver *)pci_dev->driver;
> +       drv = pci_dev->driver;
>
>         /* Invoke PMD device uninit function */
> -       if (*eth_drv->eth_dev_uninit) {
> -               ret = (*eth_drv->eth_dev_uninit)(eth_dev);
> +       if (*drv->uninit) {
> +               ret = (*drv->uninit)(eth_dev);
>                 if (ret)
>                         return ret;
>         }
> @@ -317,7 +317,7 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
>         if (rte_eal_process_type() == RTE_PROC_PRIMARY)
>                 rte_free(eth_dev->data->dev_private);
>
> -       eth_dev->pci_dev = NULL;
> +       eth_dev->device = NULL;
>         eth_dev->driver = NULL;
>         eth_dev->data = NULL;
>
> @@ -1556,7 +1556,7 @@ rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
>
>         RTE_FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get);
>         (*dev->dev_ops->dev_infos_get)(dev, dev_info);
> -       dev_info->pci_dev = dev->pci_dev;
> +       dev_info->device = dev->device;
>         dev_info->driver_name = dev->data->drv_name;
>         dev_info->nb_rx_queues = dev->data->nb_rx_queues;
>         dev_info->nb_tx_queues = dev->data->nb_tx_queues;
> @@ -2537,6 +2537,7 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
>  {
>         uint32_t vec;
>         struct rte_eth_dev *dev;
> +       struct rte_pci_device *pci_dev;
>         struct rte_intr_handle *intr_handle;
>         uint16_t qid;
>         int rc;
> @@ -2544,6 +2545,10 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
>         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>
>         dev = &rte_eth_devices[port_id];
> +       /* TODO intr_handle is currently in rte_pci_device;
> +        * Below is incorrect until that time
> +        */
> +       pci_dev = container_of(dev->device, struct rte_pci_device, device);
>         intr_handle = &dev->pci_dev->intr_handle;
>         if (!intr_handle->intr_vec) {
>                 RTE_PMD_DEBUG_TRACE("RX Intr vector unset\n");
> @@ -2572,7 +2577,7 @@ rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
>         const struct rte_memzone *mz;
>
>         snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
> -                dev->driver->pci_drv.driver.name, ring_name,
> +                dev->driver->name, ring_name,
>                  dev->data->port_id, queue_id);
>
>         mz = rte_memzone_lookup(z_name);
> @@ -2593,6 +2598,7 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
>  {
>         uint32_t vec;
>         struct rte_eth_dev *dev;
> +       struct rte_pci_device *pci_dev;
>         struct rte_intr_handle *intr_handle;
>         int rc;
>
> @@ -2604,7 +2610,9 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
>                 return -EINVAL;
>         }
>
> -       intr_handle = &dev->pci_dev->intr_handle;
> +       /* TODO; Until intr_handle is available in rte_device, below is incorrect */
> +       pci_dev = container_of(dev->device, struct rte_pci_device, device);
> +       intr_handle = &pci_dev->intr_handle;
>         if (!intr_handle->intr_vec) {
>                 RTE_PMD_DEBUG_TRACE("RX Intr vector unset\n");
>                 return -EPERM;
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index 38641e8..2b1d826 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -876,7 +876,7 @@ struct rte_eth_conf {
>   * Ethernet device information
>   */
>  struct rte_eth_dev_info {
> -       struct rte_pci_device *pci_dev; /**< Device PCI information. */
> +       struct rte_device *device; /**< Device PCI information. */

We already the situation that virtual devices don't set the pci_dev
field. I wonder if it really makes sense to replace it with a struct
rte_device because that is not adding a lot of value (only numa_node).
If we break ABI we might want to add numa_node and dev_flags as
suggested by Stephen Hemminger already. If we choose to not break ABI
we can delegate the population of the pci_dev field to dev_infos_get.
I already have that patch in my patch stack too.

The problem with rte_eth_dev_info is that it doesn't support
extensions. Maybe its time to add rte_eth_dev_info_ex() ... or
rte_eth_dev_xinfo() if you don't like the IB verbs API.


>         const char *driver_name; /**< Device Driver name. */
>         unsigned int if_index; /**< Index to bound host interface, or 0 if none.
>                 Use if_indextoname() to translate into an interface name. */
> @@ -1623,9 +1623,9 @@ struct rte_eth_dev {
>         eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */
>         eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */
>         struct rte_eth_dev_data *data;  /**< Pointer to device data */
> -       const struct eth_driver *driver;/**< Driver for this device */
> +       const struct rte_driver *driver;/**< Driver for this device */
>         const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
> -       struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
> +       struct rte_device *device; /**< Device instance */
>         /** User application callbacks for NIC interrupts */
>         struct rte_eth_dev_cb_list link_intr_cbs;
>         /**
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 2/6] eal: introduce bus-device-driver structure
  2016-11-17 11:19   ` Jan Blunck
@ 2016-11-17 13:00     ` Shreyansh Jain
  2016-11-17 16:13       ` Jan Blunck
  0 siblings, 1 reply; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17 13:00 UTC (permalink / raw)
  To: Jan Blunck; +Cc: David Marchand, dev

Hello Jan,

Thanks for comments. Replies inline.

On Thursday 17 November 2016 04:49 PM, Jan Blunck wrote:
> On Thu, Nov 17, 2016 at 6:30 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
>> A device is connected to a bus and services by a driver associated with
>> the bus. It is responsibility of the bus to identify the devices (scan)
>> and then assign each device to a matching driver.
>>
>> A PMD would allocate a rte_xxx_driver and rte_xxx_device.
>> rte_xxx_driver has rte_driver and rte_bus embedded. Similarly, rte_xxx_device
>> has rte_device and rte_bus embedded.
>
> I don't think so: the rte_xxx_device embeds the generic rte_device and
> references a the rte_bus
> that it is attached to.

You mean?

  struct rte_pci_device {
    struct rte_device device;
    struct rte_bus *bus;
    ...
}

If yes then I have a different view.
'device' is connected to a bus. pci_device is just a type of device. 
Only way it should know about the bus is through the parent rte_device.
rte_device can reference the bus.

>
>> When a ethernet or crypto device (rte_eth_dev, rte_cryptodev) is allocated,
>> it contains a reference of rte_device and rte_driver.
>> Each ethernet device implementation would use container_of for finding the
>> enclosing structure of rte_xxx_*.
>>
>>                             +-------------------+
>>  +--------------+           |rte_pci_device     |
>>  |rte_eth_dev   |           |+-----------------+|
>>  |+------------+|   .-------->rte_device       ||
>>  ||rte_device*-----'        |+-----------------+|
>>  |+------------+|           ||rte_bus          ||
>>  |              |           |+-----------------+|
>>  /              /           +-------------------+
>>
>> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
>> ---
>>  lib/librte_eal/common/include/rte_bus.h | 243 ++++++++++++++++++++++++++++++++
>>  lib/librte_eal/common/include/rte_dev.h |  36 ++---
>>  2 files changed, 261 insertions(+), 18 deletions(-)
>>  create mode 100644 lib/librte_eal/common/include/rte_bus.h
>>
>> diff --git a/lib/librte_eal/common/include/rte_bus.h b/lib/librte_eal/common/include/rte_bus.h
>> new file mode 100644
>> index 0000000..dc3aeb8
>> --- /dev/null
>> +++ b/lib/librte_eal/common/include/rte_bus.h
>> @@ -0,0 +1,243 @@
>> +/*-
>> + *   BSD LICENSE
>> + *
>> + *   Copyright(c) 2016 NXP
>> + *   All rights reserved.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of NXP nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> + */
>> +
>> +#ifndef _RTE_BUS_H_
>> +#define _RTE_BUS_H_
>> +
>> +/**
>> + * @file
>> + *
>> + * RTE PMD Bus Abstraction interfaces
>> + *
>> + * This file exposes APIs and Interfaces for Bus Abstraction over the devices
>> + * drivers in EAL.
>> + */
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <stdio.h>
>> +#include <sys/queue.h>
>> +
>> +#include <rte_log.h>
>> +#include <rte_dev.h>
>> +
>> +
>> +/** Double linked list of buses */
>> +TAILQ_HEAD(rte_bus_list, rte_bus);
>> +
>> +/**
>> + * Bus specific scan for devices attached on the bus.
>> + * For each bus object, the scan would be reponsible for finding devices and
>> + * adding them to its private device list.
>> + *
>> + * Successful detection of a device results in rte_device object which is
>> + * embedded within the respective device type (rte_pci_device, for example).
>> + * Thereafter, PCI specific bus would need to perform
>> + * container_of(rte_pci_device) to obtain PCI device object.
>> + *
>> + * Scan failure of a bus is not treated as exit criteria for application. Scan
>> + * for all other buses would still continue.
>> + *
>> + * @param void
>> + * @return
>> + *     0 for successful scan
>> + *     !0 (<0) for unsuccessful scan with error value
>> + */
>> +typedef int (* bus_scan_t)(void);
>> +
>> +/**
>> + * Bus specific match for devices and drivers which can service them.
>> + * For each scanned device, during probe the match would link the devices with
>> + * drivers which can service the device.
>> + *
>> + * It is the work of each bus handler to obtain the specific device object
>> + * using container_of (or typecasting, as a less preferred way).
>> + *
>> + * @param drv
>> + *     Driver object attached to the bus
>> + * @param dev
>> + *     Device object which is being probed.
>> + * @return
>> + *     0 for successful match
>> + *     !0 for unsuccessful match
>> + */
>> +typedef int (* bus_match_t)(struct rte_driver *drv, struct rte_device *dev);
>
> Do you think this should do match & probe?

It is only matching as of now.
rte_eal_probe() from eal.c calls this function for the bus and probe for 
each device of that bus. (Look at 'pci_probe_all_drivers' called from 
rte_eal_pci_probe).

>
> I believe it is better to separate this into two functions to match()
> and probe(). The
> result of matching tells if the driver would want to claim the device
> in general. But
> probe()ing the device should only happen if the device isn't claimed
> by a driver yet and
> is not blacklisted.

Agree. This is what rte_eal_pci_probe() is doing for PCI. Similar 
implementation would exists in bus specific area (this is a different 
debate where...) for probe on each driver (associated with that bus).

>
>> +
>> +/**
>> + * Dump the devices on the bus.
>> + * Each bus type can define its own definition of information to dump.
>> + *
>> + * @param bus
>> + *     Handle for bus, device from which are to be dumped.
>> + * @param f
>> + *     Handle to output device or file.
>> + * @return void
>> + */
>> +typedef void (* bus_dump_t)(struct rte_bus *bus, FILE *f);
>> +
>> +/**
>> + * Search for a specific device in device list of the bus
>> + * This would rely on the bus specific addressing. Each implementation would
>> + * extract its specific device type and perform address compare.
>> + *
>> + * @param dev
>> + *     device handle to search for.
>> + * @return
>> + *     rte_device handle for matched device, or NULL
>> + */
>> +typedef struct rte_device * (* bus_device_get_t)(struct rte_device *dev);
>
> From my experience it is better to delegate this to a helper:
>
> int bus_for_each_dev(struct rte_bus *bus, struct rte_device *start,
> int (*match)(struct rte_device *dev, void *data), void *data);
>
> Based on that you can easily implement all kinds of functions like
> bus_find_dev(), bus_find_dev_by_name(), bus_dump(), ...

Interesting idea. Thanks. I will have a look on this.

>
>> +
>> +struct rte_bus {
>> +       TAILQ_ENTRY(rte_bus) next;   /**< Next bus object in linked list */
>> +       struct rte_driver_list driver_list; /**< List of all drivers of bus */
>
> TAILQ_HEAD?

Yes. That should be TAILQ_HEAD. I just reused the definition create in 
rte_dev.h using TAILQ_HEAD. I will change this.

>
>> +       struct rte_device_list device_list; /**< List of all devices on bus */
>
> TAILQ_HEAD?

Same as above. I will change this.

>
>> +       const char *name;            /**< Name of the bus */
>> +       /* Mandatory hooks */
>> +       bus_scan_t *scan;            /**< Hook for scanning for devices */
>> +       bus_match_t *match;          /**< Hook for matching device & driver */
>> +       /* Optional hooks */
>> +       bus_dump_t *dump_dev;        /**< Hook for dumping devices on bus */
>> +       bus_device_get_t *find_dev;  /**< Search for a device on bus */
>> +};
>> +
>> +/** @internal
>> + * Add a device to a bus.
>> + *
>> + * @param bus
>> + *     Bus on which device is to be added
>> + * @param dev
>> + *     Device handle
>> + * @return
>> + *     None
>> + */
>> +void
>> +rte_eal_bus_add_device(struct rte_bus *bus, struct rte_device *dev);
>
> Why do we need this? From my understanding the rte_bus->scan() is
> adding the devices to the rte_bus->device_list.

Plugging a device *after* scan has been completed. Hotplugging.
Specially for vdev, this would be required, in my opinion.

>
>> +/** @internal
>> + * Remove a device from its bus.
>> + *
>> + * @param dev
>> + *     Device handle to remove
>> + * @return
>> + *     None
>> + */
>> +void
>> +rte_eal_bus_remove_device(struct rte_device *dev);
>> +
>> +/** @internal
>> + * Associate a driver with a bus.
>> + *
>> + * @param bus
>> + *     Bus on which driver is to be added
>> + * @param dev
>> + *     Driver handle
>> + * @return
>> + *     None
>> + */
>> +void
>> +rte_eal_bus_add_driver(struct rte_bus *bus, struct rte_driver *drv);
>> +
>
> What happens if a driver is added at runtime to a bus? Does that immediately
> trigger a match & probe of unclaimed devices?

My take:
  - scan the bus for any newly plugged devices. Might be required in 
case a device is logical entity represented by multiple entries in sysfs.
  -- add only those which are not already added - maybe address/id match
  - Trigger driver probe for all devices which don't have a driver 
assigned to them (unclaimed, as you stated).

(This is part of my further changes - I think I forgot to put them as 
note in cover letter).

>
>> +/** @internal
>> + * Disassociate a driver from its bus.
>> + *
>> + * @param dev
>> + *     Driver handle to remove
>> + * @return
>> + *     None
>> + */
>> +void
>> +rte_eal_bus_remove_driver(struct rte_driver *drv);
>> +
>> +/**
>> + * Register a Bus handler.
>> + *
>> + * @param driver
>> + *   A pointer to a rte_bus structure describing the bus
>> + *   to be registered.
>> + */
>> +void rte_eal_bus_register(struct rte_bus *bus);
>> +
>> +/**
>> + * Unregister a Bus handler.
>> + *
>> + * @param driver
>> + *   A pointer to a rte_bus structure describing the bus
>> + *   to be unregistered.
>> + */
>> +void rte_eal_bus_unregister(struct rte_bus *bus);
>> +
>> +/**
>> + * Obtain handle for bus given its name.
>> + *
>> + * @param bus_name
>> + *     Name of the bus handle to search
>> + * @return
>> + *     Pointer to Bus object if name matches any registered bus object
>> + *     NULL, if no matching bus found
>> + */
>> +struct rte_bus * rte_eal_get_bus(const char *bus_name);
>> +
>> +/**
>> + * Register a device driver.
>> + *
>> + * @param driver
>> + *   A pointer to a rte_dev structure describing the driver
>> + *   to be registered.
>> + */
>> +void rte_eal_driver_register(struct rte_driver *driver);
>> +
>> +/**
>> + * Unregister a device driver.
>> + *
>> + * @param driver
>> + *   A pointer to a rte_dev structure describing the driver
>> + *   to be unregistered.
>> + */
>> +void rte_eal_driver_unregister(struct rte_driver *driver);
>> +
>> +/** Helper for Bus registration */
>> +#define RTE_PMD_REGISTER_BUS(nm, bus) \
>> +RTE_INIT(businitfn_ ##nm); \
>> +static void businitfn_ ##nm(void) \
>> +{\
>> +       (bus).name = RTE_STR(nm);\
>> +       rte_eal_bus_register(&bus); \
>> +}
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_BUS_H */
>> diff --git a/lib/librte_eal/common/include/rte_dev.h b/lib/librte_eal/common/include/rte_dev.h
>> index 8840380..b08bab5 100644
>> --- a/lib/librte_eal/common/include/rte_dev.h
>> +++ b/lib/librte_eal/common/include/rte_dev.h
>> @@ -116,12 +116,14 @@ TAILQ_HEAD(rte_device_list, rte_device);
>>
>>  /* Forward declaration */
>>  struct rte_driver;
>> +struct rte_bus;
>>
>>  /**
>>   * A structure describing a generic device.
>>   */
>>  struct rte_device {
>>         TAILQ_ENTRY(rte_device) next; /**< Next device */
>> +       struct rte_bus *bus;          /**< Bus on which device is placed */
>>         struct rte_driver *driver;    /**< Associated driver */
>>         int numa_node;                /**< NUMA node connection */
>>         struct rte_devargs *devargs;  /**< Device user arguments */
>> @@ -144,31 +146,29 @@ void rte_eal_device_insert(struct rte_device *dev);
>>  void rte_eal_device_remove(struct rte_device *dev);
>>
>>  /**
>> - * A structure describing a device driver.
>> + * @internal
>> + * TODO
>>   */
>> -struct rte_driver {
>> -       TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
>> -       const char *name;                   /**< Driver name. */
>> -       const char *alias;              /**< Driver alias. */
>> -};
>> +typedef int (*driver_init_t)(struct rte_device *eth_dev);
>>
>>  /**
>> - * Register a device driver.
>> - *
>> - * @param driver
>> - *   A pointer to a rte_dev structure describing the driver
>> - *   to be registered.
>> + * @internal
>> + * TODO
>>   */
>> -void rte_eal_driver_register(struct rte_driver *driver);
>> +typedef int (*driver_uninit_t)(struct rte_device *eth_dev);
>>
>>  /**
>> - * Unregister a device driver.
>> - *
>> - * @param driver
>> - *   A pointer to a rte_dev structure describing the driver
>> - *   to be unregistered.
>> + * A structure describing a device driver.
>>   */
>> -void rte_eal_driver_unregister(struct rte_driver *driver);
>> +struct rte_driver {
>> +       TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
>> +       struct rte_bus *bus;           /**< Bus which drivers services */
>
> I think this should be TAILQ_ENTRY instead.

rte_bus
       `-> device_list-. <- TAILQ_HEAD (rte_device)
                       |
                 rte_pci_device:TAILQ_ENTRY(rte_device)

rte_device just references the bus it belong to.

Am I missing something?

>
>> +       const char *name;              /**< Driver name. */
>> +       const char *alias;             /**< Driver alias. */
>> +       driver_init_t *init;           /**< Driver initialization */
>> +       driver_uninit_t *uninit;       /**< Driver uninitialization */
>
> Shouldn't this be probe() and remove()?

rte_xxx_driver already includes probe and remove.
Mandate for init is to allocate the ethernet or cryptodev (or some other 
functional unit). Whereas, probe is responsible for driver specific 
intialization - PCI specific, in case of PCI driver.

Initially I had thought of moving rte_pci_driver->probe() to rte_driver. 
But, I couldn't understand where would functional device initialization 
exist. It cannot be rte_xxx_driver. It should be generic place.

Such functional device initialization is in path of the probe function, 
but so does driver specific information. If we remove 
rte_pci_driver->probe and move to rte_driver->probe(), it would only 
mean rte_driver calling a hook within the PCI driver specific domain.

>
>> +       unsigned int dev_private_size; /**< Size of device private data ??*/
>
> I don't think that dev_private_size is really required at this level.
> First of all this is related to the rte_eth_dev structure and
> therefore it really depends on the driver_init_t (aka probe()) if it
> is actually allocating an rte_eth_dev or not. Anyway that is up to the
> drivers probe() function.

I moved it based on input through ML (or IRC, I don't remember).
My understanding: if this is common for all rte_xxx_driver instances 
(for allocating their ethernet/crypto structure), it would actually make 
sense to keep at this level so that init() can use it for allocating 
necessary space before handling over the rte_xxx_driver.

But again, I am OK keeping it in the underlying layer - as you have 
rightly pointed out, it would only be used at rte_xxx_driver layer.

>
>> +};
>>
>>  /**
>>   * Initalize all the registered drivers in this process
>> --
>> 2.7.4
>>
>

-
Shreyansh

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 1/6] eal: define container macro
  2016-11-17 12:06   ` Jan Blunck
@ 2016-11-17 13:01     ` Shreyansh Jain
  0 siblings, 0 replies; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17 13:01 UTC (permalink / raw)
  To: Jan Blunck; +Cc: David Marchand, dev, Jan Viktorin

On Thursday 17 November 2016 05:36 PM, Jan Blunck wrote:
> On Thu, Nov 17, 2016 at 6:30 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
>> From: Jan Viktorin <viktorin@rehivetech.com>
>>
>> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
>> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
>> ---
>>  lib/librte_eal/common/include/rte_common.h | 18 ++++++++++++++++++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/lib/librte_eal/common/include/rte_common.h b/lib/librte_eal/common/include/rte_common.h
>> index db5ac91..8152bd9 100644
>> --- a/lib/librte_eal/common/include/rte_common.h
>> +++ b/lib/librte_eal/common/include/rte_common.h
>> @@ -331,6 +331,24 @@ rte_bsf32(uint32_t v)
>>  #define offsetof(TYPE, MEMBER)  __builtin_offsetof (TYPE, MEMBER)
>>  #endif
>>
>> +/**
>> + * Return pointer to the wrapping struct instance.
>> + * Example:
>> + *
>> + *  struct wrapper {
>> + *      ...
>> + *      struct child c;
>> + *      ...
>> + *  };
>> + *
>> + *  struct child *x = obtain(...);
>> + *  struct wrapper *w = container_of(x, struct wrapper, c);
>> + */
>> +#ifndef container_of
>> +#define container_of(p, type, member) \
>> +       ((type *) (((char *) (p)) - offsetof(type, member)))
>
> Are there any reasons why you choose to implement this in a non-type
> safe way? Catching obvious bugs at compile time is in the interest of
> us and our users from my point of view.

No specific reason. I just took an existing patchset floating in ML. If 
you can point me to some better implementation, I will use that.

>
>
>> +#endif
>> +
>>  #define _RTE_STR(x) #x
>>  /** Take a macro value and get a string version of it */
>>  #define RTE_STR(x) _RTE_STR(x)
>> --
>> 2.7.4
>>
>


-- 
-
Shreyansh

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/6] Restructure EAL device model for bus support
  2016-11-17 11:55 ` [RFC PATCH 0/6] Restructure EAL device model for bus support Jan Blunck
@ 2016-11-17 13:08   ` Shreyansh Jain
  2016-11-17 16:54     ` Jan Blunck
  0 siblings, 1 reply; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-17 13:08 UTC (permalink / raw)
  To: Jan Blunck; +Cc: David Marchand, dev

On Thursday 17 November 2016 05:25 PM, Jan Blunck wrote:
> On Thu, Nov 17, 2016 at 6:29 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
>> DPDK has been inherently a PCI inclined framework. Because of this, the
>> design of device tree (or list) within DPDK is also PCI inclined. A non-PCI
>> device doesn't have a way of being expressed without using hooks started from
>> EAL to PMD.
>>
>> With this cover letter, some patches are presented which try to break this
>> strict linkage of EAL with PCI devices. Aim is to generalize the device
>> hierarchy on the lines of how Linux handles it:
>>
>>         device A1
>>           |
>>   +==.===='==============.============+ Bus A.
>>      |                    `--> driver A11     \
>>   device A2                `-> driver A12      \______
>>                                                 |CPU |
>>                                                 /`````
>>         device A1                              /
>>           |                                   /
>>   +==.===='==============.============+ Bus A`
>>      |                    `--> driver B11
>>   device A2                `-> driver B12
>>
>> Simply put:
>>  - a bus is connect to CPU (or core)
>>  - devices are conneted to Bus
>>  - drivers are running instances which manage one or more devices
>>  - bus is responsible for identifying devices (and interrupt propogation)
>>  - driver is responsible for initializing the device
>>
>> (*Reusing text from email [1])
>> In context of DPDK EAL:
>>  - a generic bus (not a driver, not a device). I don't know how to categorize
>>    a bus. It is certainly not a device, and then handler for a bus (physical)
>>    can be considered a 'bus driver'. So, just 'rte_bus'.
>>  - there is a bus for each physical implementation (or virtual). So, a rte_bus
>>    Object for PCI, VDEV, ABC, DEF and so on.
>>  - Buses are registered just like a PMD - RTE_PMD_BUS_REGISTER()
>>  - Each registered bus is part of a doubly list.
>>    -- Each device inherits rte_bus
>>    -- Each driver inherits rte_bus
>>    -- Device and Drivers lists are part of rte_bus
>>  - eth_driver is no more required - it was just a holder for PMDs to register
>>    themselves. It can be replaced with rte_xxx_driver and corresponding init/
>>    uninit moved to rte_driver
>>  - rte_eth_dev modified to disassociate itself from rte_pci_device and connect
>>    to generic rte_device
>>
>> Once again, improvising from [1]:
>>
>>                                   __ rte_bus_list
>>                                  /
>>                      +----------'---+
>>                      |rte_bus       |
>>                      | driver_list------> List of rte_bus specific
>>                      | device_list----    devices
>>                      | scan         | `-> List of rte_bus associated
>>                      | match        |     drivers
>>                      | dump         |
>>                      | ..some refcnt| (#)
>>                      +--|------|----+
>>               _________/        \_________
>>     +--------/----+                     +-\---------------+
>>     |rte_device   |                     |rte_driver       |
>>     | rte_bus     |                     | rte_bus         |
>>     | rte_driver  |(#)                  | init            |
>>     |             |                     | uninit          |
>>     |  devargs    |                     | dev_private_size|
>>     +---||--------+                     | drv_flags       |(#)
>>         ||                              | intr_handle(2*) |(#)
>>         | \                             +----------\\\----+
>>         |  \_____________                           \\\
>>         |                \                          |||
>>  +------|---------+ +----|----------+               |||
>>  |rte_pci_device  | |rte_xxx_device | (4*)          |||
>>  | PCI specific   | | xxx device    |               |||
>>  | info (mem,)    | | specific fns  |              / | \
>>  +----------------+ +---------------+             /  |  \
>>                             _____________________/  /    \
>>                            /                    ___/      \
>>             +-------------'--+    +------------'---+    +--'------------+
>>             |rte_pci_driver  |    |rte_vdev_driver |    |rte_xxx_driver |
>>             | PCI id table,  |    | <probably,     |    | ....          |
>>             | other driver   |    |  nothing>      |    +---------------+
>>             | data           |    +----------------+
>>             +----------------+
>>
>>     (1*) Problem is that probe functions have different arguments. So,
>>          generalizing them might be some rework in the respective device
>>          layers
>>     (2*) Interrupt handling for each driver type might be different. I am not
>>          sure how to generalize that either. This is grey area for me.
>>     (3*) Probably exposing a bitmask for device capabilities. Nothing similar
>>          exists now to relate it. Don't know if that is useful. Allowing
>>          applications to question a device about what it supports and what it
>>          doesn't - making it more flexible at application layer (but more code
>>          as well.)
>>     (4*) Even vdev would be an instantiated as a device. It is not being done
>>          currently.
>>     (#)  Items which are still pending
>>
>> With this cover letter, some patches have been posted. These are _only_ for
>> discussion purpose. They are not complete and neither compilable.
>> All the patches, except 0001, have sufficient context about what the changes
>> are and rationale for same. Obviously, code is best answer.
>>
>> === Patch description: ===
>>
>> Patch 0001: Introduce container_of macro. Originally a patch from Jan.
>>
>> Patch 0002: introduce changes to rte_device/rte_driver for rte_bus, and
>>             rte_bus definition itself.
>>
>> Patch 0003: Add a new layer for 'bus driver' with linuxapp PCI as an example
>>
>> Patch 0004: Changes with respect to rte_bus APIs and impact on eal_common_pci
>
> From my point of view it is beneficial to keep the PCI support in one
> place. Moving the match() and scan()
> to the drivers directory doesn't seem to be technically required but
> it makes the code harder to read and understand.

It is not a technical requirement - just logical placement. These are 
bus specific logic and should exist with the bus driver - where ever 
that is placed.
Though, I am not entirely sure about 'harder to read'. If a person is 
reading through PCI's bus implementation, I am assuming it would be nice 
to have all the hooks (or default hooks) at same place. Isn't it?

Or, maybe you are suggesting move all librte_eal/*/*pci* out to some 
common location. If so, that is something I haven't yet given serious 
thought.

>
>
>> Patch 0005: Change to rte_eal_init (of linuxapp only, for now) for supporting
>>             bus->scan. Probe is still being done old way, but in a new wrapper
>>
>> Patch 0006: eth_driver removal and corresponding changes in ixgbe_ethdev, as
>>             an example. Includes changes to rte_ethdev to remove most possible
>>             PCI references. But, work still remains.
>
> Making rte_ethdev independent from PCI device is not directly related
> to the rest of the patches that add bus support. I think it makes
> sense to handle this separately because of the impact of refactoring
> rte_ethdev.

Once rte_device is available, the change is independent.
Only dependency on it was changes required to rte_ethdev.c file because 
of various PCI naming/variables/functions. And that is indeed a very 
large change - including changes to drivers/* which I haven't yet done.

Are you suggesting breaking out of this series? If so, can be done but 
only when formal patches start coming out.

>
>
>>
>> === Pending Items/Questions: ===
>>
>>  - Interrupt and notification handling. How to allow drivers to be notified
>>    of presence/plugging of a device.
>>  - Placement of bus driver/handling code. librte_bus, bus/, drivers/bus?
>>  -- Also from a pespective of a external library and whether symbols would be
>>     available in that.
>>  -- and secondary processes
>>  - VDEV bus is missing from current set.
>>  - Locking of list for supporting hotplugging. Or, at the least safe add/
>>    remove
>>  - PMDINFOGEN support or lack of it.
>>  - Is there ever a case where rte_eth_dev needs to be resolved from
>>    rte_pci_device? I couldn't find any such use and neither a use-case for it.
>>  - There should be a way for Bus APIs to return a generic list handle so that
>>    EAL doesn't need to bother about bus->driver_list like dereferencing. This
>>    is untidy as well as less portable (in terms of code movement, not arch).
>>  - Are more helper hooks required for a bus?
>>  -- I can think of scan, match, dump, find, plug (device), unplug (device),
>>     associate (driver), disassociate (driver). But, most of the work is
>>     already being done by lower instances (rte_device/driver etc).
>>
>> Further:
>>  - In next few days I will make all necessary changes on the lines mentioned
>>    in the patches. This would include changing the drivers/* and librte_eal/*
>>  - As an when review comments float in and agreement reached, I will keep
>>    changing the model
>>  - There are grey areas like interrupt, notification, locking of bus/list
>>    which require more discussion. I will try and post a rfc for those as well
>>    or if someone can help me on those - great
>
> As already hinted on IRC I'm taking a look at the interrupt usage of ethdev.

Great. Thank you. Do let me know feedback for any changes that you might 
require along the way.

>
>>  - Change would include PCI bus and VDEV bus handling. A new bus (NXP's FSLMC)
>>    would also be layered over this series to verify the model of 'bus
>>    registration'. This is also part of 17.02 roadmap.
>>
>> [1] http://dpdk.org/ml/archives/dev/2016-November/050186.html
>>
>> Jan Viktorin (1):
>>   eal: define container macro
>>
>> Shreyansh Jain (5):
>>   eal: introduce bus-device-driver structure
>>   bus: add bus driver layer
>>   eal/common: handle bus abstraction for device/driver objects
>>   eal: supporting bus model in init process
>>   eal: removing eth_driver
>>
>>  bus/Makefile                               |  36 +++
>>  bus/pci/Makefile                           |  37 +++
>>  bus/pci/linuxapp/pci_bus.c                 | 418 +++++++++++++++++++++++++++++
>>  bus/pci/linuxapp/pci_bus.h                 |  55 ++++
>>  drivers/net/ixgbe/ixgbe_ethdev.c           |  49 ++--
>>  lib/librte_eal/common/eal_common_bus.c     | 188 +++++++++++++
>>  lib/librte_eal/common/eal_common_dev.c     |  31 ++-
>>  lib/librte_eal/common/eal_common_pci.c     | 226 +++++++++-------
>>  lib/librte_eal/common/include/rte_bus.h    | 243 +++++++++++++++++
>>  lib/librte_eal/common/include/rte_common.h |  18 ++
>>  lib/librte_eal/common/include/rte_dev.h    |  36 +--
>>  lib/librte_eal/common/include/rte_pci.h    |  11 +-
>>  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
>>  lib/librte_eal/linuxapp/eal/eal.c          |  51 +++-
>>  lib/librte_eal/linuxapp/eal/eal_pci.c      | 298 --------------------
>>  lib/librte_ether/rte_ethdev.c              |  36 ++-
>>  lib/librte_ether/rte_ethdev.h              |   6 +-
>>  17 files changed, 1262 insertions(+), 478 deletions(-)
>>  create mode 100644 bus/Makefile
>>  create mode 100644 bus/pci/Makefile
>>  create mode 100644 bus/pci/linuxapp/pci_bus.c
>>  create mode 100644 bus/pci/linuxapp/pci_bus.h
>>  create mode 100644 lib/librte_eal/common/eal_common_bus.c
>>  create mode 100644 lib/librte_eal/common/include/rte_bus.h
>>
>> --
>> 2.7.4
>>
>

-
Shreyansh

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 2/6] eal: introduce bus-device-driver structure
  2016-11-17 13:00     ` Shreyansh Jain
@ 2016-11-17 16:13       ` Jan Blunck
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Blunck @ 2016-11-17 16:13 UTC (permalink / raw)
  To: Shreyansh Jain; +Cc: David Marchand, dev

On Thu, Nov 17, 2016 at 2:00 PM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
>
> On Thursday 17 November 2016 04:49 PM, Jan Blunck wrote:
>>
>> On Thu, Nov 17, 2016 at 6:30 AM, Shreyansh Jain <shreyansh.jain@nxp.com>
>> wrote:
>>>
>>> A device is connected to a bus and services by a driver associated with
>>> the bus. It is responsibility of the bus to identify the devices (scan)
>>> and then assign each device to a matching driver.
>>>
>>> A PMD would allocate a rte_xxx_driver and rte_xxx_device.
>>> rte_xxx_driver has rte_driver and rte_bus embedded. Similarly,
>>> rte_xxx_device
>>> has rte_device and rte_bus embedded.
>>
>>
>> I don't think so: the rte_xxx_device embeds the generic rte_device and
>> references a the rte_bus
>> that it is attached to.
>
>
> You mean?
>
>  struct rte_pci_device {
>    struct rte_device device;
>    struct rte_bus *bus;
>    ...
> }
>
> If yes then I have a different view.
> 'device' is connected to a bus. pci_device is just a type of device. Only
> way it should know about the bus is through the parent rte_device.
> rte_device can reference the bus.
>

No. Actually my English was bad. I meant that the rte_device
references the rte_bus but shouldn't embed it.

>
>>
>>> When a ethernet or crypto device (rte_eth_dev, rte_cryptodev) is
>>> allocated,
>>> it contains a reference of rte_device and rte_driver.
>>> Each ethernet device implementation would use container_of for finding
>>> the
>>> enclosing structure of rte_xxx_*.
>>>
>>>                             +-------------------+
>>>  +--------------+           |rte_pci_device     |
>>>  |rte_eth_dev   |           |+-----------------+|
>>>  |+------------+|   .-------->rte_device       ||
>>>  ||rte_device*-----'        |+-----------------+|
>>>  |+------------+|           ||rte_bus          ||
>>>  |              |           |+-----------------+|
>>>  /              /           +-------------------+
>>>
>>> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
>>> ---
>>>  lib/librte_eal/common/include/rte_bus.h | 243
>>> ++++++++++++++++++++++++++++++++
>>>  lib/librte_eal/common/include/rte_dev.h |  36 ++---
>>>  2 files changed, 261 insertions(+), 18 deletions(-)
>>>  create mode 100644 lib/librte_eal/common/include/rte_bus.h
>>>
>>> diff --git a/lib/librte_eal/common/include/rte_bus.h
>>> b/lib/librte_eal/common/include/rte_bus.h
>>> new file mode 100644
>>> index 0000000..dc3aeb8
>>> --- /dev/null
>>> +++ b/lib/librte_eal/common/include/rte_bus.h
>>> @@ -0,0 +1,243 @@
>>> +/*-
>>> + *   BSD LICENSE
>>> + *
>>> + *   Copyright(c) 2016 NXP
>>> + *   All rights reserved.
>>> + *
>>> + *   Redistribution and use in source and binary forms, with or without
>>> + *   modification, are permitted provided that the following conditions
>>> + *   are met:
>>> + *
>>> + *     * Redistributions of source code must retain the above copyright
>>> + *       notice, this list of conditions and the following disclaimer.
>>> + *     * Redistributions in binary form must reproduce the above
>>> copyright
>>> + *       notice, this list of conditions and the following disclaimer in
>>> + *       the documentation and/or other materials provided with the
>>> + *       distribution.
>>> + *     * Neither the name of NXP nor the names of its
>>> + *       contributors may be used to endorse or promote products derived
>>> + *       from this software without specific prior written permission.
>>> + *
>>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
>>> FOR
>>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
>>> COPYRIGHT
>>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
>>> INCIDENTAL,
>>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
>>> USE,
>>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
>>> ANY
>>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
>>> USE
>>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
>>> DAMAGE.
>>> + */
>>> +
>>> +#ifndef _RTE_BUS_H_
>>> +#define _RTE_BUS_H_
>>> +
>>> +/**
>>> + * @file
>>> + *
>>> + * RTE PMD Bus Abstraction interfaces
>>> + *
>>> + * This file exposes APIs and Interfaces for Bus Abstraction over the
>>> devices
>>> + * drivers in EAL.
>>> + */
>>> +
>>> +#ifdef __cplusplus
>>> +extern "C" {
>>> +#endif
>>> +
>>> +#include <stdio.h>
>>> +#include <sys/queue.h>
>>> +
>>> +#include <rte_log.h>
>>> +#include <rte_dev.h>
>>> +
>>> +
>>> +/** Double linked list of buses */
>>> +TAILQ_HEAD(rte_bus_list, rte_bus);
>>> +
>>> +/**
>>> + * Bus specific scan for devices attached on the bus.
>>> + * For each bus object, the scan would be reponsible for finding devices
>>> and
>>> + * adding them to its private device list.
>>> + *
>>> + * Successful detection of a device results in rte_device object which
>>> is
>>> + * embedded within the respective device type (rte_pci_device, for
>>> example).
>>> + * Thereafter, PCI specific bus would need to perform
>>> + * container_of(rte_pci_device) to obtain PCI device object.
>>> + *
>>> + * Scan failure of a bus is not treated as exit criteria for
>>> application. Scan
>>> + * for all other buses would still continue.
>>> + *
>>> + * @param void
>>> + * @return
>>> + *     0 for successful scan
>>> + *     !0 (<0) for unsuccessful scan with error value
>>> + */
>>> +typedef int (* bus_scan_t)(void);
>>> +
>>> +/**
>>> + * Bus specific match for devices and drivers which can service them.
>>> + * For each scanned device, during probe the match would link the
>>> devices with
>>> + * drivers which can service the device.
>>> + *
>>> + * It is the work of each bus handler to obtain the specific device
>>> object
>>> + * using container_of (or typecasting, as a less preferred way).
>>> + *
>>> + * @param drv
>>> + *     Driver object attached to the bus
>>> + * @param dev
>>> + *     Device object which is being probed.
>>> + * @return
>>> + *     0 for successful match
>>> + *     !0 for unsuccessful match
>>> + */
>>> +typedef int (* bus_match_t)(struct rte_driver *drv, struct rte_device
>>> *dev);
>>
>>
>> Do you think this should do match & probe?
>
>
> It is only matching as of now.
> rte_eal_probe() from eal.c calls this function for the bus and probe for
> each device of that bus. (Look at 'pci_probe_all_drivers' called from
> rte_eal_pci_probe).
>
>>
>> I believe it is better to separate this into two functions to match()
>> and probe(). The
>> result of matching tells if the driver would want to claim the device
>> in general. But
>> probe()ing the device should only happen if the device isn't claimed
>> by a driver yet and
>> is not blacklisted.
>
>
> Agree. This is what rte_eal_pci_probe() is doing for PCI. Similar
> implementation would exists in bus specific area (this is a different debate
> where...) for probe on each driver (associated with that bus).
>
>
>>
>>> +
>>> +/**
>>> + * Dump the devices on the bus.
>>> + * Each bus type can define its own definition of information to dump.
>>> + *
>>> + * @param bus
>>> + *     Handle for bus, device from which are to be dumped.
>>> + * @param f
>>> + *     Handle to output device or file.
>>> + * @return void
>>> + */
>>> +typedef void (* bus_dump_t)(struct rte_bus *bus, FILE *f);
>>> +
>>> +/**
>>> + * Search for a specific device in device list of the bus
>>> + * This would rely on the bus specific addressing. Each implementation
>>> would
>>> + * extract its specific device type and perform address compare.
>>> + *
>>> + * @param dev
>>> + *     device handle to search for.
>>> + * @return
>>> + *     rte_device handle for matched device, or NULL
>>> + */
>>> +typedef struct rte_device * (* bus_device_get_t)(struct rte_device
>>> *dev);
>>
>>
>> From my experience it is better to delegate this to a helper:
>>
>> int bus_for_each_dev(struct rte_bus *bus, struct rte_device *start,
>> int (*match)(struct rte_device *dev, void *data), void *data);
>>
>> Based on that you can easily implement all kinds of functions like
>> bus_find_dev(), bus_find_dev_by_name(), bus_dump(), ...
>
>
> Interesting idea. Thanks. I will have a look on this.
>
>>
>>> +
>>> +struct rte_bus {
>>> +       TAILQ_ENTRY(rte_bus) next;   /**< Next bus object in linked list
>>> */
>>> +       struct rte_driver_list driver_list; /**< List of all drivers of
>>> bus */
>>
>>
>> TAILQ_HEAD?
>
>
> Yes. That should be TAILQ_HEAD. I just reused the definition create in
> rte_dev.h using TAILQ_HEAD. I will change this.
>
>>
>>> +       struct rte_device_list device_list; /**< List of all devices on
>>> bus */
>>
>>
>> TAILQ_HEAD?
>
>
> Same as above. I will change this.
>
>>
>>> +       const char *name;            /**< Name of the bus */
>>> +       /* Mandatory hooks */
>>> +       bus_scan_t *scan;            /**< Hook for scanning for devices
>>> */
>>> +       bus_match_t *match;          /**< Hook for matching device &
>>> driver */
>>> +       /* Optional hooks */
>>> +       bus_dump_t *dump_dev;        /**< Hook for dumping devices on bus
>>> */
>>> +       bus_device_get_t *find_dev;  /**< Search for a device on bus */
>>> +};
>>> +
>>> +/** @internal
>>> + * Add a device to a bus.
>>> + *
>>> + * @param bus
>>> + *     Bus on which device is to be added
>>> + * @param dev
>>> + *     Device handle
>>> + * @return
>>> + *     None
>>> + */
>>> +void
>>> +rte_eal_bus_add_device(struct rte_bus *bus, struct rte_device *dev);
>>
>>
>> Why do we need this? From my understanding the rte_bus->scan() is
>> adding the devices to the rte_bus->device_list.
>
>
> Plugging a device *after* scan has been completed. Hotplugging.
> Specially for vdev, this would be required, in my opinion.
>
>
>>
>>> +/** @internal
>>> + * Remove a device from its bus.
>>> + *
>>> + * @param dev
>>> + *     Device handle to remove
>>> + * @return
>>> + *     None
>>> + */
>>> +void
>>> +rte_eal_bus_remove_device(struct rte_device *dev);
>>> +
>>> +/** @internal
>>> + * Associate a driver with a bus.
>>> + *
>>> + * @param bus
>>> + *     Bus on which driver is to be added
>>> + * @param dev
>>> + *     Driver handle
>>> + * @return
>>> + *     None
>>> + */
>>> +void
>>> +rte_eal_bus_add_driver(struct rte_bus *bus, struct rte_driver *drv);
>>> +
>>
>>
>> What happens if a driver is added at runtime to a bus? Does that
>> immediately
>> trigger a match & probe of unclaimed devices?
>
>
> My take:
>  - scan the bus for any newly plugged devices. Might be required in case a
> device is logical entity represented by multiple entries in sysfs.
>  -- add only those which are not already added - maybe address/id match
>  - Trigger driver probe for all devices which don't have a driver assigned
> to them (unclaimed, as you stated).
>
> (This is part of my further changes - I think I forgot to put them as note
> in cover letter).
>
>
>>
>>> +/** @internal
>>> + * Disassociate a driver from its bus.
>>> + *
>>> + * @param dev
>>> + *     Driver handle to remove
>>> + * @return
>>> + *     None
>>> + */
>>> +void
>>> +rte_eal_bus_remove_driver(struct rte_driver *drv);
>>> +
>>> +/**
>>> + * Register a Bus handler.
>>> + *
>>> + * @param driver
>>> + *   A pointer to a rte_bus structure describing the bus
>>> + *   to be registered.
>>> + */
>>> +void rte_eal_bus_register(struct rte_bus *bus);
>>> +
>>> +/**
>>> + * Unregister a Bus handler.
>>> + *
>>> + * @param driver
>>> + *   A pointer to a rte_bus structure describing the bus
>>> + *   to be unregistered.
>>> + */
>>> +void rte_eal_bus_unregister(struct rte_bus *bus);
>>> +
>>> +/**
>>> + * Obtain handle for bus given its name.
>>> + *
>>> + * @param bus_name
>>> + *     Name of the bus handle to search
>>> + * @return
>>> + *     Pointer to Bus object if name matches any registered bus object
>>> + *     NULL, if no matching bus found
>>> + */
>>> +struct rte_bus * rte_eal_get_bus(const char *bus_name);
>>> +
>>> +/**
>>> + * Register a device driver.
>>> + *
>>> + * @param driver
>>> + *   A pointer to a rte_dev structure describing the driver
>>> + *   to be registered.
>>> + */
>>> +void rte_eal_driver_register(struct rte_driver *driver);
>>> +
>>> +/**
>>> + * Unregister a device driver.
>>> + *
>>> + * @param driver
>>> + *   A pointer to a rte_dev structure describing the driver
>>> + *   to be unregistered.
>>> + */
>>> +void rte_eal_driver_unregister(struct rte_driver *driver);
>>> +
>>> +/** Helper for Bus registration */
>>> +#define RTE_PMD_REGISTER_BUS(nm, bus) \
>>> +RTE_INIT(businitfn_ ##nm); \
>>> +static void businitfn_ ##nm(void) \
>>> +{\
>>> +       (bus).name = RTE_STR(nm);\
>>> +       rte_eal_bus_register(&bus); \
>>> +}
>>> +
>>> +#ifdef __cplusplus
>>> +}
>>> +#endif
>>> +
>>> +#endif /* _RTE_BUS_H */
>>> diff --git a/lib/librte_eal/common/include/rte_dev.h
>>> b/lib/librte_eal/common/include/rte_dev.h
>>> index 8840380..b08bab5 100644
>>> --- a/lib/librte_eal/common/include/rte_dev.h
>>> +++ b/lib/librte_eal/common/include/rte_dev.h
>>> @@ -116,12 +116,14 @@ TAILQ_HEAD(rte_device_list, rte_device);
>>>
>>>  /* Forward declaration */
>>>  struct rte_driver;
>>> +struct rte_bus;
>>>
>>>  /**
>>>   * A structure describing a generic device.
>>>   */
>>>  struct rte_device {
>>>         TAILQ_ENTRY(rte_device) next; /**< Next device */
>>> +       struct rte_bus *bus;          /**< Bus on which device is placed
>>> */
>>>         struct rte_driver *driver;    /**< Associated driver */
>>>         int numa_node;                /**< NUMA node connection */
>>>         struct rte_devargs *devargs;  /**< Device user arguments */
>>> @@ -144,31 +146,29 @@ void rte_eal_device_insert(struct rte_device *dev);
>>>  void rte_eal_device_remove(struct rte_device *dev);
>>>
>>>  /**
>>> - * A structure describing a device driver.
>>> + * @internal
>>> + * TODO
>>>   */
>>> -struct rte_driver {
>>> -       TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
>>> -       const char *name;                   /**< Driver name. */
>>> -       const char *alias;              /**< Driver alias. */
>>> -};
>>> +typedef int (*driver_init_t)(struct rte_device *eth_dev);
>>>
>>>  /**
>>> - * Register a device driver.
>>> - *
>>> - * @param driver
>>> - *   A pointer to a rte_dev structure describing the driver
>>> - *   to be registered.
>>> + * @internal
>>> + * TODO
>>>   */
>>> -void rte_eal_driver_register(struct rte_driver *driver);
>>> +typedef int (*driver_uninit_t)(struct rte_device *eth_dev);
>>>
>>>  /**
>>> - * Unregister a device driver.
>>> - *
>>> - * @param driver
>>> - *   A pointer to a rte_dev structure describing the driver
>>> - *   to be unregistered.
>>> + * A structure describing a device driver.
>>>   */
>>> -void rte_eal_driver_unregister(struct rte_driver *driver);
>>> +struct rte_driver {
>>> +       TAILQ_ENTRY(rte_driver) next;  /**< Next in list. */
>>> +       struct rte_bus *bus;           /**< Bus which drivers services */
>>
>>
>> I think this should be TAILQ_ENTRY instead.
>
>
> rte_bus
>       `-> device_list-. <- TAILQ_HEAD (rte_device)
>                       |
>                 rte_pci_device:TAILQ_ENTRY(rte_device)
>
> rte_device just references the bus it belong to.
>
> Am I missing something?
>

This is the rte_driver though. Don't we keep the global list of all
registered drivers? If yes, we need a second TAILQ_ENTRY() that is
registered with the rte_bus.


>>
>>> +       const char *name;              /**< Driver name. */
>>> +       const char *alias;             /**< Driver alias. */
>>> +       driver_init_t *init;           /**< Driver initialization */
>>> +       driver_uninit_t *uninit;       /**< Driver uninitialization */
>>
>>
>> Shouldn't this be probe() and remove()?
>
>
> rte_xxx_driver already includes probe and remove.
> Mandate for init is to allocate the ethernet or cryptodev (or some other
> functional unit). Whereas, probe is responsible for driver specific
> intialization - PCI specific, in case of PCI driver.
>

There is close to zero PCI specific code left in
rte_eth_dev_pci_probe() function. Basically it generates the correct
name from the pci_dev. Everything else it delegates to eth_driver
(which should die anyway).

Example:

struct rte_pci_driver foo_ethernet_driver {
     .driver = {
               "net_foo",
     },
     .id_table = ...,
     .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
     .probe = foo_probe_pci,
}

struct rte_pci_driver foo_crypto_driver {
     .driver = {
               "crypto_foo",
     },
    .id_table = ...,
     .drv_flags = RTE_PCI_DRV_NEED_MAPPING,
     .probe = foo_probe_pci,
}

The bus looks at the driver and does everything requested (mapping
etc) before it calls the driver specific probe. If we delegate the
functional initialization to the next layer (rte_driver) this mean the
probe() operates on a rte_device. In this case it would need to upcast
to the rte_pci_device.

>From my understanding this is going two steps forward and one step
back. Maybe this confusion arises because the rte_bus->probe()
function is missing? My understanding is that the PCI rte_pci_bus is
providing the functionality that is required (kernel driver unloading,
mapping, interrupt wiring, ...) by an instance of a rte_pci_driver.



> Initially I had thought of moving rte_pci_driver->probe() to rte_driver.
> But, I couldn't understand where would functional device initialization
> exist. It cannot be rte_xxx_driver. It should be generic place.
>
> Such functional device initialization is in path of the probe function, but
> so does driver specific information. If we remove rte_pci_driver->probe and
> move to rte_driver->probe(), it would only mean rte_driver calling a hook
> within the PCI driver specific domain.
>
>>
>>> +       unsigned int dev_private_size; /**< Size of device private data
>>> ??*/
>>
>>
>> I don't think that dev_private_size is really required at this level.
>> First of all this is related to the rte_eth_dev structure and
>> therefore it really depends on the driver_init_t (aka probe()) if it
>> is actually allocating an rte_eth_dev or not. Anyway that is up to the
>> drivers probe() function.
>
>
> I moved it based on input through ML (or IRC, I don't remember).
> My understanding: if this is common for all rte_xxx_driver instances (for
> allocating their ethernet/crypto structure), it would actually make sense to
> keep at this level so that init() can use it for allocating necessary space
> before handling over the rte_xxx_driver.
>
> But again, I am OK keeping it in the underlying layer - as you have rightly
> pointed out, it would only be used at rte_xxx_driver layer.
>
>>
>>> +};
>>>
>>>  /**
>>>   * Initalize all the registered drivers in this process
>>> --
>>> 2.7.4
>>>
>>
>
> -
> Shreyansh

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/6] Restructure EAL device model for bus support
  2016-11-17 13:08   ` Shreyansh Jain
@ 2016-11-17 16:54     ` Jan Blunck
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Blunck @ 2016-11-17 16:54 UTC (permalink / raw)
  To: Shreyansh Jain; +Cc: David Marchand, dev

On Thu, Nov 17, 2016 at 2:08 PM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
> On Thursday 17 November 2016 05:25 PM, Jan Blunck wrote:
>>
>> On Thu, Nov 17, 2016 at 6:29 AM, Shreyansh Jain <shreyansh.jain@nxp.com>
>> wrote:
>>>
>>> DPDK has been inherently a PCI inclined framework. Because of this, the
>>> design of device tree (or list) within DPDK is also PCI inclined. A
>>> non-PCI
>>> device doesn't have a way of being expressed without using hooks started
>>> from
>>> EAL to PMD.
>>>
>>> With this cover letter, some patches are presented which try to break
>>> this
>>> strict linkage of EAL with PCI devices. Aim is to generalize the device
>>> hierarchy on the lines of how Linux handles it:
>>>
>>>         device A1
>>>           |
>>>   +==.===='==============.============+ Bus A.
>>>      |                    `--> driver A11     \
>>>   device A2                `-> driver A12      \______
>>>                                                 |CPU |
>>>                                                 /`````
>>>         device A1                              /
>>>           |                                   /
>>>   +==.===='==============.============+ Bus A`
>>>      |                    `--> driver B11
>>>   device A2                `-> driver B12
>>>
>>> Simply put:
>>>  - a bus is connect to CPU (or core)
>>>  - devices are conneted to Bus
>>>  - drivers are running instances which manage one or more devices
>>>  - bus is responsible for identifying devices (and interrupt propogation)
>>>  - driver is responsible for initializing the device
>>>
>>> (*Reusing text from email [1])
>>> In context of DPDK EAL:
>>>  - a generic bus (not a driver, not a device). I don't know how to
>>> categorize
>>>    a bus. It is certainly not a device, and then handler for a bus
>>> (physical)
>>>    can be considered a 'bus driver'. So, just 'rte_bus'.
>>>  - there is a bus for each physical implementation (or virtual). So, a
>>> rte_bus
>>>    Object for PCI, VDEV, ABC, DEF and so on.
>>>  - Buses are registered just like a PMD - RTE_PMD_BUS_REGISTER()
>>>  - Each registered bus is part of a doubly list.
>>>    -- Each device inherits rte_bus
>>>    -- Each driver inherits rte_bus
>>>    -- Device and Drivers lists are part of rte_bus
>>>  - eth_driver is no more required - it was just a holder for PMDs to
>>> register
>>>    themselves. It can be replaced with rte_xxx_driver and corresponding
>>> init/
>>>    uninit moved to rte_driver
>>>  - rte_eth_dev modified to disassociate itself from rte_pci_device and
>>> connect
>>>    to generic rte_device
>>>
>>> Once again, improvising from [1]:
>>>
>>>                                   __ rte_bus_list
>>>                                  /
>>>                      +----------'---+
>>>                      |rte_bus       |
>>>                      | driver_list------> List of rte_bus specific
>>>                      | device_list----    devices
>>>                      | scan         | `-> List of rte_bus associated
>>>                      | match        |     drivers
>>>                      | dump         |
>>>                      | ..some refcnt| (#)
>>>                      +--|------|----+
>>>               _________/        \_________
>>>     +--------/----+                     +-\---------------+
>>>     |rte_device   |                     |rte_driver       |
>>>     | rte_bus     |                     | rte_bus         |
>>>     | rte_driver  |(#)                  | init            |
>>>     |             |                     | uninit          |
>>>     |  devargs    |                     | dev_private_size|
>>>     +---||--------+                     | drv_flags       |(#)
>>>         ||                              | intr_handle(2*) |(#)
>>>         | \                             +----------\\\----+
>>>         |  \_____________                           \\\
>>>         |                \                          |||
>>>  +------|---------+ +----|----------+               |||
>>>  |rte_pci_device  | |rte_xxx_device | (4*)          |||
>>>  | PCI specific   | | xxx device    |               |||
>>>  | info (mem,)    | | specific fns  |              / | \
>>>  +----------------+ +---------------+             /  |  \
>>>                             _____________________/  /    \
>>>                            /                    ___/      \
>>>             +-------------'--+    +------------'---+    +--'------------+
>>>             |rte_pci_driver  |    |rte_vdev_driver |    |rte_xxx_driver |
>>>             | PCI id table,  |    | <probably,     |    | ....          |
>>>             | other driver   |    |  nothing>      |    +---------------+
>>>             | data           |    +----------------+
>>>             +----------------+
>>>
>>>     (1*) Problem is that probe functions have different arguments. So,
>>>          generalizing them might be some rework in the respective device
>>>          layers
>>>     (2*) Interrupt handling for each driver type might be different. I am
>>> not
>>>          sure how to generalize that either. This is grey area for me.
>>>     (3*) Probably exposing a bitmask for device capabilities. Nothing
>>> similar
>>>          exists now to relate it. Don't know if that is useful. Allowing
>>>          applications to question a device about what it supports and
>>> what it
>>>          doesn't - making it more flexible at application layer (but more
>>> code
>>>          as well.)
>>>     (4*) Even vdev would be an instantiated as a device. It is not being
>>> done
>>>          currently.
>>>     (#)  Items which are still pending
>>>
>>> With this cover letter, some patches have been posted. These are _only_
>>> for
>>> discussion purpose. They are not complete and neither compilable.
>>> All the patches, except 0001, have sufficient context about what the
>>> changes
>>> are and rationale for same. Obviously, code is best answer.
>>>
>>> === Patch description: ===
>>>
>>> Patch 0001: Introduce container_of macro. Originally a patch from Jan.
>>>
>>> Patch 0002: introduce changes to rte_device/rte_driver for rte_bus, and
>>>             rte_bus definition itself.
>>>
>>> Patch 0003: Add a new layer for 'bus driver' with linuxapp PCI as an
>>> example
>>>
>>> Patch 0004: Changes with respect to rte_bus APIs and impact on
>>> eal_common_pci
>>
>>
>> From my point of view it is beneficial to keep the PCI support in one
>> place. Moving the match() and scan()
>> to the drivers directory doesn't seem to be technically required but
>> it makes the code harder to read and understand.
>
>
> It is not a technical requirement - just logical placement. These are bus
> specific logic and should exist with the bus driver - where ever that is
> placed.
> Though, I am not entirely sure about 'harder to read'. If a person is
> reading through PCI's bus implementation, I am assuming it would be nice to
> have all the hooks (or default hooks) at same place. Isn't it?
>
> Or, maybe you are suggesting move all librte_eal/*/*pci* out to some common
> location. If so, that is something I haven't yet given serious thought.
>

Yes, either have everything PCI related stuff in drivers or in
librte_eal (or librte_eal_pci). I don't believe it is required to have
two different locations for the rte_pci_bus and the rte_pci_*
functions that are now in librte_eal.


>>
>>
>>> Patch 0005: Change to rte_eal_init (of linuxapp only, for now) for
>>> supporting
>>>             bus->scan. Probe is still being done old way, but in a new
>>> wrapper
>>>
>>> Patch 0006: eth_driver removal and corresponding changes in ixgbe_ethdev,
>>> as
>>>             an example. Includes changes to rte_ethdev to remove most
>>> possible
>>>             PCI references. But, work still remains.
>>
>>
>> Making rte_ethdev independent from PCI device is not directly related
>> to the rest of the patches that add bus support. I think it makes
>> sense to handle this separately because of the impact of refactoring
>> rte_ethdev.
>
>
> Once rte_device is available, the change is independent.
> Only dependency on it was changes required to rte_ethdev.c file because of
> various PCI naming/variables/functions. And that is indeed a very large
> change - including changes to drivers/* which I haven't yet done.
>
> Are you suggesting breaking out of this series? If so, can be done but only
> when formal patches start coming out.
>

Yes, that is fine too.

>
>>
>>
>>>
>>> === Pending Items/Questions: ===
>>>
>>>  - Interrupt and notification handling. How to allow drivers to be
>>> notified
>>>    of presence/plugging of a device.
>>>  - Placement of bus driver/handling code. librte_bus, bus/, drivers/bus?
>>>  -- Also from a pespective of a external library and whether symbols
>>> would be
>>>     available in that.
>>>  -- and secondary processes
>>>  - VDEV bus is missing from current set.
>>>  - Locking of list for supporting hotplugging. Or, at the least safe add/
>>>    remove
>>>  - PMDINFOGEN support or lack of it.
>>>  - Is there ever a case where rte_eth_dev needs to be resolved from
>>>    rte_pci_device? I couldn't find any such use and neither a use-case
>>> for it.
>>>  - There should be a way for Bus APIs to return a generic list handle so
>>> that
>>>    EAL doesn't need to bother about bus->driver_list like dereferencing.
>>> This
>>>    is untidy as well as less portable (in terms of code movement, not
>>> arch).
>>>  - Are more helper hooks required for a bus?
>>>  -- I can think of scan, match, dump, find, plug (device), unplug
>>> (device),
>>>     associate (driver), disassociate (driver). But, most of the work is
>>>     already being done by lower instances (rte_device/driver etc).
>>>
>>> Further:
>>>  - In next few days I will make all necessary changes on the lines
>>> mentioned
>>>    in the patches. This would include changing the drivers/* and
>>> librte_eal/*
>>>  - As an when review comments float in and agreement reached, I will keep
>>>    changing the model
>>>  - There are grey areas like interrupt, notification, locking of bus/list
>>>    which require more discussion. I will try and post a rfc for those as
>>> well
>>>    or if someone can help me on those - great
>>
>>
>> As already hinted on IRC I'm taking a look at the interrupt usage of
>> ethdev.
>
>
> Great. Thank you. Do let me know feedback for any changes that you might
> require along the way.
>
>
>>
>>>  - Change would include PCI bus and VDEV bus handling. A new bus (NXP's
>>> FSLMC)
>>>    would also be layered over this series to verify the model of 'bus
>>>    registration'. This is also part of 17.02 roadmap.
>>>
>>> [1] http://dpdk.org/ml/archives/dev/2016-November/050186.html
>>>
>>> Jan Viktorin (1):
>>>   eal: define container macro
>>>
>>> Shreyansh Jain (5):
>>>   eal: introduce bus-device-driver structure
>>>   bus: add bus driver layer
>>>   eal/common: handle bus abstraction for device/driver objects
>>>   eal: supporting bus model in init process
>>>   eal: removing eth_driver
>>>
>>>  bus/Makefile                               |  36 +++
>>>  bus/pci/Makefile                           |  37 +++
>>>  bus/pci/linuxapp/pci_bus.c                 | 418
>>> +++++++++++++++++++++++++++++
>>>  bus/pci/linuxapp/pci_bus.h                 |  55 ++++
>>>  drivers/net/ixgbe/ixgbe_ethdev.c           |  49 ++--
>>>  lib/librte_eal/common/eal_common_bus.c     | 188 +++++++++++++
>>>  lib/librte_eal/common/eal_common_dev.c     |  31 ++-
>>>  lib/librte_eal/common/eal_common_pci.c     | 226 +++++++++-------
>>>  lib/librte_eal/common/include/rte_bus.h    | 243 +++++++++++++++++
>>>  lib/librte_eal/common/include/rte_common.h |  18 ++
>>>  lib/librte_eal/common/include/rte_dev.h    |  36 +--
>>>  lib/librte_eal/common/include/rte_pci.h    |  11 +-
>>>  lib/librte_eal/linuxapp/eal/Makefile       |   1 +
>>>  lib/librte_eal/linuxapp/eal/eal.c          |  51 +++-
>>>  lib/librte_eal/linuxapp/eal/eal_pci.c      | 298 --------------------
>>>  lib/librte_ether/rte_ethdev.c              |  36 ++-
>>>  lib/librte_ether/rte_ethdev.h              |   6 +-
>>>  17 files changed, 1262 insertions(+), 478 deletions(-)
>>>  create mode 100644 bus/Makefile
>>>  create mode 100644 bus/pci/Makefile
>>>  create mode 100644 bus/pci/linuxapp/pci_bus.c
>>>  create mode 100644 bus/pci/linuxapp/pci_bus.h
>>>  create mode 100644 lib/librte_eal/common/eal_common_bus.c
>>>  create mode 100644 lib/librte_eal/common/include/rte_bus.h
>>>
>>> --
>>> 2.7.4
>>>
>>
>
> -
> Shreyansh

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 6/6] eal: removing eth_driver
  2016-11-17 12:53   ` Jan Blunck
@ 2016-11-18 13:05     ` Shreyansh Jain
  0 siblings, 0 replies; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-18 13:05 UTC (permalink / raw)
  To: Jan Blunck; +Cc: David Marchand, dev

sorry for delay in responding; somehow I didn't notice this email.

On Thursday 17 November 2016 06:23 PM, Jan Blunck wrote:
> On Thu, Nov 17, 2016 at 6:30 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
>> This patch demonstrates how eth_driver can be replaced with appropriate
>> changes for rte_xxx_driver from the PMD itself. It uses ixgbe_ethernet as
>> an example.
>>
>> A large set of changes exists in the rte_ethdev.c - primarily because too
>> much PCI centric code (names, assumption of rte_pci_device) still exists
>> in it. Most, except symbol naming, has been changed in this patch.
>>
>> This proposes that:
>>  - PMD would declare the rte_xxx_driver. In case of ixgbe, it would be
>>    rte_pci_driver.
>>  - Probe and remove continue to exists in rte_pci_driver. But, the
>>    rte_driver has new hooks for init and uninit. The rationale is that
>>    once a ethernet or cryto device is created, the rte_driver->init would
>>    be responsible for initializing the device.
>>    -- Eth_dev -> rte_driver -> rte_pci_driver
>>                           |    `-> probe/remove
>>                           `--> init/uninit
>
> Hmm, from my perspective this moves struct rte_driver a step closer to
> struct rte_eth_dev instead of decoupling them. It is up to the
> rte_driver->probe if it wants to allocate a struct rte_eth_dev,
> rte_crypto_dev or the famous rte_foo_dev.

That 'closeness' was my intention - to make rte_eth_dev an 
implementation of rte_device type.

rte_eth_dev == rte_cryptodev == rte_anyother_functional_device
- for the above context. All would include rte_device.

As for rte_driver->probe(), it still comes in the rte_driver->init()'s 
role to initialize the 'generic' functional device associated with the 
driver. And, allowing bus specific driver (like PCI) for its individual 
initialization using rte_xxx_driver->probe.

>
> Instead of explicitly modelling rte_eth_dev specifics like init, unit
> or dev_private_size I think we should delegate this to the
> rte_driver->probe instead. Most of what is in rte_eth_dev_pci_probe()
> today is anyway a rte_eth_dev_allocate_priv() anyway. I already have
> some patches in this area in my patch stack.

Can be done - either way rte_pci_driver->probe() ends up calling 
driver->init() (or erstwhile eth_driver->eth_dev_init()).

But, I still think it is better to keep them separate.
A PCI device is type of rte_device, physically.
A ethernet device is type of rte_device, logically.
They both should exist independently. It will help in splitting the 
functionality from physical layout in future - if need be.

>
>
>>  - necessary changes in the rte_eth_dev have also been done so that it
>>    refers to the rte_device and rte_driver rather than rte_xxx_*. This
>>    would imply, ethernet device is 'linked' to a rte_device/rte_driver
>>    which in turn is a rte_xxx_device/rte_xxx_driver type.
>>    - for all operations related to extraction relvant xxx type,
>>      container_of would have to be used.
>>
>> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
>> ---
>>  drivers/net/ixgbe/ixgbe_ethdev.c | 49 +++++++++++++++++++++-------------------
>>  lib/librte_ether/rte_ethdev.c    | 36 +++++++++++++++++------------
>>  lib/librte_ether/rte_ethdev.h    |  6 ++---
>>  3 files changed, 51 insertions(+), 40 deletions(-)
>>
>> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
>> index edc9b22..acead31 100644
>> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
>> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
>> @@ -1419,7 +1419,7 @@ eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
>>                 return 0;
>>         }
>>
>> -       pci_dev = eth_dev->pci_dev;
>> +       pci_dev = container_of(eth_dev->device, struct rte_pci_device, device);
>>
>>         rte_eth_copy_pci_info(eth_dev, pci_dev);
>>
>> @@ -1532,7 +1532,9 @@ static int
>>  eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
>>  {
>>         struct ixgbe_hw *hw;
>> -       struct rte_pci_device *pci_dev = eth_dev->pci_dev;
>> +       struct rte_pci_device *pci_dev;
>> +
>> +       pci_dev = container_of(eth_dev->device, struct rte_pci_device, device);
>>
>>         PMD_INIT_FUNC_TRACE();
>>
>> @@ -1562,32 +1564,33 @@ eth_ixgbevf_dev_uninit(struct rte_eth_dev *eth_dev)
>>         return 0;
>>  }
>>
>> -static struct eth_driver rte_ixgbe_pmd = {
>> -       .pci_drv = {
>> -               .id_table = pci_id_ixgbe_map,
>> -               .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
>> -                       RTE_PCI_DRV_DETACHABLE,
>> -               .probe = rte_eth_dev_pci_probe,
>> -               .remove = rte_eth_dev_pci_remove,
>> +static struct rte_pci_driver rte_ixgbe_pci_driver = {
>> +       .id_table = pci_id_ixgbe_map,
>> +       .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
>> +                    RTE_PCI_DRV_DETACHABLE,
>> +       .probe = rte_eth_dev_pci_probe,
>> +       .remove = rte_eth_dev_pci_remove,
>> +       .driver = {
>> +               .driver_init_t= eth_ixgbe_dev_init,
>> +               .driver_uninit_t= eth_ixgbe_dev_uninit,
>> +               .dev_private_size = sizeof(struct ixgbe_adapter),
>>         },
>> -       .eth_dev_init = eth_ixgbe_dev_init,
>> -       .eth_dev_uninit = eth_ixgbe_dev_uninit,
>> -       .dev_private_size = sizeof(struct ixgbe_adapter),
>>  };
>>
>>  /*
>>   * virtual function driver struct
>>   */
>> -static struct eth_driver rte_ixgbevf_pmd = {
>> -       .pci_drv = {
>> -               .id_table = pci_id_ixgbevf_map,
>> -               .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
>> -               .probe = rte_eth_dev_pci_probe,
>> -               .remove = rte_eth_dev_pci_remove,
>> +static struct rte_pci_driver rte_ixgbevf_pci_driver = {
>> +       .id_table = pci_id_ixgbevf_map,
>> +       .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_DETACHABLE,
>> +       .probe = rte_eth_dev_pci_probe,
>> +       .remove = rte_eth_dev_pci_remove,
>> +       .driver = {
>> +               /* rte_driver hooks */
>> +               .init = eth_ixgbevf_dev_init,
>> +               .uninit = eth_ixgbevf_dev_uninit,
>> +               .dev_private_size = sizeof(struct ixgbe_adapter),
>>         },
>> -       .eth_dev_init = eth_ixgbevf_dev_init,
>> -       .eth_dev_uninit = eth_ixgbevf_dev_uninit,
>> -       .dev_private_size = sizeof(struct ixgbe_adapter),
>>  };
>>
>>  static int
>> @@ -7592,7 +7595,7 @@ ixgbevf_dev_interrupt_handler(__rte_unused struct rte_intr_handle *handle,
>>         ixgbevf_dev_interrupt_action(dev);
>>  }
>>
>> -RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pmd.pci_drv);
>> +RTE_PMD_REGISTER_PCI(net_ixgbe, rte_ixgbe_pci_driver);
>>  RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe, pci_id_ixgbe_map);
>> -RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pmd.pci_drv);
>> +RTE_PMD_REGISTER_PCI(net_ixgbe_vf, rte_ixgbevf_pci_driver);
>>  RTE_PMD_REGISTER_PCI_TABLE(net_ixgbe_vf, pci_id_ixgbevf_map);
>> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
>> index fde8112..3535ff4 100644
>> --- a/lib/librte_ether/rte_ethdev.c
>> +++ b/lib/librte_ether/rte_ethdev.c
>> @@ -235,13 +235,13 @@ int
>>  rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>>                       struct rte_pci_device *pci_dev)
>>  {
>> -       struct eth_driver    *eth_drv;
>> +       struct rte_driver *drv;
>>         struct rte_eth_dev *eth_dev;
>>         char ethdev_name[RTE_ETH_NAME_MAX_LEN];
>>
>>         int diag;
>>
>> -       eth_drv = (struct eth_driver *)pci_drv;
>> +       drv = pci_drv->driver;
>>
>>         rte_eal_pci_device_name(&pci_dev->addr, ethdev_name,
>>                         sizeof(ethdev_name));
>> @@ -252,13 +252,13 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>>
>>         if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>>                 eth_dev->data->dev_private = rte_zmalloc("ethdev private structure",
>> -                                 eth_drv->dev_private_size,
>> +                                 drv->dev_private_size,
>>                                   RTE_CACHE_LINE_SIZE);
>>                 if (eth_dev->data->dev_private == NULL)
>>                         rte_panic("Cannot allocate memzone for private port data\n");
>>         }
>> -       eth_dev->pci_dev = pci_dev;
>> -       eth_dev->driver = eth_drv;
>> +       eth_dev->device = pci_dev->device;
>> +       eth_dev->driver = drv;
>>         eth_dev->data->rx_mbuf_alloc_failed = 0;
>>
>>         /* init user callbacks */
>> @@ -270,7 +270,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>>         eth_dev->data->mtu = ETHER_MTU;
>>
>>         /* Invoke PMD device initialization function */
>> -       diag = (*eth_drv->eth_dev_init)(eth_dev);
>> +       diag = (*drv->init)(eth_dev);
>>         if (diag == 0)
>>                 return 0;
>>
>> @@ -287,7 +287,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv,
>>  int
>>  rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
>>  {
>> -       const struct eth_driver *eth_drv;
>> +       const struct rte_driver *drv;
>>         struct rte_eth_dev *eth_dev;
>>         char ethdev_name[RTE_ETH_NAME_MAX_LEN];
>>         int ret;
>> @@ -302,11 +302,11 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
>>         if (eth_dev == NULL)
>>                 return -ENODEV;
>>
>> -       eth_drv = (const struct eth_driver *)pci_dev->driver;
>> +       drv = pci_dev->driver;
>>
>>         /* Invoke PMD device uninit function */
>> -       if (*eth_drv->eth_dev_uninit) {
>> -               ret = (*eth_drv->eth_dev_uninit)(eth_dev);
>> +       if (*drv->uninit) {
>> +               ret = (*drv->uninit)(eth_dev);
>>                 if (ret)
>>                         return ret;
>>         }
>> @@ -317,7 +317,7 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev)
>>         if (rte_eal_process_type() == RTE_PROC_PRIMARY)
>>                 rte_free(eth_dev->data->dev_private);
>>
>> -       eth_dev->pci_dev = NULL;
>> +       eth_dev->device = NULL;
>>         eth_dev->driver = NULL;
>>         eth_dev->data = NULL;
>>
>> @@ -1556,7 +1556,7 @@ rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info *dev_info)
>>
>>         RTE_FUNC_PTR_OR_RET(*dev->dev_ops->dev_infos_get);
>>         (*dev->dev_ops->dev_infos_get)(dev, dev_info);
>> -       dev_info->pci_dev = dev->pci_dev;
>> +       dev_info->device = dev->device;
>>         dev_info->driver_name = dev->data->drv_name;
>>         dev_info->nb_rx_queues = dev->data->nb_rx_queues;
>>         dev_info->nb_tx_queues = dev->data->nb_tx_queues;
>> @@ -2537,6 +2537,7 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
>>  {
>>         uint32_t vec;
>>         struct rte_eth_dev *dev;
>> +       struct rte_pci_device *pci_dev;
>>         struct rte_intr_handle *intr_handle;
>>         uint16_t qid;
>>         int rc;
>> @@ -2544,6 +2545,10 @@ rte_eth_dev_rx_intr_ctl(uint8_t port_id, int epfd, int op, void *data)
>>         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>>
>>         dev = &rte_eth_devices[port_id];
>> +       /* TODO intr_handle is currently in rte_pci_device;
>> +        * Below is incorrect until that time
>> +        */
>> +       pci_dev = container_of(dev->device, struct rte_pci_device, device);
>>         intr_handle = &dev->pci_dev->intr_handle;
>>         if (!intr_handle->intr_vec) {
>>                 RTE_PMD_DEBUG_TRACE("RX Intr vector unset\n");
>> @@ -2572,7 +2577,7 @@ rte_eth_dma_zone_reserve(const struct rte_eth_dev *dev, const char *ring_name,
>>         const struct rte_memzone *mz;
>>
>>         snprintf(z_name, sizeof(z_name), "%s_%s_%d_%d",
>> -                dev->driver->pci_drv.driver.name, ring_name,
>> +                dev->driver->name, ring_name,
>>                  dev->data->port_id, queue_id);
>>
>>         mz = rte_memzone_lookup(z_name);
>> @@ -2593,6 +2598,7 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
>>  {
>>         uint32_t vec;
>>         struct rte_eth_dev *dev;
>> +       struct rte_pci_device *pci_dev;
>>         struct rte_intr_handle *intr_handle;
>>         int rc;
>>
>> @@ -2604,7 +2610,9 @@ rte_eth_dev_rx_intr_ctl_q(uint8_t port_id, uint16_t queue_id,
>>                 return -EINVAL;
>>         }
>>
>> -       intr_handle = &dev->pci_dev->intr_handle;
>> +       /* TODO; Until intr_handle is available in rte_device, below is incorrect */
>> +       pci_dev = container_of(dev->device, struct rte_pci_device, device);
>> +       intr_handle = &pci_dev->intr_handle;
>>         if (!intr_handle->intr_vec) {
>>                 RTE_PMD_DEBUG_TRACE("RX Intr vector unset\n");
>>                 return -EPERM;
>> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
>> index 38641e8..2b1d826 100644
>> --- a/lib/librte_ether/rte_ethdev.h
>> +++ b/lib/librte_ether/rte_ethdev.h
>> @@ -876,7 +876,7 @@ struct rte_eth_conf {
>>   * Ethernet device information
>>   */
>>  struct rte_eth_dev_info {
>> -       struct rte_pci_device *pci_dev; /**< Device PCI information. */
>> +       struct rte_device *device; /**< Device PCI information. */
>
> We already the situation that virtual devices don't set the pci_dev
> field. I wonder if it really makes sense to replace it with a struct
> rte_device because that is not adding a lot of value (only numa_node).

Sorry, I couldn't understand which way you are pointing:
  - continuing with 'rte_pci_device' in rte_eth_dev_info.
  - completely removing both, rte_pci_device and rte_device

In either case, I am ok. I went through the code usage of 
rte_eth_dev_info and it is mostly being used for getting information. I 
couldn't point out a situation where based on available info 
(rte_eth_dev_info), we need to extract back the device it is 
representing. Is that understanding correct?

If yes, I can remove this (after checking that this member is not being 
used).

> If we break ABI we might want to add numa_node and dev_flags as
> suggested by Stephen Hemminger already. If we choose to not break ABI
> we can delegate the population of the pci_dev field to dev_infos_get.
> I already have that patch in my patch stack too.

We can't avoid the ABI breakage - it is anyway going to happen.
As for 'dev_flags', I am assuming you are referring to moving 
'drv_flags' from rte_pci_driver. And you are suggesting moving that to 
'rte_driver' - is that correct understanding?

I don't know if drv_flags have any significance in rte_device. I thought 
they are driver specific flags (mmap, etc). Or, maybe they are just 
placed in driver for acting on all compatible devices.

>
> The problem with rte_eth_dev_info is that it doesn't support
> extensions. Maybe its time to add rte_eth_dev_info_ex() ... or
> rte_eth_dev_xinfo() if you don't like the IB verbs API.

I have no idea about IB verbs. And as for extensions, I will have to see 
- I don't prefer mixing that with current set. Though, the idea is nice.

>
>
>>         const char *driver_name; /**< Device Driver name. */
>>         unsigned int if_index; /**< Index to bound host interface, or 0 if none.
>>                 Use if_indextoname() to translate into an interface name. */
>> @@ -1623,9 +1623,9 @@ struct rte_eth_dev {
>>         eth_rx_burst_t rx_pkt_burst; /**< Pointer to PMD receive function. */
>>         eth_tx_burst_t tx_pkt_burst; /**< Pointer to PMD transmit function. */
>>         struct rte_eth_dev_data *data;  /**< Pointer to device data */
>> -       const struct eth_driver *driver;/**< Driver for this device */
>> +       const struct rte_driver *driver;/**< Driver for this device */
>>         const struct eth_dev_ops *dev_ops; /**< Functions exported by PMD */
>> -       struct rte_pci_device *pci_dev; /**< PCI info. supplied by probing */
>> +       struct rte_device *device; /**< Device instance */
>>         /** User application callbacks for NIC interrupts */
>>         struct rte_eth_dev_cb_list link_intr_cbs;
>>         /**
>> --
>> 2.7.4
>>
>

-
Shreyansh

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/6] Restructure EAL device model for bus support
  2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
                   ` (6 preceding siblings ...)
  2016-11-17 11:55 ` [RFC PATCH 0/6] Restructure EAL device model for bus support Jan Blunck
@ 2016-11-20 15:30 ` David Marchand
  2016-11-21  9:08   ` Thomas Monjalon
  2016-11-23  9:45   ` Shreyansh Jain
  7 siblings, 2 replies; 21+ messages in thread
From: David Marchand @ 2016-11-20 15:30 UTC (permalink / raw)
  To: Shreyansh Jain; +Cc: dev

On Thu, Nov 17, 2016 at 6:29 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
> DPDK has been inherently a PCI inclined framework. Because of this, the
> design of device tree (or list) within DPDK is also PCI inclined. A non-PCI
> device doesn't have a way of being expressed without using hooks started from
> EAL to PMD.
>
> With this cover letter, some patches are presented which try to break this
> strict linkage of EAL with PCI devices. Aim is to generalize the device
> hierarchy on the lines of how Linux handles it:
>
>         device A1
>           |
>   +==.===='==============.============+ Bus A.
>      |                    `--> driver A11     \
>   device A2                `-> driver A12      \______
>                                                 |CPU |
>                                                 /`````
>         device A1                              /
>           |                                   /
>   +==.===='==============.============+ Bus A`
>      |                    `--> driver B11
>   device A2                `-> driver B12
>
> Simply put:
>  - a bus is connect to CPU (or core)
>  - devices are conneted to Bus
>  - drivers are running instances which manage one or more devices
>  - bus is responsible for identifying devices (and interrupt propogation)
>  - driver is responsible for initializing the device
>
> (*Reusing text from email [1])
> In context of DPDK EAL:
>  - a generic bus (not a driver, not a device). I don't know how to categorize
>    a bus. It is certainly not a device, and then handler for a bus (physical)
>    can be considered a 'bus driver'. So, just 'rte_bus'.
>  - there is a bus for each physical implementation (or virtual). So, a rte_bus
>    Object for PCI, VDEV, ABC, DEF and so on.
>  - Buses are registered just like a PMD - RTE_PMD_BUS_REGISTER()
>  - Each registered bus is part of a doubly list.
>    -- Each device inherits rte_bus
>    -- Each driver inherits rte_bus
>    -- Device and Drivers lists are part of rte_bus
>  - eth_driver is no more required - it was just a holder for PMDs to register
>    themselves. It can be replaced with rte_xxx_driver and corresponding init/
>    uninit moved to rte_driver
>  - rte_eth_dev modified to disassociate itself from rte_pci_device and connect
>    to generic rte_device
>
> Once again, improvising from [1]:
>
>                                   __ rte_bus_list
>                                  /
>                      +----------'---+
>                      |rte_bus       |
>                      | driver_list------> List of rte_bus specific
>                      | device_list----    devices
>                      | scan         | `-> List of rte_bus associated
>                      | match        |     drivers
>                      | dump         |
>                      | ..some refcnt| (#)
>                      +--|------|----+
>               _________/        \_________
>     +--------/----+                     +-\---------------+
>     |rte_device   |                     |rte_driver       |
>     | rte_bus     |                     | rte_bus         |
>     | rte_driver  |(#)                  | init            |
>     |             |                     | uninit          |
>     |  devargs    |                     | dev_private_size|
>     +---||--------+                     | drv_flags       |(#)
>         ||                              | intr_handle(2*) |(#)
>         | \                             +----------\\\----+
>         |  \_____________                           \\\
>         |                \                          |||
>  +------|---------+ +----|----------+               |||
>  |rte_pci_device  | |rte_xxx_device | (4*)          |||
>  | PCI specific   | | xxx device    |               |||
>  | info (mem,)    | | specific fns  |              / | \
>  +----------------+ +---------------+             /  |  \
>                             _____________________/  /    \
>                            /                    ___/      \
>             +-------------'--+    +------------'---+    +--'------------+
>             |rte_pci_driver  |    |rte_vdev_driver |    |rte_xxx_driver |
>             | PCI id table,  |    | <probably,     |    | ....          |
>             | other driver   |    |  nothing>      |    +---------------+
>             | data           |    +----------------+
>             +----------------+
>
>     (1*) Problem is that probe functions have different arguments. So,
>          generalizing them might be some rework in the respective device
>          layers
>     (2*) Interrupt handling for each driver type might be different. I am not
>          sure how to generalize that either. This is grey area for me.
>     (3*) Probably exposing a bitmask for device capabilities. Nothing similar
>          exists now to relate it. Don't know if that is useful. Allowing
>          applications to question a device about what it supports and what it
>          doesn't - making it more flexible at application layer (but more code
>          as well.)
>     (4*) Even vdev would be an instantiated as a device. It is not being done
>          currently.
>     (#)  Items which are still pending
>
> With this cover letter, some patches have been posted. These are _only_ for
> discussion purpose. They are not complete and neither compilable.
> All the patches, except 0001, have sufficient context about what the changes
> are and rationale for same. Obviously, code is best answer.

As said by Jan B., I too think that splitting the patches into smaller
patchsets is important.

Looking at your datamodel, some quick comments :
- Why init/uninit in rte_driver ? Why not probe/remove ?
- I don't think dev_private_size makes sense in rte_driver. The bus is
responsible for creating rte_device objects (embedded or not in
rte_$bus_device objects). If a driver needs to allocate some priv (for
whatever needs), the driver should do the allocation itself (or maybe
a helper for current eth_driver), then reference the original
rte_$bus_device object it got from the probe method.


For a first patchset, I would see:
- introduce the rte_bus object. In rte_eal_init, for each bus, we call
the scan method. Then, for each bus, we find the appropriate
rte_driver using the bus match method then call the probe method. If
the probe succeeds, the rte_device points to the associated
rte_driver,
- migrate the pci scan code to a pci bus (scan looks at sysfs for
linux / ioctl for bsd + devargs for blacklist / whitelist ?), match is
the same at what is done in rte_eal_pci_probe_one_driver() at the
moment,
- migrate the vdev init code to a vdev bus (scan looks at devargs):
this is new, we must create rte_device objects for vdev drivers to use
later

Then we can talk about the next steps once the bus is in place.


-- 
David Marchand

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/6] Restructure EAL device model for bus support
  2016-11-20 15:30 ` David Marchand
@ 2016-11-21  9:08   ` Thomas Monjalon
  2016-11-21 10:47     ` Jan Blunck
  2016-11-23  9:45   ` Shreyansh Jain
  1 sibling, 1 reply; 21+ messages in thread
From: Thomas Monjalon @ 2016-11-21  9:08 UTC (permalink / raw)
  To: dev; +Cc: David Marchand, Shreyansh Jain, Jan Blunck

2016-11-20 16:30, David Marchand:
> For a first patchset, I would see:
> - introduce the rte_bus object. In rte_eal_init, for each bus, we call
> the scan method. Then, for each bus, we find the appropriate
> rte_driver using the bus match method then call the probe method. If
> the probe succeeds, the rte_device points to the associated
> rte_driver,
> - migrate the pci scan code to a pci bus (scan looks at sysfs for
> linux / ioctl for bsd + devargs for blacklist / whitelist ?), match is
> the same at what is done in rte_eal_pci_probe_one_driver() at the
> moment,
> - migrate the vdev init code to a vdev bus (scan looks at devargs):
> this is new, we must create rte_device objects for vdev drivers to use
> later

I think it can be 3 patchsets.
Who can work on the vdev part please?

> Then we can talk about the next steps once the bus is in place.

Yes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/6] Restructure EAL device model for bus support
  2016-11-21  9:08   ` Thomas Monjalon
@ 2016-11-21 10:47     ` Jan Blunck
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Blunck @ 2016-11-21 10:47 UTC (permalink / raw)
  To: Thomas Monjalon; +Cc: dev, David Marchand, Shreyansh Jain

On Mon, Nov 21, 2016 at 10:08 AM, Thomas Monjalon
<thomas.monjalon@6wind.com> wrote:
> 2016-11-20 16:30, David Marchand:
>> For a first patchset, I would see:
>> - introduce the rte_bus object. In rte_eal_init, for each bus, we call
>> the scan method. Then, for each bus, we find the appropriate
>> rte_driver using the bus match method then call the probe method. If
>> the probe succeeds, the rte_device points to the associated
>> rte_driver,
>> - migrate the pci scan code to a pci bus (scan looks at sysfs for
>> linux / ioctl for bsd + devargs for blacklist / whitelist ?), match is
>> the same at what is done in rte_eal_pci_probe_one_driver() at the
>> moment,
>> - migrate the vdev init code to a vdev bus (scan looks at devargs):
>> this is new, we must create rte_device objects for vdev drivers to use
>> later
>
> I think it can be 3 patchsets.
> Who can work on the vdev part please?

I'll take a look.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 0/6] Restructure EAL device model for bus support
  2016-11-20 15:30 ` David Marchand
  2016-11-21  9:08   ` Thomas Monjalon
@ 2016-11-23  9:45   ` Shreyansh Jain
  1 sibling, 0 replies; 21+ messages in thread
From: Shreyansh Jain @ 2016-11-23  9:45 UTC (permalink / raw)
  To: David Marchand; +Cc: dev

I should have replied to this earlier, apologies.

On Sunday 20 November 2016 09:00 PM, David Marchand wrote:
> On Thu, Nov 17, 2016 at 6:29 AM, Shreyansh Jain <shreyansh.jain@nxp.com> wrote:
>> DPDK has been inherently a PCI inclined framework. Because of this, the
>> design of device tree (or list) within DPDK is also PCI inclined. A non-PCI
>> device doesn't have a way of being expressed without using hooks started from
>> EAL to PMD.
>>
>> With this cover letter, some patches are presented which try to break this
>> strict linkage of EAL with PCI devices. Aim is to generalize the device
>> hierarchy on the lines of how Linux handles it:
>>
>>         device A1
>>           |
>>   +==.===='==============.============+ Bus A.
>>      |                    `--> driver A11     \
>>   device A2                `-> driver A12      \______
>>                                                 |CPU |
>>                                                 /`````
>>         device A1                              /
>>           |                                   /
>>   +==.===='==============.============+ Bus A`
>>      |                    `--> driver B11
>>   device A2                `-> driver B12
>>
>> Simply put:
>>  - a bus is connect to CPU (or core)
>>  - devices are conneted to Bus
>>  - drivers are running instances which manage one or more devices
>>  - bus is responsible for identifying devices (and interrupt propogation)
>>  - driver is responsible for initializing the device
>>
>> (*Reusing text from email [1])
>> In context of DPDK EAL:
>>  - a generic bus (not a driver, not a device). I don't know how to categorize
>>    a bus. It is certainly not a device, and then handler for a bus (physical)
>>    can be considered a 'bus driver'. So, just 'rte_bus'.
>>  - there is a bus for each physical implementation (or virtual). So, a rte_bus
>>    Object for PCI, VDEV, ABC, DEF and so on.
>>  - Buses are registered just like a PMD - RTE_PMD_BUS_REGISTER()
>>  - Each registered bus is part of a doubly list.
>>    -- Each device inherits rte_bus
>>    -- Each driver inherits rte_bus
>>    -- Device and Drivers lists are part of rte_bus
>>  - eth_driver is no more required - it was just a holder for PMDs to register
>>    themselves. It can be replaced with rte_xxx_driver and corresponding init/
>>    uninit moved to rte_driver
>>  - rte_eth_dev modified to disassociate itself from rte_pci_device and connect
>>    to generic rte_device
>>
>> Once again, improvising from [1]:
>>
>>                                   __ rte_bus_list
>>                                  /
>>                      +----------'---+
>>                      |rte_bus       |
>>                      | driver_list------> List of rte_bus specific
>>                      | device_list----    devices
>>                      | scan         | `-> List of rte_bus associated
>>                      | match        |     drivers
>>                      | dump         |
>>                      | ..some refcnt| (#)
>>                      +--|------|----+
>>               _________/        \_________
>>     +--------/----+                     +-\---------------+
>>     |rte_device   |                     |rte_driver       |
>>     | rte_bus     |                     | rte_bus         |
>>     | rte_driver  |(#)                  | init            |
>>     |             |                     | uninit          |
>>     |  devargs    |                     | dev_private_size|
>>     +---||--------+                     | drv_flags       |(#)
>>         ||                              | intr_handle(2*) |(#)
>>         | \                             +----------\\\----+
>>         |  \_____________                           \\\
>>         |                \                          |||
>>  +------|---------+ +----|----------+               |||
>>  |rte_pci_device  | |rte_xxx_device | (4*)          |||
>>  | PCI specific   | | xxx device    |               |||
>>  | info (mem,)    | | specific fns  |              / | \
>>  +----------------+ +---------------+             /  |  \
>>                             _____________________/  /    \
>>                            /                    ___/      \
>>             +-------------'--+    +------------'---+    +--'------------+
>>             |rte_pci_driver  |    |rte_vdev_driver |    |rte_xxx_driver |
>>             | PCI id table,  |    | <probably,     |    | ....          |
>>             | other driver   |    |  nothing>      |    +---------------+
>>             | data           |    +----------------+
>>             +----------------+
>>
>>     (1*) Problem is that probe functions have different arguments. So,
>>          generalizing them might be some rework in the respective device
>>          layers
>>     (2*) Interrupt handling for each driver type might be different. I am not
>>          sure how to generalize that either. This is grey area for me.
>>     (3*) Probably exposing a bitmask for device capabilities. Nothing similar
>>          exists now to relate it. Don't know if that is useful. Allowing
>>          applications to question a device about what it supports and what it
>>          doesn't - making it more flexible at application layer (but more code
>>          as well.)
>>     (4*) Even vdev would be an instantiated as a device. It is not being done
>>          currently.
>>     (#)  Items which are still pending
>>
>> With this cover letter, some patches have been posted. These are _only_ for
>> discussion purpose. They are not complete and neither compilable.
>> All the patches, except 0001, have sufficient context about what the changes
>> are and rationale for same. Obviously, code is best answer.
>
> As said by Jan B., I too think that splitting the patches into smaller
> patchsets is important.
>
> Looking at your datamodel, some quick comments :
> - Why init/uninit in rte_driver ? Why not probe/remove ?

No specific reason. At first I wasn't planning to replace driver->init 
with driver->probe (which rte_pci_driver) is doing. But, I will do this 
in v2.

> - I don't think dev_private_size makes sense in rte_driver. The bus is
> responsible for creating rte_device objects (embedded or not in
> rte_$bus_device objects). If a driver needs to allocate some priv (for
> whatever needs), the driver should do the allocation itself (or maybe
> a helper for current eth_driver), then reference the original
> rte_$bus_device object it got from the probe method.

first off, this came as a suggestion on the ML somewhere as far as I 
remember.
Secondly, your point makes sense. I will move it back.

>
>
> For a first patchset, I would see:
> - introduce the rte_bus object. In rte_eal_init, for each bus, we call
> the scan method. Then, for each bus, we find the appropriate
> rte_driver using the bus match method then call the probe method. If
> the probe succeeds, the rte_device points to the associated
> rte_driver,
> - migrate the pci scan code to a pci bus (scan looks at sysfs for
> linux / ioctl for bsd + devargs for blacklist / whitelist ?), match is
> the same at what is done in rte_eal_pci_probe_one_driver() at the
> moment,
> - migrate the vdev init code to a vdev bus (scan looks at devargs):
> this is new, we must create rte_device objects for vdev drivers to use
> later

Thanks for outlay - I agree with that.

>
> Then we can talk about the next steps once the bus is in place.

I will post the new set probably tomorrow or day-after.

>
>

-
Shreyansh

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-11-23  9:42 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-17  5:29 [RFC PATCH 0/6] Restructure EAL device model for bus support Shreyansh Jain
2016-11-17  5:30 ` [RFC PATCH 1/6] eal: define container macro Shreyansh Jain
2016-11-17 12:06   ` Jan Blunck
2016-11-17 13:01     ` Shreyansh Jain
2016-11-17  5:30 ` [RFC PATCH 2/6] eal: introduce bus-device-driver structure Shreyansh Jain
2016-11-17 11:19   ` Jan Blunck
2016-11-17 13:00     ` Shreyansh Jain
2016-11-17 16:13       ` Jan Blunck
2016-11-17  5:30 ` [RFC PATCH 3/6] bus: add bus driver layer Shreyansh Jain
2016-11-17  5:30 ` [RFC PATCH 4/6] eal/common: handle bus abstraction for device/driver objects Shreyansh Jain
2016-11-17  5:30 ` [RFC PATCH 5/6] eal: supporting bus model in init process Shreyansh Jain
2016-11-17  5:30 ` [RFC PATCH 6/6] eal: removing eth_driver Shreyansh Jain
2016-11-17 12:53   ` Jan Blunck
2016-11-18 13:05     ` Shreyansh Jain
2016-11-17 11:55 ` [RFC PATCH 0/6] Restructure EAL device model for bus support Jan Blunck
2016-11-17 13:08   ` Shreyansh Jain
2016-11-17 16:54     ` Jan Blunck
2016-11-20 15:30 ` David Marchand
2016-11-21  9:08   ` Thomas Monjalon
2016-11-21 10:47     ` Jan Blunck
2016-11-23  9:45   ` Shreyansh Jain

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.