From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thomas Monjalon Subject: Re: [PATCH 1/4] eventdev: introduce event driven programming model Date: Thu, 24 Nov 2016 16:35:56 +0100 Message-ID: <1883454.103LptOkIX@xps13> References: <1479447902-3700-1-git-send-email-jerin.jacob@caviumnetworks.com> <3691745.y1f1NvKTEv@xps13> <20161124015912.GA13508@svelivela-lt.caveonetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7Bit Cc: dev@dpdk.org, bruce.richardson@intel.com, harry.van.haaren@intel.com, hemant.agrawal@nxp.com, gage.eads@intel.com To: Jerin Jacob Return-path: Received: from mail-wm0-f41.google.com (mail-wm0-f41.google.com [74.125.82.41]) by dpdk.org (Postfix) with ESMTP id B5A6612A8 for ; Thu, 24 Nov 2016 16:35:58 +0100 (CET) Received: by mail-wm0-f41.google.com with SMTP id c184so22090479wmd.0 for ; Thu, 24 Nov 2016 07:35:58 -0800 (PST) In-Reply-To: <20161124015912.GA13508@svelivela-lt.caveonetworks.com> List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 2016-11-24 07:29, Jerin Jacob: > On Wed, Nov 23, 2016 at 07:39:09PM +0100, Thomas Monjalon wrote: > > 2016-11-18 11:14, Jerin Jacob: > > > +Eventdev API - EXPERIMENTAL > > > +M: Jerin Jacob > > > +F: lib/librte_eventdev/ > > > > OK to mark it experimental. > > What is the plan to remove the experimental word? > > IMO, EXPERIMENTAL status can be changed when > - At least two event drivers available(Intel and Cavium are working on > SW and HW event drivers) > - Functional test applications are fine with at least two drivers > - Portable example application to showcase the features of the library > - eventdev integration with another dpdk subsystem such as ethdev > > Thoughts?. I am not sure the criteria used in cryptodev case. Sounds good. We will be more confident when drivers and tests will be implemented. I think the roadmap for the SW driver targets the release 17.05. Do you still plan 17.02 for this API and the Cavium driver? > > > +#define EVENTDEV_NAME_SKELETON_PMD event_skeleton > > > +/**< Skeleton event device PMD name */ > > > > I do not understand this #define. > > Applications can explicitly request the a specific driver though driver > name. This will go as argument to rte_event_dev_get_dev_id(const char *name). > The reason for keeping this #define in rte_eventdev.h is that, > application needs to include only rte_eventdev.h not rte_eventdev_pmd.h. So each driver must register its name in the API? Is it really needed? > > > +struct rte_event_dev_config { > > > + uint32_t dequeue_wait_ns; > > > + /**< rte_event_dequeue() wait for *dequeue_wait_ns* ns on this device. > > > > Please explain exactly when the wait occurs and why. > > Here is the explanation from rte_event_dequeue() API definition, > - > @param wait > 0 - no-wait, returns immediately if there is no event. > >0 - wait for the event, if the device is configured with > RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT then this function will wait until > the event available or *wait* time. > if the device is not configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT > then this function will wait until the event available or *dequeue_wait_ns* > ^^^^^^^^^^^^^^^^^^^^^^ > ns which was previously supplied to rte_event_dev_configure() > - > This is provides the application to have control over, how long the > implementation should wait if event is not available. > > Let me know what exact changes are required if details are not enough in > rte_event_dequeue() API definition. Maybe that timeout would be a better name. It waits only if there is nothing in the queue. It can be interesting to highlight in this comment that this parameter makes the dequeue function a blocking call. > > > +/** Event port configuration structure */ > > > +struct rte_event_port_conf { > > > + int32_t new_event_threshold; > > > + /**< A backpressure threshold for new event enqueues on this port. > > > + * Use for *closed system* event dev where event capacity is limited, > > > + * and cannot exceed the capacity of the event dev. > > > + * Configuring ports with different thresholds can make higher priority > > > + * traffic less likely to be backpressured. > > > + * For example, a port used to inject NIC Rx packets into the event dev > > > + * can have a lower threshold so as not to overwhelm the device, > > > + * while ports used for worker pools can have a higher threshold. > > > + * This value cannot exceed the *nb_events_limit* > > > + * which previously supplied to rte_event_dev_configure() > > > + */ > > > + uint8_t dequeue_depth; > > > + /**< Configure number of bulk dequeues for this event port. > > > + * This value cannot exceed the *nb_event_port_dequeue_depth* > > > + * which previously supplied to rte_event_dev_configure() > > > + */ > > > + uint8_t enqueue_depth; > > > + /**< Configure number of bulk enqueues for this event port. > > > + * This value cannot exceed the *nb_event_port_enqueue_depth* > > > + * which previously supplied to rte_event_dev_configure() > > > + */ > > > +}; > > > > The depth configuration is not clear to me. > > Basically the maximum number of events can be enqueued/dequeued at time > from a given event port. depth of one == non burst mode. OK so depth is the queue size. Please could you reword? > > > +/* Event types to classify the event source */ > > > > Why this classification is needed? > > This for application pipeling and the cases like, if application wants to know which > subsystem generated the event. > > example packet forwarding loop on the worker cores: > while(1) { > ev = dequeue() > // event from ethdev subsystem > if (ev.event_type == RTE_EVENT_TYPE_ETHDEV) { > - swap the mac address > - push to atomic queue for ingress flow order maintenance > by CORE > /* events from core */ > } else if (ev.event_type == RTE_EVENT_TYPE_CORE) { > > } > enqueue(ev); > } I don't know why but I feel this classification is weak. You need to track the source of the event. Does it make sense to go beyond and identify the source device? > > > +#define RTE_EVENT_TYPE_ETHDEV 0x0 > > > +/**< The event generated from ethdev subsystem */ > > > +#define RTE_EVENT_TYPE_CRYPTODEV 0x1 > > > +/**< The event generated from crypodev subsystem */ > > > +#define RTE_EVENT_TYPE_TIMERDEV 0x2 > > > +/**< The event generated from timerdev subsystem */ > > > +#define RTE_EVENT_TYPE_CORE 0x3 > > > +/**< The event generated from core. > > > > What is core? > > The event are generated by lcore for pipeling. Any suggestion for > better name? lcore? What about CPU or SW? > > > + /**< Opaque event pointer */ > > > + struct rte_mbuf *mbuf; > > > + /**< mbuf pointer if dequeued event is associated with mbuf */ > > > > How do we know that an event is associated with mbuf? > > By looking at the event source/type RTE_EVENT_TYPE_* > > > Does it mean that such events are always converted into mbuf even if the > > application does not need it? > > Hardware has dependency on getting physical address of the event, so any > struct that has "phys_addr_t buf_physaddr" works. I do not understand. I tought that decoding the event would be the responsibility of the app by calling a function like rte_eventdev_convert_to_mbuf(struct rte_event *, struct rte_mbuf *). > > > +struct rte_eventdev_driver; > > > +struct rte_eventdev_ops; > > > > I think it is better to split API and driver interface in two files. > > (we should do this split in ethdev) > > I thought so, but then the "static inline" versions of northbound > API(like rte_event_enqueue) will go another file(due to the fact that > implementation need to deference "dev->data->ports[port_id]"). Do you want that way? > I would like to keep all northbound API in rte_eventdev.h and not any of them > in rte_eventdev_pmd.h. My comment was confusing. You are doing 2 files, one for API (what you call northbound I think) and the other one for driver interface (what you call southbound I think), it's very fine. > > > +/** > > > + * Enqueue the event object supplied in the *rte_event* structure on an > > > + * event device designated by its *dev_id* through the event port specified by > > > + * *port_id*. The event object specifies the event queue on which this > > > + * event will be enqueued. > > > + * > > > + * @param dev_id > > > + * Event device identifier. > > > + * @param port_id > > > + * The identifier of the event port. > > > + * @param ev > > > + * Pointer to struct rte_event > > > + * > > > + * @return > > > + * - 0 on success > > > + * - <0 on failure. Failure can occur if the event port's output queue is > > > + * backpressured, for instance. > > > + */ > > > +static inline int > > > +rte_event_enqueue(uint8_t dev_id, uint8_t port_id, struct rte_event *ev) > > > > Is it really needed to have non-burst variant of enqueue/dequeue? > > Yes. certain HW can work only with non burst variants. Same comment as Bruce, we must keep only the burst variant. We cannot have different API for different HW. > > > +/** > > > + * Converts nanoseconds to *wait* value for rte_event_dequeue() > > > + * > > > + * If the device is configured with RTE_EVENT_DEV_CFG_PER_DEQUEUE_WAIT flag then > > > + * application can use this function to convert wait value in nanoseconds to > > > + * implementations specific wait value supplied in rte_event_dequeue() > > > > Why is it implementation-specific? > > Why this conversion is not internal in the driver? > > This is for performance optimization, otherwise in drivers > need to convert ns to ticks in "fast path" So why not defining the unit of this timeout as CPU cycles like the ones returned by rte_get_timer_cycles()?