From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Subject: Re: [PATCH 2/4] eventdev: implement the northbound APIs
Date: Wed, 23 Nov 2016 05:13:32 +0530
Message-ID: <20161122234331.GA20501@svelivela-lt.caveonetworks.com>
References: <1479447902-3700-1-git-send-email-jerin.jacob@caviumnetworks.com>
 <1479447902-3700-3-git-send-email-jerin.jacob@caviumnetworks.com>
 <9184057F7FC11744A2107296B6B8EB1E01E31739@FMSMSX108.amr.corp.intel.com>
 <20161121191358.GA9044@svelivela-lt.caveonetworks.com>
 <20161121193133.GA9895@svelivela-lt.caveonetworks.com>
 <9184057F7FC11744A2107296B6B8EB1E01E31C40@FMSMSX108.amr.corp.intel.com>
 <20161122181913.GA9456@svelivela-lt.caveonetworks.com>
 <9184057F7FC11744A2107296B6B8EB1E01E32F3E@FMSMSX108.amr.corp.intel.com>
 <20161122200022.GA12168@svelivela-lt.caveonetworks.com>
 <9184057F7FC11744A2107296B6B8EB1E01E331A3@FMSMSX108.amr.corp.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Cc: "dev@dpdk.org" <dev@dpdk.org>, "Richardson, Bruce"
 <bruce.richardson@intel.com>, "Van Haaren, Harry"
 <harry.van.haaren@intel.com>, "hemant.agrawal@nxp.com"
 <hemant.agrawal@nxp.com>
To: "Eads, Gage" <gage.eads@intel.com>
Return-path: <dev-bounces@dpdk.org>
Received: from NAM01-BN3-obe.outbound.protection.outlook.com
 (mail-bn3nam01on0080.outbound.protection.outlook.com [104.47.33.80])
 by dpdk.org (Postfix) with ESMTP id C48DA2934
 for <dev@dpdk.org>; Wed, 23 Nov 2016 00:43:38 +0100 (CET)
Content-Disposition: inline
In-Reply-To: <9184057F7FC11744A2107296B6B8EB1E01E331A3@FMSMSX108.amr.corp.intel.com>
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On Tue, Nov 22, 2016 at 10:48:32PM +0000, Eads, Gage wrote:
> 
> 
> >  -----Original Message-----
> >  From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com]
> >  Sent: Tuesday, November 22, 2016 2:00 PM
> >  To: Eads, Gage <gage.eads@intel.com>
> >  Cc: dev@dpdk.org; Richardson, Bruce <bruce.richardson@intel.com>; Van
> >  Haaren, Harry <harry.van.haaren@intel.com>; hemant.agrawal@nxp.com
> >  Subject: Re: [dpdk-dev] [PATCH 2/4] eventdev: implement the northbound APIs
> >  
> >  On Tue, Nov 22, 2016 at 07:43:03PM +0000, Eads, Gage wrote:
> >  > >  > >  > > One open issue I noticed is the "typical workflow"
> >  > > description starting in  > >  rte_eventdev.h:204 conflicts with the
> >  > > centralized software PMD that Harry  > >  posted last week.
> >  > > Specifically, that PMD expects a single core to call the  > >
> >  > > schedule function. We could extend the documentation to account for
> >  > > this  > >  alternative style of scheduler invocation, or discuss
> >  > > ways to make the  software  > >  PMD work with the documented
> >  > > workflow. I prefer the former, but either  way I  > >  think we
> >  > > ought to expose the scheduler's expected usage to the user --
> >  > > perhaps  > >  through an RTE_EVENT_DEV_CAP flag?
> >  > >  > >  >
> >  > >  > >  > I prefer former too, you can propose the documentation
> >  > > change required  for  > >  software PMD.
> >  > >  >
> >  > >  > Sure, proposal follows. The "typical workflow" isn't the most
> >  > > optimal by  having a conditional in the fast-path, of course, but it
> >  > > demonstrates the idea  simply.
> >  > >  >
> >  > >  > (line 204)
> >  > >  >  * An event driven based application has following typical
> >  > > workflow on
> >  > >  fastpath:
> >  > >  >  * \code{.c}
> >  > >  >  *      while (1) {
> >  > >  >  *
> >  > >  >  *              if (dev_info.event_dev_cap &
> >  > >  >  *                      RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED)
> >  > >  >  *                      rte_event_schedule(dev_id);
> >  > >
> >  > >  Yes, I like the idea of RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED.
> >  > >  It  can be input to application/subsystem to  launch separate
> >  > > core(s) for schedule functions.
> >  > >  But, I think, the "dev_info.event_dev_cap &
> >  > > RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED"
> >  > >  check can be moved inside the implementation(to make the better
> >  > > decisions  and  avoiding consuming cycles on HW based schedulers.
> >  >
> >  > How would this check work? Wouldn't it prevent any core from running the
> >  software scheduler in the centralized case?
> >  
> >  I guess you may not need RTE_EVENT_DEV_CAP here, instead need flag for
> >  device configure here
> >  
> >  #define RTE_EVENT_DEV_CFG_DISTRIBUTED_SCHED (1ULL << 1)
> >  
> >  struct rte_event_dev_config config;
> >  config.event_dev_cfg = RTE_EVENT_DEV_CFG_DISTRIBUTED_SCHED;
> >  rte_event_dev_configure(.., &config);
> >  
> >  on the driver side on configure,
> >  if (config.event_dev_cfg & RTE_EVENT_DEV_CFG_DISTRIBUTED_SCHED)
> >  	eventdev->schedule = NULL;
> >  else // centralized case
> >  	eventdev->schedule = your_centrized_schedule_function;
> >  
> >  Does that work?
> 
> Hm, I fear the API would give users the impression that they can select the scheduling behavior of a given eventdev, when a software scheduler is more likely to be either distributed or centralized -- not both.

Even if it is capability flag then also it is per "device". Right ?
capability flag is more of read only too. Am i missing something here?

> 
> What if we use the capability flag, and define rte_event_schedule() as the scheduling function for centralized schedulers and rte_event_dequeue() as the scheduling function for distributed schedulers? That way, the datapath could be the simple dequeue -> process -> enqueue. Applications would check the capability flag at configuration time to decide whether or not to launch an lcore that calls rte_event_schedule().

I am all for simple "dequeue -> process -> enqueue".
rte_event_schedule() added for SW scheduler only,  now it may not make
sense to add one more check on top of "rte_event_schedule()" to see
it is really need or not in fastpath?

> 
> >  
> >  >
> >  > >
> >  > >  >  *
> >  > >  >  *              rte_event_dequeue(...);
> >  > >  >  *
> >  > >  >  *              (event processing)
> >  > >  >  *
> >  > >  >  *              rte_event_enqueue(...);
> >  > >  >  *      }
> >  > >  >  * \endcode
> >  > >  >  *
> >  > >  >  * The *schedule* operation is intended to do event scheduling,
> >  > > and the  >  * *dequeue* operation returns the scheduled events. An
> >  > > implementation  >  * is free to define the semantics between
> >  > > *schedule* and *dequeue*. For  >  * example, a system based on a
> >  > > hardware scheduler can define its  >  * rte_event_schedule() to be
> >  > > an NOOP, whereas a software scheduler can  use  >  * the *schedule*
> >  > > operation to schedule events. The  >  *
> >  > > RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED capability flag indicates
> >  > > whether  >  * rte_event_schedule() should be called by all cores or
> >  > > by a single (typically  >  * dedicated) core.
> >  > >  >
> >  > >  > (line 308)
> >  > >  > #define RTE_EVENT_DEV_CAP_DISTRIBUTED_SCHED (1ULL < 2)  > /**<
> >  > > Event scheduling implementation is distributed and all cores must
> >  > > execute  >  *  rte_event_schedule(). If unset, the implementation is
> >  > > centralized and  >  *  a single core must execute the schedule
> >  > > operation.
> >  > >  >  *
> >  > >  >  *  \see rte_event_schedule()
> >  > >  >  */
> >  > >  >
> >  > >  > >  >
> >  > >  > >  > On same note, If software PMD based workflow need  a
> >  > > separate core(s)  for  > >  > schedule function then, Can we hide
> >  > > that from API specification and pass  an  > >  > argument to SW pmd
> >  > > to define the scheduling core(s)?
> >  > >  > >  >
> >  > >  > >  > Something like --vdev=eventsw0,schedule_cmask=0x2
> >  > >  >
> >  > >  > An API for controlling the scheduler coremask instead of (or
> >  > > perhaps in  addition to) the vdev argument would be good, to allow
> >  > > runtime control. I can  imagine apps that scale the number of cores
> >  > > based on load, and in doing so  may want to migrate the scheduler to a
> >  different core.
> >  > >
> >  > >  Yes, an API for number of scheduler core looks OK. But if we are
> >  > > going to  have service core approach then we just need to specify at
> >  > > one place as  application will not creating the service functions.
> >  > >
> >  > >  >
> >  > >  > >
> >  > >  > >  Just a thought,
> >  > >  > >
> >  > >  > >  Perhaps, We could introduce generic "service" cores concept to
> >  > > DPDK to  hide  > >  the  > >  requirement where the implementation
> >  > > needs dedicated core to do certain  > >  work. I guess it would
> >  > > useful for other NPU integration in DPDK.
> >  > >  > >
> >  > >  >
> >  > >  > That's an interesting idea. As you suggested in the other thread,
> >  > > this concept  could be extended to the "producer" code in the
> >  > > example for configurations  where the NIC requires software to feed
> >  > > into the eventdev. And to the other  subsystems mentioned in your original
> >  PDF, crypto and timer.
> >  > >
> >  > >  Yes. Producers should come in service core category. I think, that
> >  > > enables us to have better NPU integration.(same application code for
> >  > > NPU vs non NPU)
> >  > >