From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Carrillo, Erik G" Subject: Re: [PATCH 0/3] *** timer library enhancements *** Date: Wed, 23 Aug 2017 16:19:20 +0000 Message-ID: References: <1503499644-29432-1-git-send-email-erik.g.carrillo@intel.com> <3F9B5E47-8083-443E-96EE-CBC41695BE43@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Cc: "rsanford@akamai.com" , "dev@dpdk.org" To: "Wiles, Keith" Return-path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 215907D3A for ; Wed, 23 Aug 2017 18:19:22 +0200 (CEST) In-Reply-To: <3F9B5E47-8083-443E-96EE-CBC41695BE43@intel.com> Content-Language: en-US List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Wiles, Keith > Sent: Wednesday, August 23, 2017 10:02 AM > To: Carrillo, Erik G > Cc: rsanford@akamai.com; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements *** >=20 >=20 > > On Aug 23, 2017, at 9:47 AM, Gabriel Carrillo > wrote: > > > > In the current implementation of the DPDK timer library, timers can be > > created and set to be handled by a target lcore by adding it to a > > skiplist that corresponds to that lcore. However, if an application > > enables multiple lcores, and each of these lcores repeatedly attempts > > to install timers on the same target lcore, overall application > > throughput will be reduced as all lcores contend to acquire the lock > > guarding the single skiplist of pending timers. > > > > This patchset addresses this scenario by adding an array of skiplists > > to each lcore's priv_timer struct, such that when lcore i installs a > > timer on lcore k, the timer will be added to the ith skiplist for > > lcore k. If lcore j installs a timer on lcore k simultaneously, > > lcores i and j can both proceed since they will be acquiring different > > locks for different lists. > > > > When lcore k processes its pending timers, it will traverse each > > skiplist in its array and acquire a skiplist's lock while a run list > > is broken out; meanwhile, all other lists can continue to be modified. > > Then, all run lists for lcore k are collected and traversed together > > so timers are executed in their global order. >=20 > What is the performance and/or latency added to the timeout now? >=20 > I worry about the case when just about all of the cores are enabled, whic= h > could be as high was 128 or more now. There is a case in the timer_perf_autotest that runs rte_timer_manage with = zero timers that can give a sense of the added latency. When run with one= lcore, it completes in around 25 cycles. When run with 43 lcores (the hig= hest I have access to at the moment), rte_timer_mange completes in around 1= 55 cycles. So it looks like each added lcore adds around 3 cycles of overh= ead for checking empty lists in my testing. >=20 > One option is to have the lcore j that wants to install a timer on lcore = k to pass > a message via a ring to lcore k to add that timer. We could even add that= logic > into setting a timer on a different lcore then the caller in the current = API. The > ring would be a multi-producer and single consumer, we still have the loc= k. > What am I missing here? >=20 I did try this approach: initially I had a multi-producer single-consumer r= ing that would hold requests to add or delete a timer from lcore k's skipli= st, but it didn't really give an appreciable increase in my test applicatio= n throughput. In profiling this solution, the hotspot had moved from acqui= ring the skiplist's spinlock to the rte_atomic32_cmpset that the multiple-p= roducer ring code uses to manipulate the head pointer. Then, I tried multiple single-producer single-consumer rings per target lco= re. This removed the ring hotspot, but the performance didn't increase as = much as with the proposed solution. These solutions also add overhead to r= te_timer_manage, as it would have to process the rings and then process the= skiplists. One other thing to note is that a solution that uses such messages changes = the use models for the timer. One interesting example is: =20 - lcore I enqueues a message to install a timer on lcore k - lcore k runs rte_timer_manage, processes its messages and adds the timer = to its list - lcore I then enqueues a message to stop the same timer, now owned by lcor= e k - lcore k does not run rte_timer_manage again - lcore I wants to free the timer but it might not be safe Even though lcore I has successfully enqueued the request to stop the timer= (and delete it from lcore k's pending list), it hasn't actually been delet= ed from the list yet, so freeing it could corrupt the list. This case exi= sts in the existing timer stress tests. Another interesting scenario is: - lcore I resets a timer to install it on lcore k - lcore j resets the same timer to install it on lcore k - then, lcore k runs timer_manage Lcore j's message obviates lcore i's message, and it would be wasted work f= or lcore k to process it, so we should mark it to be skipped over. Handli= ng all the edge cases was more complex than the solution proposed. > > > > Gabriel Carrillo (3): > > timer: add per-installer pending lists for each lcore > > timer: handle timers installed from non-EAL threads > > doc: update timer lib docs > > > > doc/guides/prog_guide/timer_lib.rst | 19 ++- > > lib/librte_timer/rte_timer.c | 329 +++++++++++++++++++++++------= --- > ---- > > lib/librte_timer/rte_timer.h | 9 +- > > 3 files changed, 231 insertions(+), 126 deletions(-) > > > > -- > > 2.6.4 > > >=20 > Regards, > Keith