All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: Neil Horman <nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org>,
	"Wodkowski,
	PawelX"
	<pawelx.wodkowski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: "dev-VfR2kkLFssw@public.gmane.org" <dev-VfR2kkLFssw@public.gmane.org>
Subject: Re: [PATCH v2] Change alarm cancel function to thread-safe:
Date: Fri, 26 Sep 2014 14:13:55 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB9772582137D7F1@IRSMSX104.ger.corp.intel.com> (raw)
In-Reply-To: <20140926134014.GB3930-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>

> From: dev [mailto:dev-bounces-VfR2kkLFssw@public.gmane.org] On Behalf Of Neil Horman
> Sent: Friday, September 26, 2014 2:40 PM
> To: Wodkowski, PawelX
> Cc: dev-VfR2kkLFssw@public.gmane.org
> Subject: Re: [dpdk-dev] [PATCH v2] Change alarm cancel function to thread-safe:
> 
> On Fri, Sep 26, 2014 at 12:37:54PM +0000, Wodkowski, PawelX wrote:
> > > So basically cancel() just set ALARM_CANCELLED and leaves actual alarm
> > > deletion to the callback()?
> > > That was the thought, yes.
> > >
> > > > I think it is doable - but I don't see any real advantage with that approach.
> > > > Yes, code will become a bit simpler, as  we'll have one point when we remove
> > > alarm from the list.
> > > Yes, that would be the advantage, that the code would be much simpler.
> > >
> > > > But from other side, imagine such simple test-case:
> > > >
> > > > for (i = 0; i < 0x100000; i++) {
> > > >    rte_eal_alarm_set(ONE_MIN, cb_func, (void *)i);
> > > >    rte_eal_alarm_cancel(cb_func, (void *)i);
> > > > }
> > > >
> > > > We'll endup with 1M of cancelled, but still not removed entries in the
> > > alarm_list.
> > > > With current implementation that means - few MBs of wasted memory,
> > > Thats correct, and the tradeoff to choose between.  Do you want simpler code
> > > that is easier to maintain, or do you want a high speed cancel and set
> > > operation.  I'm not aware of all the use cases, but I have a hard time seeing
> > > a use case in which the in-flight alarm list grows unboundedly large, which in
> > > my mind mitigates the risk of deferred removal, but I'm perfectly willing to
> > > believe that there are use cases which I'm not aware of.

After executing example above - from user perspective there is no active alarms in the system at all.
Though in fact alarm_list contains 1M entries. 

> > >
> > > > plus very slow set() and cancel(), as they'll  have to traverse all entries in the
> > > list.
> > > > And all that - for empty from user perspective alarm_list
> > > > So I still prefer Michal's way.
> > > > After all, it doesn't look that complicated to me.
> > > Except that the need for Michals fix arose from the fact that we have two free
> > > locations that might both get called depending on the situation.  Thats what I'm
> > > trying to address here, the complexity itself, rather than the fix (which I
> > > agree is perfectly valid).

Well, I believe his fix addresses all the open issues, no?

Another thing, as Pawel pointed in another mail, your fix doesn't address properly the situation when
we have a racing alarm_cancel(cb_func) and alarm cb_func rearming itself. 
While the original patch does.

> > >
> > > > BTW, any particular reason you are so negative about pthread_self()?
> > > >
> > > Nothing specifically against it (save for its inverted return code sense, which
> > > made it difficult for me to parse when reviewing).  Its more the complexity
> > > itself in the alarm cancel and callback routine that I was looking at.  Given
> > > that the origional bug happened because an cancel in a callback might produce a
> > > double free, I wanted to fix it by simpifying the code, not adding conditions
> > > which make it more complex.
> > >
> > > You know, looking at it, something else just occured to me.  I think this could
> > > all be fixed without the cancel flag or the pthread_self assignments.  What if
> > > we simply removed the alarm from the list before we called the callback in
> > > rte_eal_alarm_callback()?  That way any cancel operation called from within the
> > > callback would fail, as it wouldn't appear on the list, and the callback
> > > operation would be the only freeing entity?  That would let you still have a
> > > fast set and cancel, and avoid the race.  Thoughts?  Untested sample patch
> > > below


Hmm, and how it would address any of the problems I mentioned above:

1. The alarm_list still would grow by the repetition of: {alarm_set(x); alarm_cansel(x);} 
2. Race between alarm_cancel(cb_func) and alarm cb_func rearming itself.

?

> > >
> > >
> > > > >
> > > > > It also seems like the alarm api as a whole could use some improvement.
> > > The
> > > > > way its written right now, theres no way to refer to a specific alarm (i.e.
> > > > > cancelation relies on the specification of a function and data pointer, which
> > > > > may refer to multiple timers).  Shouldn't rte_eal_alarm_set return an opaque
> > > > > handle to a unique timer instance that can be store by a caller and used to
> > > > > specfically cancel that timer?  Thats how both the bsd and linux timer
> > > > > subsystems model timers.
> > > >
> > > > Yeh,  alarm API looks a bit unusual.
> > > > Though, I suppose that's subject for another patch/discussion :)
> > > >
> > > Yes, agreed :)
> > >
> >
> > Please read quoted message bellow:
> >
> > > >
> > > >
> > > > His solution *does* eliminate race condition too.
> > > >
> > > I applied his patch. And here is the problem
> > > 1	rte_spinlock_lock(&alarm_list_lk);
> > > 2	while ((ap = LIST_FIRST(&alarm_list)) !=NULL &&
> > > 3			gettimeofday(&now, NULL) == 0 &&
> > > 4			(ap->time.tv_sec < now.tv_sec || (ap->time.tv_sec ==
> > > now.tv_sec &&
> > > 5						ap->time.tv_usec <=
> > > now.tv_usec))){
> > > 6		ap->executing |= ALARM_EXECUTING;
> > > 7		if (likely(!(ap->executing & ALARM_CANCELLED))) {
> > > 8				rte_spinlock_unlock(&alarm_list_lk);
> > > 9                           //another thread: rte_alarm_cancel called, mark this timer
> > > canceled and exit ( THE RACE)
> > > 10				ap->cb_fn(ap->cb_arg); // rte_alarm_set called
> > > (THE RACE)
> > > 11
> > > 12				rte_spinlock_lock(&alarm_list_lk);
> > > 13		}
> > > 14
> > > 15		rte_spinlock_lock(&alarm_list_lk);
> > > 16		LIST_REMOVE(ap, next);
> > > 17		rte_free(ap);
> > > 18	}
> > >
> > > Imagine
> > >
> > > Thread 1:                                         Thread2
> > > Execute eal_alarm_callback
> > > Lock list at 1                                   rte_alarm_cancel -> block on spinlock
> > > ....
> > > Realease lock at line 8                  rte_alarm_cancel -> resumes execution -> it
> > > mark this timer canceled
> > > ap->cb_fn is called at line 10       rte_alarm_cancel exits reporting all alarms are
> > > canceled and not executing (which is not true!)
> > >     rte_alarm_set is called
> > >     to rearm itself (THE RACE)
> > >
> > > He only mark timer to * do not execute* but does not assure that it is not
> > > executing while canceling.
> > > Race condition is made by unlocking at line 8 and our solution workarounds this
> > > by looping until all alarms finish execution then cancel what left after callback
> > > finish (*including rearmed alarms*).
> > > Real elimination of the race would be by using recursive locking when *other
> > > thread can't* access the list *while callback* is running but callback *itself can
> > > by using recursive locking*.
> > >
> > > Maybe I don't see something obvious? :)
> 
> I think you're missing the fact that your patch doesn't do what you assert above
> either :)
> 
> First, lets address rte_alarm_set.  There is no notion of "re-arming" in this
> alarm implementation, because theres no ability to refer to a specific alarm
> from the callers perspective.  When you call rte_eal_alarm_set you get a new
> alarm every time.  So I don't really see a race there.  It might not be exactly
> the behavior you want, but its not a race, becuase you're not modifying an alarm
> in the middle of execution, you're just creating a new alarm, which is safe.
> 
> There is a race in what you describe above, insofar as its possible that you
> might call rte_eal_alarm_cancel and return without having canceled all the
> matching alarms.  I don't see any clear documentation on what the behavior is
> supposed to be, but if you want to ensure that all matching alarms are cancelled
> or complete on return from rte_eal_alarm_cancel, thats perfectly fine (in linux
> API parlance, thats usually denoted as a cancel_sync operation).
> 
> For that race condition, you're correct, my patch doesn't address it, I see that
> now.  Though your patch doesn't either.  If you call rte_eal_alarm_cancel from
> within a callback function, then, by definition, you can't wait on the
> completion of the active alarm, because thats a deadlock.  Its a necessecary
> evil, I grant you, but it means that you can't be guaranteed the cancelled and
> complete (cancel_sync) behavior that you want, at least not with the current
> api.  If you want that behavior, you need to do one of two things:
> 
> 1) Modify the api to allow callers to individually reference timer instances, so
> that when cancelling, we can return an appropriate return code to indicate to
> the caller that this alarm is in-progress.  That way you can guarantee the
> caller that the specific alarm that you cancelled is either complete and cancelled
> or currently executing.  Add an api to expicitly wait on a referenced alarm as
> well.  This allows developers to know that, when executing an alarm callback, an
> -ECURRENTLYEXECUTING return code is ok, because they are in the currently
> executing context.
> 
> 
> 2) Understand that the guarantee can't be met, and change nothing, because you
> consider it ok for an in-progress alarm to be ignored when getting cancelled.
> 
> 
> Neil

  parent reply	other threads:[~2014-09-26 14:13 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-25 12:56 [PATCH v2] Change alarm cancel function to thread-safe: Michal Jastrzebski
     [not found] ` <1411649768-8084-1-git-send-email-michalx.k.jastrzebski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2014-09-25 13:11   ` Ananyev, Konstantin
2014-09-25 15:08   ` Neil Horman
     [not found]     ` <20140925150807.GD32725-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2014-09-25 16:03       ` Ananyev, Konstantin
     [not found]         ` <2601191342CEEE43887BDE71AB977258213769DE-kPTMFJFq+rEu0RiL9chJVbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-25 17:23           ` Neil Horman
     [not found]             ` <20140925172358.GG32725-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2014-09-25 23:24               ` Ananyev, Konstantin
     [not found]                 ` <2601191342CEEE43887BDE71AB97725821378B50-kPTMFJFq+rGvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-26 11:46                   ` Neil Horman
     [not found]                     ` <20140926114630.GA3930-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2014-09-26 12:37                       ` Wodkowski, PawelX
     [not found]                         ` <F6F2A6264E145F47A18AB6DF8E87425D12B39EB5-kPTMFJFq+rFP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-26 13:40                           ` Neil Horman
     [not found]                             ` <20140926134014.GB3930-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2014-09-26 14:01                               ` Wodkowski, PawelX
     [not found]                                 ` <F6F2A6264E145F47A18AB6DF8E87425D12B39F15-kPTMFJFq+rFP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-26 15:01                                   ` Neil Horman
     [not found]                                     ` <20140926150156.GB5619-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2014-09-26 15:41                                       ` Ananyev, Konstantin
     [not found]                                         ` <2601191342CEEE43887BDE71AB9772582137D88E-kPTMFJFq+rGvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-26 16:21                                           ` Neil Horman
     [not found]                                             ` <20140926162134.GE5619-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2014-09-26 18:07                                               ` Ananyev, Konstantin
     [not found]                                                 ` <2601191342CEEE43887BDE71AB9772582137D95F-kPTMFJFq+rGvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-26 19:39                                                   ` Neil Horman
     [not found]                                                     ` <20140926193905.GH5619-B26myB8xz7F8NnZeBjwnZQMhkBWG/bsMQH7oEaQurus@public.gmane.org>
2014-09-28 16:12                                                       ` Ananyev, Konstantin
     [not found]                                                         ` <2601191342CEEE43887BDE71AB9772582138410B-kPTMFJFq+rGvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-28 20:47                                                           ` Neil Horman
     [not found]                                                             ` <20140928204754.GC4012-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org>
2014-09-29  6:40                                                               ` Wodkowski, PawelX
     [not found]                                                                 ` <F6F2A6264E145F47A18AB6DF8E87425D12B3A553-kPTMFJFq+rFP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-29  9:50                                                                   ` Ananyev, Konstantin
     [not found]                                                                     ` <2601191342CEEE43887BDE71AB977258213874C5-kPTMFJFq+rGvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-29 10:11                                                                       ` Wodkowski, PawelX
     [not found]                                                                         ` <F6F2A6264E145F47A18AB6DF8E87425D12B3A80F-kPTMFJFq+rFP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-29 10:33                                                                           ` Bruce Richardson
2014-09-30 11:13                                                                             ` Wodkowski, PawelX
     [not found]                                                                               ` <F6F2A6264E145F47A18AB6DF8E87425D12B3B0AC-kPTMFJFq+rFP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-30 12:05                                                                                 ` Wodkowski, PawelX
     [not found]                                                                                   ` <F6F2A6264E145F47A18AB6DF8E87425D12B3B1A8-kPTMFJFq+rFP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-30 12:30                                                                                     ` Ananyev, Konstantin
     [not found]                                                                                       ` <2601191342CEEE43887BDE71AB9772582138DE32-kPTMFJFq+rEu0RiL9chJVbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-30 12:54                                                                                         ` Neil Horman
2014-09-29 11:35                                                                           ` Neil Horman
2014-09-26 14:13                               ` Ananyev, Konstantin [this message]
     [not found]                                 ` <2601191342CEEE43887BDE71AB9772582137D7F1-kPTMFJFq+rGvNW/NfzhIbrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-29 10:37                                   ` Bruce Richardson
2014-09-26  6:33               ` Wodkowski, PawelX
     [not found]                 ` <F6F2A6264E145F47A18AB6DF8E87425D12B39AFE-kPTMFJFq+rFP9JyJpTNKArfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-09-26  9:49                   ` Wodkowski, PawelX
2014-09-26 13:43                   ` Neil Horman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB9772582137D7F1@IRSMSX104.ger.corp.intel.com \
    --to=konstantin.ananyev-ral2jqcrhueavxtiumwx3w@public.gmane.org \
    --cc=dev-VfR2kkLFssw@public.gmane.org \
    --cc=nhorman-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org \
    --cc=pawelx.wodkowski-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.