All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Harry van Haaren" <harry.van.haaren@intel.com>, <dev@dpdk.org>
Cc: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>,
	nd <nd@arm.com>, nd <nd@arm.com>
Subject: RE: [PATCH 2/2] service: fix potential stats race-condition on MT services
Date: Fri, 8 Jul 2022 20:08:26 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D871BA@smartserver.smartshare.dk> (raw)
In-Reply-To: <DBAPR08MB581412228A468F7156FB87D498829@DBAPR08MB5814.eurprd08.prod.outlook.com>

> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Friday, 8 July 2022 19.40
> 
> <snip>
> > > > > >
> > > > > > This commit fixes a potential racey-add that could occur if
> > > multiple
> > > > > service-
> > > > > > lcores were executing the same MT-safe service at the same
> time,
> > > > > > with service statistics collection enabled.
> > > > > >
> > > > > > Because multiple threads can run and execute the service, the
> > > stats
> > > > > values
> > > > > > can have multiple writer threads, resulting in the
> requirement
> > > > > > of
> > > > > using
> > > > > > atomic addition for correctness.
> > > > > >
> > > > > > Note that when a MT unsafe service is executed, a spinlock is
> > > held,
> > > > > so the
> > > > > > stats increments are protected. This fact is used to avoid
> > > executing
> > > > > atomic
> > > > > > add instructions when not required.
> > > > > >
> > > > > > This patch causes a 1.25x increase in cycle-cost for polling
> a
> > > > > > MT
> > > > > safe service
> > > > > > when statistics are enabled. No change was seen for MT unsafe
> > > > > services, or
> > > > > > when statistics are disabled.
> > > > > >
> > > > > > Reported-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
> > > > > > Suggested-by: Honnappa Nagarahalli
> > > > > > <Honnappa.Nagarahalli@arm.com>
> > > > > > Suggested-by: Morten Brørup <mb@smartsharesystems.com>
> > > > > > Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
> > > > > >
> > > > > > ---
> > > > > > ---
> > > > > >  lib/eal/common/rte_service.c | 10 ++++++++--
> > > > > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/lib/eal/common/rte_service.c
> > > > > b/lib/eal/common/rte_service.c
> > > > > > index ef31b1f63c..f045e74ef3 100644
> > > > > > --- a/lib/eal/common/rte_service.c
> > > > > > +++ b/lib/eal/common/rte_service.c
> > > > > > @@ -363,9 +363,15 @@ service_runner_do_callback(struct
> > > > > > rte_service_spec_impl *s,
> > > > > >  		uint64_t start = rte_rdtsc();
> > > > > >  		s->spec.callback(userdata);
> > > > > >  		uint64_t end = rte_rdtsc();
> > > > > > -		s->cycles_spent += end - start;
> > > > > > +		uint64_t cycles = end - start;
> > > > > >  		cs->calls_per_service[service_idx]++;
> > > > > > -		s->calls++;
> > > > > > +		if (service_mt_safe(s)) {
> > > > > > +			__atomic_fetch_add(&s->cycles_spent,
> > cycles,
> > > > > > __ATOMIC_RELAXED);
> > > > > > +			__atomic_fetch_add(&s->calls, 1,
> > > > > > __ATOMIC_RELAXED);
> > > > > > +		} else {
> > > > > > +			s->cycles_spent += cycles;
> > > > > > +			s->calls++;
> > > > > This is still a problem from a reader perspective. It is
> possible
> > > that
> > > > > the writes could be split while a reader is reading the stats.
> > > These
> > > > > need to be atomic adds.
> > > >
> > > > I don't understand what you suggest can go wrong here, Honnappa.
> If
> > > you
> > > > talking about 64 bit counters on 32 bit architectures, then I
> > > understand the
> > > > problem (and have many years of direct experience with it
> myself).
> > > > Otherwise, I hope you can elaborate or direct me to educational
> > > material
> > > > about the issue, considering this a learning opportunity. :-)
> > > I am thinking of the case where the 64b write is split into two 32b
> > > (or
> > > more) write operations either by the compiler or the micro-
> > > architecture. If this were to happen, it causes race conditions
> with
> > > the reader.
> > >
> > > As far as I understand, the compiler does not provide any
> guarantees
> > > on generating non-tearing stores unless an atomic builtin/function
> is
> > > used.
> >
> > This seems like a generic problem for all 64b statistics counters in
> DPDK, and
> > any other C code using 64 bit counters. Being a generic C problem,
> there is
> > probably a generic solution to it.
> Browsing through the code, I see similar problems elsewhere.
> 
> >
> > Is any compiler going to do something that stupid (i.e. tearing a
> store into
> > multiple write operations) to a simple 64b counter on any 64 bit
> architecture
> > (assuming the counter is 64b aligned)? Otherwise, we might only need
> to
> > take special precautions for 32 bit architectures.
> It is always a debate on who is stupid, compiler or programmer 😊

Compilers will never stop surprising me. Thankfully, they are not so unreliable and full of bugs as they were 25 years ago. :-)

> 
> Though not the same case, you can look at this discussion where
> compiler generated torn stores [1] when we all thought it has been
> generating a 64b store.
> 
> [1] http://inbox.dpdk.org/dev/d5d563ab-0411-3faf-39ec-
> 4994f2bc9f6f@intel.com/

Good reference.

Technically, this sets a bunch of fields in the rte_lpm_tbl_entry structure (which happens to be 32b in total size), so it is not completely unreasonable for the compiler to store those fields individually. I wonder if using a union to cast the rte_lpm_tbl_entry struct to uint32_t (and ensuring 32b alignment) would have solved the problem, and the __atomic_store() could be avoided?

> 
> >
> > > If we have to ensure the micro-architecture does not generate split
> > > writes, we need to be careful that future code additions do not
> change
> > > the alignment of the stats.
> >
> > Unless the structure containing the stats counters is packed, the
> contained
> > 64b counters will be 64b aligned (on 64 bit architecture). So we
> should not
> > worry about alignment, except perhaps on 32 bit architectures.
> Agree, future code changes need to be aware of these issues and DPDK
> supports 32b architectures.

  reply	other threads:[~2022-07-08 18:08 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-08 12:56 [PATCH 1/2] test/service: add perf measurements for with stats mode Harry van Haaren
2022-07-08 12:56 ` [PATCH 2/2] service: fix potential stats race-condition on MT services Harry van Haaren
2022-07-08 13:23   ` Morten Brørup
2022-07-08 13:44     ` Van Haaren, Harry
2022-07-08 14:14       ` Morten Brørup
2022-07-08 13:48   ` Mattias Rönnblom
2022-07-08 15:16   ` Honnappa Nagarahalli
2022-07-08 15:31     ` Van Haaren, Harry
2022-07-08 16:21       ` Bruce Richardson
2022-07-08 16:33         ` Honnappa Nagarahalli
2022-07-08 20:02         ` Mattias Rönnblom
2022-07-08 16:29     ` Morten Brørup
2022-07-08 16:45       ` Honnappa Nagarahalli
2022-07-08 17:22         ` Morten Brørup
2022-07-08 17:39           ` Honnappa Nagarahalli
2022-07-08 18:08             ` Morten Brørup [this message]
2022-09-06 16:13   ` [PATCH 1/6] service: reduce statistics overhead for parallel services Mattias Rönnblom
2022-09-06 16:13     ` [PATCH 2/6] service: introduce per-lcore cycles counter Mattias Rönnblom
2022-09-06 16:13     ` [PATCH 3/6] service: reduce average case service core overhead Mattias Rönnblom
2022-10-03 13:33       ` Van Haaren, Harry
2022-10-03 14:32         ` Mattias Rönnblom
2022-09-06 16:13     ` [PATCH 4/6] service: tweak cycle statistics semantics Mattias Rönnblom
2022-09-07  8:41       ` Morten Brørup
2022-10-03 13:45         ` Van Haaren, Harry
2022-09-06 16:13     ` [PATCH 5/6] event/sw: report idle when no work is performed Mattias Rönnblom
2022-09-06 16:13     ` [PATCH 6/6] service: provide links to functions in documentation Mattias Rönnblom
2022-10-03  8:06     ` [PATCH 1/6] service: reduce statistics overhead for parallel services David Marchand
2022-10-03  8:40       ` Mattias Rönnblom
2022-10-03  9:53         ` David Marchand
2022-10-03 11:37           ` Mattias Rönnblom
2022-10-03 13:03             ` Van Haaren, Harry
2022-10-03 13:33     ` Van Haaren, Harry
2022-10-03 14:37       ` Mattias Rönnblom
2022-10-05  9:16     ` [PATCH v2 0/6] Service cores performance and statistics improvements Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 1/6] service: reduce statistics overhead for parallel services Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 2/6] service: introduce per-lcore cycles counter Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 3/6] service: reduce average case service core overhead Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 4/6] service: tweak cycle statistics semantics Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 5/6] event/sw: report idle when no work is performed Mattias Rönnblom
2022-10-05  9:16       ` [PATCH v2 6/6] service: provide links to functions in documentation Mattias Rönnblom
2022-10-05  9:49       ` [PATCH v2 0/6] Service cores performance and statistics improvements Morten Brørup
2022-10-05 10:14         ` Mattias Rönnblom
2022-10-05 13:39       ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D871BA@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=dev@dpdk.org \
    --cc=harry.van.haaren@intel.com \
    --cc=mattias.ronnblom@ericsson.com \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.