All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kuusisaari, Juhamatti" <Juhamatti.Kuusisaari@coriant.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"'dev@dpdk.org'" <dev@dpdk.org>
Subject: Re: [PATCH] lib: move rte_ring read barrier to correct	location
Date: Fri, 15 Jul 2016 05:40:56 +0000	[thread overview]
Message-ID: <HE1PR04MB133704D0F24EDBDB3AB9D95A9D330@HE1PR04MB1337.eurprd04.prod.outlook.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB97725836B7DD85@irsmsx105.ger.corp.intel.com>


Hi Konstantin,

> Hi Juhamatti,
> 
> >
> > Hi Konstantin,
> >
> > > > > > > It is quite safe to move the barrier before DEQUEUE because
> > > > > > > after the DEQUEUE there is nothing really that we would want
> > > > > > > to protect
> > > with a
> > > > > read barrier.
> > > > > >
> > > > > > I don't think so.
> > > > > > If you remove barrier after DEQUEUE(), that means on systems
> > > > > > with relaxed memory ordering cons.tail could be updated before
> > > > > > DEQUEUE() will be finished and producer can overwrite queue
> > > > > > entries that were
> > > not
> > > > > yet dequeued.
> > > > > > So if cpu can really do such speculative out of order loads,
> > > > > > then we do need for  __rte_ring_sc_do_dequeue() something like:
> > > > > >
> > > > > > rte_smp_rmb();
> > > > > > DEQUEUE_PTRS();
> > > > > > rte_smp_rmb();
> > > >
> > > > You have a valid point here, there needs to be a guarantee that
> > > > cons_tail
> > > cannot
> > > > be updated before DEQUEUE is completed. Nevertheless, my point was
> > > that it is
> > > > not guaranteed with a read barrier anyway. The implementation has
> > > > the
> > > following
> > > > sequence
> > > >
> > > > DEQUEUE_PTRS(); (i.e. READ/LOAD)
> > > > rte_smp_rmb();
> > > > ..
> > > > r->cons.tail = cons_next; (i.e WRITE/STORE)
> > > >
> > > > Above read barrier does not guarantee any ordering for the
> > > > following
> > > writes/stores.
> > > > As a guarantee is needed, I think we in fact need to change the
> > > > read barrier
> > > on the
> > > > dequeue to a full barrier, which guarantees the read+write order,
> > > > as
> > > follows
> > > >
> > > > DEQUEUE_PTRS();
> > > > rte_smp_mb();
> > > > ..
> > > > r->cons.tail = cons_next;
> > > >
> > > > If you agree, I can for sure prepare another patch for this issue.
> > >
> > > Hmm, I think for __rte_ring_mc_do_dequeue() we are ok with
> > > smp_rmb(), as we have to read cons.tail anyway.
> >
> > Are you certain that this read creates strong enough dependency between
> read of cons.tail and the write of it on the mc_do_dequeue()?
> 
> Yes, I believe so.
> 
> > I think it does not really create any control dependency there as the next
> write is not dependent of the result of the read.
> 
> I think it is dependent: cons.tail can be updated only if it's current value is
> eual to precomputed before cons_head.
> So cpu has to read cons.tail value first.

I was thinking that it might match to this processing pattern:

S1.         if (a == b)
S2.             a = a + b
S3.         b = a + b

Above S3 has no control dependency to S1 and can be in theory executed
in parallel. 

In the (simplified) implementation we have:

X1.         while (a != b)
X2.             { }
X3.         a = c

If we would consider S3 and X3 equal, there might be a problem with coherence 
without a write barrier after the while(). It may of course be purely theoretical, 
depending on the HW. This is anyway the reason for my suggestion to have a 
write-barrier also on the mc_do_dequeue() just before tail update.

--
 Juhamatti 

> > The CPU also
> > knows already the value that will be written to cons.tail and that
> > value does not depend on the previous read either. The CPU does not
> know we are planning to do a spinlock there, so it might do things out-of-
> order without proper dependencies.
> >
> > > For  __rte_ring_sc_do_dequeue(), I think you right, we might need
> > > something stronger.
> > > I don't want to put rte_smp_mb() here as it would cause full HW
> > > barrier even on machines with strong memory order (IA).
> > > I think that rte_smp_wmb() might be enough here:
> > > it would force cpu to wait till writes in DEQUEUE_PTRS() are become
> > > visible, which means reads have to be completed too.
> >
> > In practice I think that rte_smp_wmb() would work fine, even though it
> > is not strictly according to the book. Below solution would be my
> > proposal as a fix to the issue of sc dequeueing (and also to mc dequeueing,
> if we have the problem of CPU completely ignoring the spinlock in reality
> there):
> >
> > DEQUEUE_PTRS();
> > ..
> > rte_smp_wmb();
> > r->cons.tail = cons_next;
> 
> As I said in previous email - it looks good for me for
> _rte_ring_sc_do_dequeue(), but I am interested to hear what  ARM and PPC
> maintainers think about it.
> Jan, Jerin do you have any comments on it?
> Chao, sorry but I still not sure why PPC is considered as architecture with
> strong memory ordering?
> Might be I am missing something obvious here.
> Thank
> Konstantin
> 
> >
> > --
> >  Juhamatti
> >
> > > Another option would be to define a new macro: rte_weak_mb() or so,
> > > that would be expanded into CB on boxes with strong memory model,
> > > and to full MB on machines with relaxed ones.
> > > Interested to hear what ARM and PPC guys think.
> > > Konstantin
> > >
> > > P.S. Another thing a bit off-topic - for PPC guys:
> > > As I can see smp_rmb/smp_wmb are just a complier barriers:
> > > find lib/librte_eal/common/include/arch/ppc_64/ -type f | xargs grep
> > > smp_ lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define
> > > rte_smp_mb() rte_mb()
> > > lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define
> > > rte_smp_wmb() rte_compiler_barrier()
> > > lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h:#define
> > > rte_smp_rmb() rte_compiler_barrier() My knowledge about PPC
> > > architecture is rudimental, but is that really enough?
> > >
> > > >
> > > > Thanks,
> > > > --
> > > >  Juhamatti
> > > >
> > > > > > Konstantin
> > > > > >
> > > > > > > The read
> > > > > > > barrier is mapped to a compiler barrier on strong memory
> > > > > > > model systems and this works fine too as the order of the
> > > > > > > head,tail updates is still guaranteed on the new location.
> > > > > > > Even if the problem would be theoretical on most systems, it
> > > > > > > is worth fixing as the risk for
> > > > > problems is very low.
> > > > > > >
> > > > > > > --
> > > > > > >  Juhamatti
> > > > > > >
> > > > > > > > Konstantin
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Juhamatti Kuusisaari
> > > > > > > > > > > <juhamatti.kuusisaari@coriant.com>
> > > > > > > > > > > ---
> > > > > > > > > > >  lib/librte_ring/rte_ring.h | 6 ++++--
> > > > > > > > > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/lib/librte_ring/rte_ring.h
> > > > > > > > > > > b/lib/librte_ring/rte_ring.h index eb45e41..a923e49
> > > > > > > > > > > 100644
> > > > > > > > > > > --- a/lib/librte_ring/rte_ring.h
> > > > > > > > > > > +++ b/lib/librte_ring/rte_ring.h
> > > > > > > > > > > @@ -662,9 +662,10 @@
> __rte_ring_mc_do_dequeue(struct
> > > > > > > > > > > rte_ring *r,
> > > > > > > > > > void **obj_table,
> > > > > > > > > > >                                               cons_next);
> > > > > > > > > > >         } while (unlikely(success == 0));
> > > > > > > > > > >
> > > > > > > > > > > +       rte_smp_rmb();
> > > > > > > > > > > +
> > > > > > > > > > >         /* copy in table */
> > > > > > > > > > >         DEQUEUE_PTRS();
> > > > > > > > > > > -       rte_smp_rmb();
> > > > > > > > > > >
> > > > > > > > > > >         /*
> > > > > > > > > > >          * If there are other dequeues in progress
> > > > > > > > > > > that preceded us, @@ -746,9 +747,10 @@
> > > > > > > > > > > __rte_ring_sc_do_dequeue(struct rte_ring *r,
> > > > > > > > > > void **obj_table,
> > > > > > > > > > >         cons_next = cons_head + n;
> > > > > > > > > > >         r->cons.head = cons_next;
> > > > > > > > > > >
> > > > > > > > > > > +       rte_smp_rmb();
> > > > > > > > > > > +
> > > > > > > > > > >         /* copy in table */
> > > > > > > > > > >         DEQUEUE_PTRS();
> > > > > > > > > > > -       rte_smp_rmb();
> > > > > > > > > > >
> > > > > > > > > > >         __RING_STAT_ADD(r, deq_success, n);
> > > > > > > > > > >         r->cons.tail = cons_next;
> > > > > > > > > > > --
> > > > > > > > > > > 2.9.0
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> ==========================================================
> > > > > > > > > > ==
> > > > > > > > > > > The information contained in this message may be
> > > > > > > > > > > privileged and confidential and protected from
> > > > > > > > > > > disclosure. If the reader of this message is not the
> > > > > > > > > > > intended recipient, or an employee or agent
> > > > > > > > > > > responsible for delivering this message to the
> > > > > > > > > > > intended recipient, you are hereby notified that any
> > > > > > > > > > > reproduction, dissemination or distribution of this
> > > > > > > > > > > communication is strictly prohibited. If you have
> > > > > > > > > > > received this communication in error, please notify
> > > > > > > > > > > us immediately by replying to the message and
> > > > > > > > > > > deleting it from your
> > > > > > > > computer. Thank you.
> > > > > > > > > > > Coriant-Tellabs
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > >
> > >
> ==========================================================
> > > > > > > > > > ==

  reply	other threads:[~2016-07-15  5:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-11 10:20 [PATCH] lib: move rte_ring read barrier to correct location Juhamatti Kuusisaari
2016-07-11 10:41 ` Ananyev, Konstantin
2016-07-11 11:22   ` Kuusisaari, Juhamatti
2016-07-11 11:40     ` Olivier Matz
2016-07-12  4:10       ` Kuusisaari, Juhamatti
2016-07-11 12:34     ` Ananyev, Konstantin
2016-07-12  5:27       ` Kuusisaari, Juhamatti
2016-07-12 11:01         ` Ananyev, Konstantin
2016-07-12 17:58           ` Ananyev, Konstantin
2016-07-13  5:27             ` Kuusisaari, Juhamatti
2016-07-13 13:00               ` Ananyev, Konstantin
2016-07-14  4:17                 ` Kuusisaari, Juhamatti
2016-07-14 12:56                   ` Ananyev, Konstantin
2016-07-15  5:40                     ` Kuusisaari, Juhamatti [this message]
2016-07-15  6:29                     ` Jerin Jacob
2016-07-15 10:34                       ` Ananyev, Konstantin
2016-07-18  2:47                         ` Jerin Jacob

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=HE1PR04MB133704D0F24EDBDB3AB9D95A9D330@HE1PR04MB1337.eurprd04.prod.outlook.com \
    --to=juhamatti.kuusisaari@coriant.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.