All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Nikolova, Tatyana E" <tatyana.e.nikolova@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: "dledford@redhat.com" <dledford@redhat.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Subject: RE: [PATCH rdma-core] irdma: Restore full memory barrier for doorbell optimization
Date: Thu, 2 Sep 2021 16:27:37 +0000	[thread overview]
Message-ID: <DM6PR11MB4692BD4BB222DAA58C264000CBCE9@DM6PR11MB4692.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20210819224408.GE1721383@nvidia.com>



> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, August 19, 2021 5:44 PM
> To: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com>
> Cc: dledford@redhat.com; leon@kernel.org; linux-rdma@vger.kernel.org
> Subject: Re: [PATCH rdma-core] irdma: Restore full memory barrier for
> doorbell optimization
> 
> On Thu, Aug 19, 2021 at 10:01:50PM +0000, Nikolova, Tatyana E wrote:
> >
> >
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Wednesday, August 18, 2021 11:50 AM
> > > To: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com>
> > > Cc: dledford@redhat.com; leon@kernel.org; linux-rdma@vger.kernel.org
> > > Subject: Re: [PATCH rdma-core] irdma: Restore full memory barrier
> > > for doorbell optimization
> > >
> > > On Fri, Aug 13, 2021 at 05:25:49PM -0500, Tatyana Nikolova wrote:
> > > > >> 1.	Software writing the valid bit in the WQE.
> > > > >> 2.	Software reading shadow memory (hw_tail) value.
> > > >
> > > > > You are missing an ordered atomic on this read it looks like
> > > >
> > > > Hi Jason,
> > > >
> > > > Why do you think we need atomic ops in this case? We aren't trying
> > > > to protect from multiple threads but CPU re-ordering of a write
> > > > and a read.
> > >
> > > Which is what the atomics will do.
> > >
> > > Barriers are only appropriate when you can't add atomic markers to
> > > the actual data that needs ordering.
> >
> > Hi Jason,
> >
> > We aren't sure what you mean by atomic markers. We ran a few
> > experiments with atomics, but none of the barriers we tried
> > smp_mb__{before,after}_atomic(), smp_load_acquire() and
> > smp_store_release() translates to a full memory barrier on X86.
> 
> Huh? Those are kernel primitives, this is a userspace patch.
> 
> Userspace follows the C11 atomics memory model.
> 
> So I'd expect
> 
>   atomic_store_explicit(tail, memory_order_release)
>   atomic_load_explicit(tail, memory_order_acquire)
> 
> To be the atomics you need. This will ensure that the read/writes to valid
> before the atomics are sequenced correctly, eg no CPU thread can observe
> an updated tail without also observing the set valid.
> 

Hi Jason,

We tried these atomic ops as shown bellow, but they don't fix the issue.

atomic_store_explicit(hdr, memory_order_release) atomic_load_explicit(tail, memory_order_acquire)

In assembly they look like this:

//set_64bit_val(wqe, 24, hdr);
atomic_store_explicit((_Atomic(uint64_t) *)(wqe + (24 >> 3)), hdr, memory_order_release);
                     2130:       49 89 5f 18             mov    %rbx,0x18(%r15)
/root/CVL-3.0-V26.4C00390/rdma-core-27.0/build/../providers/irdma/uk.c:747


/root/CVL-3.0-V26.4C00390/rdma-core-27.0/build/../providers/irdma/uk.c:123
        temp = atomic_load_explicit((_Atomic(__u64) *)qp->shadow_area, memory_order_acquire);
    1c32:       15 00 00 28 84          adc    $0x84280000,%eax


However, the following works:
 atomic_store_explicit(hdr, memory_order_seq_cst)

//set_64bit_val(wqe, 24, hdr);
 atomic_store_explicit((_Atomic(uint64_t) *)(wqe + (24 >> 3)), hdr,  memory_order_seq_cst);
    2130:       49 89 5f 18             mov    %rbx,0x18(%r15)
    2134:       0f ae f0                mfence
/root/CVL-3.0-V26.4C00390/rdma-core-27.0/build/../providers/irdma/uk.c:748
 

atomic_load_explicit(tail, memory_order_seq_cst) - same assembly as with memory_order_acquire
 
Thank you,
Tatyana

  reply	other threads:[~2021-09-02 16:27 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-05 16:11 [PATCH rdma-core] irdma: Restore full memory barrier for doorbell optimization Tatyana Nikolova
2021-08-06  1:28 ` Jason Gunthorpe
2021-08-09 20:07   ` Nikolova, Tatyana E
2021-08-10 11:59     ` Jason Gunthorpe
2021-08-13 22:25       ` Tatyana Nikolova
2021-08-18 16:49         ` Jason Gunthorpe
2021-08-19 22:01           ` Nikolova, Tatyana E
2021-08-19 22:44             ` Jason Gunthorpe
2021-09-02 16:27               ` Nikolova, Tatyana E [this message]
2021-09-02 17:09                 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR11MB4692BD4BB222DAA58C264000CBCE9@DM6PR11MB4692.namprd11.prod.outlook.com \
    --to=tatyana.e.nikolova@intel.com \
    --cc=dledford@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.