linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Nikolova, Tatyana E" <tatyana.e.nikolova@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: "dledford@redhat.com" <dledford@redhat.com>,
	"leon@kernel.org" <leon@kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>
Subject: RE: [PATCH rdma-core] irdma: Restore full memory barrier for doorbell optimization
Date: Thu, 2 Sep 2021 16:27:37 +0000	[thread overview]
Message-ID: <DM6PR11MB4692BD4BB222DAA58C264000CBCE9@DM6PR11MB4692.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20210819224408.GE1721383@nvidia.com>



> -----Original Message-----
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Thursday, August 19, 2021 5:44 PM
> To: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com>
> Cc: dledford@redhat.com; leon@kernel.org; linux-rdma@vger.kernel.org
> Subject: Re: [PATCH rdma-core] irdma: Restore full memory barrier for
> doorbell optimization
> 
> On Thu, Aug 19, 2021 at 10:01:50PM +0000, Nikolova, Tatyana E wrote:
> >
> >
> > > From: Jason Gunthorpe <jgg@nvidia.com>
> > > Sent: Wednesday, August 18, 2021 11:50 AM
> > > To: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com>
> > > Cc: dledford@redhat.com; leon@kernel.org; linux-rdma@vger.kernel.org
> > > Subject: Re: [PATCH rdma-core] irdma: Restore full memory barrier
> > > for doorbell optimization
> > >
> > > On Fri, Aug 13, 2021 at 05:25:49PM -0500, Tatyana Nikolova wrote:
> > > > >> 1.	Software writing the valid bit in the WQE.
> > > > >> 2.	Software reading shadow memory (hw_tail) value.
> > > >
> > > > > You are missing an ordered atomic on this read it looks like
> > > >
> > > > Hi Jason,
> > > >
> > > > Why do you think we need atomic ops in this case? We aren't trying
> > > > to protect from multiple threads but CPU re-ordering of a write
> > > > and a read.
> > >
> > > Which is what the atomics will do.
> > >
> > > Barriers are only appropriate when you can't add atomic markers to
> > > the actual data that needs ordering.
> >
> > Hi Jason,
> >
> > We aren't sure what you mean by atomic markers. We ran a few
> > experiments with atomics, but none of the barriers we tried
> > smp_mb__{before,after}_atomic(), smp_load_acquire() and
> > smp_store_release() translates to a full memory barrier on X86.
> 
> Huh? Those are kernel primitives, this is a userspace patch.
> 
> Userspace follows the C11 atomics memory model.
> 
> So I'd expect
> 
>   atomic_store_explicit(tail, memory_order_release)
>   atomic_load_explicit(tail, memory_order_acquire)
> 
> To be the atomics you need. This will ensure that the read/writes to valid
> before the atomics are sequenced correctly, eg no CPU thread can observe
> an updated tail without also observing the set valid.
> 

Hi Jason,

We tried these atomic ops as shown bellow, but they don't fix the issue.

atomic_store_explicit(hdr, memory_order_release) atomic_load_explicit(tail, memory_order_acquire)

In assembly they look like this:

//set_64bit_val(wqe, 24, hdr);
atomic_store_explicit((_Atomic(uint64_t) *)(wqe + (24 >> 3)), hdr, memory_order_release);
                     2130:       49 89 5f 18             mov    %rbx,0x18(%r15)
/root/CVL-3.0-V26.4C00390/rdma-core-27.0/build/../providers/irdma/uk.c:747


/root/CVL-3.0-V26.4C00390/rdma-core-27.0/build/../providers/irdma/uk.c:123
        temp = atomic_load_explicit((_Atomic(__u64) *)qp->shadow_area, memory_order_acquire);
    1c32:       15 00 00 28 84          adc    $0x84280000,%eax


However, the following works:
 atomic_store_explicit(hdr, memory_order_seq_cst)

//set_64bit_val(wqe, 24, hdr);
 atomic_store_explicit((_Atomic(uint64_t) *)(wqe + (24 >> 3)), hdr,  memory_order_seq_cst);
    2130:       49 89 5f 18             mov    %rbx,0x18(%r15)
    2134:       0f ae f0                mfence
/root/CVL-3.0-V26.4C00390/rdma-core-27.0/build/../providers/irdma/uk.c:748
 

atomic_load_explicit(tail, memory_order_seq_cst) - same assembly as with memory_order_acquire
 
Thank you,
Tatyana

  reply	other threads:[~2021-09-02 16:27 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-05 16:11 [PATCH rdma-core] irdma: Restore full memory barrier for doorbell optimization Tatyana Nikolova
2021-08-06  1:28 ` Jason Gunthorpe
2021-08-09 20:07   ` Nikolova, Tatyana E
2021-08-10 11:59     ` Jason Gunthorpe
2021-08-13 22:25       ` Tatyana Nikolova
2021-08-18 16:49         ` Jason Gunthorpe
2021-08-19 22:01           ` Nikolova, Tatyana E
2021-08-19 22:44             ` Jason Gunthorpe
2021-09-02 16:27               ` Nikolova, Tatyana E [this message]
2021-09-02 17:09                 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR11MB4692BD4BB222DAA58C264000CBCE9@DM6PR11MB4692.namprd11.prod.outlook.com \
    --to=tatyana.e.nikolova@intel.com \
    --cc=dledford@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).