linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-cxl@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Vishal L Verma <vishal.l.verma@intel.com>,
	"Weiny, Ira" <ira.weiny@intel.com>,
	"Schofield, Alison" <alison.schofield@intel.com>
Subject: Re: [PATCH v2 2/4] cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations
Date: Tue, 30 Mar 2021 16:26:08 -0300	[thread overview]
Message-ID: <20210330192608.GA1430856@nvidia.com> (raw)
In-Reply-To: <CAPcyv4igMvwfZNgi-Uap_QUJi+uocMUD3KZBhXUy56AuHZQtqw@mail.gmail.com>

On Tue, Mar 30, 2021 at 12:00:23PM -0700, Dan Williams wrote:

> > > > IMHO this can't use 'dev->kobj.state_in_sysfs' as the RCU protected data.
> > >
> > > This usage of srcu is functionally equivalent to replacing
> > > srcu_read_lock() with down_read() and the shutdown path with:
> >
> > Sort of, but the rules for load/store under RCU are different than for
> > load/store under a normal barriered lock. All the data is unstable for
> > instance and minimially needs READ_ONCE.
> 
> The data is unstable under the srcu_read_lock until the end of the
> next rcu grace period, synchronize_rcu() ensures all active
> srcu_read_lock() sections have completed.

No, that isn't how I would phrase it. *any* write side data under RCU
is always unstable by definition in the read side because the write
side can always race with any reader. Thus one should use the RCU
accessors to deal with that data race, and get some acquire/release
semantics when pointer chasing (though this doesn't matter here)

> Unless Paul and I misunderstood each other, this scheme of
> synchronizing object state is also used in kill_dax(), and I put
> that comment there the last time this question was raised. If srcu
> was being used to synchronize the liveness of an rcu object like
> @cxlm or a new ops object then I would expect rcu_dereference +
> rcu_assign_pointer around usages of that object. The liveness of the
> object in this case is handled by kobject reference, or inode
> reference in the case of kill_dax() outside of srcu.

It was probably a mis-understanding as I don't think Paul would say
you should read data in thread A and write to it in B without using
READ_ONCE/WRITE_ONCE or a stronger atomic to manage the data race.

The LWN articles on the "big bad compiler" are informative here. You
don't want the compiler to do a transformation where it loads
state_in_sysfs multiple times and gets different answers. This is what
READ_ONCE is aiming to prevent.

Here is it just a boolean flag, and the flag is only cleared, so risks
are low, but it still isn't a technically correct way to use RCU.

(and yes the kernel is full of examples of not following the memory
model strictly)

> > > cdev_device_del(...);
> > > down_write(...):
> > > up_write(...);
> >
> > The lock would have to enclose the store to state_in_sysfs, otherwise
> > as written it has the same data race problems.
> 
> There's no race above. The rule is that any possible observation of
> ->state_in_sysfs == 1, or rcu_dereference() != NULL, must be
> flushed.

It is not about the flushing.

Jason

  reply	other threads:[~2021-03-30 19:26 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-30  2:47 [PATCH v2 0/4] cxl/mem: Fix memdev device setup Dan Williams
2021-03-30  2:47 ` [PATCH v2 1/4] cxl/mem: Use sysfs_emit() for attribute show routines Dan Williams
2021-03-30  2:47 ` [PATCH v2 2/4] cxl/mem: Fix synchronization mechanism for device removal vs ioctl operations Dan Williams
2021-03-30 11:16   ` Jason Gunthorpe
2021-03-30 15:37     ` Dan Williams
2021-03-30 15:47       ` Jason Gunthorpe
2021-03-30 16:05         ` Dan Williams
2021-03-30 17:02           ` Jason Gunthorpe
2021-03-30 17:31             ` Dan Williams
2021-03-30 17:54               ` Jason Gunthorpe
2021-03-30 19:00                 ` Dan Williams
2021-03-30 19:26                   ` Jason Gunthorpe [this message]
2021-03-30 19:43                     ` Dan Williams
2021-03-30 19:51                       ` Jason Gunthorpe
2021-03-30 21:00                         ` Dan Williams
2021-03-30 22:09                           ` Jason Gunthorpe
2021-03-30  2:47 ` [PATCH v2 3/4] cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures Dan Williams
2021-03-30  2:48 ` [PATCH v2 4/4] cxl/mem: Disable cxl device power management Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210330192608.GA1430856@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=alison.schofield@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).