All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Vetter <daniel@ffwll.ch>
To: "Noralf Trønnes" <noralf@tronnes.org>
Cc: dri-devel <dri-devel@lists.freedesktop.org>,
	Laurent Pinchart <laurent.pinchart@ideasonboard.com>,
	David Lechner <david@lechnology.com>
Subject: Re: [PATCH 6/6] drm/tinydrm: Support device unplug
Date: Mon, 4 Sep 2017 17:20:04 +0200	[thread overview]
Message-ID: <20170904152004.3dm3wn6x6ezbmccp@phenom.ffwll.local> (raw)
In-Reply-To: <ee8f385e-02af-daad-9511-cc31303a3218@tronnes.org>

On Mon, Sep 04, 2017 at 02:30:05PM +0200, Noralf Trønnes wrote:
> 
> Den 04.09.2017 11.04, skrev Daniel Vetter:
> > On Mon, Sep 04, 2017 at 11:41:05AM +0300, Laurent Pinchart wrote:
> > > Hi Daniel,
> > > 
> > > On Monday, 4 September 2017 10:26:15 EEST Daniel Vetter wrote:
> > > > On Thu, Aug 31, 2017 at 09:22:03PM +0200, Noralf Trønnes wrote:
> > > > > Den 31.08.2017 14.59, skrev Laurent Pinchart:
> > > > > > On Wednesday, 30 August 2017 20:18:49 EEST Daniel Vetter wrote:
> > > > > > > On Wed, Aug 30, 2017 at 6:31 PM, Noralf Trønnes wrote:
> > > > > > > > Den 28.08.2017 23.56, skrev Daniel Vetter:
> > > > > > > > > On Mon, Aug 28, 2017 at 07:17:48PM +0200, Noralf Trønnes wrote:
> > > > > > > > > > +
> > > > > > > > > > +       drm_fbdev_cma_dev_unplug(tdev->fbdev_cma);
> > > > > > > > > > +       drm_dev_unplug(&tdev->drm);
> > > > > > > > > > +
> > > > > > > > > > +       /* Make sure framebuffer flushing is done */
> > > > > > > > > > +       mutex_lock(&tdev->dirty_lock);
> > > > > > > > > > +       mutex_unlock(&tdev->dirty_lock);
> > > > > > > > > Is this really needed? Or, doesn't it just paper over a driver bug
> > > > > > > > > you have already anyway, since native kms userspace can directly
> > > > > > > > > call fb->funcs->dirty too, and you already protect against that.
> > > > > > > > > 
> > > > > > > > > This definitely looks like the fbdev helper is leaking
> > > > > > > > > implementation details to callers where it shouldn't do that.
> > > > > > > > Flushing can happen while drm_dev_unplug() is called, and when we
> > > > > > > > leave this function the device facing resources controlled by devres
> > > > > > > > will be removed. Thus I have to make sure any such flushing is done
> > > > > > > > before leaving so the next flush is stopped by the
> > > > > > > > drm_dev_is_unplugged() check. I don't see any other way of ensuring
> > > > > > > > that.
> > > > > > > > 
> > > > > > > > I see now that I should move the call to
> > > > > > > > drm_atomic_helper_shutdown() after drm_dev_unplug() to properly
> > > > > > > > protect the pipe .enable/.disable callbacks.
> > > > > > > Hm, calling _shutdown when the hw is gone already won't end well.
> > > > > > > Fundamentally this race exists for all use-cases, and I'm somewhat
> > > > > > > leaning towards plugging it in the core.
> > > > > > > 
> > > > > > > The general solution probably involves something that smells a lot
> > > > > > > like srcu, i.e. at every possible entry point into a drm driver
> > > > > > > (ioctl, fbdev, dma-buf sharing, everything really) we take that
> > > > > > > super-cheap read-side look, and drop it when we leave.
> > > > > > That's similar to what we plan to do in V4L2. The idea is to set a
> > > > > > device removed flag at the beginning of the .remove() handler and wait
> > > > > > for all pending operations to complete. The core will reject any new
> > > > > > operation when the flag is set. To wait for completion, every entry
> > > > > > point would increase a use count, and decrease it on exit. When the use
> > > > > > count is decreased to 0 waiters will be woken up. This should solve the
> > > > > > unplug/user race.
> > > > > Ah, such a simple solution, easy to understand and difficult to get wrong!
> > > > > And it's even nestable, no danger of deadlocking.
> > > > > 
> > > > > Maybe I can use it with tinydrm:
> > > > > 
> > > > > * @dev_use: Tracks use of functions acessing the parent device.
> > > > > *           If it is zero, the device is gone. See ...
> > > > > struct tinydrm_device {
> > > > >      atomic_t dev_use;
> > > > > };
> > > > > 
> > > > > /**
> > > > >   * tinydrm_dev_enter - Enter device accessing function
> > > > >   * @tdev: tinydrm device
> > > > >   *
> > > > >   * This function protects against using a device and it's resources after
> > > > >   * it's removed. Should be called at the beginning of the function.
> > > > >   *
> > > > >   * Returns:
> > > > >   * False if the device is still present, true if it is gone.
> > > > >   */
> > > > > static inline bool tinydrm_dev_enter(struct tinydrm_device *tdev)
> > > > > {
> > > > >      return !atomic_inc_not_zero(&tdev->dev_use);
> > > > > }
> > > > > 
> > > > > static inline void tinydrm_dev_exit(struct tinydrm_device *tdev)
> > > > > {
> > > > >      atomic_dec(&tdev->dev_use);
> > > > > }
> > > > > 
> > > > > static inline bool tinydrm_is_unplugged(struct tinydrm_device *tdev)
> > > > > {
> > > > >      bool ret = !atomic_read(&tdev->dev_use);
> > > > >      smp_rmb();
> > > > >      return ret;
> > > > > }
> > > > > 
> > > > > 
> > > > > static int tinydrm_init(...)
> > > > > {
> > > > >      /* initialize */
> > > > > 
> > > > >      /* Set device is present */
> > > > >      atomic_set(&tdev->dev_use, 1);
> > > > > }
> > > > > 
> > > > > static void tinydrm_unregister(...)
> > > > > {
> > > > >      /* Set device gone */
> > > > >      atomic_dec(&tdev->dev_use);
> > > > > 
> > > > >      /* Wait for all device facing functions to finish */
> > > > >      while (!tinydrm_is_unplugged(tdev)) {
> > > > >          cond_resched();
> > > > >      }
> > > > > 
> > > > >      /* proceed with unregistering */
> > > > > }
> > > > > 
> > > > > static int mipi_dbi_fb_dirty(...)
> > > > > {
> > > > >      if (tinydrm_dev_enter(tdev))
> > > > >          return -ENODEV;
> > > > > 
> > > > >      /* flush framebuffer */
> > > > > 
> > > > >      tinydrm_dev_exit(tdev);
> > > > > }
> > > > Yup, expect imo this should be done in drm core (as much as possible, i.e.
> > > > for ioctl at least, standard sysfs), plus then enter/exit exported to
> > > > drivers for their stuff.
> > > > 
> > > > And it really needs to be srcu. For kms drivers the atomic_inc/dec wont
> > > > matter, for render drivers it will show up (some of our ioctls are 100%
> > > > lock less in the fastpath, not a single atomic op in there).
> > > Don't forget that we need to check the disconnect flag when entering ioctls to
> > > return -ENODEV. I'm open to clever solutions, but I'd first aim for
> > > correctness with a lock and then replace the internal implementation.
> > As Noralf pointed out, we already check for drm_dev_is_unplugged(). Maybe
> > not in all places, but that can be fixed.
> > 
> > And you can't first make all ioctl slower and then fix it up, at least not
> > spread over multiple patch series. I guess for developing, doing the
> > simpler atomic counter first is ok. But the same patch series needs to
> > move over to srcu at the end.
> 
> I've never used srcu/rcu before, care to explain a little bit more?
> Is it drm_device we are protecting here?
> 
> How can srcu be better for the fast path you're talking about if it's
> half of srcu (assuming that makes srcu slower)?

We're just protecting drm_device->unplugged with srcu, but the great thing
is that srcu_read_lock/unlock are extremely cheap, and still guarantee
that we can fully synchronize with writers.

Here's the rough sketch:

/* srcu is real cheap on the read side, we can have one for all of drm
DEFINE_STATIC_SRCU(drm_unplug_srcu);

drm_dev_enter()
{
	srcu_read_lock(&drm_srcu);
}

drm_dev_exit()
{
	srcu_read_unlock(&drm_srcu);
}

drm_dev_is_unplugged()
{
	/* checking unplugged state outside of holding the srcu read side
	 * lock is racy, catch this. */
	WARN_ON(!srcu_read_lock_head(&drm_unplug_srcu));

	/* with rcu we can demote this to a simple read, no need for any
	 * atomic or memory barries
	return dev->unplugged;
}

drm_dev_unplug()
{
	dev->unplugged = true;

	/* This is were the real magic is. After it finished any critical
	 * read section is guaranteed to see the new value of ->unplugged,
	 * and any critical section which might still have seen the old
	 * value of ->unplugged is guaranteed to have finished.
	 * It also takes like forever to complete, but that's kinda the
	 * point. */
	synchronize_srcu(&drm_unplug_srcu);
}

Excercise for the reader is poking holes into the simpler atomic_t based
approach and highlighting all the races. You probably need to reach the
complexity of srcu until it's really race free (and fast).

Oh, one more: please make sure you enable CONFIG_PROVE_LOCKING and
especilly all the rcu options in there when working on this, to make sure
we catch all the deadlocks and bugs. This is some really tricky locking
stuff.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2017-09-04 15:20 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-28 17:17 [PATCH 0/6] drm/tinydrm: Support device unplug Noralf Trønnes
2017-08-28 17:17 ` [PATCH 1/6] drm/fb-helper: Avoid NULL ptr dereference in fb_set_suspend() Noralf Trønnes
2017-08-28 21:34   ` Daniel Vetter
2017-08-31  9:30     ` Laurent Pinchart
2017-09-02 12:46       ` Noralf Trønnes
2017-08-28 17:17 ` [PATCH 2/6] drm/fb-helper: Support device unplug Noralf Trønnes
2017-08-28 21:41   ` Daniel Vetter
2017-08-29 16:17     ` Noralf Trønnes
2017-08-30  7:29       ` Daniel Vetter
2017-08-30 13:45         ` Noralf Trønnes
2017-08-28 17:17 ` [PATCH 3/6] drm/fb-cma-helper: " Noralf Trønnes
2017-08-28 21:46   ` Daniel Vetter
2017-08-29 17:23     ` Noralf Trønnes
2017-08-30  7:36       ` Daniel Vetter
2017-08-28 17:17 ` [PATCH 4/6] drm/tinydrm: Embed drm_device in tinydrm_device Noralf Trønnes
2017-08-28 21:47   ` Daniel Vetter
2017-08-29 19:09   ` David Lechner
2017-08-31 10:18   ` Laurent Pinchart
2017-08-31 17:16     ` Noralf Trønnes
2017-09-01  7:28       ` Laurent Pinchart
2017-09-01 18:46         ` Noralf Trønnes
2017-08-28 17:17 ` [PATCH 5/6] drm/tinydrm/mi0283qt: Let the display pipe handle power Noralf Trønnes
2017-08-28 17:17 ` [PATCH 6/6] drm/tinydrm: Support device unplug Noralf Trønnes
2017-08-28 21:56   ` Daniel Vetter
2017-08-30 16:31     ` Noralf Trønnes
2017-08-30 17:18       ` Daniel Vetter
2017-08-31 12:59         ` Laurent Pinchart
2017-08-31 19:22           ` Noralf Trønnes
2017-09-01  8:38             ` Laurent Pinchart
2017-09-02 20:59               ` Noralf Trønnes
2017-09-04  7:26             ` Daniel Vetter
2017-09-04  8:41               ` Laurent Pinchart
2017-09-04  9:04                 ` Daniel Vetter
2017-09-04  9:38                   ` Laurent Pinchart
2017-09-04 12:30                   ` Noralf Trønnes
2017-09-04 15:20                     ` Daniel Vetter [this message]
2017-09-04 15:54                       ` Noralf Trønnes
2017-09-04 16:39                         ` Daniel Vetter
2017-08-28 21:58 ` [PATCH 0/6] " Daniel Vetter
2017-08-29 18:05   ` Noralf Trønnes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170904152004.3dm3wn6x6ezbmccp@phenom.ffwll.local \
    --to=daniel@ffwll.ch \
    --cc=david@lechnology.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=noralf@tronnes.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.