ksummit.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>,
	Mark Brown <broonie@kernel.org>,
	Linus Walleij <linus.walleij@linaro.org>,
	Roland Dreier <roland@kernel.org>,
	Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>,
	ksummit@lists.linux.dev, Daniel Vetter <daniel@ffwll.ch>
Subject: cdev/devm_* issues (was Re: [TECH TOPIC] Rust for Linux)
Date: Thu, 8 Jul 2021 08:49:39 +0200	[thread overview]
Message-ID: <YOagA4bgdGYos5aa@kroah.com> (raw)
In-Reply-To: <YOXhlDsMAZUn1EBg@pendragon.ideasonboard.com>

On Wed, Jul 07, 2021 at 08:17:08PM +0300, Laurent Pinchart wrote:
> On Wed, Jul 07, 2021 at 02:45:04PM +0200, Greg KH wrote:
> > On Wed, Jul 07, 2021 at 01:38:44PM +0100, James Bottomley wrote:
> > > On Wed, 2021-07-07 at 14:20 +0200, Greg KH wrote:
> > > > On Wed, Jul 07, 2021 at 12:34:31PM +0100, James Bottomley wrote:
> > > > > On Wed, 2021-07-07 at 11:50 +0100, Mark Brown wrote:
> > > > > > On Tue, Jul 06, 2021 at 10:36:14PM +0200, Linus Walleij wrote:
> > > > > > > On Tue, Jul 6, 2021 at 10:00 PM Roland Dreier wrote:
> > > > > > > > "devres" / devm_xxx was an attempt to deal with this in C,
> > > > > > > > but it only solves some cases of this and has not received a
> > > > > > > > lot of adoption (we can argue about the reasons).
> > > > > > >  
> > > > > > > Really? From my point of view that is adopted all over the map.
> > > > > > > I add new users all the time and use it as much as I can when
> > > > > > > writing new drivers.
> > > > > > 
> > > > > > Yes, it's *super* widely used in most of the kernel.  Perhaps
> > > > > > there's some subsystems that reject it for some reason.
> > > > > > 
> > > > > > > I think it's a formidable success, people just need to learn to
> > > > > > > do it more.
> > > > > > 
> > > > > > There *are* issues with people adopting it too enthusiastically -
> > > > > > as well as the memory lifetime issues that Laurent mentioned it's
> > > > > > easy for it to cause problems with interrupt handlers that are
> > > > > > left live longer than they should be and try to use things that
> > > > > > were already deallocated.
> 
> (CC'ing Daniel Vetter as the author of the DRM devres-based resource
> management)
> 
> I've given this lots of thoughts lately, in the context of V4L2, but it
> should be roughly the same for character devices in general. Here's what
> I think should be done for drivers that expose character devices.
> 
> - Drivers must stop allocating their internal data structure (the one
>   they set as device data with dev_set_drvdata()) with devm_kzalloc().
>   The data structure must instead be allocated with a plain kzalloc()
>   and reference-counted.
> 
>   Most drivers will register a single character device using a
>   subsystem-specific API (e.g. video_register_device() in V4L2). The
>   subsystem needs to provide a .release() callback, called when the
>   last reference to the character device is released. Drivers must
>   implement this, and can simply free their internal data structure at
>   this point.
> 
>   For drivers that register multiple character devices, or in general
>   expose multiple interfaces to userspace or other parts of the kernel,
>   the internal data structure must be properly reference-counted, with a
>   reference released in each .release() callback. There may be ways to
>   simplify this.
> 
>   This can be seen as going back to the pre-devm_kzalloc() era, but it's
>   only about undoing a mistake that was done way too often (to be fair,
>   many drivers used to just kfree() the data in the driver's .remove()
>   operation, so it wasn't always a regression, only enshrining a
>   preexisting bad practice).
> 
>   This is only part of the puzzle though. There's a remove/use race that
>   still needs to be solved.
> 
> - In .remove(), drivers must inform the character device that new access
>   from userspace are not allowed anymore. This would set a flag in
>   struct cdev that would result in all new calls from userspace through
>   file operations to be rejected (with -ENXIO for instance). This should
>   be wrapped in subsystem-specific functions (e.g.
>   video_device_disconnect() wrapping cdev_disconnect()). From now on, no
>   new calls are possible, but existing calls may be in progress.
> 
> - Drivers then need to cancel all pending I/O and wait for completion.
>   I/O (either direct memory-mapped I/O or through bus APIs, such as I2C
>   or USB transactions) are not allowed anymore after .remove() returns.
>   This will have the side effect of waking up userspace contexts that
>   are waiting for I/O completion (through a blocking file I/O operation
>   for instance). Existing calls from userspace will start completing.
> 
>   This is also a good place to start shutting down the device, for
>   instance disabling interrupts.
> 
> - The next step is for drivers to wait until all calls from userspace
>   complete. This should be done with a call count in struct cdev that is
>   updated upon entry and exit of calls from userspace, and a wait queue
>   to wait for that count to go to 0. This should be wrapped in
>   subsystem-specific APIs. As the flag that indicates device removal is
>   set, no new calls are allowed so the counter can only decrease, and as
>   all pending I/O have terminated or have been cancelled, no pending
>   calls should be blocked.
> 
> - Finally, drivers should unregister the character device, through the
>   appropriate subsystem API.
> 
>   At this point, memory mappings and file handles referencing the device
>   may still exist, but no file operation is in progress. The device is
>   quiescent.
> 
>   Care needs to be taken in drivers and subsystems to not start any I/O
>   operation when handling the file .release() operation or the
>   destruction of memory mappings. Overall I don't expect much issues,
>   but I'm sure some drivers do strange things in those code paths.
> 
> - When the least memory mapping is gone and the last file handle is
>   closed, the subsystem will call the driver's .release() callback. At
>   this point, the driver will perform the operations listed in the first
>   item of this list.
> 
> 
> The challenge here will be to make this as easy as possible for drivers,
> to avoid risk of drivers getting it wrong. The DRM/KMS subsystem has
> created a devres-based system to handle this, with the devres associated
> with the drm_device (the abstraction of the cdevs exposed by DRM to
> userspace), *not* the physical device. It has a drawback though, it
> assumes that a DRM driver will only ever want to register a drm_device
> and nothing else, and hardcodes that assumption in the way it releases
> resources. That's fine for most DRM drivers, but if a driver was to
> register a drm_device and something else (such as a V4L2 video_device,
> an input device, ...), the DRM subsystem will get in the way.
> 
> I have two questions:
> 
> - Does the above make sense ?

Yes, totally, thank you for taking the time to write this all out.  It's
been in the back of my mind for over a decade that we need to work on
these issues, but have not had any time to sit down and write it all
down like you have.

> - Assuming it does, how do we get from the current mess to a situation
>   where writing a driver will be a pleasure, not a punishment ? :-)

That's the real question.  Thanks to a lot of cleanups that Christoph
has recently done to the lower-level cdev code, the lower levels are now
in a shape where we can work on them better.  I'm going to try to carve
out some time in the next few months to start to work on these things.
I think that once we get the ideas of what needs to be done, and a
working core change, I can unleash some interns on doing tree-wide
cleanups/changes to help bring everything into alignment.

I'm going to save this email off for reference for later.  But of
course, if others want to start to attack this earlier, all the better :)

thanks,

greg k-h



> 
> > > > > Fixing this should not be beyond the wit of humankind, though.  If
> > > > > we delayed deallocation to module release, that would fix the
> > > > > interrupt issue, wouldn't it?  Perhaps all devres memory for
> > > > > devices should live until then anyway and thus be automatically
> > > > > deallocated instead of needing an explicit free ... the problem
> > > > > with that being compiled in devices currently optimize away the
> > > > > module refcounting, but that should be fixable.
> > > > 
> > > > module code lifespans are different than device structure lifespans,
> > > > it's when people get them confused, as here, that we have problems.
> > > 
> > > I'm not claiming they are.  What I am claiming is that module lifetime
> > > must always encompass the device lifetimes.  Therefore, you can never
> > > be incorrect by using a module lifetime for anything attached to a
> > > device, just inefficient for using memory longer than potentially
> > > needed.  However, in a lot of use cases, the device is created on
> > > module init and destroyed on module exit, so the inefficiency is barely
> > > noticeable.
> > 
> > In almost no use case is the device created on module init of a driver.
> > devices are created by busses, look at the USB code as an example.  The
> > usb bus creates the devices and then individual modules bind to that
> > device as needed.  If the device is removed from the system, wonderful,
> > the device is unbound, but the module is still loaded.  So if you really
> > wanted to, with your change, just do a insert/remove of a USB device a
> > few zillion times and then memory is all gone :(
> > 
> > > The question I'm asking is shouldn't we optimize for this?
> > 
> > No.
> > 
> > > so let people allocate devm memory safe in the knowledge it will be
> > > freed on module release?
> > 
> > No, see above.  Modules are never removed in a normal system.  devices
> > are.
> > 
> > And the drm developers have done great work in unwinding some of these
> > types of mistakes in their drivers, let's not go backwards please.
> 
> -- 
> Regards,
> 
> Laurent Pinchart

  reply	other threads:[~2021-07-08  6:49 UTC|newest]

Thread overview: 200+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-25 22:09 [TECH TOPIC] Rust for Linux Miguel Ojeda
2021-07-05 23:51 ` Linus Walleij
2021-07-06  4:30   ` Leon Romanovsky
2021-07-06  9:55     ` Linus Walleij
2021-07-06 10:16       ` Geert Uytterhoeven
2021-07-06 17:59         ` Linus Walleij
2021-07-06 18:36           ` Miguel Ojeda
2021-07-06 19:12             ` Linus Walleij
2021-07-06 21:32               ` Miguel Ojeda
2021-07-07 14:10             ` Arnd Bergmann
2021-07-07 15:28               ` Miguel Ojeda
2021-07-07 15:50                 ` Andrew Lunn
2021-07-07 16:34                   ` Miguel Ojeda
2021-07-07 16:55                 ` Arnd Bergmann
2021-07-07 17:54                   ` Miguel Ojeda
2021-07-06 10:22       ` Leon Romanovsky
2021-07-06 14:30       ` Miguel Ojeda
2021-07-06 14:32         ` Miguel Ojeda
2021-07-06 15:03         ` Sasha Levin
2021-07-06 15:33           ` Miguel Ojeda
2021-07-06 15:42             ` Laurent Pinchart
2021-07-06 16:09               ` Mike Rapoport
2021-07-06 18:29               ` Miguel Ojeda
2021-07-06 18:38                 ` Laurent Pinchart
2021-07-06 19:45                   ` Steven Rostedt
2021-07-06 19:59                   ` Miguel Ojeda
2021-07-06 18:53             ` Sasha Levin
2021-07-06 21:50               ` Miguel Ojeda
2021-07-07  4:57                 ` Leon Romanovsky
2021-07-07 13:39                 ` Alexandre Belloni
2021-07-07 13:50                   ` Miguel Ojeda
2021-07-06 18:26         ` Linus Walleij
2021-07-06 19:11           ` Miguel Ojeda
2021-07-06 19:13         ` Johannes Berg
2021-07-06 19:43           ` Miguel Ojeda
2021-07-06 10:20     ` James Bottomley
2021-07-06 14:55       ` Miguel Ojeda
2021-07-06 15:01         ` Sasha Levin
2021-07-06 15:36           ` Miguel Ojeda
2021-07-09 10:02         ` Marco Elver
2021-07-09 16:02           ` Miguel Ojeda
2021-07-06 18:09       ` Linus Walleij
2021-07-06 14:24     ` Miguel Ojeda
2021-07-06 14:33       ` Laurent Pinchart
2021-07-06 14:56       ` Leon Romanovsky
2021-07-06 15:29         ` Miguel Ojeda
2021-07-07  4:38           ` Leon Romanovsky
2021-07-06 20:00   ` Roland Dreier
2021-07-06 20:36     ` Linus Walleij
2021-07-06 22:00       ` Laurent Pinchart
2021-07-07  7:27         ` Julia Lawall
2021-07-07  7:45           ` Greg KH
2021-07-07  7:52             ` James Bottomley
2021-07-07 13:49               ` Miguel Ojeda
2021-07-07 14:08                 ` James Bottomley
2021-07-07 15:15                   ` Miguel Ojeda
2021-07-07 15:44                     ` Greg KH
2021-07-07 17:01                       ` Wedson Almeida Filho
2021-07-07 17:20                         ` Greg KH
2021-07-07 19:19                           ` Wedson Almeida Filho
2021-07-07 20:38                             ` Jan Kara
2021-07-07 23:09                               ` Wedson Almeida Filho
2021-07-08  6:11                                 ` Greg KH
2021-07-08 13:36                                   ` Wedson Almeida Filho
2021-07-08 18:51                                     ` Greg KH
2021-07-08 19:31                                       ` Andy Lutomirski
2021-07-08 19:35                                         ` Geert Uytterhoeven
2021-07-08 21:56                                           ` Andy Lutomirski
2021-07-08 19:49                                       ` Linus Walleij
2021-07-08 20:34                                         ` Miguel Ojeda
2021-07-08 22:13                                           ` Linus Walleij
2021-07-09  7:24                                             ` Geert Uytterhoeven
2021-07-19 12:24                                             ` Wedson Almeida Filho
2021-07-19 13:15                                               ` Wedson Almeida Filho
2021-07-19 14:02                                                 ` Arnd Bergmann
2021-07-19 14:13                                                   ` Linus Walleij
2021-07-19 21:32                                                     ` Arnd Bergmann
2021-07-19 21:33                                                     ` Arnd Bergmann
2021-07-20  1:46                                                       ` Miguel Ojeda
2021-07-20  6:43                                                         ` Johannes Berg
2021-07-19 14:43                                                   ` Geert Uytterhoeven
2021-07-19 18:24                                                     ` Miguel Ojeda
2021-07-19 18:47                                                       ` Steven Rostedt
2021-07-19 14:54                                                   ` Miguel Ojeda
2021-07-19 17:32                                                   ` Wedson Almeida Filho
2021-07-19 21:31                                                     ` Arnd Bergmann
2021-07-19 17:37                                                   ` Miguel Ojeda
2021-07-19 16:02                                                 ` Vegard Nossum
2021-07-19 17:45                                                   ` Miguel Ojeda
2021-07-19 17:54                                                     ` Miguel Ojeda
2021-07-19 18:06                                                   ` Wedson Almeida Filho
2021-07-19 19:37                                                     ` Laurent Pinchart
2021-07-19 21:09                                                       ` Wedson Almeida Filho
2021-07-20 23:54                                                         ` Laurent Pinchart
2021-07-21  1:33                                                           ` Andy Lutomirski
2021-07-21  1:42                                                             ` Laurent Pinchart
2021-07-21 13:54                                                               ` Linus Walleij
2021-07-21 14:13                                                                 ` Wedson Almeida Filho
2021-07-21 14:19                                                                   ` Linus Walleij
2021-07-22 11:33                                                                     ` Wedson Almeida Filho
2021-07-23  0:45                                                                       ` Linus Walleij
2021-07-21  4:39                                                             ` Wedson Almeida Filho
2021-07-23  1:04                                                               ` Laurent Pinchart
2021-07-21  4:23                                                           ` Wedson Almeida Filho
2021-07-23  1:13                                                             ` Laurent Pinchart
2021-07-19 22:57                                                 ` Alexandre Belloni
2021-07-20  7:15                                                   ` Miguel Ojeda
2021-07-20  9:39                                                     ` Alexandre Belloni
2021-07-20 12:10                                                       ` Miguel Ojeda
2021-07-19 13:53                                               ` Linus Walleij
2021-07-19 14:42                                                 ` Wedson Almeida Filho
2021-07-19 22:16                                                   ` Linus Walleij
2021-07-20  1:20                                                     ` Wedson Almeida Filho
2021-07-20 13:21                                                       ` Andrew Lunn
2021-07-20 13:38                                                         ` Miguel Ojeda
2021-07-20 14:04                                                           ` Andrew Lunn
2021-07-20 13:55                                                         ` Greg KH
2021-07-20  1:21                                                     ` Miguel Ojeda
2021-07-20 16:00                                                       ` Mark Brown
2021-07-20 22:42                                                       ` Linus Walleij
2021-07-19 14:43                                                 ` Miguel Ojeda
2021-07-19 15:15                                                   ` Andrew Lunn
2021-07-19 15:43                                                     ` Miguel Ojeda
2021-07-09  7:03                                         ` Viresh Kumar
2021-07-09 17:06                                         ` Mark Brown
2021-07-09 17:43                                           ` Miguel Ojeda
2021-07-10  9:53                                             ` Jonathan Cameron
2021-07-10 20:09                                         ` Kees Cook
2021-07-08 13:55                                   ` Miguel Ojeda
2021-07-08 14:58                                     ` Greg KH
2021-07-08 15:02                                       ` Mark Brown
2021-07-08 16:38                                       ` Andy Lutomirski
2021-07-08 18:01                                         ` Greg KH
2021-07-08 18:00                                       ` Miguel Ojeda
2021-07-08 18:44                                         ` Greg KH
2021-07-08 23:09                                           ` Miguel Ojeda
2021-07-08  7:20                                 ` Geert Uytterhoeven
2021-07-08 13:41                                   ` Wedson Almeida Filho
2021-07-08 13:43                                     ` Geert Uytterhoeven
2021-07-08 13:54                                       ` Wedson Almeida Filho
2021-07-08 14:16                                         ` Geert Uytterhoeven
2021-07-08 14:24                                           ` Wedson Almeida Filho
2021-07-09  7:04                                             ` Jerome Glisse
2021-07-08 14:04                                       ` Miguel Ojeda
2021-07-08 14:18                                         ` Geert Uytterhoeven
2021-07-08 14:28                                           ` Miguel Ojeda
2021-07-08 14:33                                             ` Geert Uytterhoeven
2021-07-08 14:35                                               ` Miguel Ojeda
2021-07-09 11:55                                                 ` Geert Uytterhoeven
2021-07-08 16:07                                               ` Andy Lutomirski
2021-07-07 20:58                           ` Miguel Ojeda
2021-07-07 21:47                             ` Laurent Pinchart
2021-07-07 22:44                               ` Miguel Ojeda
2021-07-07 17:01           ` Miguel Ojeda
2021-07-07 10:50       ` Mark Brown
2021-07-07 10:56         ` Julia Lawall
2021-07-07 11:27           ` James Bottomley
2021-07-07 11:34         ` James Bottomley
2021-07-07 12:20           ` Greg KH
2021-07-07 12:38             ` James Bottomley
2021-07-07 12:45               ` Greg KH
2021-07-07 17:17                 ` Laurent Pinchart
2021-07-08  6:49                   ` Greg KH [this message]
2021-07-08  8:23                     ` cdev/devm_* issues (was Re: [TECH TOPIC] Rust for Linux) Laurent Pinchart
2021-07-08 23:06                     ` Linus Walleij
2021-07-09  0:02                       ` Dan Williams
2021-07-09 16:53                       ` Wedson Almeida Filho
2021-07-13  8:59                         ` Linus Walleij
     [not found]                           ` <CAHp75VfW7PxAyU=eYPNWFU_oUY=aStz-4W5gX87KSo402YhMXQ@mail.gmail.com>
2021-07-21 13:46                             ` Linus Walleij
2021-07-21 15:49                               ` Andy Shevchenko
2021-07-10  7:09                     ` Dan Carpenter
2021-07-12 13:42                       ` Jason Gunthorpe
2021-07-15  9:54                     ` Daniel Vetter
2021-07-21  9:08                       ` Dan Carpenter
2021-07-22  9:56                         ` Daniel Vetter
2021-07-22 10:09                           ` Dan Carpenter
2021-07-08  9:08                   ` [TECH TOPIC] Rust for Linux Mauro Carvalho Chehab
2021-07-10 16:42                     ` Laurent Pinchart
2021-07-10 17:18                       ` Andy Lutomirski
2021-07-07 15:17           ` Mark Brown
2021-07-06 21:45     ` Bart Van Assche
2021-07-06 23:08       ` Stephen Hemminger
2021-07-07  2:41         ` Bart Van Assche
2021-07-07 18:57           ` Linus Torvalds
2021-07-07 20:32             ` Bart Van Assche
2021-07-07 20:39               ` Linus Torvalds
2021-07-07 21:40                 ` Laurent Pinchart
2021-07-08  7:22                 ` Geert Uytterhoeven
2021-07-07 21:02               ` Laurent Pinchart
2021-07-07 22:11               ` Miguel Ojeda
2021-07-07 22:43                 ` Laurent Pinchart
2021-07-07 23:21                   ` Miguel Ojeda
2021-07-07 23:40                     ` Laurent Pinchart
2021-07-08  0:27                       ` Miguel Ojeda
2021-07-08  0:56                         ` Laurent Pinchart
2021-07-08  6:26             ` Alexey Dobriyan
2021-07-06 19:05 ` Bart Van Assche
2021-07-06 19:27   ` Miguel Ojeda
2021-07-07 15:48 ` Steven Rostedt
2021-07-07 16:44   ` Miguel Ojeda

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YOagA4bgdGYos5aa@kroah.com \
    --to=greg@kroah.com \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=broonie@kernel.org \
    --cc=daniel@ffwll.ch \
    --cc=ksummit@lists.linux.dev \
    --cc=laurent.pinchart@ideasonboard.com \
    --cc=linus.walleij@linaro.org \
    --cc=miguel.ojeda.sandonis@gmail.com \
    --cc=roland@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).