From: "Winkler, Tomas" <tomas.winkler@intel.com> To: Richard Weinberger <richard.weinberger@gmail.com> Cc: Vignesh Raghavendra <vigneshr@ti.com>, Miquel Raynal <miquel.raynal@bootlin.com>, Richard Weinberger <richard@nod.at>, "intel-gfx@lists.freedesktop.org" <intel-gfx@lists.freedesktop.org>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, "Usyskin, Alexander" <alexander.usyskin@intel.com>, Jani Nikula <jani.nikula@linux.intel.com>, "linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>, "Vivi, Rodrigo" <rodrigo.vivi@intel.com>, "Lubart, Vitaly" <vitaly.lubart@intel.com> Subject: RE: [RFC PATCH 0/9] drm/i915/spi: discrete graphics internal spi Date: Wed, 17 Feb 2021 08:34:53 +0000 [thread overview] Message-ID: <cb20e706d494458a8957252eeacfb1da@intel.com> (raw) In-Reply-To: <CAFLxGvwP5-O5DHQ07Fs_GnG12dsK24mer8LJfhz2z2UqW9e5cQ@mail.gmail.com> > > )On Tue, Feb 16, 2021 at 7:26 PM Tomas Winkler <tomas.winkler@intel.com> > wrote: > > Because the graphic card may undergo reset at any time and basically > > hot unplug all its child devices, this series also provides a fix to > > the mtd framework to make the reset graceful. > > Well, just because MTD does not work as you expect, it is not broken. :-) I'm not saying it's broken by design it just didn't fit this use case. > > In your case i915_spi_remove() blindly removes the MTD, this is not allowed. > You may remove the MTD only if there are no more users. I'm not sure it's good idea to stall the removal on user space. This is just asking for a deadlock as user space is not getting what it needs and may stall I think it's better the user space will fail gracefully the hw is not accessible in that stage anyway. > > The current model in MTD is that the driver is in charge of all life cycle > management. > Using ->_get_device() and ->_put_device() a driver can implement > refcounting and deny new users if the MTD is about to disappear. Please note that this use case you are describing is still valid, I haven't removed _get_device() _put_device() handlers, You can still stall the removal of mtd, If this is not that way it's a bug > > In the upcoming MUSE driver that mechanism is used too. > MUSE allows to implement a MTD in userspace. So the FUSE server can > disappear at > *any* time. Just like in your case. Even worse, it can be hostile. > In MUSE the MTD life time is tied to the FUSE connection object, > muse_mtd_get_device() > increments the FUSE connection refcount, and muse_mtd_put_device() > decrements it. > That means if the FUSE server disappears all of a sudden but the MTD still has > users, the MTD will stay. But in this state no new references are allowed and > all MTD operations of existing users will fail with -ENOTCONN (via FUSE). > As soon the last user is gone (can be userspace via /dev/mtd* or a in-kernel > user such as UBIFS), the MTD will be removed. But in our case whole i915 is taken hostage, it cannot reset because of misbehaving user space. > For the full details, please see: > https://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git/tree/fs/fuse/m > use.c?h=muse_v3#n1034 > > Is in your case *really* not possible to do it that way? Maybe it's possible but I don't think it's good to stall i915 removal. Also It's very easily to crash the kernel. I've posted a sniped to the mailing list that tried to do that, the kernel still has crashed. Can you looked at? > On the other hand, your last patch moves some part of the life cycle > management into MTD core. > The MTD will stay as long it has users. > But that's only one part. The driver is still in charge to make sure that all > operations fail immediately and that no new users arrive. I think that case I would need to validate every HW access to make sure it's still valid. > If we want to do all in MTD core we'd have to do it like SCSI disks. > That means having devices states such as SDEV_RUNNING, SDEV_CANCEL, > SDEV_OFFLINE, .... > That way the MTD could be shutdown gracefully, first no new users are > allowed, then ongoing operations will be cancelled, next all operation will fail > with -EIO or such, then the device is being removed from sysfs and finally if > the last user is gone, the MTD can be removed. Isn't that already that way? You cannot open new handler. That I would need more of your insights. > > I'm not sure whether we want to take that path. Thanks Tomas ______________________________________________________ Linux MTD discussion mailing list http://lists.infradead.org/mailman/listinfo/linux-mtd/
WARNING: multiple messages have this Message-ID (diff)
From: "Winkler, Tomas" <tomas.winkler@intel.com> To: Richard Weinberger <richard.weinberger@gmail.com> Cc: Vignesh Raghavendra <vigneshr@ti.com>, Miquel Raynal <miquel.raynal@bootlin.com>, Richard Weinberger <richard@nod.at>, "intel-gfx@lists.freedesktop.org" <intel-gfx@lists.freedesktop.org>, "Usyskin, Alexander" <alexander.usyskin@intel.com>, "linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>, "Lubart, Vitaly" <vitaly.lubart@intel.com> Subject: Re: [Intel-gfx] [RFC PATCH 0/9] drm/i915/spi: discrete graphics internal spi Date: Wed, 17 Feb 2021 08:34:53 +0000 [thread overview] Message-ID: <cb20e706d494458a8957252eeacfb1da@intel.com> (raw) In-Reply-To: <CAFLxGvwP5-O5DHQ07Fs_GnG12dsK24mer8LJfhz2z2UqW9e5cQ@mail.gmail.com> > > )On Tue, Feb 16, 2021 at 7:26 PM Tomas Winkler <tomas.winkler@intel.com> > wrote: > > Because the graphic card may undergo reset at any time and basically > > hot unplug all its child devices, this series also provides a fix to > > the mtd framework to make the reset graceful. > > Well, just because MTD does not work as you expect, it is not broken. :-) I'm not saying it's broken by design it just didn't fit this use case. > > In your case i915_spi_remove() blindly removes the MTD, this is not allowed. > You may remove the MTD only if there are no more users. I'm not sure it's good idea to stall the removal on user space. This is just asking for a deadlock as user space is not getting what it needs and may stall I think it's better the user space will fail gracefully the hw is not accessible in that stage anyway. > > The current model in MTD is that the driver is in charge of all life cycle > management. > Using ->_get_device() and ->_put_device() a driver can implement > refcounting and deny new users if the MTD is about to disappear. Please note that this use case you are describing is still valid, I haven't removed _get_device() _put_device() handlers, You can still stall the removal of mtd, If this is not that way it's a bug > > In the upcoming MUSE driver that mechanism is used too. > MUSE allows to implement a MTD in userspace. So the FUSE server can > disappear at > *any* time. Just like in your case. Even worse, it can be hostile. > In MUSE the MTD life time is tied to the FUSE connection object, > muse_mtd_get_device() > increments the FUSE connection refcount, and muse_mtd_put_device() > decrements it. > That means if the FUSE server disappears all of a sudden but the MTD still has > users, the MTD will stay. But in this state no new references are allowed and > all MTD operations of existing users will fail with -ENOTCONN (via FUSE). > As soon the last user is gone (can be userspace via /dev/mtd* or a in-kernel > user such as UBIFS), the MTD will be removed. But in our case whole i915 is taken hostage, it cannot reset because of misbehaving user space. > For the full details, please see: > https://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git/tree/fs/fuse/m > use.c?h=muse_v3#n1034 > > Is in your case *really* not possible to do it that way? Maybe it's possible but I don't think it's good to stall i915 removal. Also It's very easily to crash the kernel. I've posted a sniped to the mailing list that tried to do that, the kernel still has crashed. Can you looked at? > On the other hand, your last patch moves some part of the life cycle > management into MTD core. > The MTD will stay as long it has users. > But that's only one part. The driver is still in charge to make sure that all > operations fail immediately and that no new users arrive. I think that case I would need to validate every HW access to make sure it's still valid. > If we want to do all in MTD core we'd have to do it like SCSI disks. > That means having devices states such as SDEV_RUNNING, SDEV_CANCEL, > SDEV_OFFLINE, .... > That way the MTD could be shutdown gracefully, first no new users are > allowed, then ongoing operations will be cancelled, next all operation will fail > with -EIO or such, then the device is being removed from sysfs and finally if > the last user is gone, the MTD can be removed. Isn't that already that way? You cannot open new handler. That I would need more of your insights. > > I'm not sure whether we want to take that path. Thanks Tomas _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
next prev parent reply other threads:[~2021-02-17 8:36 UTC|newest] Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-02-16 18:19 [RFC PATCH 0/9] drm/i915/spi: discrete graphics internal spi Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-16 18:19 ` [RFC PATCH 1/9] drm/i915/spi: add spi device for discrete graphics Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-17 10:42 ` Jani Nikula 2021-02-17 10:42 ` [Intel-gfx] " Jani Nikula 2021-02-17 17:14 ` Lucas De Marchi 2021-02-17 17:14 ` [Intel-gfx] " Lucas De Marchi 2021-02-17 19:02 ` Winkler, Tomas 2021-02-17 19:02 ` [Intel-gfx] " Winkler, Tomas 2021-02-16 18:19 ` [RFC PATCH 2/9] drm/i915/spi: intel_spi_region map Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-17 10:46 ` Jani Nikula 2021-02-17 10:46 ` [Intel-gfx] " Jani Nikula 2021-02-17 20:45 ` Winkler, Tomas 2021-02-17 20:45 ` [Intel-gfx] " Winkler, Tomas 2021-02-22 10:17 ` Jani Nikula 2021-02-22 10:17 ` [Intel-gfx] " Jani Nikula 2021-02-22 11:30 ` Winkler, Tomas 2021-02-22 11:30 ` [Intel-gfx] " Winkler, Tomas 2021-02-16 18:19 ` [RFC PATCH 3/9] drm/i915/spi: add driver for on-die spi device Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-17 10:56 ` Jani Nikula 2021-02-17 10:56 ` [Intel-gfx] " Jani Nikula 2021-02-17 20:58 ` Winkler, Tomas 2021-02-17 20:58 ` [Intel-gfx] " Winkler, Tomas 2021-02-18 9:49 ` Lucas De Marchi 2021-02-18 9:49 ` Lucas De Marchi 2021-02-18 10:50 ` Winkler, Tomas 2021-02-18 10:50 ` Winkler, Tomas 2021-02-19 6:06 ` Winkler, Tomas 2021-02-19 6:06 ` Winkler, Tomas 2021-02-19 22:59 ` Lucas De Marchi 2021-02-19 22:59 ` Lucas De Marchi 2021-02-20 17:56 ` Winkler, Tomas 2021-02-20 17:56 ` Winkler, Tomas 2021-02-16 18:19 ` [RFC PATCH 4/9] drm/i915/spi: implement region enumeration Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-16 18:19 ` [RFC PATCH 5/9] drm/i915/spi: implement spi access functions Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-16 18:19 ` [RFC PATCH 6/9] drm/i915/spi: spi register with mtd Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-16 18:19 ` [RFC PATCH 7/9] drm/i915/spi: mtd: implement access handlers Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-16 18:19 ` [RFC PATCH 8/9] drm/i915/spi: serialize spi access Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-16 18:19 ` [RFC PATCH 9/9] mtd: use refcount to prevent corruption Tomas Winkler 2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler 2021-02-16 18:45 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/spi: discrete graphics internal spi Patchwork 2021-02-16 18:47 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork 2021-02-16 19:14 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork 2021-02-16 20:35 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork 2021-02-16 23:01 ` [RFC PATCH 0/9] " Richard Weinberger 2021-02-16 23:01 ` [Intel-gfx] " Richard Weinberger 2021-02-17 8:34 ` Winkler, Tomas [this message] 2021-02-17 8:34 ` Winkler, Tomas 2021-02-21 7:10 ` Winkler, Tomas 2021-02-21 7:10 ` [Intel-gfx] " Winkler, Tomas 2021-02-22 22:38 ` Richard Weinberger 2021-02-22 22:38 ` [Intel-gfx] " Richard Weinberger 2021-02-23 6:31 ` Winkler, Tomas 2021-02-23 6:31 ` [Intel-gfx] " Winkler, Tomas 2021-02-28 6:52 ` Winkler, Tomas 2021-02-28 6:52 ` [Intel-gfx] " Winkler, Tomas 2021-02-17 10:36 ` Jani Nikula 2021-02-17 10:36 ` [Intel-gfx] " Jani Nikula 2021-02-17 12:50 ` Winkler, Tomas 2021-02-17 12:50 ` [Intel-gfx] " Winkler, Tomas 2021-02-17 13:35 ` Jani Nikula 2021-02-17 13:35 ` [Intel-gfx] " Jani Nikula 2021-02-17 18:33 ` Rodrigo Vivi 2021-02-17 18:33 ` Rodrigo Vivi 2021-02-17 11:02 ` Jani Nikula 2021-02-17 11:02 ` [Intel-gfx] " Jani Nikula 2021-02-17 13:56 ` Winkler, Tomas 2021-02-17 13:56 ` [Intel-gfx] " Winkler, Tomas 2021-03-01 12:33 ` Jani Nikula 2021-03-01 12:33 ` [Intel-gfx] " Jani Nikula
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=cb20e706d494458a8957252eeacfb1da@intel.com \ --to=tomas.winkler@intel.com \ --cc=alexander.usyskin@intel.com \ --cc=intel-gfx@lists.freedesktop.org \ --cc=jani.nikula@linux.intel.com \ --cc=joonas.lahtinen@linux.intel.com \ --cc=linux-mtd@lists.infradead.org \ --cc=miquel.raynal@bootlin.com \ --cc=richard.weinberger@gmail.com \ --cc=richard@nod.at \ --cc=rodrigo.vivi@intel.com \ --cc=vigneshr@ti.com \ --cc=vitaly.lubart@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.