All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Winkler, Tomas" <tomas.winkler@intel.com>
To: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>,
	Miquel Raynal <miquel.raynal@bootlin.com>,
	Richard Weinberger <richard@nod.at>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>,
	"Usyskin, Alexander" <alexander.usyskin@intel.com>,
	Jani Nikula <jani.nikula@linux.intel.com>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>,
	"Vivi, Rodrigo" <rodrigo.vivi@intel.com>,
	"Lubart, Vitaly" <vitaly.lubart@intel.com>
Subject: RE: [RFC PATCH 0/9] drm/i915/spi: discrete graphics internal spi
Date: Wed, 17 Feb 2021 08:34:53 +0000	[thread overview]
Message-ID: <cb20e706d494458a8957252eeacfb1da@intel.com> (raw)
In-Reply-To: <CAFLxGvwP5-O5DHQ07Fs_GnG12dsK24mer8LJfhz2z2UqW9e5cQ@mail.gmail.com>


> 
> )On Tue, Feb 16, 2021 at 7:26 PM Tomas Winkler <tomas.winkler@intel.com>
> wrote:
> > Because the graphic card may undergo reset at any time and basically
> > hot unplug all its child devices, this series also provides a fix to
> > the mtd framework to make the reset graceful.
> 
> Well, just because MTD does not work as you expect, it is not broken. :-)
I'm not saying it's broken by design it just didn't fit this use case. 
> 
> In your case i915_spi_remove() blindly removes the MTD, this is not allowed.
> You may remove the MTD only if there are no more users.

I'm not sure it's good idea to stall the removal on user space.
This is just asking for a deadlock as user space is not getting what it needs and may stall
I think it's better the user space will fail gracefully the hw is not accessible in that stage anyway. 
> 
> The current model in MTD is that the driver is in charge of all life cycle
> management.
> Using ->_get_device() and ->_put_device() a driver can implement
> refcounting and deny new users if the MTD is about to disappear.

Please note that this use case you are describing is still valid, I haven't removed _get_device() _put_device() handlers,
You can still stall the removal of mtd, If this is not that way it's a bug

> 
> In the upcoming MUSE driver that mechanism is used too.
> MUSE allows to implement a MTD in userspace. So the FUSE server can
> disappear at
> *any* time. Just like in your case. Even worse, it can be hostile.
> In MUSE the MTD life time is tied to the FUSE connection object,
> muse_mtd_get_device()
> increments the FUSE connection refcount, and muse_mtd_put_device()
> decrements it.
> That means if the FUSE server disappears all of a sudden but the MTD still has
> users, the MTD will stay. But in this state no new references are allowed and
> all MTD operations of existing users will fail with -ENOTCONN (via FUSE).
> As soon the last user is gone (can be userspace via /dev/mtd* or a in-kernel
> user such as UBIFS), the MTD will be removed.

But in our case whole i915 is taken hostage, it cannot reset because of misbehaving user space.

> For the full details, please see:
> https://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git/tree/fs/fuse/m
> use.c?h=muse_v3#n1034
> 
> Is in your case *really* not possible to do it that way?

Maybe it's possible but I don't think it's good to stall i915 removal. Also It's very easily to crash the kernel.
I've posted a sniped to the mailing list that tried to do that, the kernel still has crashed. Can you looked at?

> On the other hand, your last patch moves some part of the life cycle
> management into MTD core.
> The MTD will stay as long it has users.
> But that's only one part. The driver is still in charge to make sure that all
> operations fail immediately and that no new users arrive.

I think that case I would need to validate every HW access to make sure it's still valid.

> If we want to do all in MTD core we'd have to do it like SCSI disks.
> That means having devices states such as SDEV_RUNNING, SDEV_CANCEL,
> SDEV_OFFLINE, ....
> That way the MTD could be shutdown gracefully, first no new users are
> allowed, then ongoing operations will be cancelled, next all operation will fail
> with -EIO or such, then the device is being removed from sysfs and finally if
> the last user is gone, the MTD can be removed.

Isn't that already that way? You cannot open new handler. That I would need more of your insights.
> 
> I'm not sure whether we want to take that path.

Thanks
Tomas

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

WARNING: multiple messages have this Message-ID (diff)
From: "Winkler, Tomas" <tomas.winkler@intel.com>
To: Richard Weinberger <richard.weinberger@gmail.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>,
	Miquel Raynal <miquel.raynal@bootlin.com>,
	Richard Weinberger <richard@nod.at>,
	"intel-gfx@lists.freedesktop.org"
	<intel-gfx@lists.freedesktop.org>,
	"Usyskin, Alexander" <alexander.usyskin@intel.com>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>,
	"Lubart, Vitaly" <vitaly.lubart@intel.com>
Subject: Re: [Intel-gfx] [RFC PATCH 0/9] drm/i915/spi: discrete graphics internal spi
Date: Wed, 17 Feb 2021 08:34:53 +0000	[thread overview]
Message-ID: <cb20e706d494458a8957252eeacfb1da@intel.com> (raw)
In-Reply-To: <CAFLxGvwP5-O5DHQ07Fs_GnG12dsK24mer8LJfhz2z2UqW9e5cQ@mail.gmail.com>


> 
> )On Tue, Feb 16, 2021 at 7:26 PM Tomas Winkler <tomas.winkler@intel.com>
> wrote:
> > Because the graphic card may undergo reset at any time and basically
> > hot unplug all its child devices, this series also provides a fix to
> > the mtd framework to make the reset graceful.
> 
> Well, just because MTD does not work as you expect, it is not broken. :-)
I'm not saying it's broken by design it just didn't fit this use case. 
> 
> In your case i915_spi_remove() blindly removes the MTD, this is not allowed.
> You may remove the MTD only if there are no more users.

I'm not sure it's good idea to stall the removal on user space.
This is just asking for a deadlock as user space is not getting what it needs and may stall
I think it's better the user space will fail gracefully the hw is not accessible in that stage anyway. 
> 
> The current model in MTD is that the driver is in charge of all life cycle
> management.
> Using ->_get_device() and ->_put_device() a driver can implement
> refcounting and deny new users if the MTD is about to disappear.

Please note that this use case you are describing is still valid, I haven't removed _get_device() _put_device() handlers,
You can still stall the removal of mtd, If this is not that way it's a bug

> 
> In the upcoming MUSE driver that mechanism is used too.
> MUSE allows to implement a MTD in userspace. So the FUSE server can
> disappear at
> *any* time. Just like in your case. Even worse, it can be hostile.
> In MUSE the MTD life time is tied to the FUSE connection object,
> muse_mtd_get_device()
> increments the FUSE connection refcount, and muse_mtd_put_device()
> decrements it.
> That means if the FUSE server disappears all of a sudden but the MTD still has
> users, the MTD will stay. But in this state no new references are allowed and
> all MTD operations of existing users will fail with -ENOTCONN (via FUSE).
> As soon the last user is gone (can be userspace via /dev/mtd* or a in-kernel
> user such as UBIFS), the MTD will be removed.

But in our case whole i915 is taken hostage, it cannot reset because of misbehaving user space.

> For the full details, please see:
> https://git.kernel.org/pub/scm/linux/kernel/git/rw/misc.git/tree/fs/fuse/m
> use.c?h=muse_v3#n1034
> 
> Is in your case *really* not possible to do it that way?

Maybe it's possible but I don't think it's good to stall i915 removal. Also It's very easily to crash the kernel.
I've posted a sniped to the mailing list that tried to do that, the kernel still has crashed. Can you looked at?

> On the other hand, your last patch moves some part of the life cycle
> management into MTD core.
> The MTD will stay as long it has users.
> But that's only one part. The driver is still in charge to make sure that all
> operations fail immediately and that no new users arrive.

I think that case I would need to validate every HW access to make sure it's still valid.

> If we want to do all in MTD core we'd have to do it like SCSI disks.
> That means having devices states such as SDEV_RUNNING, SDEV_CANCEL,
> SDEV_OFFLINE, ....
> That way the MTD could be shutdown gracefully, first no new users are
> allowed, then ongoing operations will be cancelled, next all operation will fail
> with -EIO or such, then the device is being removed from sysfs and finally if
> the last user is gone, the MTD can be removed.

Isn't that already that way? You cannot open new handler. That I would need more of your insights.
> 
> I'm not sure whether we want to take that path.

Thanks
Tomas

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  reply	other threads:[~2021-02-17  8:36 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-16 18:19 [RFC PATCH 0/9] drm/i915/spi: discrete graphics internal spi Tomas Winkler
2021-02-16 18:19 ` [Intel-gfx] " Tomas Winkler
2021-02-16 18:19 ` [RFC PATCH 1/9] drm/i915/spi: add spi device for discrete graphics Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-17 10:42   ` Jani Nikula
2021-02-17 10:42     ` [Intel-gfx] " Jani Nikula
2021-02-17 17:14   ` Lucas De Marchi
2021-02-17 17:14     ` [Intel-gfx] " Lucas De Marchi
2021-02-17 19:02     ` Winkler, Tomas
2021-02-17 19:02       ` [Intel-gfx] " Winkler, Tomas
2021-02-16 18:19 ` [RFC PATCH 2/9] drm/i915/spi: intel_spi_region map Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-17 10:46   ` Jani Nikula
2021-02-17 10:46     ` [Intel-gfx] " Jani Nikula
2021-02-17 20:45     ` Winkler, Tomas
2021-02-17 20:45       ` [Intel-gfx] " Winkler, Tomas
2021-02-22 10:17       ` Jani Nikula
2021-02-22 10:17         ` [Intel-gfx] " Jani Nikula
2021-02-22 11:30         ` Winkler, Tomas
2021-02-22 11:30           ` [Intel-gfx] " Winkler, Tomas
2021-02-16 18:19 ` [RFC PATCH 3/9] drm/i915/spi: add driver for on-die spi device Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-17 10:56   ` Jani Nikula
2021-02-17 10:56     ` [Intel-gfx] " Jani Nikula
2021-02-17 20:58     ` Winkler, Tomas
2021-02-17 20:58       ` [Intel-gfx] " Winkler, Tomas
2021-02-18  9:49       ` Lucas De Marchi
2021-02-18  9:49         ` Lucas De Marchi
2021-02-18 10:50         ` Winkler, Tomas
2021-02-18 10:50           ` Winkler, Tomas
2021-02-19  6:06         ` Winkler, Tomas
2021-02-19  6:06           ` Winkler, Tomas
2021-02-19 22:59           ` Lucas De Marchi
2021-02-19 22:59             ` Lucas De Marchi
2021-02-20 17:56             ` Winkler, Tomas
2021-02-20 17:56               ` Winkler, Tomas
2021-02-16 18:19 ` [RFC PATCH 4/9] drm/i915/spi: implement region enumeration Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-16 18:19 ` [RFC PATCH 5/9] drm/i915/spi: implement spi access functions Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-16 18:19 ` [RFC PATCH 6/9] drm/i915/spi: spi register with mtd Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-16 18:19 ` [RFC PATCH 7/9] drm/i915/spi: mtd: implement access handlers Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-16 18:19 ` [RFC PATCH 8/9] drm/i915/spi: serialize spi access Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-16 18:19 ` [RFC PATCH 9/9] mtd: use refcount to prevent corruption Tomas Winkler
2021-02-16 18:19   ` [Intel-gfx] " Tomas Winkler
2021-02-16 18:45 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for drm/i915/spi: discrete graphics internal spi Patchwork
2021-02-16 18:47 ` [Intel-gfx] ✗ Fi.CI.SPARSE: " Patchwork
2021-02-16 19:14 ` [Intel-gfx] ✓ Fi.CI.BAT: success " Patchwork
2021-02-16 20:35 ` [Intel-gfx] ✗ Fi.CI.IGT: failure " Patchwork
2021-02-16 23:01 ` [RFC PATCH 0/9] " Richard Weinberger
2021-02-16 23:01   ` [Intel-gfx] " Richard Weinberger
2021-02-17  8:34   ` Winkler, Tomas [this message]
2021-02-17  8:34     ` Winkler, Tomas
2021-02-21  7:10     ` Winkler, Tomas
2021-02-21  7:10       ` [Intel-gfx] " Winkler, Tomas
2021-02-22 22:38       ` Richard Weinberger
2021-02-22 22:38         ` [Intel-gfx] " Richard Weinberger
2021-02-23  6:31         ` Winkler, Tomas
2021-02-23  6:31           ` [Intel-gfx] " Winkler, Tomas
2021-02-28  6:52           ` Winkler, Tomas
2021-02-28  6:52             ` [Intel-gfx] " Winkler, Tomas
2021-02-17 10:36 ` Jani Nikula
2021-02-17 10:36   ` [Intel-gfx] " Jani Nikula
2021-02-17 12:50   ` Winkler, Tomas
2021-02-17 12:50     ` [Intel-gfx] " Winkler, Tomas
2021-02-17 13:35     ` Jani Nikula
2021-02-17 13:35       ` [Intel-gfx] " Jani Nikula
2021-02-17 18:33       ` Rodrigo Vivi
2021-02-17 18:33         ` Rodrigo Vivi
2021-02-17 11:02 ` Jani Nikula
2021-02-17 11:02   ` [Intel-gfx] " Jani Nikula
2021-02-17 13:56   ` Winkler, Tomas
2021-02-17 13:56     ` [Intel-gfx] " Winkler, Tomas
2021-03-01 12:33     ` Jani Nikula
2021-03-01 12:33       ` [Intel-gfx] " Jani Nikula

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb20e706d494458a8957252eeacfb1da@intel.com \
    --to=tomas.winkler@intel.com \
    --cc=alexander.usyskin@intel.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=miquel.raynal@bootlin.com \
    --cc=richard.weinberger@gmail.com \
    --cc=richard@nod.at \
    --cc=rodrigo.vivi@intel.com \
    --cc=vigneshr@ti.com \
    --cc=vitaly.lubart@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.