All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Bjørn Mork" <bjorn@mork.no>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Greg KH <gregkh@linuxfoundation.org>, Kay Sievers <kay@vrfy.org>,
	Myron Stowe <mstowe@redhat.com>,
	Myron Stowe <myron.stowe@redhat.com>,
	linux-hotplug@vger.kernel.org, linux-pci@vger.kernel.org,
	yuxiangl@marvell.com, yxlraid@gmail.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] udevadm-info: Don't access sysfs 'resource<N>' files
Date: Mon, 18 Mar 2013 19:25:01 +0100	[thread overview]
Message-ID: <f153b1ed-69e2-4576-a474-e770b90c87a5@email.android.com> (raw)
In-Reply-To: <1363629287.24132.380.camel@bling.home>

Alex Williamson <alex.williamson@redhat.com> wrote:

>On Mon, 2013-03-18 at 18:20 +0100, Bjørn Mork wrote:
>> Alex Williamson <alex.williamson@redhat.com> writes:
>> 
>> > At least for KVM the kernel fix is the addition of the vfio driver
>which
>> > gives us a non-sysfs way to do this.  If this problem was found a
>few
>> > years later and we were ready to make the switch I'd support just
>> > removing these resource files.  In the meantime we have userspace
>that
>> > depends on this interface, so I'm open to suggestions how to fix
>it.
>> 
>> I am puzzled by a couple of things in this discussion:
>> 
>> 1) do you seriously mean that a userspace application (any, not just
>>    udevadm or qemu or whatever) should be able to read and write
>these
>>    registers while the device is owned by a driver?  How is that ever
>>    going to work?
>
>The expectation is that the user doesn't mess with the device through
>pci-sysfs while it's running.  This is really no different than config
>space or MMIO space in that respect. 

But it is.  That's the problem. As a user I expect to be able to run e.g "grep . /sys/devices/whatever/*" with no ill effects. This holds for config space or MMIO space. It does not for any reset-on-read register.


> You can use setpci to break your
>PCI card while it's used by the driver today.  The difference is that
>MMIO spaces side-step the issue by only allowing mmap and config space
>is known not to have read side-effects.

Yes. And that is why there is no problem exporting those. This difference is fundamental. 

>> 2) is it really so that a device can be so fundamentally screwed up
>by
>>    reading some registers, that a later driver probe cannot properly
>>    reinitialize it?
>
>Never underestimate how broken hardware can be, 

True :)

> though in this case
>reading a device register seems to be causing a system hang/reset.

I understand that it does so if the ahci driver is bound to the device while reading the registers, but does it also hang the system with no bound driver? How does it do that? By killing the bus?

>> I would have thought that the solution to all this was to return
>-EINVAL
>> on any attemt to read or write these files while a driver is bound to
>> the device.  If userspace is going to use the API, then the
>application
>> better unbind any driver first.
>> 
>> Or? Am I missing something here?
>
>That doesn't really solve anything though.  Let's pretend the resource
>files only work while the device is bound to pci-stub.  Now what
>happens
>when you run this udevadm command as admin while it's in use by the
>userspace driver?  All we've done is limit the scope of the problem.

Assuming that the system hangs without driver help and that this brokenness is widespread. I don't think any of those assumptions hold. Do they?

>> > If we want to blacklist this specific device, that's fine, but as
>others
>> > have pointed out it's really a class problem.  Perhaps we report 1
>byte
>> > extra for the file length where EOF-1 is an enable byte?  Is there
>> > anything else in file ops that we could use to make it slightly
>more
>> > complicated than open(), read() to access the device?  Thanks,
>> 
>> If there really are devices which cannot handle reading at all, and
>> cannot be reset to a sane state by later driver initialization, then
>a
>> blacklist could be added for those devices.  This should not be a
>common
>> problem.
>
>Yes, if these are dead registers, let's blacklist and move along.  I
>suspect though that these registers probably work fine if you access
>them according to the device programming model, so blacklisting just
>prevents full use through something like KVM device assignment. 

Well, if the device is that broken then I think it will require the kernel to police the device programming. I don't see how you can leave a bomb like that because it might be useful in a rare and very theoretical case.

Easier to just blacklist it...


Bjørn



WARNING: multiple messages have this Message-ID (diff)
From: "Bjørn Mork" <bjorn@mork.no>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Greg KH <gregkh@linuxfoundation.org>, Kay Sievers <kay@vrfy.org>,
	Myron Stowe <mstowe@redhat.com>,
	Myron Stowe <myron.stowe@redhat.com>,
	linux-hotplug@vger.kernel.org, linux-pci@vger.kernel.org,
	yuxiangl@marvell.com, yxlraid@gmail.com,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] udevadm-info: Don't access sysfs 'resource<N>' files
Date: Mon, 18 Mar 2013 18:25:01 +0000	[thread overview]
Message-ID: <f153b1ed-69e2-4576-a474-e770b90c87a5@email.android.com> (raw)
In-Reply-To: <1363629287.24132.380.camel@bling.home>

Alex Williamson <alex.williamson@redhat.com> wrote:

>On Mon, 2013-03-18 at 18:20 +0100, Bjørn Mork wrote:
>> Alex Williamson <alex.williamson@redhat.com> writes:
>> 
>> > At least for KVM the kernel fix is the addition of the vfio driver
>which
>> > gives us a non-sysfs way to do this.  If this problem was found a
>few
>> > years later and we were ready to make the switch I'd support just
>> > removing these resource files.  In the meantime we have userspace
>that
>> > depends on this interface, so I'm open to suggestions how to fix
>it.
>> 
>> I am puzzled by a couple of things in this discussion:
>> 
>> 1) do you seriously mean that a userspace application (any, not just
>>    udevadm or qemu or whatever) should be able to read and write
>these
>>    registers while the device is owned by a driver?  How is that ever
>>    going to work?
>
>The expectation is that the user doesn't mess with the device through
>pci-sysfs while it's running.  This is really no different than config
>space or MMIO space in that respect. 

But it is.  That's the problem. As a user I expect to be able to run e.g "grep . /sys/devices/whatever/*" with no ill effects. This holds for config space or MMIO space. It does not for any reset-on-read register.


> You can use setpci to break your
>PCI card while it's used by the driver today.  The difference is that
>MMIO spaces side-step the issue by only allowing mmap and config space
>is known not to have read side-effects.

Yes. And that is why there is no problem exporting those. This difference is fundamental. 

>> 2) is it really so that a device can be so fundamentally screwed up
>by
>>    reading some registers, that a later driver probe cannot properly
>>    reinitialize it?
>
>Never underestimate how broken hardware can be, 

True :)

> though in this case
>reading a device register seems to be causing a system hang/reset.

I understand that it does so if the ahci driver is bound to the device while reading the registers, but does it also hang the system with no bound driver? How does it do that? By killing the bus?

>> I would have thought that the solution to all this was to return
>-EINVAL
>> on any attemt to read or write these files while a driver is bound to
>> the device.  If userspace is going to use the API, then the
>application
>> better unbind any driver first.
>> 
>> Or? Am I missing something here?
>
>That doesn't really solve anything though.  Let's pretend the resource
>files only work while the device is bound to pci-stub.  Now what
>happens
>when you run this udevadm command as admin while it's in use by the
>userspace driver?  All we've done is limit the scope of the problem.

Assuming that the system hangs without driver help and that this brokenness is widespread. I don't think any of those assumptions hold. Do they?

>> > If we want to blacklist this specific device, that's fine, but as
>others
>> > have pointed out it's really a class problem.  Perhaps we report 1
>byte
>> > extra for the file length where EOF-1 is an enable byte?  Is there
>> > anything else in file ops that we could use to make it slightly
>more
>> > complicated than open(), read() to access the device?  Thanks,
>> 
>> If there really are devices which cannot handle reading at all, and
>> cannot be reset to a sane state by later driver initialization, then
>a
>> blacklist could be added for those devices.  This should not be a
>common
>> problem.
>
>Yes, if these are dead registers, let's blacklist and move along.  I
>suspect though that these registers probably work fine if you access
>them according to the device programming model, so blacklisting just
>prevents full use through something like KVM device assignment. 

Well, if the device is that broken then I think it will require the kernel to police the device programming. I don't see how you can leave a bomb like that because it might be useful in a rare and very theoretical case.

Easier to just blacklist it...


Bjørn



  parent reply	other threads:[~2013-03-18 18:25 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-16 21:35 [PATCH] udevadm-info: Don't access sysfs entries backing device I/O port space Myron Stowe
2013-03-16 21:35 ` Myron Stowe
2013-03-16 21:35 ` [PATCH] udevadm-info: Don't access sysfs 'resource<N>' files Myron Stowe
2013-03-16 21:35   ` Myron Stowe
2013-03-16 22:11   ` Greg KH
2013-03-16 22:11     ` Greg KH
2013-03-16 22:55     ` Bjorn Helgaas
2013-03-16 22:55       ` Bjorn Helgaas
2013-03-16 23:50     ` Myron Stowe
2013-03-16 23:50       ` Myron Stowe
2013-03-17  1:03       ` Greg KH
2013-03-17  1:03         ` Greg KH
2013-03-17  4:11         ` Alex Williamson
2013-03-17  4:11           ` Alex Williamson
2013-03-17  5:36           ` Greg KH
2013-03-17  5:36             ` Greg KH
2013-03-17 13:38             ` Alex Williamson
2013-03-17 13:38               ` Alex Williamson
2013-03-17 14:00               ` Kay Sievers
2013-03-17 14:00                 ` Kay Sievers
2013-03-17 14:20                 ` Myron Stowe
2013-03-17 14:20                   ` Myron Stowe
2013-03-17 14:29                   ` Kay Sievers
2013-03-17 14:29                     ` Kay Sievers
2013-03-17 14:36                     ` Myron Stowe
2013-03-17 14:36                       ` Myron Stowe
2013-03-17 14:43                       ` Kay Sievers
2013-03-17 14:43                         ` Kay Sievers
2013-03-18 16:24                 ` Alex Williamson
2013-03-18 16:24                   ` Alex Williamson
2013-03-18 16:41                   ` Greg KH
2013-03-18 16:41                     ` Greg KH
2013-03-18 16:51                     ` Alex Williamson
2013-03-18 16:51                       ` Alex Williamson
2013-03-18 17:20                       ` Bjørn Mork
2013-03-18 17:20                         ` Bjørn Mork
2013-03-18 17:54                         ` Alex Williamson
2013-03-18 17:54                           ` Alex Williamson
2013-03-18 18:02                           ` Robert Brown
2013-03-18 18:02                             ` Robert Brown
2013-03-18 18:25                           ` Bjørn Mork [this message]
2013-03-18 18:25                             ` Bjørn Mork
2013-03-18 18:59                             ` Alex Williamson
2013-03-18 18:59                               ` Alex Williamson
2013-03-19 16:57                               ` Myron Stowe
2013-03-19 16:57                                 ` Myron Stowe
2013-03-19 17:06                                 ` Myron Stowe
2013-03-19 17:06                                   ` Myron Stowe
2013-03-17 14:33               ` Myron Stowe
2013-03-17 14:33                 ` Myron Stowe
2013-03-17 22:28                 ` Alex Williamson
2013-03-17 22:28                   ` Alex Williamson
2013-03-18 14:50                   ` Don Dutile
2013-03-18 14:50                     ` Don Dutile
2013-03-18 16:34                     ` Alex Williamson
2013-03-18 16:34                       ` Alex Williamson
2013-03-17 14:12         ` Myron Stowe
2013-03-17 14:12           ` Myron Stowe
2013-03-19  1:54         ` Robert Hancock
2013-03-19  1:54           ` Robert Hancock
2013-03-19  2:03           ` Greg KH
2013-03-19  2:03             ` Greg KH
2013-03-19  2:09             ` Robert Hancock
2013-03-19  2:09               ` Robert Hancock
2013-03-19  2:35               ` Greg KH
2013-03-19  2:35                 ` Greg KH
2013-03-19  3:08                 ` Robert Hancock
2013-03-19  3:08                   ` Robert Hancock

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f153b1ed-69e2-4576-a474-e770b90c87a5@email.android.com \
    --to=bjorn@mork.no \
    --cc=alex.williamson@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=kay@vrfy.org \
    --cc=linux-hotplug@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=mstowe@redhat.com \
    --cc=myron.stowe@redhat.com \
    --cc=yuxiangl@marvell.com \
    --cc=yxlraid@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.