All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
@ 2019-01-18  9:39 Erik Skultety
  2019-01-18 10:16 ` Daniel P. Berrangé
  2019-01-18 12:51 ` Singh, Brijesh
  0 siblings, 2 replies; 23+ messages in thread
From: Erik Skultety @ 2019-01-18  9:39 UTC (permalink / raw)
  To: libvir-list; +Cc: qemu-devel, brijesh.singh, berrange, dinechin, mkletzan

Hi,
this is a summary of a private discussion I've had with guys CC'd on this email
about finding a solution to [1] - basically, the default permissions on
/dev/sev (below) make it impossible to query for SEV platform capabilities,
since by default we run QEMU as qemu:qemu when probing for capabilities. It's
worth noting is that this is only relevant to probing, since for a proper QEMU
VM we create a mount namespace for the process and chown all the nodes (needs a
SEV fix though).

# ll /dev/sev
crw-------. 1 root root

I suggested either force running QEMU as root for probing (despite the obvious
security implications) or using namespaces for probing too. Dan argued that
this would have a significant perf impact and suggested we ask systemd to add a
global udev rule.

I proceeded with cloning [1] to systemd and creating an udev rule that I planned
on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
make it world accessible to which Brijesh from AMD expressed a concern that
regular users might deplete the resources (limit on the number of guests
allowed by the platform). But since the limit is claimed to be around 4, Dan
discouraged me to continue with restricting the udev rule to only the 'kvm'
group which Laszlo suggested earlier as the limit is so small that a malicious
QEMU could easily deplete this during probing. This fact also ruled out any
kind of ACL we could create dynamically. Instead, he suggested that we filter
out the kvm-capable QEMU and put only that one in the namespace without a
significant perf impact.
    - my take on this is that there could potentially be more than a single
      kvm-enabled QEMU and therefore we'd need to create more than just a
      single namespace.
    - I also argued that I can image that the same kind of DOS attack might be
      possible from within the namespace, even if we created the /dev/sev node
      only in SEV-enabled guests (which we currently don't). All of us have
      agreed that allowing /dev/sev in the namespace for only SEV-enabled
      guests is worth doing nonetheless.

In the meantime, Christophe went through the kernel code to verify how the SEV
resources are managed and what protection is currently in place to mitigate the
chance of a process easily depleting the limit on SEV guests. He found that
ASID, which determines the encryption key, is allocated from a single ASID
bitmap and essentially guarded by a single 'sev->active' flag.

So, in conclusion, we absolutely need input from Brijesh (AMD) whether there
was something more than the low limit on number of guests behind the default
permissions. Also, we'd like to get some details on how the limit is managed,
helping to assess the approaches mentioned above.

Thanks and please do share your ideas,
Erik

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1665400
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1561113

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-18  9:39 [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities Erik Skultety
@ 2019-01-18 10:16 ` Daniel P. Berrangé
  2019-01-18 10:56   ` [Qemu-devel] [libvirt] " Erik Skultety
  2019-01-18 11:11   ` [Qemu-devel] " Martin Kletzander
  2019-01-18 12:51 ` Singh, Brijesh
  1 sibling, 2 replies; 23+ messages in thread
From: Daniel P. Berrangé @ 2019-01-18 10:16 UTC (permalink / raw)
  To: Erik Skultety; +Cc: libvir-list, qemu-devel, brijesh.singh, dinechin, mkletzan

On Fri, Jan 18, 2019 at 10:39:35AM +0100, Erik Skultety wrote:
> Hi,
> this is a summary of a private discussion I've had with guys CC'd on this email
> about finding a solution to [1] - basically, the default permissions on
> /dev/sev (below) make it impossible to query for SEV platform capabilities,
> since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> worth noting is that this is only relevant to probing, since for a proper QEMU
> VM we create a mount namespace for the process and chown all the nodes (needs a
> SEV fix though).
> 
> # ll /dev/sev
> crw-------. 1 root root
> 
> I suggested either force running QEMU as root for probing (despite the obvious
> security implications) or using namespaces for probing too. Dan argued that
> this would have a significant perf impact and suggested we ask systemd to add a
> global udev rule.

I've just realized there is a potential 3rd solution. Remember there is
actually nothing inherantly special about the 'root' user as an account
ID. 'root' gains its powers from the fact that it has many capabilities
by default.  'qemu' can't access /dev/sev because it is owned by a
different user (happens to be root) and 'qemu' does not have capabilities.

So we can make probing work by using our capabilities code to grant
CAP_DAC_OVERRIDE to the qemu process we spawn. So probing still runs
as 'qemu', but can none the less access /dev/sev while it is owned
by root.  We were not using 'qemu' for sake of security, as the probing
process is not executing any untrusthworthy code, so we don't  loose any
security protection by granting CAP_DAC_OVERRIDE.

> I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> make it world accessible to which Brijesh from AMD expressed a concern that
> regular users might deplete the resources (limit on the number of guests
> allowed by the platform). But since the limit is claimed to be around 4, Dan
> discouraged me to continue with restricting the udev rule to only the 'kvm'
> group which Laszlo suggested earlier as the limit is so small that a malicious
> QEMU could easily deplete this during probing. This fact also ruled out any
> kind of ACL we could create dynamically. Instead, he suggested that we filter
> out the kvm-capable QEMU and put only that one in the namespace without a
> significant perf impact.

Yes, my suggestion to mimic /dev/kvm was based on the mistaken mis-understanding
that there was not a finite resource limit. Given that there are one or more
finite resource limits, we need access control on which unprivileged users, and
/or which individual QEMU instances are permitted access. This means /dev/sev
must remain with restrictive user/group/permissions that prevent any unprivilegd
account from having access. This means either root:root 0770/0700, or possibly
having an 'sev' group and using root:sev 0770, so that users can be granted
access via 'sev' group membership which (might?) allow unprivileged libvirtd to
use 'sev' if the user was added.

>     - my take on this is that there could potentially be more than a single
>       kvm-enabled QEMU and therefore we'd need to create more than just a
>       single namespace.

True, I guess qemu-system-x86_64 and qemu-system-i386 both get KVM
on an x86_64 host, and likewise for many other 64-bit archs supporting.
32-bit apps.

>     - I also argued that I can image that the same kind of DOS attack might be
>       possible from within the namespace, even if we created the /dev/sev node
>       only in SEV-enabled guests (which we currently don't). All of us have
>       agreed that allowing /dev/sev in the namespace for only SEV-enabled
>       guests is worth doing nonetheless.

There's never any perfect level of protection. We're just striving to
minimize the attack surface by only exposing it where there's a genuine
need to use it.

> In the meantime, Christophe went through the kernel code to verify how the SEV
> resources are managed and what protection is currently in place to mitigate the
> chance of a process easily depleting the limit on SEV guests. He found that
> ASID, which determines the encryption key, is allocated from a single ASID
> bitmap and essentially guarded by a single 'sev->active' flag.
> 
> So, in conclusion, we absolutely need input from Brijesh (AMD) whether there
> was something more than the low limit on number of guests behind the default
> permissions. Also, we'd like to get some details on how the limit is managed,
> helping to assess the approaches mentioned above.

Regardless of this problem, I think it is important to have some docs
in either libvirt or QEMU that describe the resource usage constraints
so that management apps can decide how to best take advantage of SEV.

> 
> Thanks and please do share your ideas,
> Erik
> 
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665400
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1561113

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] [libvirt] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-18 10:16 ` Daniel P. Berrangé
@ 2019-01-18 10:56   ` Erik Skultety
  2019-01-18 11:11   ` [Qemu-devel] " Martin Kletzander
  1 sibling, 0 replies; 23+ messages in thread
From: Erik Skultety @ 2019-01-18 10:56 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: libvir-list, dinechin, mkletzan, brijesh.singh, qemu-devel

On Fri, Jan 18, 2019 at 10:16:38AM +0000, Daniel P. Berrangé wrote:
> On Fri, Jan 18, 2019 at 10:39:35AM +0100, Erik Skultety wrote:
> > Hi,
> > this is a summary of a private discussion I've had with guys CC'd on this email
> > about finding a solution to [1] - basically, the default permissions on
> > /dev/sev (below) make it impossible to query for SEV platform capabilities,
> > since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> > worth noting is that this is only relevant to probing, since for a proper QEMU
> > VM we create a mount namespace for the process and chown all the nodes (needs a
> > SEV fix though).
> >
> > # ll /dev/sev
> > crw-------. 1 root root
> >
> > I suggested either force running QEMU as root for probing (despite the obvious
> > security implications) or using namespaces for probing too. Dan argued that
> > this would have a significant perf impact and suggested we ask systemd to add a
> > global udev rule.
>
> I've just realized there is a potential 3rd solution. Remember there is
> actually nothing inherantly special about the 'root' user as an account
> ID. 'root' gains its powers from the fact that it has many capabilities
> by default.  'qemu' can't access /dev/sev because it is owned by a
> different user (happens to be root) and 'qemu' does not have capabilities.
>
> So we can make probing work by using our capabilities code to grant
> CAP_DAC_OVERRIDE to the qemu process we spawn. So probing still runs
> as 'qemu', but can none the less access /dev/sev while it is owned
> by root.  We were not using 'qemu' for sake of security, as the probing
> process is not executing any untrusthworthy code, so we don't  loose any
> security protection by granting CAP_DAC_OVERRIDE.

Truth to be told, I was playing with the idea of using CAP_ capabilities,
however, because I'm paranoid (maybe too much) I took your comment "malicious
QEMU" as an axiom for the process not be trusted *at all*, regardless of what
it's doing, I decided to ditch the idea, because with CAP_DAC_OVERRIDE, you're
essentially giving QEMU an open ticket to any file on the filesystem (okay, not
any and not always because QEMU executed by system libvirt will be confined so
SElinux will still be our guardian angel).

[...]

> >
> > So, in conclusion, we absolutely need input from Brijesh (AMD) whether there
> > was something more than the low limit on number of guests behind the default
> > permissions. Also, we'd like to get some details on how the limit is managed,
> > helping to assess the approaches mentioned above.
>
> Regardless of this problem, I think it is important to have some docs
> in either libvirt or QEMU that describe the resource usage constraints
> so that management apps can decide how to best take advantage of SEV.

Agreed. In the meantime, I'll start working on adding /dev/sev into the
namespace only for SEV-enabled guests.

Thanks,
Erik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-18 10:16 ` Daniel P. Berrangé
  2019-01-18 10:56   ` [Qemu-devel] [libvirt] " Erik Skultety
@ 2019-01-18 11:11   ` Martin Kletzander
  2019-01-18 11:17     ` Daniel P. Berrangé
  1 sibling, 1 reply; 23+ messages in thread
From: Martin Kletzander @ 2019-01-18 11:11 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Erik Skultety, libvir-list, qemu-devel, brijesh.singh, dinechin

[-- Attachment #1: Type: text/plain, Size: 5846 bytes --]

On Fri, Jan 18, 2019 at 10:16:38AM +0000, Daniel P. Berrangé wrote:
>On Fri, Jan 18, 2019 at 10:39:35AM +0100, Erik Skultety wrote:
>> Hi,
>> this is a summary of a private discussion I've had with guys CC'd on this email
>> about finding a solution to [1] - basically, the default permissions on
>> /dev/sev (below) make it impossible to query for SEV platform capabilities,
>> since by default we run QEMU as qemu:qemu when probing for capabilities. It's
>> worth noting is that this is only relevant to probing, since for a proper QEMU
>> VM we create a mount namespace for the process and chown all the nodes (needs a
>> SEV fix though).
>>
>> # ll /dev/sev
>> crw-------. 1 root root
>>
>> I suggested either force running QEMU as root for probing (despite the obvious
>> security implications) or using namespaces for probing too. Dan argued that
>> this would have a significant perf impact and suggested we ask systemd to add a
>> global udev rule.
>

If the creation of namespaces is poses a performance impact, then why don't we
special-case the probing in a sense that we create one namespace for probing,
once, and probe all QEMU binaries in that one namespace?

>I've just realized there is a potential 3rd solution. Remember there is
>actually nothing inherantly special about the 'root' user as an account
>ID. 'root' gains its powers from the fact that it has many capabilities
>by default.  'qemu' can't access /dev/sev because it is owned by a
>different user (happens to be root) and 'qemu' does not have capabilities.
>
>So we can make probing work by using our capabilities code to grant
>CAP_DAC_OVERRIDE to the qemu process we spawn. So probing still runs
>as 'qemu', but can none the less access /dev/sev while it is owned
>by root.  We were not using 'qemu' for sake of security, as the probing
>process is not executing any untrusthworthy code, so we don't  loose any
>security protection by granting CAP_DAC_OVERRIDE.
>

IMHO CAP_DAC_OVERRIDE is a lot, especially on systems without SELinux.

>> I proceeded with cloning [1] to systemd and creating an udev rule that I planned
>> on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
>> make it world accessible to which Brijesh from AMD expressed a concern that
>> regular users might deplete the resources (limit on the number of guests
>> allowed by the platform). But since the limit is claimed to be around 4, Dan
>> discouraged me to continue with restricting the udev rule to only the 'kvm'
>> group which Laszlo suggested earlier as the limit is so small that a malicious
>> QEMU could easily deplete this during probing. This fact also ruled out any
>> kind of ACL we could create dynamically. Instead, he suggested that we filter
>> out the kvm-capable QEMU and put only that one in the namespace without a
>> significant perf impact.
>
>Yes, my suggestion to mimic /dev/kvm was based on the mistaken mis-understanding
>that there was not a finite resource limit. Given that there are one or more
>finite resource limits, we need access control on which unprivileged users, and
>/or which individual QEMU instances are permitted access. This means /dev/sev
>must remain with restrictive user/group/permissions that prevent any unprivilegd
>account from having access. This means either root:root 0770/0700, or possibly
>having an 'sev' group and using root:sev 0770, so that users can be granted
>access via 'sev' group membership which (might?) allow unprivileged libvirtd to
>use 'sev' if the user was added.
>
>>     - my take on this is that there could potentially be more than a single
>>       kvm-enabled QEMU and therefore we'd need to create more than just a
>>       single namespace.
>
>True, I guess qemu-system-x86_64 and qemu-system-i386 both get KVM
>on an x86_64 host, and likewise for many other 64-bit archs supporting.
>32-bit apps.
>
>>     - I also argued that I can image that the same kind of DOS attack might be
>>       possible from within the namespace, even if we created the /dev/sev node
>>       only in SEV-enabled guests (which we currently don't). All of us have
>>       agreed that allowing /dev/sev in the namespace for only SEV-enabled
>>       guests is worth doing nonetheless.
>
>There's never any perfect level of protection. We're just striving to
>minimize the attack surface by only exposing it where there's a genuine
>need to use it.
>
>> In the meantime, Christophe went through the kernel code to verify how the SEV
>> resources are managed and what protection is currently in place to mitigate the
>> chance of a process easily depleting the limit on SEV guests. He found that
>> ASID, which determines the encryption key, is allocated from a single ASID
>> bitmap and essentially guarded by a single 'sev->active' flag.
>>
>> So, in conclusion, we absolutely need input from Brijesh (AMD) whether there
>> was something more than the low limit on number of guests behind the default
>> permissions. Also, we'd like to get some details on how the limit is managed,
>> helping to assess the approaches mentioned above.
>
>Regardless of this problem, I think it is important to have some docs
>in either libvirt or QEMU that describe the resource usage constraints
>so that management apps can decide how to best take advantage of SEV.
>
>>
>> Thanks and please do share your ideas,
>> Erik
>>
>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665400
>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1561113
>
>Regards,
>Daniel
>-- 
>|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-18 11:11   ` [Qemu-devel] " Martin Kletzander
@ 2019-01-18 11:17     ` Daniel P. Berrangé
  2019-01-18 11:31       ` Martin Kletzander
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2019-01-18 11:17 UTC (permalink / raw)
  To: Martin Kletzander
  Cc: Erik Skultety, libvir-list, qemu-devel, brijesh.singh, dinechin

On Fri, Jan 18, 2019 at 12:11:50PM +0100, Martin Kletzander wrote:
> On Fri, Jan 18, 2019 at 10:16:38AM +0000, Daniel P. Berrangé wrote:
> > I've just realized there is a potential 3rd solution. Remember there is
> > actually nothing inherantly special about the 'root' user as an account
> > ID. 'root' gains its powers from the fact that it has many capabilities
> > by default.  'qemu' can't access /dev/sev because it is owned by a
> > different user (happens to be root) and 'qemu' does not have capabilities.
> > 
> > So we can make probing work by using our capabilities code to grant
> > CAP_DAC_OVERRIDE to the qemu process we spawn. So probing still runs
> > as 'qemu', but can none the less access /dev/sev while it is owned
> > by root.  We were not using 'qemu' for sake of security, as the probing
> > process is not executing any untrusthworthy code, so we don't  loose any
> > security protection by granting CAP_DAC_OVERRIDE.
> > 
> 
> IMHO CAP_DAC_OVERRIDE is a lot, especially on systems without SELinux.

Probing of QEMU capabilities is not a security critical task. QEMU is
run with "-machine none" so does not even create the virtual machine
hardware, nor have any guest image that it would run. All it is running
is the QEMU class initialization code. The only way for that to act in
a malicious way is for a backdoor to have been inserted when QEMU was
built by the OS vendor, or fo the QEMU binary on disk to have been
replaced by something malcious (which would require root privileges
itself).


We must of coure *NEVER* give CAP_DAC_OVERRIDE to a QEMU that is running
the real, untrustworty, end user VM.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-18 11:17     ` Daniel P. Berrangé
@ 2019-01-18 11:31       ` Martin Kletzander
  0 siblings, 0 replies; 23+ messages in thread
From: Martin Kletzander @ 2019-01-18 11:31 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Erik Skultety, libvir-list, qemu-devel, brijesh.singh, dinechin

[-- Attachment #1: Type: text/plain, Size: 2239 bytes --]

On Fri, Jan 18, 2019 at 11:17:11AM +0000, Daniel P. Berrangé wrote:
>On Fri, Jan 18, 2019 at 12:11:50PM +0100, Martin Kletzander wrote:
>> On Fri, Jan 18, 2019 at 10:16:38AM +0000, Daniel P. Berrangé wrote:
>> > I've just realized there is a potential 3rd solution. Remember there is
>> > actually nothing inherantly special about the 'root' user as an account
>> > ID. 'root' gains its powers from the fact that it has many capabilities
>> > by default.  'qemu' can't access /dev/sev because it is owned by a
>> > different user (happens to be root) and 'qemu' does not have capabilities.
>> >
>> > So we can make probing work by using our capabilities code to grant
>> > CAP_DAC_OVERRIDE to the qemu process we spawn. So probing still runs
>> > as 'qemu', but can none the less access /dev/sev while it is owned
>> > by root.  We were not using 'qemu' for sake of security, as the probing
>> > process is not executing any untrusthworthy code, so we don't  loose any
>> > security protection by granting CAP_DAC_OVERRIDE.
>> >
>>
>> IMHO CAP_DAC_OVERRIDE is a lot, especially on systems without SELinux.
>
>Probing of QEMU capabilities is not a security critical task. QEMU is
>run with "-machine none" so does not even create the virtual machine
>hardware, nor have any guest image that it would run. All it is running
>is the QEMU class initialization code. The only way for that to act in
>a malicious way is for a backdoor to have been inserted when QEMU was
>built by the OS vendor, or fo the QEMU binary on disk to have been
>replaced by something malcious (which would require root privileges
>itself).
>

So you are trying to protect from buggy qemu with malicious guest, not really a
malicious qemu.  I got confused as SEV is trying to protect against
untrustworthy host including binaries like qemu.  OK

>
>We must of coure *NEVER* give CAP_DAC_OVERRIDE to a QEMU that is running
>the real, untrustworty, end user VM.
>
>Regards,
>Daniel
>-- 
>|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
>|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
>|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-18  9:39 [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities Erik Skultety
  2019-01-18 10:16 ` Daniel P. Berrangé
@ 2019-01-18 12:51 ` Singh, Brijesh
  2019-01-23 12:55   ` Erik Skultety
  1 sibling, 1 reply; 23+ messages in thread
From: Singh, Brijesh @ 2019-01-18 12:51 UTC (permalink / raw)
  To: Erik Skultety, libvir-list
  Cc: Singh, Brijesh, qemu-devel, berrange, dinechin, mkletzan


On 1/18/19 3:39 AM, Erik Skultety wrote:
> Hi,
> this is a summary of a private discussion I've had with guys CC'd on this email
> about finding a solution to [1] - basically, the default permissions on
> /dev/sev (below) make it impossible to query for SEV platform capabilities,
> since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> worth noting is that this is only relevant to probing, since for a proper QEMU
> VM we create a mount namespace for the process and chown all the nodes (needs a
> SEV fix though).
>
> # ll /dev/sev
> crw-------. 1 root root
>
> I suggested either force running QEMU as root for probing (despite the obvious
> security implications) or using namespaces for probing too. Dan argued that
> this would have a significant perf impact and suggested we ask systemd to add a
> global udev rule.
>
> I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> make it world accessible to which Brijesh from AMD expressed a concern that
> regular users might deplete the resources (limit on the number of guests
> allowed by the platform).


During private discussion I didn't realized that we are discussing a
probe issue hence things I have said earlier may not be applicable
during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
The /dev/sev is used for communicating with the SEV FW running inside
the PSP. The SEV FW offers platform and guest specific services. The
guest specific services are used during the guest launch, these services
are available through KVM driver only. Whereas the platform services can
be invoked at anytime. A typical platform specific services are:

- importing certificates

- exporting certificates

- querying the SEV FW version etc etc

In case of the probe we are not launch SEV guest hence we should not be
worried about depleting the SEV ASID resources.

IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
QEMU executes the below sequence to complete the request:

1. Exports the platform certificates  (this is when /dev/sev is accessed).

2. Read the host MSR to determine the C-bit and reduced phys-bit position

I don't see any reason why we can't give world a 'read' permission to
/dev/sev. Anyone should be able to export the certificates and query
status etc. I think the main issue is reading MSR -- which I believe is
putting a 'root' requirement. Am I missing something ?


> But since the limit is claimed to be around 4, Dan


FYI, the limit on EPYC is 15.


> discouraged me to continue with restricting the udev rule to only the 'kvm'
> group which Laszlo suggested earlier as the limit is so small that a malicious
> QEMU could easily deplete this during probing. This fact also ruled out any
> kind of ACL we could create dynamically. Instead, he suggested that we filter
> out the kvm-capable QEMU and put only that one in the namespace without a
> significant perf impact.
>     - my take on this is that there could potentially be more than a single
>       kvm-enabled QEMU and therefore we'd need to create more than just a
>       single namespace.
>     - I also argued that I can image that the same kind of DOS attack might be
>       possible from within the namespace, even if we created the /dev/sev node
>       only in SEV-enabled guests (which we currently don't). All of us have
>       agreed that allowing /dev/sev in the namespace for only SEV-enabled
>       guests is worth doing nonetheless.
>
> In the meantime, Christophe went through the kernel code to verify how the SEV
> resources are managed and what protection is currently in place to mitigate the
> chance of a process easily depleting the limit on SEV guests. He found that
> ASID, which determines the encryption key, is allocated from a single ASID
> bitmap and essentially guarded by a single 'sev->active' flag.
>
> So, in conclusion, we absolutely need input from Brijesh (AMD) whether there
> was something more than the low limit on number of guests behind the default
> permissions. Also, we'd like to get some details on how the limit is managed,
> helping to assess the approaches mentioned above.

As explained above, during the guest launch KVM drv directly call a
guest specific services from the /dev/sev so we wanted to ensure that
kvm user also has access to /dev/sev. If admin does not grant access to
/dev/sev then KVM drv should fail to execute the SEV commands.


> Thanks and please do share your ideas,
> Erik
>
> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665400
> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1561113

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-18 12:51 ` Singh, Brijesh
@ 2019-01-23 12:55   ` Erik Skultety
  2019-01-23 13:10     ` Daniel P. Berrangé
  0 siblings, 1 reply; 23+ messages in thread
From: Erik Skultety @ 2019-01-23 12:55 UTC (permalink / raw)
  To: Singh, Brijesh; +Cc: libvir-list, qemu-devel, berrange, dinechin, mkletzan

On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
>
> On 1/18/19 3:39 AM, Erik Skultety wrote:
> > Hi,
> > this is a summary of a private discussion I've had with guys CC'd on this email
> > about finding a solution to [1] - basically, the default permissions on
> > /dev/sev (below) make it impossible to query for SEV platform capabilities,
> > since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> > worth noting is that this is only relevant to probing, since for a proper QEMU
> > VM we create a mount namespace for the process and chown all the nodes (needs a
> > SEV fix though).
> >
> > # ll /dev/sev
> > crw-------. 1 root root
> >
> > I suggested either force running QEMU as root for probing (despite the obvious
> > security implications) or using namespaces for probing too. Dan argued that
> > this would have a significant perf impact and suggested we ask systemd to add a
> > global udev rule.
> >
> > I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> > on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> > make it world accessible to which Brijesh from AMD expressed a concern that
> > regular users might deplete the resources (limit on the number of guests
> > allowed by the platform).
>
>
> During private discussion I didn't realized that we are discussing a
> probe issue hence things I have said earlier may not be applicable
> during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
> The /dev/sev is used for communicating with the SEV FW running inside
> the PSP. The SEV FW offers platform and guest specific services. The
> guest specific services are used during the guest launch, these services
> are available through KVM driver only. Whereas the platform services can
> be invoked at anytime. A typical platform specific services are:
>
> - importing certificates
>
> - exporting certificates
>
> - querying the SEV FW version etc etc
>
> In case of the probe we are not launch SEV guest hence we should not be
> worried about depleting the SEV ASID resources.
>
> IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
> QEMU executes the below sequence to complete the request:
>
> 1. Exports the platform certificates  (this is when /dev/sev is accessed).
>
> 2. Read the host MSR to determine the C-bit and reduced phys-bit position
>
> I don't see any reason why we can't give world a 'read' permission to
> /dev/sev. Anyone should be able to export the certificates and query

Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
in QEMU which makes an _IOWR request, therefore the file descriptor being
opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
what I've read in the man page, so I don't quite understand the need for IOWR
here - but my honest guess would be that it's because the commands like
SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
kernel to instruct kernel which services we want, ergo _IOWR, is that right?
In any case, a fix of some sort needs to land in QEMU first, because no udev
rule would fix the current situation. Afterwards, I expect that having a rule
like this:

KERNEL=="sev", GROUP="kvm", MODE="0644"

and a selinux policy rule adding the kvm_device_t label, we should be fine, do
we agree on that?

> status etc. I think the main issue is reading MSR -- which I believe is
> putting a 'root' requirement. Am I missing something ?
>
>
> > But since the limit is claimed to be around 4, Dan
>
>
> FYI, the limit on EPYC is 15.

Thanks for correction.

Erik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-23 12:55   ` Erik Skultety
@ 2019-01-23 13:10     ` Daniel P. Berrangé
  2019-01-23 13:22       ` Erik Skultety
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2019-01-23 13:10 UTC (permalink / raw)
  To: Erik Skultety; +Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 23, 2019 at 01:55:06PM +0100, Erik Skultety wrote:
> On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
> >
> > On 1/18/19 3:39 AM, Erik Skultety wrote:
> > > Hi,
> > > this is a summary of a private discussion I've had with guys CC'd on this email
> > > about finding a solution to [1] - basically, the default permissions on
> > > /dev/sev (below) make it impossible to query for SEV platform capabilities,
> > > since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> > > worth noting is that this is only relevant to probing, since for a proper QEMU
> > > VM we create a mount namespace for the process and chown all the nodes (needs a
> > > SEV fix though).
> > >
> > > # ll /dev/sev
> > > crw-------. 1 root root
> > >
> > > I suggested either force running QEMU as root for probing (despite the obvious
> > > security implications) or using namespaces for probing too. Dan argued that
> > > this would have a significant perf impact and suggested we ask systemd to add a
> > > global udev rule.
> > >
> > > I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> > > on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> > > make it world accessible to which Brijesh from AMD expressed a concern that
> > > regular users might deplete the resources (limit on the number of guests
> > > allowed by the platform).
> >
> >
> > During private discussion I didn't realized that we are discussing a
> > probe issue hence things I have said earlier may not be applicable
> > during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
> > The /dev/sev is used for communicating with the SEV FW running inside
> > the PSP. The SEV FW offers platform and guest specific services. The
> > guest specific services are used during the guest launch, these services
> > are available through KVM driver only. Whereas the platform services can
> > be invoked at anytime. A typical platform specific services are:
> >
> > - importing certificates
> >
> > - exporting certificates
> >
> > - querying the SEV FW version etc etc
> >
> > In case of the probe we are not launch SEV guest hence we should not be
> > worried about depleting the SEV ASID resources.
> >
> > IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
> > QEMU executes the below sequence to complete the request:
> >
> > 1. Exports the platform certificates  (this is when /dev/sev is accessed).
> >
> > 2. Read the host MSR to determine the C-bit and reduced phys-bit position
> >
> > I don't see any reason why we can't give world a 'read' permission to
> > /dev/sev. Anyone should be able to export the certificates and query
> 
> Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
> in QEMU which makes an _IOWR request, therefore the file descriptor being
> opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
> what I've read in the man page, so I don't quite understand the need for IOWR
> here - but my honest guess would be that it's because the commands like
> SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
> kernel to instruct kernel which services we want, ergo _IOWR, is that right?

I'm not seeing any permissions checks in the sev_ioctl() function in the
kernel, so IIUC, that means any permissions are entirely based on whether
you can open the /dev/sev, once open you can run any ioctl.  What, if anything,
enforces which ioctls you can run when the device is only O_RDONLY vs O_RDWR ?

> In any case, a fix of some sort needs to land in QEMU first, because no udev
> rule would fix the current situation. Afterwards, I expect that having a rule
> like this:
> 
> KERNEL=="sev", GROUP="kvm", MODE="0644"
> 
> and a selinux policy rule adding the kvm_device_t label, we should be fine, do
> we agree on that?

Based on what I think I see above, this looks like a bad idea.

It still looks like we can solve this entirely in libvirt by just giving
the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
libvirt work for all currently released SEV support in kernel/qemu.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-23 13:10     ` Daniel P. Berrangé
@ 2019-01-23 13:22       ` Erik Skultety
  2019-01-23 13:24         ` Daniel P. Berrangé
  0 siblings, 1 reply; 23+ messages in thread
From: Erik Skultety @ 2019-01-23 13:22 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> On Wed, Jan 23, 2019 at 01:55:06PM +0100, Erik Skultety wrote:
> > On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
> > >
> > > On 1/18/19 3:39 AM, Erik Skultety wrote:
> > > > Hi,
> > > > this is a summary of a private discussion I've had with guys CC'd on this email
> > > > about finding a solution to [1] - basically, the default permissions on
> > > > /dev/sev (below) make it impossible to query for SEV platform capabilities,
> > > > since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> > > > worth noting is that this is only relevant to probing, since for a proper QEMU
> > > > VM we create a mount namespace for the process and chown all the nodes (needs a
> > > > SEV fix though).
> > > >
> > > > # ll /dev/sev
> > > > crw-------. 1 root root
> > > >
> > > > I suggested either force running QEMU as root for probing (despite the obvious
> > > > security implications) or using namespaces for probing too. Dan argued that
> > > > this would have a significant perf impact and suggested we ask systemd to add a
> > > > global udev rule.
> > > >
> > > > I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> > > > on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> > > > make it world accessible to which Brijesh from AMD expressed a concern that
> > > > regular users might deplete the resources (limit on the number of guests
> > > > allowed by the platform).
> > >
> > >
> > > During private discussion I didn't realized that we are discussing a
> > > probe issue hence things I have said earlier may not be applicable
> > > during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
> > > The /dev/sev is used for communicating with the SEV FW running inside
> > > the PSP. The SEV FW offers platform and guest specific services. The
> > > guest specific services are used during the guest launch, these services
> > > are available through KVM driver only. Whereas the platform services can
> > > be invoked at anytime. A typical platform specific services are:
> > >
> > > - importing certificates
> > >
> > > - exporting certificates
> > >
> > > - querying the SEV FW version etc etc
> > >
> > > In case of the probe we are not launch SEV guest hence we should not be
> > > worried about depleting the SEV ASID resources.
> > >
> > > IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
> > > QEMU executes the below sequence to complete the request:
> > >
> > > 1. Exports the platform certificates  (this is when /dev/sev is accessed).
> > >
> > > 2. Read the host MSR to determine the C-bit and reduced phys-bit position
> > >
> > > I don't see any reason why we can't give world a 'read' permission to
> > > /dev/sev. Anyone should be able to export the certificates and query
> >
> > Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
> > in QEMU which makes an _IOWR request, therefore the file descriptor being
> > opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
> > what I've read in the man page, so I don't quite understand the need for IOWR
> > here - but my honest guess would be that it's because the commands like
> > SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
> > kernel to instruct kernel which services we want, ergo _IOWR, is that right?
>
> I'm not seeing any permissions checks in the sev_ioctl() function in the
> kernel, so IIUC, that means any permissions are entirely based on whether
> you can open the /dev/sev, once open you can run any ioctl.  What, if anything,
> enforces which ioctls you can run when the device is only O_RDONLY vs O_RDWR ?

I don't know, that's why I'm asking, because the manual didn't make it any
clear for me whether there's a connection between the device permissions and
ioctls that you're allowed to run.

>
> > In any case, a fix of some sort needs to land in QEMU first, because no udev
> > rule would fix the current situation. Afterwards, I expect that having a rule
> > like this:
> >
> > KERNEL=="sev", GROUP="kvm", MODE="0644"
> >
> > and a selinux policy rule adding the kvm_device_t label, we should be fine, do
> > we agree on that?
>
> Based on what I think I see above, this looks like a bad idea.
>
> It still looks like we can solve this entirely in libvirt by just giving
> the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
> libvirt work for all currently released SEV support in kernel/qemu.

Sure we can, but that would make libvirt the only legitimate user of /dev/sev
and everything else would require the admin to change the permissions
explicitly so that other apps could at least retrieve the platform info, if
it was intended to be for public use?
Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
able to access /dev/sev by default.

Erik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-23 13:22       ` Erik Skultety
@ 2019-01-23 13:24         ` Daniel P. Berrangé
  2019-01-23 13:33           ` Erik Skultety
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2019-01-23 13:24 UTC (permalink / raw)
  To: Erik Skultety; +Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
> On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> > On Wed, Jan 23, 2019 at 01:55:06PM +0100, Erik Skultety wrote:
> > > On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
> > > >
> > > > On 1/18/19 3:39 AM, Erik Skultety wrote:
> > > > > Hi,
> > > > > this is a summary of a private discussion I've had with guys CC'd on this email
> > > > > about finding a solution to [1] - basically, the default permissions on
> > > > > /dev/sev (below) make it impossible to query for SEV platform capabilities,
> > > > > since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> > > > > worth noting is that this is only relevant to probing, since for a proper QEMU
> > > > > VM we create a mount namespace for the process and chown all the nodes (needs a
> > > > > SEV fix though).
> > > > >
> > > > > # ll /dev/sev
> > > > > crw-------. 1 root root
> > > > >
> > > > > I suggested either force running QEMU as root for probing (despite the obvious
> > > > > security implications) or using namespaces for probing too. Dan argued that
> > > > > this would have a significant perf impact and suggested we ask systemd to add a
> > > > > global udev rule.
> > > > >
> > > > > I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> > > > > on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> > > > > make it world accessible to which Brijesh from AMD expressed a concern that
> > > > > regular users might deplete the resources (limit on the number of guests
> > > > > allowed by the platform).
> > > >
> > > >
> > > > During private discussion I didn't realized that we are discussing a
> > > > probe issue hence things I have said earlier may not be applicable
> > > > during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
> > > > The /dev/sev is used for communicating with the SEV FW running inside
> > > > the PSP. The SEV FW offers platform and guest specific services. The
> > > > guest specific services are used during the guest launch, these services
> > > > are available through KVM driver only. Whereas the platform services can
> > > > be invoked at anytime. A typical platform specific services are:
> > > >
> > > > - importing certificates
> > > >
> > > > - exporting certificates
> > > >
> > > > - querying the SEV FW version etc etc
> > > >
> > > > In case of the probe we are not launch SEV guest hence we should not be
> > > > worried about depleting the SEV ASID resources.
> > > >
> > > > IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
> > > > QEMU executes the below sequence to complete the request:
> > > >
> > > > 1. Exports the platform certificates  (this is when /dev/sev is accessed).
> > > >
> > > > 2. Read the host MSR to determine the C-bit and reduced phys-bit position
> > > >
> > > > I don't see any reason why we can't give world a 'read' permission to
> > > > /dev/sev. Anyone should be able to export the certificates and query
> > >
> > > Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
> > > in QEMU which makes an _IOWR request, therefore the file descriptor being
> > > opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
> > > what I've read in the man page, so I don't quite understand the need for IOWR
> > > here - but my honest guess would be that it's because the commands like
> > > SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
> > > kernel to instruct kernel which services we want, ergo _IOWR, is that right?
> >
> > I'm not seeing any permissions checks in the sev_ioctl() function in the
> > kernel, so IIUC, that means any permissions are entirely based on whether
> > you can open the /dev/sev, once open you can run any ioctl.  What, if anything,
> > enforces which ioctls you can run when the device is only O_RDONLY vs O_RDWR ?
> 
> I don't know, that's why I'm asking, because the manual didn't make it any
> clear for me whether there's a connection between the device permissions and
> ioctls that you're allowed to run.
> 
> >
> > > In any case, a fix of some sort needs to land in QEMU first, because no udev
> > > rule would fix the current situation. Afterwards, I expect that having a rule
> > > like this:
> > >
> > > KERNEL=="sev", GROUP="kvm", MODE="0644"
> > >
> > > and a selinux policy rule adding the kvm_device_t label, we should be fine, do
> > > we agree on that?
> >
> > Based on what I think I see above, this looks like a bad idea.
> >
> > It still looks like we can solve this entirely in libvirt by just giving
> > the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
> > libvirt work for all currently released SEV support in kernel/qemu.
> 
> Sure we can, but that would make libvirt the only legitimate user of /dev/sev
> and everything else would require the admin to change the permissions
> explicitly so that other apps could at least retrieve the platform info, if
> it was intended to be for public use?
> Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
> able to access /dev/sev by default.

That's separate form probing and just needs SELinux policy to define
a new  sev_device_t type and grant svirt_t access to it.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-23 13:24         ` Daniel P. Berrangé
@ 2019-01-23 13:33           ` Erik Skultety
  2019-01-23 13:36             ` Daniel P. Berrangé
  0 siblings, 1 reply; 23+ messages in thread
From: Erik Skultety @ 2019-01-23 13:33 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 23, 2019 at 01:24:13PM +0000, Daniel P. Berrangé wrote:
> On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
> > On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> > > On Wed, Jan 23, 2019 at 01:55:06PM +0100, Erik Skultety wrote:
> > > > On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
> > > > >
> > > > > On 1/18/19 3:39 AM, Erik Skultety wrote:
> > > > > > Hi,
> > > > > > this is a summary of a private discussion I've had with guys CC'd on this email
> > > > > > about finding a solution to [1] - basically, the default permissions on
> > > > > > /dev/sev (below) make it impossible to query for SEV platform capabilities,
> > > > > > since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> > > > > > worth noting is that this is only relevant to probing, since for a proper QEMU
> > > > > > VM we create a mount namespace for the process and chown all the nodes (needs a
> > > > > > SEV fix though).
> > > > > >
> > > > > > # ll /dev/sev
> > > > > > crw-------. 1 root root
> > > > > >
> > > > > > I suggested either force running QEMU as root for probing (despite the obvious
> > > > > > security implications) or using namespaces for probing too. Dan argued that
> > > > > > this would have a significant perf impact and suggested we ask systemd to add a
> > > > > > global udev rule.
> > > > > >
> > > > > > I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> > > > > > on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> > > > > > make it world accessible to which Brijesh from AMD expressed a concern that
> > > > > > regular users might deplete the resources (limit on the number of guests
> > > > > > allowed by the platform).
> > > > >
> > > > >
> > > > > During private discussion I didn't realized that we are discussing a
> > > > > probe issue hence things I have said earlier may not be applicable
> > > > > during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
> > > > > The /dev/sev is used for communicating with the SEV FW running inside
> > > > > the PSP. The SEV FW offers platform and guest specific services. The
> > > > > guest specific services are used during the guest launch, these services
> > > > > are available through KVM driver only. Whereas the platform services can
> > > > > be invoked at anytime. A typical platform specific services are:
> > > > >
> > > > > - importing certificates
> > > > >
> > > > > - exporting certificates
> > > > >
> > > > > - querying the SEV FW version etc etc
> > > > >
> > > > > In case of the probe we are not launch SEV guest hence we should not be
> > > > > worried about depleting the SEV ASID resources.
> > > > >
> > > > > IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
> > > > > QEMU executes the below sequence to complete the request:
> > > > >
> > > > > 1. Exports the platform certificates  (this is when /dev/sev is accessed).
> > > > >
> > > > > 2. Read the host MSR to determine the C-bit and reduced phys-bit position
> > > > >
> > > > > I don't see any reason why we can't give world a 'read' permission to
> > > > > /dev/sev. Anyone should be able to export the certificates and query
> > > >
> > > > Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
> > > > in QEMU which makes an _IOWR request, therefore the file descriptor being
> > > > opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
> > > > what I've read in the man page, so I don't quite understand the need for IOWR
> > > > here - but my honest guess would be that it's because the commands like
> > > > SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
> > > > kernel to instruct kernel which services we want, ergo _IOWR, is that right?
> > >
> > > I'm not seeing any permissions checks in the sev_ioctl() function in the
> > > kernel, so IIUC, that means any permissions are entirely based on whether
> > > you can open the /dev/sev, once open you can run any ioctl.  What, if anything,
> > > enforces which ioctls you can run when the device is only O_RDONLY vs O_RDWR ?
> >
> > I don't know, that's why I'm asking, because the manual didn't make it any
> > clear for me whether there's a connection between the device permissions and
> > ioctls that you're allowed to run.
> >
> > >
> > > > In any case, a fix of some sort needs to land in QEMU first, because no udev
> > > > rule would fix the current situation. Afterwards, I expect that having a rule
> > > > like this:
> > > >
> > > > KERNEL=="sev", GROUP="kvm", MODE="0644"
> > > >
> > > > and a selinux policy rule adding the kvm_device_t label, we should be fine, do
> > > > we agree on that?
> > >
> > > Based on what I think I see above, this looks like a bad idea.
> > >
> > > It still looks like we can solve this entirely in libvirt by just giving
> > > the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
> > > libvirt work for all currently released SEV support in kernel/qemu.
> >
> > Sure we can, but that would make libvirt the only legitimate user of /dev/sev
> > and everything else would require the admin to change the permissions
> > explicitly so that other apps could at least retrieve the platform info, if
> > it was intended to be for public use?
> > Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
> > able to access /dev/sev by default.
>
> That's separate form probing and just needs SELinux policy to define
> a new  sev_device_t type and grant svirt_t access to it.

I know, I misread "we can solve this entirely in libvirt" then, I thought you
the SELinux part was included in the statement, my bad then. Still, back to the
original issue, we could technically do both, libvirt would have run qemu with
CAP_DAC_OVERRIDE and we keep working with everything's been released for
SEV in kernel/qemu and for everyone else, systemd might add 0644 for /dev/sev,
that way, everyone's happy, not that I'd be a fan of libvirt often having
to work around something because projects underneath wouldn't backport fixes to
all the distros we support, thus leaving the dirty work to us.

Erik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-23 13:33           ` Erik Skultety
@ 2019-01-23 13:36             ` Daniel P. Berrangé
  2019-01-23 15:02               ` Singh, Brijesh
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2019-01-23 13:36 UTC (permalink / raw)
  To: Erik Skultety; +Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 23, 2019 at 02:33:01PM +0100, Erik Skultety wrote:
> On Wed, Jan 23, 2019 at 01:24:13PM +0000, Daniel P. Berrangé wrote:
> > On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
> > > On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> > > > On Wed, Jan 23, 2019 at 01:55:06PM +0100, Erik Skultety wrote:
> > > > > On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
> > > > > >
> > > > > > On 1/18/19 3:39 AM, Erik Skultety wrote:
> > > > > > > Hi,
> > > > > > > this is a summary of a private discussion I've had with guys CC'd on this email
> > > > > > > about finding a solution to [1] - basically, the default permissions on
> > > > > > > /dev/sev (below) make it impossible to query for SEV platform capabilities,
> > > > > > > since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> > > > > > > worth noting is that this is only relevant to probing, since for a proper QEMU
> > > > > > > VM we create a mount namespace for the process and chown all the nodes (needs a
> > > > > > > SEV fix though).
> > > > > > >
> > > > > > > # ll /dev/sev
> > > > > > > crw-------. 1 root root
> > > > > > >
> > > > > > > I suggested either force running QEMU as root for probing (despite the obvious
> > > > > > > security implications) or using namespaces for probing too. Dan argued that
> > > > > > > this would have a significant perf impact and suggested we ask systemd to add a
> > > > > > > global udev rule.
> > > > > > >
> > > > > > > I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> > > > > > > on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> > > > > > > make it world accessible to which Brijesh from AMD expressed a concern that
> > > > > > > regular users might deplete the resources (limit on the number of guests
> > > > > > > allowed by the platform).
> > > > > >
> > > > > >
> > > > > > During private discussion I didn't realized that we are discussing a
> > > > > > probe issue hence things I have said earlier may not be applicable
> > > > > > during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
> > > > > > The /dev/sev is used for communicating with the SEV FW running inside
> > > > > > the PSP. The SEV FW offers platform and guest specific services. The
> > > > > > guest specific services are used during the guest launch, these services
> > > > > > are available through KVM driver only. Whereas the platform services can
> > > > > > be invoked at anytime. A typical platform specific services are:
> > > > > >
> > > > > > - importing certificates
> > > > > >
> > > > > > - exporting certificates
> > > > > >
> > > > > > - querying the SEV FW version etc etc
> > > > > >
> > > > > > In case of the probe we are not launch SEV guest hence we should not be
> > > > > > worried about depleting the SEV ASID resources.
> > > > > >
> > > > > > IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
> > > > > > QEMU executes the below sequence to complete the request:
> > > > > >
> > > > > > 1. Exports the platform certificates  (this is when /dev/sev is accessed).
> > > > > >
> > > > > > 2. Read the host MSR to determine the C-bit and reduced phys-bit position
> > > > > >
> > > > > > I don't see any reason why we can't give world a 'read' permission to
> > > > > > /dev/sev. Anyone should be able to export the certificates and query
> > > > >
> > > > > Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
> > > > > in QEMU which makes an _IOWR request, therefore the file descriptor being
> > > > > opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
> > > > > what I've read in the man page, so I don't quite understand the need for IOWR
> > > > > here - but my honest guess would be that it's because the commands like
> > > > > SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
> > > > > kernel to instruct kernel which services we want, ergo _IOWR, is that right?
> > > >
> > > > I'm not seeing any permissions checks in the sev_ioctl() function in the
> > > > kernel, so IIUC, that means any permissions are entirely based on whether
> > > > you can open the /dev/sev, once open you can run any ioctl.  What, if anything,
> > > > enforces which ioctls you can run when the device is only O_RDONLY vs O_RDWR ?
> > >
> > > I don't know, that's why I'm asking, because the manual didn't make it any
> > > clear for me whether there's a connection between the device permissions and
> > > ioctls that you're allowed to run.
> > >
> > > >
> > > > > In any case, a fix of some sort needs to land in QEMU first, because no udev
> > > > > rule would fix the current situation. Afterwards, I expect that having a rule
> > > > > like this:
> > > > >
> > > > > KERNEL=="sev", GROUP="kvm", MODE="0644"
> > > > >
> > > > > and a selinux policy rule adding the kvm_device_t label, we should be fine, do
> > > > > we agree on that?
> > > >
> > > > Based on what I think I see above, this looks like a bad idea.
> > > >
> > > > It still looks like we can solve this entirely in libvirt by just giving
> > > > the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
> > > > libvirt work for all currently released SEV support in kernel/qemu.
> > >
> > > Sure we can, but that would make libvirt the only legitimate user of /dev/sev
> > > and everything else would require the admin to change the permissions
> > > explicitly so that other apps could at least retrieve the platform info, if
> > > it was intended to be for public use?
> > > Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
> > > able to access /dev/sev by default.
> >
> > That's separate form probing and just needs SELinux policy to define
> > a new  sev_device_t type and grant svirt_t access to it.
> 
> I know, I misread "we can solve this entirely in libvirt" then, I thought you
> the SELinux part was included in the statement, my bad then. Still, back to the
> original issue, we could technically do both, libvirt would have run qemu with
> CAP_DAC_OVERRIDE and we keep working with everything's been released for
> SEV in kernel/qemu and for everyone else, systemd might add 0644 for /dev/sev,
> that way, everyone's happy, not that I'd be a fan of libvirt often having
> to work around something because projects underneath wouldn't backport fixes to
> all the distros we support, thus leaving the dirty work to us.

Setting 0644 for /dev/sev looks unsafe to me unless someone can show 
where the permissions checks take place for the many ioctls that
/dev/sev allows, such that only SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS
is allowed when /dev/sev is opened by a user who doesn't have write
permissions.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-23 13:36             ` Daniel P. Berrangé
@ 2019-01-23 15:02               ` Singh, Brijesh
  2019-01-23 15:29                 ` Erik Skultety
  2019-01-29 16:15                 ` Erik Skultety
  0 siblings, 2 replies; 23+ messages in thread
From: Singh, Brijesh @ 2019-01-23 15:02 UTC (permalink / raw)
  To: Daniel P. Berrangé, Erik Skultety
  Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan



On 1/23/19 7:36 AM, Daniel P. Berrangé wrote:
> On Wed, Jan 23, 2019 at 02:33:01PM +0100, Erik Skultety wrote:
>> On Wed, Jan 23, 2019 at 01:24:13PM +0000, Daniel P. Berrangé wrote:
>>> On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
>>>> On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
>>>>> On Wed, Jan 23, 2019 at 01:55:06PM +0100, Erik Skultety wrote:
>>>>>> On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
>>>>>>>
>>>>>>> On 1/18/19 3:39 AM, Erik Skultety wrote:
>>>>>>>> Hi,
>>>>>>>> this is a summary of a private discussion I've had with guys CC'd on this email
>>>>>>>> about finding a solution to [1] - basically, the default permissions on
>>>>>>>> /dev/sev (below) make it impossible to query for SEV platform capabilities,
>>>>>>>> since by default we run QEMU as qemu:qemu when probing for capabilities. It's
>>>>>>>> worth noting is that this is only relevant to probing, since for a proper QEMU
>>>>>>>> VM we create a mount namespace for the process and chown all the nodes (needs a
>>>>>>>> SEV fix though).
>>>>>>>>
>>>>>>>> # ll /dev/sev
>>>>>>>> crw-------. 1 root root
>>>>>>>>
>>>>>>>> I suggested either force running QEMU as root for probing (despite the obvious
>>>>>>>> security implications) or using namespaces for probing too. Dan argued that
>>>>>>>> this would have a significant perf impact and suggested we ask systemd to add a
>>>>>>>> global udev rule.
>>>>>>>>
>>>>>>>> I proceeded with cloning [1] to systemd and creating an udev rule that I planned
>>>>>>>> on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
>>>>>>>> make it world accessible to which Brijesh from AMD expressed a concern that
>>>>>>>> regular users might deplete the resources (limit on the number of guests
>>>>>>>> allowed by the platform).
>>>>>>>
>>>>>>>
>>>>>>> During private discussion I didn't realized that we are discussing a
>>>>>>> probe issue hence things I have said earlier may not be applicable
>>>>>>> during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
>>>>>>> The /dev/sev is used for communicating with the SEV FW running inside
>>>>>>> the PSP. The SEV FW offers platform and guest specific services. The
>>>>>>> guest specific services are used during the guest launch, these services
>>>>>>> are available through KVM driver only. Whereas the platform services can
>>>>>>> be invoked at anytime. A typical platform specific services are:
>>>>>>>
>>>>>>> - importing certificates
>>>>>>>
>>>>>>> - exporting certificates
>>>>>>>
>>>>>>> - querying the SEV FW version etc etc
>>>>>>>
>>>>>>> In case of the probe we are not launch SEV guest hence we should not be
>>>>>>> worried about depleting the SEV ASID resources.
>>>>>>>
>>>>>>> IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
>>>>>>> QEMU executes the below sequence to complete the request:
>>>>>>>
>>>>>>> 1. Exports the platform certificates  (this is when /dev/sev is accessed).
>>>>>>>
>>>>>>> 2. Read the host MSR to determine the C-bit and reduced phys-bit position
>>>>>>>
>>>>>>> I don't see any reason why we can't give world a 'read' permission to
>>>>>>> /dev/sev. Anyone should be able to export the certificates and query
>>>>>>
>>>>>> Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
>>>>>> in QEMU which makes an _IOWR request, therefore the file descriptor being
>>>>>> opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
>>>>>> what I've read in the man page, so I don't quite understand the need for IOWR
>>>>>> here - but my honest guess would be that it's because the commands like
>>>>>> SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
>>>>>> kernel to instruct kernel which services we want, ergo _IOWR, is that right?
>>>>>
>>>>> I'm not seeing any permissions checks in the sev_ioctl() function in the
>>>>> kernel, so IIUC, that means any permissions are entirely based on whether
>>>>> you can open the /dev/sev, once open you can run any ioctl.  What, if anything,
>>>>> enforces which ioctls you can run when the device is only O_RDONLY vs O_RDWR ?
>>>>
>>>> I don't know, that's why I'm asking, because the manual didn't make it any
>>>> clear for me whether there's a connection between the device permissions and
>>>> ioctls that you're allowed to run.
>>>>
>>>>>
>>>>>> In any case, a fix of some sort needs to land in QEMU first, because no udev
>>>>>> rule would fix the current situation. Afterwards, I expect that having a rule
>>>>>> like this:
>>>>>>
>>>>>> KERNEL=="sev", GROUP="kvm", MODE="0644"
>>>>>>
>>>>>> and a selinux policy rule adding the kvm_device_t label, we should be fine, do
>>>>>> we agree on that?
>>>>>
>>>>> Based on what I think I see above, this looks like a bad idea.
>>>>>
>>>>> It still looks like we can solve this entirely in libvirt by just giving
>>>>> the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
>>>>> libvirt work for all currently released SEV support in kernel/qemu.
>>>>
>>>> Sure we can, but that would make libvirt the only legitimate user of /dev/sev
>>>> and everything else would require the admin to change the permissions
>>>> explicitly so that other apps could at least retrieve the platform info, if
>>>> it was intended to be for public use?
>>>> Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
>>>> able to access /dev/sev by default.
>>>
>>> That's separate form probing and just needs SELinux policy to define
>>> a new  sev_device_t type and grant svirt_t access to it.
>>
>> I know, I misread "we can solve this entirely in libvirt" then, I thought you
>> the SELinux part was included in the statement, my bad then. Still, back to the
>> original issue, we could technically do both, libvirt would have run qemu with
>> CAP_DAC_OVERRIDE and we keep working with everything's been released for
>> SEV in kernel/qemu and for everyone else, systemd might add 0644 for /dev/sev,
>> that way, everyone's happy, not that I'd be a fan of libvirt often having
>> to work around something because projects underneath wouldn't backport fixes to
>> all the distros we support, thus leaving the dirty work to us.
> 
> Setting 0644 for /dev/sev looks unsafe to me unless someone can show
> where the permissions checks take place for the many ioctls that
> /dev/sev allows, such that only SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS
> is allowed when /dev/sev is opened by a user who doesn't have write
> permissions.
> 

Agree its not safe to do 0644.

Currently, anyone who has access to /dev/sev (read or write) will be
able to execute SEV platform command. In other words there is no
permission check per command basis. I must admit that while developing
the driver I was under assumption that only root will ever access the
/dev/sev device hence overlooked it. But now knowing that others may
also need to access the /dev/sev, I can submit patch in kernel to do
per command access control.

Until then, can we follow Daniel's recommendation to elevate privilege
of the probing code?

-Brijesh

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-23 15:02               ` Singh, Brijesh
@ 2019-01-23 15:29                 ` Erik Skultety
  2019-01-29 16:15                 ` Erik Skultety
  1 sibling, 0 replies; 23+ messages in thread
From: Erik Skultety @ 2019-01-23 15:29 UTC (permalink / raw)
  To: Singh, Brijesh
  Cc: Daniel P. Berrangé, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 23, 2019 at 03:02:28PM +0000, Singh, Brijesh wrote:
>
>
> On 1/23/19 7:36 AM, Daniel P. Berrangé wrote:
> > On Wed, Jan 23, 2019 at 02:33:01PM +0100, Erik Skultety wrote:
> >> On Wed, Jan 23, 2019 at 01:24:13PM +0000, Daniel P. Berrangé wrote:
> >>> On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
> >>>> On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> >>>>> On Wed, Jan 23, 2019 at 01:55:06PM +0100, Erik Skultety wrote:
> >>>>>> On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
> >>>>>>>
> >>>>>>> On 1/18/19 3:39 AM, Erik Skultety wrote:
> >>>>>>>> Hi,
> >>>>>>>> this is a summary of a private discussion I've had with guys CC'd on this email
> >>>>>>>> about finding a solution to [1] - basically, the default permissions on
> >>>>>>>> /dev/sev (below) make it impossible to query for SEV platform capabilities,
> >>>>>>>> since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> >>>>>>>> worth noting is that this is only relevant to probing, since for a proper QEMU
> >>>>>>>> VM we create a mount namespace for the process and chown all the nodes (needs a
> >>>>>>>> SEV fix though).
> >>>>>>>>
> >>>>>>>> # ll /dev/sev
> >>>>>>>> crw-------. 1 root root
> >>>>>>>>
> >>>>>>>> I suggested either force running QEMU as root for probing (despite the obvious
> >>>>>>>> security implications) or using namespaces for probing too. Dan argued that
> >>>>>>>> this would have a significant perf impact and suggested we ask systemd to add a
> >>>>>>>> global udev rule.
> >>>>>>>>
> >>>>>>>> I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> >>>>>>>> on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> >>>>>>>> make it world accessible to which Brijesh from AMD expressed a concern that
> >>>>>>>> regular users might deplete the resources (limit on the number of guests
> >>>>>>>> allowed by the platform).
> >>>>>>>
> >>>>>>>
> >>>>>>> During private discussion I didn't realized that we are discussing a
> >>>>>>> probe issue hence things I have said earlier may not be applicable
> >>>>>>> during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
> >>>>>>> The /dev/sev is used for communicating with the SEV FW running inside
> >>>>>>> the PSP. The SEV FW offers platform and guest specific services. The
> >>>>>>> guest specific services are used during the guest launch, these services
> >>>>>>> are available through KVM driver only. Whereas the platform services can
> >>>>>>> be invoked at anytime. A typical platform specific services are:
> >>>>>>>
> >>>>>>> - importing certificates
> >>>>>>>
> >>>>>>> - exporting certificates
> >>>>>>>
> >>>>>>> - querying the SEV FW version etc etc
> >>>>>>>
> >>>>>>> In case of the probe we are not launch SEV guest hence we should not be
> >>>>>>> worried about depleting the SEV ASID resources.
> >>>>>>>
> >>>>>>> IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
> >>>>>>> QEMU executes the below sequence to complete the request:
> >>>>>>>
> >>>>>>> 1. Exports the platform certificates  (this is when /dev/sev is accessed).
> >>>>>>>
> >>>>>>> 2. Read the host MSR to determine the C-bit and reduced phys-bit position
> >>>>>>>
> >>>>>>> I don't see any reason why we can't give world a 'read' permission to
> >>>>>>> /dev/sev. Anyone should be able to export the certificates and query
> >>>>>>
> >>>>>> Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
> >>>>>> in QEMU which makes an _IOWR request, therefore the file descriptor being
> >>>>>> opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
> >>>>>> what I've read in the man page, so I don't quite understand the need for IOWR
> >>>>>> here - but my honest guess would be that it's because the commands like
> >>>>>> SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
> >>>>>> kernel to instruct kernel which services we want, ergo _IOWR, is that right?
> >>>>>
> >>>>> I'm not seeing any permissions checks in the sev_ioctl() function in the
> >>>>> kernel, so IIUC, that means any permissions are entirely based on whether
> >>>>> you can open the /dev/sev, once open you can run any ioctl.  What, if anything,
> >>>>> enforces which ioctls you can run when the device is only O_RDONLY vs O_RDWR ?
> >>>>
> >>>> I don't know, that's why I'm asking, because the manual didn't make it any
> >>>> clear for me whether there's a connection between the device permissions and
> >>>> ioctls that you're allowed to run.
> >>>>
> >>>>>
> >>>>>> In any case, a fix of some sort needs to land in QEMU first, because no udev
> >>>>>> rule would fix the current situation. Afterwards, I expect that having a rule
> >>>>>> like this:
> >>>>>>
> >>>>>> KERNEL=="sev", GROUP="kvm", MODE="0644"
> >>>>>>
> >>>>>> and a selinux policy rule adding the kvm_device_t label, we should be fine, do
> >>>>>> we agree on that?
> >>>>>
> >>>>> Based on what I think I see above, this looks like a bad idea.
> >>>>>
> >>>>> It still looks like we can solve this entirely in libvirt by just giving
> >>>>> the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
> >>>>> libvirt work for all currently released SEV support in kernel/qemu.
> >>>>
> >>>> Sure we can, but that would make libvirt the only legitimate user of /dev/sev
> >>>> and everything else would require the admin to change the permissions
> >>>> explicitly so that other apps could at least retrieve the platform info, if
> >>>> it was intended to be for public use?
> >>>> Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
> >>>> able to access /dev/sev by default.
> >>>
> >>> That's separate form probing and just needs SELinux policy to define
> >>> a new  sev_device_t type and grant svirt_t access to it.
> >>
> >> I know, I misread "we can solve this entirely in libvirt" then, I thought you
> >> the SELinux part was included in the statement, my bad then. Still, back to the
> >> original issue, we could technically do both, libvirt would have run qemu with
> >> CAP_DAC_OVERRIDE and we keep working with everything's been released for
> >> SEV in kernel/qemu and for everyone else, systemd might add 0644 for /dev/sev,
> >> that way, everyone's happy, not that I'd be a fan of libvirt often having
> >> to work around something because projects underneath wouldn't backport fixes to
> >> all the distros we support, thus leaving the dirty work to us.
> >
> > Setting 0644 for /dev/sev looks unsafe to me unless someone can show
> > where the permissions checks take place for the many ioctls that
> > /dev/sev allows, such that only SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS
> > is allowed when /dev/sev is opened by a user who doesn't have write
> > permissions.
> >
>
> Agree its not safe to do 0644.
>
> Currently, anyone who has access to /dev/sev (read or write) will be
> able to execute SEV platform command. In other words there is no
> permission check per command basis. I must admit that while developing
> the driver I was under assumption that only root will ever access the
> /dev/sev device hence overlooked it. But now knowing that others may
> also need to access the /dev/sev, I can submit patch in kernel to do
> per command access control.
>
> Until then, can we follow Daniel's recommendation to elevate privilege
> of the probing code?

If that is the case, then I absolutely agree with Dan, I'll start working on
the patches for libvirt, I'm glad I learned that there's no strict relation
between ioctl ops and filesystem permissions unless explicitly defined.

Thanks,
Erik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-23 15:02               ` Singh, Brijesh
  2019-01-23 15:29                 ` Erik Skultety
@ 2019-01-29 16:15                 ` Erik Skultety
  2019-01-29 18:40                   ` Daniel P. Berrangé
  1 sibling, 1 reply; 23+ messages in thread
From: Erik Skultety @ 2019-01-29 16:15 UTC (permalink / raw)
  To: Singh, Brijesh
  Cc: Daniel P. Berrangé, libvir-list, qemu-devel, dinechin, mkletzan

[-- Attachment #1: Type: text/plain, Size: 10717 bytes --]

On Wed, Jan 23, 2019 at 03:02:28PM +0000, Singh, Brijesh wrote:
>
>
> On 1/23/19 7:36 AM, Daniel P. Berrangé wrote:
> > On Wed, Jan 23, 2019 at 02:33:01PM +0100, Erik Skultety wrote:
> >> On Wed, Jan 23, 2019 at 01:24:13PM +0000, Daniel P. Berrangé wrote:
> >>> On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
> >>>> On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> >>>>> On Wed, Jan 23, 2019 at 01:55:06PM +0100, Erik Skultety wrote:
> >>>>>> On Fri, Jan 18, 2019 at 12:51:50PM +0000, Singh, Brijesh wrote:
> >>>>>>>
> >>>>>>> On 1/18/19 3:39 AM, Erik Skultety wrote:
> >>>>>>>> Hi,
> >>>>>>>> this is a summary of a private discussion I've had with guys CC'd on this email
> >>>>>>>> about finding a solution to [1] - basically, the default permissions on
> >>>>>>>> /dev/sev (below) make it impossible to query for SEV platform capabilities,
> >>>>>>>> since by default we run QEMU as qemu:qemu when probing for capabilities. It's
> >>>>>>>> worth noting is that this is only relevant to probing, since for a proper QEMU
> >>>>>>>> VM we create a mount namespace for the process and chown all the nodes (needs a
> >>>>>>>> SEV fix though).
> >>>>>>>>
> >>>>>>>> # ll /dev/sev
> >>>>>>>> crw-------. 1 root root
> >>>>>>>>
> >>>>>>>> I suggested either force running QEMU as root for probing (despite the obvious
> >>>>>>>> security implications) or using namespaces for probing too. Dan argued that
> >>>>>>>> this would have a significant perf impact and suggested we ask systemd to add a
> >>>>>>>> global udev rule.
> >>>>>>>>
> >>>>>>>> I proceeded with cloning [1] to systemd and creating an udev rule that I planned
> >>>>>>>> on submitting to systemd upstream - the initial idea was to mimic /dev/kvm and
> >>>>>>>> make it world accessible to which Brijesh from AMD expressed a concern that
> >>>>>>>> regular users might deplete the resources (limit on the number of guests
> >>>>>>>> allowed by the platform).
> >>>>>>>
> >>>>>>>
> >>>>>>> During private discussion I didn't realized that we are discussing a
> >>>>>>> probe issue hence things I have said earlier may not be applicable
> >>>>>>> during the probe. The /dev/sev is managed by the CCP (aka PSP) driver.
> >>>>>>> The /dev/sev is used for communicating with the SEV FW running inside
> >>>>>>> the PSP. The SEV FW offers platform and guest specific services. The
> >>>>>>> guest specific services are used during the guest launch, these services
> >>>>>>> are available through KVM driver only. Whereas the platform services can
> >>>>>>> be invoked at anytime. A typical platform specific services are:
> >>>>>>>
> >>>>>>> - importing certificates
> >>>>>>>
> >>>>>>> - exporting certificates
> >>>>>>>
> >>>>>>> - querying the SEV FW version etc etc
> >>>>>>>
> >>>>>>> In case of the probe we are not launch SEV guest hence we should not be
> >>>>>>> worried about depleting the SEV ASID resources.
> >>>>>>>
> >>>>>>> IIRC, libvirt uses QEMP query-sev-capabilities to probe the SEV support.
> >>>>>>> QEMU executes the below sequence to complete the request:
> >>>>>>>
> >>>>>>> 1. Exports the platform certificates  (this is when /dev/sev is accessed).
> >>>>>>>
> >>>>>>> 2. Read the host MSR to determine the C-bit and reduced phys-bit position
> >>>>>>>
> >>>>>>> I don't see any reason why we can't give world a 'read' permission to
> >>>>>>> /dev/sev. Anyone should be able to export the certificates and query
> >>>>>>
> >>>>>> Okay, makes sense to me. The problem I see is the sev_platform_ioctl function
> >>>>>> in QEMU which makes an _IOWR request, therefore the file descriptor being
> >>>>>> opened in sev_get_capabilities is O_RDWR. Now, I only understand ioctl from
> >>>>>> what I've read in the man page, so I don't quite understand the need for IOWR
> >>>>>> here - but my honest guess would be that it's because the commands like
> >>>>>> SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS need to be copied from userspace to
> >>>>>> kernel to instruct kernel which services we want, ergo _IOWR, is that right?
> >>>>>
> >>>>> I'm not seeing any permissions checks in the sev_ioctl() function in the
> >>>>> kernel, so IIUC, that means any permissions are entirely based on whether
> >>>>> you can open the /dev/sev, once open you can run any ioctl.  What, if anything,
> >>>>> enforces which ioctls you can run when the device is only O_RDONLY vs O_RDWR ?
> >>>>
> >>>> I don't know, that's why I'm asking, because the manual didn't make it any
> >>>> clear for me whether there's a connection between the device permissions and
> >>>> ioctls that you're allowed to run.
> >>>>
> >>>>>
> >>>>>> In any case, a fix of some sort needs to land in QEMU first, because no udev
> >>>>>> rule would fix the current situation. Afterwards, I expect that having a rule
> >>>>>> like this:
> >>>>>>
> >>>>>> KERNEL=="sev", GROUP="kvm", MODE="0644"
> >>>>>>
> >>>>>> and a selinux policy rule adding the kvm_device_t label, we should be fine, do
> >>>>>> we agree on that?
> >>>>>
> >>>>> Based on what I think I see above, this looks like a bad idea.
> >>>>>
> >>>>> It still looks like we can solve this entirely in libvirt by just giving
> >>>>> the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
> >>>>> libvirt work for all currently released SEV support in kernel/qemu.
> >>>>
> >>>> Sure we can, but that would make libvirt the only legitimate user of /dev/sev
> >>>> and everything else would require the admin to change the permissions
> >>>> explicitly so that other apps could at least retrieve the platform info, if
> >>>> it was intended to be for public use?
> >>>> Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
> >>>> able to access /dev/sev by default.
> >>>
> >>> That's separate form probing and just needs SELinux policy to define
> >>> a new  sev_device_t type and grant svirt_t access to it.
> >>
> >> I know, I misread "we can solve this entirely in libvirt" then, I thought you
> >> the SELinux part was included in the statement, my bad then. Still, back to the
> >> original issue, we could technically do both, libvirt would have run qemu with
> >> CAP_DAC_OVERRIDE and we keep working with everything's been released for
> >> SEV in kernel/qemu and for everyone else, systemd might add 0644 for /dev/sev,
> >> that way, everyone's happy, not that I'd be a fan of libvirt often having
> >> to work around something because projects underneath wouldn't backport fixes to
> >> all the distros we support, thus leaving the dirty work to us.
> >
> > Setting 0644 for /dev/sev looks unsafe to me unless someone can show
> > where the permissions checks take place for the many ioctls that
> > /dev/sev allows, such that only SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS
> > is allowed when /dev/sev is opened by a user who doesn't have write
> > permissions.
> >
>
> Agree its not safe to do 0644.
>
> Currently, anyone who has access to /dev/sev (read or write) will be
> able to execute SEV platform command. In other words there is no
> permission check per command basis. I must admit that while developing
> the driver I was under assumption that only root will ever access the
> /dev/sev device hence overlooked it. But now knowing that others may
> also need to access the /dev/sev, I can submit patch in kernel to do
> per command access control.
>
> Until then, can we follow Daniel's recommendation to elevate privilege
> of the probing code?

So, sorry for not coming back earlier, but I'm still fighting a permission
issue when opening the /dev/sev device and I honestly don't know what's
happening. If you apply the patch bellow (or attachment) and you run libvirtd
with LIBVIRT_LOG_OUTPUTS=1:file:<your_log_file> env, you should see something
like this in the logs:

warning : virExec:778 : INHERITABLE CAPS: 'dac_override'
warning : virExec:781 : EFFECTIVE CAPS: 'dac_override'
warning : virExec:784 : PERMITTED CAPS: 'dac_override'
warning : virExec:787 : BOUNDING CAPS: 'dac_override'

...and that is right before we issue execve. Yet, if you debug further into the
QEMU process right after execve, it doesn't report any capabilities at all, so
naturally it'll still get permission denied. Is there something I'm missing
here?

An alternative question I've been playing ever since we exchanged the last few
emails is that can't we wait until the ioctls are compared against permissions
in kernel so that upstream libvirt (and downstream too for that matter) doesn't
have to work around it and stick with that workaround for eternity?

diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
index f504db7d05..4d7ec2781a 100644
--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -53,6 +53,10 @@
 #include <stdarg.h>
 #include <sys/utsname.h>

+#if WITH_CAPNG
+# include <cap-ng.h>
+#endif
+
 #define VIR_FROM_THIS VIR_FROM_QEMU

 VIR_LOG_INIT("qemu.qemu_capabilities");
@@ -4521,6 +4525,12 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd,
                                     NULL);
     virCommandAddEnvPassCommon(cmd->cmd);
     virCommandClearCaps(cmd->cmd);
+#if WITH_CAPNG
+    /* QEMU might run into permission issues, e.g. /dev/sev (0600), overriding
+     * it just for probing is okay from security POV */
+    virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE);
+#endif
+
     virCommandSetGID(cmd->cmd, cmd->runGid);
     virCommandSetUID(cmd->cmd, cmd->runUid);

diff --git a/src/util/vircommand.c b/src/util/vircommand.c
index d965068369..6d49416704 100644
--- a/src/util/vircommand.c
+++ b/src/util/vircommand.c
@@ -771,6 +771,23 @@ virExec(virCommandPtr cmd)
             goto fork_error;
     }

+#if WITH_CAPNG
+    if (strstr(cmd->args[0], "qemu")) {
+        VIR_WARN("INHERITABLE CAPS: '%s'",
+                 capng_print_caps_text(CAPNG_PRINT_BUFFER,
+                                       CAPNG_INHERITABLE));
+        VIR_WARN("EFFECTIVE CAPS: '%s'",
+                 capng_print_caps_text(CAPNG_PRINT_BUFFER,
+                                       CAPNG_EFFECTIVE));
+        VIR_WARN("PERMITTED CAPS: '%s'",
+                 capng_print_caps_text(CAPNG_PRINT_BUFFER,
+                                       CAPNG_PERMITTED));
+        VIR_WARN("BOUNDING CAPS: '%s'",
+                 capng_print_caps_text(CAPNG_PRINT_BUFFER,
+                                       CAPNG_BOUNDING_SET));
+    }
+#endif
+
     /* Close logging again to ensure no FDs leak to child */
     virLogReset();



Thanks,
Erik



[-- Attachment #2: dac_override.patch --]
[-- Type: text/plain, Size: 1940 bytes --]

diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
index f504db7d05..4d7ec2781a 100644
--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -53,6 +53,10 @@
 #include <stdarg.h>
 #include <sys/utsname.h>
 
+#if WITH_CAPNG
+# include <cap-ng.h>
+#endif
+
 #define VIR_FROM_THIS VIR_FROM_QEMU
 
 VIR_LOG_INIT("qemu.qemu_capabilities");
@@ -4521,6 +4525,12 @@ virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd,
                                     NULL);
     virCommandAddEnvPassCommon(cmd->cmd);
     virCommandClearCaps(cmd->cmd);
+#if WITH_CAPNG
+    /* QEMU might run into permission issues, e.g. /dev/sev (0600), overriding
+     * it just for probing is okay from security POV */
+    virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE);
+#endif
+
     virCommandSetGID(cmd->cmd, cmd->runGid);
     virCommandSetUID(cmd->cmd, cmd->runUid);
 
diff --git a/src/util/vircommand.c b/src/util/vircommand.c
index d965068369..6d49416704 100644
--- a/src/util/vircommand.c
+++ b/src/util/vircommand.c
@@ -771,6 +771,23 @@ virExec(virCommandPtr cmd)
             goto fork_error;
     }
 
+#if WITH_CAPNG
+    if (strstr(cmd->args[0], "qemu")) {
+        VIR_WARN("INHERITABLE CAPS: '%s'",
+                 capng_print_caps_text(CAPNG_PRINT_BUFFER,
+                                       CAPNG_INHERITABLE));
+        VIR_WARN("EFFECTIVE CAPS: '%s'",
+                 capng_print_caps_text(CAPNG_PRINT_BUFFER,
+                                       CAPNG_EFFECTIVE));
+        VIR_WARN("PERMITTED CAPS: '%s'",
+                 capng_print_caps_text(CAPNG_PRINT_BUFFER,
+                                       CAPNG_PERMITTED));
+        VIR_WARN("BOUNDING CAPS: '%s'",
+                 capng_print_caps_text(CAPNG_PRINT_BUFFER,
+                                       CAPNG_BOUNDING_SET));
+    }
+#endif
+
     /* Close logging again to ensure no FDs leak to child */
     virLogReset();
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-29 16:15                 ` Erik Skultety
@ 2019-01-29 18:40                   ` Daniel P. Berrangé
  2019-01-30  8:06                     ` Erik Skultety
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2019-01-29 18:40 UTC (permalink / raw)
  To: Erik Skultety; +Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Tue, Jan 29, 2019 at 05:15:42PM +0100, Erik Skultety wrote:
> On Wed, Jan 23, 2019 at 03:02:28PM +0000, Singh, Brijesh wrote:
> >
> >
> > On 1/23/19 7:36 AM, Daniel P. Berrangé wrote:
> > > On Wed, Jan 23, 2019 at 02:33:01PM +0100, Erik Skultety wrote:
> > >> On Wed, Jan 23, 2019 at 01:24:13PM +0000, Daniel P. Berrangé wrote:
> > >>> On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
> > >>>> On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> > >>>>>
> > >>>>> It still looks like we can solve this entirely in libvirt by just giving
> > >>>>> the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
> > >>>>> libvirt work for all currently released SEV support in kernel/qemu.
> > >>>>
> > >>>> Sure we can, but that would make libvirt the only legitimate user of /dev/sev
> > >>>> and everything else would require the admin to change the permissions
> > >>>> explicitly so that other apps could at least retrieve the platform info, if
> > >>>> it was intended to be for public use?
> > >>>> Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
> > >>>> able to access /dev/sev by default.
> > >>>
> > >>> That's separate form probing and just needs SELinux policy to define
> > >>> a new  sev_device_t type and grant svirt_t access to it.
> > >>
> > >> I know, I misread "we can solve this entirely in libvirt" then, I thought you
> > >> the SELinux part was included in the statement, my bad then. Still, back to the
> > >> original issue, we could technically do both, libvirt would have run qemu with
> > >> CAP_DAC_OVERRIDE and we keep working with everything's been released for
> > >> SEV in kernel/qemu and for everyone else, systemd might add 0644 for /dev/sev,
> > >> that way, everyone's happy, not that I'd be a fan of libvirt often having
> > >> to work around something because projects underneath wouldn't backport fixes to
> > >> all the distros we support, thus leaving the dirty work to us.
> > >
> > > Setting 0644 for /dev/sev looks unsafe to me unless someone can show
> > > where the permissions checks take place for the many ioctls that
> > > /dev/sev allows, such that only SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS
> > > is allowed when /dev/sev is opened by a user who doesn't have write
> > > permissions.
> > >
> >
> > Agree its not safe to do 0644.
> >
> > Currently, anyone who has access to /dev/sev (read or write) will be
> > able to execute SEV platform command. In other words there is no
> > permission check per command basis. I must admit that while developing
> > the driver I was under assumption that only root will ever access the
> > /dev/sev device hence overlooked it. But now knowing that others may
> > also need to access the /dev/sev, I can submit patch in kernel to do
> > per command access control.
> >
> > Until then, can we follow Daniel's recommendation to elevate privilege
> > of the probing code?
> 
> So, sorry for not coming back earlier, but I'm still fighting a permission
> issue when opening the /dev/sev device and I honestly don't know what's
> happening. If you apply the patch bellow (or attachment) and you run libvirtd
> with LIBVIRT_LOG_OUTPUTS=1:file:<your_log_file> env, you should see something
> like this in the logs:
> 
> warning : virExec:778 : INHERITABLE CAPS: 'dac_override'
> warning : virExec:781 : EFFECTIVE CAPS: 'dac_override'
> warning : virExec:784 : PERMITTED CAPS: 'dac_override'
> warning : virExec:787 : BOUNDING CAPS: 'dac_override'
> 
> ...and that is right before we issue execve. Yet, if you debug further into the
> QEMU process right after execve, it doesn't report any capabilities at all, so
> naturally it'll still get permission denied. Is there something I'm missing
> here?

Arrrrggghh.  We're missing the painful thing that I re-learn the hard way
every time we play with capabilities in libvirt.....

*Even* if you have set INHERITABLE capabilities, they are discarded
on execve(), unless the binary you are exec'ing has file capabilities
set that match those you are trying to give it. Of course QEMU binaries
(and almost all binaries) lack such file capabilities.

The capabilities man page does in fact explain this

[quote]
        P’(permitted) = (P(inheritable) & F(inheritable)) |
                        (F(permitted) & cap_bset)

        P’(effective) = F(effective) ? P’(permitted) : 0

where

        P         denotes the value of a thread capability set before the execve(2)

        P'        denotes the value of a thread capability set after the execve(2)

        F         denotes a file capability set

If ... real user ID of the process is 0 (root) then the file inheritable
and permitted sets are defined to be all ones
[/quote]

So this means your code works perfectly *if*  we are executing
QEMU as root, but if we are executing QEMU as non-root then
all our work is for nothing.

Now I have actually intentionally mis-quoted from an old man
page here. In Linux 4.3 the kernel devs invented a new set of
"ambient" capabilities [1] which allow programs to deal with this
insanely annoying behaviour.

[quote]
        P'(ambient)     = (file is privileged) ? 0 : P(ambient)

        P'(permitted)   = (P(inheritable) & F(inheritable)) |
                          (F(permitted) & cap_bset) | P'(ambient)

        P'(effective)   = F(effective) ? P'(permitted) : P'(ambient)
[/quote]

so if we added CAP_DAC_OVERRIDE to the ambient capabilities set,
then your patch would work.

We can do that with this patch:

diff --git a/src/util/virutil.c b/src/util/virutil.c
index 5251b66454..be83e91fee 100644
--- a/src/util/virutil.c
+++ b/src/util/virutil.c
@@ -1601,6 +1601,12 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups,
         goto cleanup;
     }
 
+    for (i = 0; i <= CAP_LAST_CAP; i++) {
+        if (capBits & (1ULL << i)) {
+            prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0);
+        }
+    }
+
     ret = 0;
  cleanup:
     return ret;


though, we need a #ifdef check for existance of PR_CAP_AMBIENT 

> An alternative question I've been playing ever since we exchanged the last few
> emails is that can't we wait until the ioctls are compared against permissions
> in kernel so that upstream libvirt (and downstream too for that matter) doesn't
> have to work around it and stick with that workaround for eternity?

IIUC, the SEV feature has already shipped with distros, so we'd effectively
be saying that what we already shipped is unusable to libvirt. This doesn't
feel like a desirable story to me.


Regards,
Daniel

[1] https://s3hh.wordpress.com/2015/07/25/ambient-capabilities/
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-29 18:40                   ` Daniel P. Berrangé
@ 2019-01-30  8:06                     ` Erik Skultety
  2019-01-30 10:37                       ` Daniel P. Berrangé
  0 siblings, 1 reply; 23+ messages in thread
From: Erik Skultety @ 2019-01-30  8:06 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Tue, Jan 29, 2019 at 06:40:08PM +0000, Daniel P. Berrangé wrote:
> On Tue, Jan 29, 2019 at 05:15:42PM +0100, Erik Skultety wrote:
> > On Wed, Jan 23, 2019 at 03:02:28PM +0000, Singh, Brijesh wrote:
> > >
> > >
> > > On 1/23/19 7:36 AM, Daniel P. Berrangé wrote:
> > > > On Wed, Jan 23, 2019 at 02:33:01PM +0100, Erik Skultety wrote:
> > > >> On Wed, Jan 23, 2019 at 01:24:13PM +0000, Daniel P. Berrangé wrote:
> > > >>> On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
> > > >>>> On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> > > >>>>>
> > > >>>>> It still looks like we can solve this entirely in libvirt by just giving
> > > >>>>> the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would make
> > > >>>>> libvirt work for all currently released SEV support in kernel/qemu.
> > > >>>>
> > > >>>> Sure we can, but that would make libvirt the only legitimate user of /dev/sev
> > > >>>> and everything else would require the admin to change the permissions
> > > >>>> explicitly so that other apps could at least retrieve the platform info, if
> > > >>>> it was intended to be for public use?
> > > >>>> Additionally, we'll still get shot down by SELinux because svirt_t wouldn't be
> > > >>>> able to access /dev/sev by default.
> > > >>>
> > > >>> That's separate form probing and just needs SELinux policy to define
> > > >>> a new  sev_device_t type and grant svirt_t access to it.
> > > >>
> > > >> I know, I misread "we can solve this entirely in libvirt" then, I thought you
> > > >> the SELinux part was included in the statement, my bad then. Still, back to the
> > > >> original issue, we could technically do both, libvirt would have run qemu with
> > > >> CAP_DAC_OVERRIDE and we keep working with everything's been released for
> > > >> SEV in kernel/qemu and for everyone else, systemd might add 0644 for /dev/sev,
> > > >> that way, everyone's happy, not that I'd be a fan of libvirt often having
> > > >> to work around something because projects underneath wouldn't backport fixes to
> > > >> all the distros we support, thus leaving the dirty work to us.
> > > >
> > > > Setting 0644 for /dev/sev looks unsafe to me unless someone can show
> > > > where the permissions checks take place for the many ioctls that
> > > > /dev/sev allows, such that only SEV_PDH_CERT_EXPORT or SEV_PLATFORM_STATUS
> > > > is allowed when /dev/sev is opened by a user who doesn't have write
> > > > permissions.
> > > >
> > >
> > > Agree its not safe to do 0644.
> > >
> > > Currently, anyone who has access to /dev/sev (read or write) will be
> > > able to execute SEV platform command. In other words there is no
> > > permission check per command basis. I must admit that while developing
> > > the driver I was under assumption that only root will ever access the
> > > /dev/sev device hence overlooked it. But now knowing that others may
> > > also need to access the /dev/sev, I can submit patch in kernel to do
> > > per command access control.
> > >
> > > Until then, can we follow Daniel's recommendation to elevate privilege
> > > of the probing code?
> >
> > So, sorry for not coming back earlier, but I'm still fighting a permission
> > issue when opening the /dev/sev device and I honestly don't know what's
> > happening. If you apply the patch bellow (or attachment) and you run libvirtd
> > with LIBVIRT_LOG_OUTPUTS=1:file:<your_log_file> env, you should see something
> > like this in the logs:
> >
> > warning : virExec:778 : INHERITABLE CAPS: 'dac_override'
> > warning : virExec:781 : EFFECTIVE CAPS: 'dac_override'
> > warning : virExec:784 : PERMITTED CAPS: 'dac_override'
> > warning : virExec:787 : BOUNDING CAPS: 'dac_override'
> >
> > ...and that is right before we issue execve. Yet, if you debug further into the
> > QEMU process right after execve, it doesn't report any capabilities at all, so
> > naturally it'll still get permission denied. Is there something I'm missing
> > here?
>
> Arrrrggghh.  We're missing the painful thing that I re-learn the hard way
> every time we play with capabilities in libvirt.....
>
> *Even* if you have set INHERITABLE capabilities, they are discarded
> on execve(), unless the binary you are exec'ing has file capabilities
> set that match those you are trying to give it. Of course QEMU binaries
> (and almost all binaries) lack such file capabilities.

Thanks for ^this bit which helped me understand the bits below. When I read the
man page yesterday the first question was, okay, how do I figure out whether
the file capabilities bit is set? Well, use xattrs...which didn't return
anything, so I was puzzled what exactly it should look like, but now that you
explained that most binaries actually lack the file capabilities, I see the
issue clearly :).

>
> The capabilities man page does in fact explain this
>
> [quote]
>         P’(permitted) = (P(inheritable) & F(inheritable)) |
>                         (F(permitted) & cap_bset)
>
>         P’(effective) = F(effective) ? P’(permitted) : 0
>
> where
>
>         P         denotes the value of a thread capability set before the execve(2)
>
>         P'        denotes the value of a thread capability set after the execve(2)
>
>         F         denotes a file capability set
>
> If ... real user ID of the process is 0 (root) then the file inheritable
> and permitted sets are defined to be all ones
> [/quote]
>
> So this means your code works perfectly *if*  we are executing
> QEMU as root, but if we are executing QEMU as non-root then
> all our work is for nothing.
>
> Now I have actually intentionally mis-quoted from an old man
> page here. In Linux 4.3 the kernel devs invented a new set of
> "ambient" capabilities [1] which allow programs to deal with this
> insanely annoying behaviour.

Right, now the ambient stuff makes sense, because at first I didn't understand
what the hell that was about after reading the man page.

>
> [quote]
>         P'(ambient)     = (file is privileged) ? 0 : P(ambient)
>
>         P'(permitted)   = (P(inheritable) & F(inheritable)) |
>                           (F(permitted) & cap_bset) | P'(ambient)
>
>         P'(effective)   = F(effective) ? P'(permitted) : P'(ambient)
> [/quote]
>
> so if we added CAP_DAC_OVERRIDE to the ambient capabilities set,
> then your patch would work.
>
> We can do that with this patch:
>
> diff --git a/src/util/virutil.c b/src/util/virutil.c
> index 5251b66454..be83e91fee 100644
> --- a/src/util/virutil.c
> +++ b/src/util/virutil.c
> @@ -1601,6 +1601,12 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t *groups, int ngroups,
>          goto cleanup;
>      }
>
> +    for (i = 0; i <= CAP_LAST_CAP; i++) {
> +        if (capBits & (1ULL << i)) {
> +            prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0);
> +        }
> +    }

I'll squash this in and give it another test.

> +
>      ret = 0;
>   cleanup:
>      return ret;
>
>
> though, we need a #ifdef check for existance of PR_CAP_AMBIENT
>
> > An alternative question I've been playing ever since we exchanged the last few
> > emails is that can't we wait until the ioctls are compared against permissions
> > in kernel so that upstream libvirt (and downstream too for that matter) doesn't
> > have to work around it and stick with that workaround for eternity?
>
> IIUC, the SEV feature has already shipped with distros, so we'd effectively
> be saying that what we already shipped is unusable to libvirt. This doesn't
> feel like a desirable story to me.

It was, but it never worked, it always has been broken in this way. When we
were merging this upstream, we had a terrible shortage of machines and we had
to share, so the first person to provision the machine had already taken care
of the permissions in order to test so that led to this issue having been
overlooked until now. If it ever worked as expected and then we broke it, then
any fix from our side would make sense but otherwise I believe we should fix
this bottom up.

Thanks,
Erik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-30  8:06                     ` Erik Skultety
@ 2019-01-30 10:37                       ` Daniel P. Berrangé
  2019-01-30 13:39                         ` Erik Skultety
  0 siblings, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2019-01-30 10:37 UTC (permalink / raw)
  To: Erik Skultety; +Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 30, 2019 at 09:06:30AM +0100, Erik Skultety wrote:
> Thanks for ^this bit which helped me understand the bits below. When I read the
> man page yesterday the first question was, okay, how do I figure out whether
> the file capabilities bit is set? Well, use xattrs...which didn't return
> anything, so I was puzzled what exactly it should look like, but now that you
> explained that most binaries actually lack the file capabilities, I see the
> issue clearly :).

The commands you want to experiment with are "getcap" and "setcap" eg

# getcap  qemu-system-x86_64
# setcap cap_dac_override=+ep qemu-system-x86_64
# getcap  qemu-system-x86_64
qemu-system-x86_64 = cap_dac_override+ep
# setcap  cap_dac_override= qemu-system-x86_64
# getcap  qemu-system-x86_64
qemu-system-x86_64 =
# setcap -r qemu-system-x86_64
# getcap  qemu-system-x86_64
# 



> > +
> >      ret = 0;
> >   cleanup:
> >      return ret;
> >
> >
> > though, we need a #ifdef check for existance of PR_CAP_AMBIENT
> >
> > > An alternative question I've been playing ever since we exchanged the last few
> > > emails is that can't we wait until the ioctls are compared against permissions
> > > in kernel so that upstream libvirt (and downstream too for that matter) doesn't
> > > have to work around it and stick with that workaround for eternity?
> >
> > IIUC, the SEV feature has already shipped with distros, so we'd effectively
> > be saying that what we already shipped is unusable to libvirt. This doesn't
> > feel like a desirable story to me.
> 
> It was, but it never worked, it always has been broken in this way. When we
> were merging this upstream, we had a terrible shortage of machines and we had
> to share, so the first person to provision the machine had already taken care
> of the permissions in order to test so that led to this issue having been
> overlooked until now. If it ever worked as expected and then we broke it, then
> any fix from our side would make sense but otherwise I believe we should fix
> this bottom up.

Well technically it would work if libvirt was configured to run as
root:root, but yes, that is not a normal or recommended configuration.

Personally I have a preference for userspace solutions, as those are
pretty straightforward to roll out to people as patches in existing
releases. Deploying kernel updates is a higher bar to cross for an
existing release.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-30 10:37                       ` Daniel P. Berrangé
@ 2019-01-30 13:39                         ` Erik Skultety
  2019-01-30 17:47                           ` Singh, Brijesh
  2019-01-30 18:18                           ` Daniel P. Berrangé
  0 siblings, 2 replies; 23+ messages in thread
From: Erik Skultety @ 2019-01-30 13:39 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

> > > though, we need a #ifdef check for existance of PR_CAP_AMBIENT
> > >
> > > > An alternative question I've been playing ever since we exchanged the last few
> > > > emails is that can't we wait until the ioctls are compared against permissions
> > > > in kernel so that upstream libvirt (and downstream too for that matter) doesn't
> > > > have to work around it and stick with that workaround for eternity?
> > >
> > > IIUC, the SEV feature has already shipped with distros, so we'd effectively
> > > be saying that what we already shipped is unusable to libvirt. This doesn't
> > > feel like a desirable story to me.
> >
> > It was, but it never worked, it always has been broken in this way. When we
> > were merging this upstream, we had a terrible shortage of machines and we had
> > to share, so the first person to provision the machine had already taken care
> > of the permissions in order to test so that led to this issue having been
> > overlooked until now. If it ever worked as expected and then we broke it, then
> > any fix from our side would make sense but otherwise I believe we should fix
> > this bottom up.
>
> Well technically it would work if libvirt was configured to run as
> root:root, but yes, that is not a normal or recommended configuration.
>
> Personally I have a preference for userspace solutions, as those are
> pretty straightforward to roll out to people as patches in existing
> releases. Deploying kernel updates is a higher bar to cross for an
> existing release.

So, can you compile the prctl stuff in kernel conditionally? If so, then that's
a problem because you may end up with a platform where SEV is supported within
kernel, but you don't have the ambient stuff we have to conditionally compile
in libvirt, so you end up with broken SEV support anyway, I wanted to argue
with centos 7, but the ambient set support was backported to 3.10, so the only
distro where we'd have a problem from userspace POV would be debian 8, but then
again the kernel there is so old that neither SEV is supported there.

I understand your point, but it also sounds very agile and I don't think that
compensating with "something that is fast" for "something that is right" is the
way to go in the long term. Especially since we almost never deprecate stuff
and we can't break compatibility. Trying to work around every issue coming
from your dependencies in your project is highly unsustainable.

Thanks,
Erik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-30 13:39                         ` Erik Skultety
@ 2019-01-30 17:47                           ` Singh, Brijesh
  2019-01-30 18:18                           ` Daniel P. Berrangé
  1 sibling, 0 replies; 23+ messages in thread
From: Singh, Brijesh @ 2019-01-30 17:47 UTC (permalink / raw)
  To: Erik Skultety, Daniel P. Berrangé
  Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan



On 1/30/19 7:39 AM, Erik Skultety wrote:
>>>> though, we need a #ifdef check for existance of PR_CAP_AMBIENT
>>>>
>>>>> An alternative question I've been playing ever since we exchanged the last few
>>>>> emails is that can't we wait until the ioctls are compared against permissions
>>>>> in kernel so that upstream libvirt (and downstream too for that matter) doesn't
>>>>> have to work around it and stick with that workaround for eternity?
>>>>
>>>> IIUC, the SEV feature has already shipped with distros, so we'd effectively
>>>> be saying that what we already shipped is unusable to libvirt. This doesn't
>>>> feel like a desirable story to me.
>>>
>>> It was, but it never worked, it always has been broken in this way. When we
>>> were merging this upstream, we had a terrible shortage of machines and we had
>>> to share, so the first person to provision the machine had already taken care
>>> of the permissions in order to test so that led to this issue having been
>>> overlooked until now. If it ever worked as expected and then we broke it, then
>>> any fix from our side would make sense but otherwise I believe we should fix
>>> this bottom up.
>>
>> Well technically it would work if libvirt was configured to run as
>> root:root, but yes, that is not a normal or recommended configuration.
>>
>> Personally I have a preference for userspace solutions, as those are
>> pretty straightforward to roll out to people as patches in existing
>> releases. Deploying kernel updates is a higher bar to cross for an
>> existing release.
> 
> So, can you compile the prctl stuff in kernel conditionally? If so, then that's
> a problem because you may end up with a platform where SEV is supported within
> kernel, but you don't have the ambient stuff we have to conditionally compile
> in libvirt, so you end up with broken SEV support anyway, I wanted to argue
> with centos 7, but the ambient set support was backported to 3.10, so the only
> distro where we'd have a problem from userspace POV would be debian 8, but then
> again the kernel there is so old that neither SEV is supported there.
> 


Are you referring to prctl syscall ? If so, I don't think you can
conditionally compile it out. It will be always there. If getting the
libvirt  to run as root:root during the probe is cumbersome and
causing the backward compatibility issues then I guess we can make
/dev/sev 0644. The 0644 will not create any security vulnerability per
say. It may expose us to a DoS attack. e.g a normal user can
open /dev/sev and issue commands to import new certificates and fill the
storage quickly etc. In long run I do want to patch kernel so that a
user without "write" access will not able to issue any command which
will cause the FW to do some flash writes.

In summary, I am against making /dev/sev 0644 if its simplifies the
integrating in libvirt.


> I understand your point, but it also sounds very agile and I don't think that
> compensating with "something that is fast" for "something that is right" is the
> way to go in the long term. Especially since we almost never deprecate stuff
> and we can't break compatibility. Trying to work around every issue coming
> from your dependencies in your project is highly unsustainable.
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-30 13:39                         ` Erik Skultety
  2019-01-30 17:47                           ` Singh, Brijesh
@ 2019-01-30 18:18                           ` Daniel P. Berrangé
  2019-01-31 15:28                             ` Erik Skultety
  1 sibling, 1 reply; 23+ messages in thread
From: Daniel P. Berrangé @ 2019-01-30 18:18 UTC (permalink / raw)
  To: Erik Skultety; +Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 30, 2019 at 02:39:54PM +0100, Erik Skultety wrote:
> > > > though, we need a #ifdef check for existance of PR_CAP_AMBIENT
> > > >
> > > > > An alternative question I've been playing ever since we exchanged the last few
> > > > > emails is that can't we wait until the ioctls are compared against permissions
> > > > > in kernel so that upstream libvirt (and downstream too for that matter) doesn't
> > > > > have to work around it and stick with that workaround for eternity?
> > > >
> > > > IIUC, the SEV feature has already shipped with distros, so we'd effectively
> > > > be saying that what we already shipped is unusable to libvirt. This doesn't
> > > > feel like a desirable story to me.
> > >
> > > It was, but it never worked, it always has been broken in this way. When we
> > > were merging this upstream, we had a terrible shortage of machines and we had
> > > to share, so the first person to provision the machine had already taken care
> > > of the permissions in order to test so that led to this issue having been
> > > overlooked until now. If it ever worked as expected and then we broke it, then
> > > any fix from our side would make sense but otherwise I believe we should fix
> > > this bottom up.
> >
> > Well technically it would work if libvirt was configured to run as
> > root:root, but yes, that is not a normal or recommended configuration.
> >
> > Personally I have a preference for userspace solutions, as those are
> > pretty straightforward to roll out to people as patches in existing
> > releases. Deploying kernel updates is a higher bar to cross for an
> > existing release.
> 
> So, can you compile the prctl stuff in kernel conditionally? If so, then that's
> a problem because you may end up with a platform where SEV is supported within
> kernel, but you don't have the ambient stuff we have to conditionally compile
> in libvirt, so you end up with broken SEV support anyway, I wanted to argue
> with centos 7, but the ambient set support was backported to 3.10, so the only
> distro where we'd have a problem from userspace POV would be debian 8, but then
> again the kernel there is so old that neither SEV is supported there.
> 
> I understand your point, but it also sounds very agile and I don't think that
> compensating with "something that is fast" for "something that is right" is the
> way to go in the long term. Especially since we almost never deprecate stuff
> and we can't break compatibility. Trying to work around every issue coming
> from your dependencies in your project is highly unsustainable.

With the launching of VMs we've got to a place where libvirt is pretty
robust about being able to grant access regardless of what the host OS
has done for permissions in /dev.  I think its desirable that this same
flexibility extends to capabilities probing, which is somethign the
dac_override approach allows. IOW, even if the kernel changes /dev/sev
as previously discussed, I would keep the dac_override stuff for probing
capabilities forever. This makes sure we'll work even if the distro in
question has strictly locked down permissions on /dev/kvm or /dev/sev,
diverging from the default udev settup


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
  2019-01-30 18:18                           ` Daniel P. Berrangé
@ 2019-01-31 15:28                             ` Erik Skultety
  0 siblings, 0 replies; 23+ messages in thread
From: Erik Skultety @ 2019-01-31 15:28 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Singh, Brijesh, libvir-list, qemu-devel, dinechin, mkletzan

On Wed, Jan 30, 2019 at 06:18:22PM +0000, Daniel P. Berrangé wrote:
> On Wed, Jan 30, 2019 at 02:39:54PM +0100, Erik Skultety wrote:
> > > > > though, we need a #ifdef check for existance of PR_CAP_AMBIENT
> > > > >
> > > > > > An alternative question I've been playing ever since we exchanged the last few
> > > > > > emails is that can't we wait until the ioctls are compared against permissions
> > > > > > in kernel so that upstream libvirt (and downstream too for that matter) doesn't
> > > > > > have to work around it and stick with that workaround for eternity?
> > > > >
> > > > > IIUC, the SEV feature has already shipped with distros, so we'd effectively
> > > > > be saying that what we already shipped is unusable to libvirt. This doesn't
> > > > > feel like a desirable story to me.
> > > >
> > > > It was, but it never worked, it always has been broken in this way. When we
> > > > were merging this upstream, we had a terrible shortage of machines and we had
> > > > to share, so the first person to provision the machine had already taken care
> > > > of the permissions in order to test so that led to this issue having been
> > > > overlooked until now. If it ever worked as expected and then we broke it, then
> > > > any fix from our side would make sense but otherwise I believe we should fix
> > > > this bottom up.
> > >
> > > Well technically it would work if libvirt was configured to run as
> > > root:root, but yes, that is not a normal or recommended configuration.
> > >
> > > Personally I have a preference for userspace solutions, as those are
> > > pretty straightforward to roll out to people as patches in existing
> > > releases. Deploying kernel updates is a higher bar to cross for an
> > > existing release.
> >
> > So, can you compile the prctl stuff in kernel conditionally? If so, then that's
> > a problem because you may end up with a platform where SEV is supported within
> > kernel, but you don't have the ambient stuff we have to conditionally compile
> > in libvirt, so you end up with broken SEV support anyway, I wanted to argue
> > with centos 7, but the ambient set support was backported to 3.10, so the only
> > distro where we'd have a problem from userspace POV would be debian 8, but then
> > again the kernel there is so old that neither SEV is supported there.
> >
> > I understand your point, but it also sounds very agile and I don't think that
> > compensating with "something that is fast" for "something that is right" is the
> > way to go in the long term. Especially since we almost never deprecate stuff
> > and we can't break compatibility. Trying to work around every issue coming
> > from your dependencies in your project is highly unsustainable.
>
> With the launching of VMs we've got to a place where libvirt is pretty
> robust about being able to grant access regardless of what the host OS
> has done for permissions in /dev.  I think its desirable that this same
> flexibility extends to capabilities probing, which is somethign the
> dac_override approach allows. IOW, even if the kernel changes /dev/sev
> as previously discussed, I would keep the dac_override stuff for probing
> capabilities forever. This makes sure we'll work even if the distro in
> question has strictly locked down permissions on /dev/kvm or /dev/sev,
> diverging from the default udev settup

Fair enough, I resent the series where I only exposed /dev/sev to machines that
need it and applied the DAC_OVERRIDE_PATCH on top of that.
https://www.redhat.com/archives/libvir-list/2019-January/msg01343.html

Erik

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-01-31 15:28 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-18  9:39 [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities Erik Skultety
2019-01-18 10:16 ` Daniel P. Berrangé
2019-01-18 10:56   ` [Qemu-devel] [libvirt] " Erik Skultety
2019-01-18 11:11   ` [Qemu-devel] " Martin Kletzander
2019-01-18 11:17     ` Daniel P. Berrangé
2019-01-18 11:31       ` Martin Kletzander
2019-01-18 12:51 ` Singh, Brijesh
2019-01-23 12:55   ` Erik Skultety
2019-01-23 13:10     ` Daniel P. Berrangé
2019-01-23 13:22       ` Erik Skultety
2019-01-23 13:24         ` Daniel P. Berrangé
2019-01-23 13:33           ` Erik Skultety
2019-01-23 13:36             ` Daniel P. Berrangé
2019-01-23 15:02               ` Singh, Brijesh
2019-01-23 15:29                 ` Erik Skultety
2019-01-29 16:15                 ` Erik Skultety
2019-01-29 18:40                   ` Daniel P. Berrangé
2019-01-30  8:06                     ` Erik Skultety
2019-01-30 10:37                       ` Daniel P. Berrangé
2019-01-30 13:39                         ` Erik Skultety
2019-01-30 17:47                           ` Singh, Brijesh
2019-01-30 18:18                           ` Daniel P. Berrangé
2019-01-31 15:28                             ` Erik Skultety

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.