linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] PCI/MSI: Hardening fixes
@ 2023-01-19 17:06 Alexander Shishkin
  2023-01-19 17:06 ` [PATCH 1/2] PCI/MSI: Cache the MSIX table size Alexander Shishkin
  2023-01-19 17:06 ` [PATCH 2/2] PCI/MSI: Validate device supplied MSI table offset and size Alexander Shishkin
  0 siblings, 2 replies; 16+ messages in thread
From: Alexander Shishkin @ 2023-01-19 17:06 UTC (permalink / raw)
  To: Bjorn Helgaas, Thomas Gleixner, linux-pci
  Cc: linux-kernel, Marc Zyngier, darwi, elena.reshetova,
	kirill.shutemov, Alexander Shishkin

Hi,

Here are 2 hardening fixes around MSIX table size/offset handling,
aiming to prevent a malicious device or VMM from triggering bugs by
supplying bogus values.

Alexander Shishkin (2):
  PCI/MSI: Cache the MSIX table size
  PCI/MSI: Validate device supplied MSI table offset and size

 drivers/pci/msi/api.c |  7 ++++++-
 drivers/pci/msi/msi.c | 12 +++++++++---
 include/linux/pci.h   |  1 +
 3 files changed, 16 insertions(+), 4 deletions(-)

-- 
2.39.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-19 17:06 [PATCH 0/2] PCI/MSI: Hardening fixes Alexander Shishkin
@ 2023-01-19 17:06 ` Alexander Shishkin
  2023-01-22  9:00   ` Leon Romanovsky
  2023-01-19 17:06 ` [PATCH 2/2] PCI/MSI: Validate device supplied MSI table offset and size Alexander Shishkin
  1 sibling, 1 reply; 16+ messages in thread
From: Alexander Shishkin @ 2023-01-19 17:06 UTC (permalink / raw)
  To: Bjorn Helgaas, Thomas Gleixner, linux-pci
  Cc: linux-kernel, Marc Zyngier, darwi, elena.reshetova,
	kirill.shutemov, Alexander Shishkin, Mika Westerberg, stable

A malicious device can change its MSIX table size between the table
ioremap() and subsequent accesses, resulting in a kernel page fault in
pci_write_msg_msix().

To avoid this, cache the table size observed at the moment of table
ioremap() and use the cached value. This, however, does not help drivers
that peek at the PCIE_MSIX_FLAGS register directly.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/pci/msi/api.c | 7 ++++++-
 drivers/pci/msi/msi.c | 2 +-
 include/linux/pci.h   | 1 +
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/msi/api.c b/drivers/pci/msi/api.c
index b8009aa11f3c..617ea1256487 100644
--- a/drivers/pci/msi/api.c
+++ b/drivers/pci/msi/api.c
@@ -75,8 +75,13 @@ int pci_msix_vec_count(struct pci_dev *dev)
 	if (!dev->msix_cap)
 		return -EINVAL;
 
+	if (dev->flags_qsize)
+		return dev->flags_qsize;
+
 	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
-	return msix_table_size(control);
+	dev->flags_qsize = msix_table_size(control);
+
+	return dev->flags_qsize;
 }
 EXPORT_SYMBOL(pci_msix_vec_count);
 
diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index 1f716624ca56..d50cd45119f1 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -715,7 +715,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
 
 	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
 	/* Request & Map MSI-X table region */
-	tsize = msix_table_size(control);
+	tsize = pci_msix_vec_count(dev);
 	dev->msix_base = msix_map_region(dev, tsize);
 	if (!dev->msix_base) {
 		ret = -ENOMEM;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index adffd65e84b4..2e1a72a2139d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -352,6 +352,7 @@ struct pci_dev {
 	u8		rom_base_reg;	/* Config register controlling ROM */
 	u8		pin;		/* Interrupt pin this device uses */
 	u16		pcie_flags_reg;	/* Cached PCIe Capabilities Register */
+	u16		flags_qsize;	/* Cached MSIX table size */
 	unsigned long	*dma_alias_mask;/* Mask of enabled devfn aliases */
 
 	struct pci_driver *driver;	/* Driver bound to this device */
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/2] PCI/MSI: Validate device supplied MSI table offset and size
  2023-01-19 17:06 [PATCH 0/2] PCI/MSI: Hardening fixes Alexander Shishkin
  2023-01-19 17:06 ` [PATCH 1/2] PCI/MSI: Cache the MSIX table size Alexander Shishkin
@ 2023-01-19 17:06 ` Alexander Shishkin
  1 sibling, 0 replies; 16+ messages in thread
From: Alexander Shishkin @ 2023-01-19 17:06 UTC (permalink / raw)
  To: Bjorn Helgaas, Thomas Gleixner, linux-pci
  Cc: linux-kernel, Marc Zyngier, darwi, elena.reshetova,
	kirill.shutemov, Alexander Shishkin, Mika Westerberg,
	Ilpo Järvinen, stable

Currently, the MSI table offset supplied by device's config space is
passed directly into ioremap() without validation, allowing, for
example, a malicious VMM to trick the OS into exposing its private
memory.

Correct this by making sure the table with the given number of vectors
fits into its BIR starting at the provided table offset.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by: Elena Reshetova <elena.reshetova@intel.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/pci/msi/msi.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
index d50cd45119f1..e93e633cb6a3 100644
--- a/drivers/pci/msi/msi.c
+++ b/drivers/pci/msi/msi.c
@@ -552,7 +552,8 @@ static void __iomem *msix_map_region(struct pci_dev *dev,
 				     unsigned int nr_entries)
 {
 	resource_size_t phys_addr;
-	u32 table_offset;
+	u32 table_offset, table_size;
+	resource_size_t bir_size;
 	unsigned long flags;
 	u8 bir;
 
@@ -563,10 +564,15 @@ static void __iomem *msix_map_region(struct pci_dev *dev,
 	if (!flags || (flags & IORESOURCE_UNSET))
 		return NULL;
 
+	bir_size = pci_resource_len(dev, bir);
+	table_size = nr_entries * PCI_MSIX_ENTRY_SIZE;
 	table_offset &= PCI_MSIX_TABLE_OFFSET;
+	if (bir_size < table_size || table_offset > bir_size - table_size)
+		return NULL;
+
 	phys_addr = pci_resource_start(dev, bir) + table_offset;
 
-	return ioremap(phys_addr, nr_entries * PCI_MSIX_ENTRY_SIZE);
+	return ioremap(phys_addr, table_size);
 }
 
 /**
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-19 17:06 ` [PATCH 1/2] PCI/MSI: Cache the MSIX table size Alexander Shishkin
@ 2023-01-22  9:00   ` Leon Romanovsky
  2023-01-22 10:57     ` Marc Zyngier
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Leon Romanovsky @ 2023-01-22  9:00 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Bjorn Helgaas, Thomas Gleixner, linux-pci, linux-kernel,
	Marc Zyngier, darwi, elena.reshetova, kirill.shutemov,
	Mika Westerberg, stable

On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
> A malicious device can change its MSIX table size between the table
> ioremap() and subsequent accesses, resulting in a kernel page fault in
> pci_write_msg_msix().
> 
> To avoid this, cache the table size observed at the moment of table
> ioremap() and use the cached value. This, however, does not help drivers
> that peek at the PCIE_MSIX_FLAGS register directly.
> 
> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/pci/msi/api.c | 7 ++++++-
>  drivers/pci/msi/msi.c | 2 +-
>  include/linux/pci.h   | 1 +
>  3 files changed, 8 insertions(+), 2 deletions(-)

I'm not security expert here, but not sure that this protects from anything.
1. Kernel relies on working and not-malicious HW. There are gazillion ways
to cause crashes other than changing MSI-X.
2. Device can report large table size, kernel will cache it and
malicious device will reduce it back. It is not handled and will cause
to kernel crash too.

Thanks

> 
> diff --git a/drivers/pci/msi/api.c b/drivers/pci/msi/api.c
> index b8009aa11f3c..617ea1256487 100644
> --- a/drivers/pci/msi/api.c
> +++ b/drivers/pci/msi/api.c
> @@ -75,8 +75,13 @@ int pci_msix_vec_count(struct pci_dev *dev)
>  	if (!dev->msix_cap)
>  		return -EINVAL;
>  
> +	if (dev->flags_qsize)
> +		return dev->flags_qsize;
> +
>  	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
> -	return msix_table_size(control);
> +	dev->flags_qsize = msix_table_size(control);
> +
> +	return dev->flags_qsize;
>  }
>  EXPORT_SYMBOL(pci_msix_vec_count);
>  
> diff --git a/drivers/pci/msi/msi.c b/drivers/pci/msi/msi.c
> index 1f716624ca56..d50cd45119f1 100644
> --- a/drivers/pci/msi/msi.c
> +++ b/drivers/pci/msi/msi.c
> @@ -715,7 +715,7 @@ static int msix_capability_init(struct pci_dev *dev, struct msix_entry *entries,
>  
>  	pci_read_config_word(dev, dev->msix_cap + PCI_MSIX_FLAGS, &control);
>  	/* Request & Map MSI-X table region */
> -	tsize = msix_table_size(control);
> +	tsize = pci_msix_vec_count(dev);
>  	dev->msix_base = msix_map_region(dev, tsize);
>  	if (!dev->msix_base) {
>  		ret = -ENOMEM;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index adffd65e84b4..2e1a72a2139d 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -352,6 +352,7 @@ struct pci_dev {
>  	u8		rom_base_reg;	/* Config register controlling ROM */
>  	u8		pin;		/* Interrupt pin this device uses */
>  	u16		pcie_flags_reg;	/* Cached PCIe Capabilities Register */
> +	u16		flags_qsize;	/* Cached MSIX table size */
>  	unsigned long	*dma_alias_mask;/* Mask of enabled devfn aliases */
>  
>  	struct pci_driver *driver;	/* Driver bound to this device */
> -- 
> 2.39.0
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-22  9:00   ` Leon Romanovsky
@ 2023-01-22 10:57     ` Marc Zyngier
  2023-01-22 15:34       ` David Laight
  2023-01-24 11:59       ` Alexander Shishkin
  2023-01-22 10:57     ` Greg KH
  2023-01-24 11:52     ` Alexander Shishkin
  2 siblings, 2 replies; 16+ messages in thread
From: Marc Zyngier @ 2023-01-22 10:57 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Alexander Shishkin, Bjorn Helgaas, Thomas Gleixner, linux-pci,
	linux-kernel, darwi, elena.reshetova, kirill.shutemov,
	Mika Westerberg, stable

On Sun, 22 Jan 2023 09:00:04 +0000,
Leon Romanovsky <leon@kernel.org> wrote:
> 
> On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
> > A malicious device can change its MSIX table size between the table
> > ioremap() and subsequent accesses, resulting in a kernel page fault in
> > pci_write_msg_msix().
> > 
> > To avoid this, cache the table size observed at the moment of table
> > ioremap() and use the cached value. This, however, does not help drivers
> > that peek at the PCIE_MSIX_FLAGS register directly.
> > 
> > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/pci/msi/api.c | 7 ++++++-
> >  drivers/pci/msi/msi.c | 2 +-
> >  include/linux/pci.h   | 1 +
> >  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> I'm not security expert here, but not sure that this protects from anything.
> 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> to cause crashes other than changing MSI-X.
> 2. Device can report large table size, kernel will cache it and
> malicious device will reduce it back. It is not handled and will cause
> to kernel crash too.
> 

Indeed, this was my exact reaction reading this patch. This only makes
sure the same (potentially wrong) value is used at all times. So while
this results in a consistent use, this doesn't give much guarantee.

The only way to deal with this is to actually handle the resulting
fault, similar to what the kernel does when accessing userspace. Not
sure how possible this is with something like PCIe.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-22  9:00   ` Leon Romanovsky
  2023-01-22 10:57     ` Marc Zyngier
@ 2023-01-22 10:57     ` Greg KH
  2023-01-23 10:22       ` Jonathan Cameron
  2023-01-24 11:52     ` Alexander Shishkin
  2 siblings, 1 reply; 16+ messages in thread
From: Greg KH @ 2023-01-22 10:57 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Alexander Shishkin, Bjorn Helgaas, Thomas Gleixner, linux-pci,
	linux-kernel, Marc Zyngier, darwi, elena.reshetova,
	kirill.shutemov, Mika Westerberg, stable

On Sun, Jan 22, 2023 at 11:00:04AM +0200, Leon Romanovsky wrote:
> On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
> > A malicious device can change its MSIX table size between the table
> > ioremap() and subsequent accesses, resulting in a kernel page fault in
> > pci_write_msg_msix().
> > 
> > To avoid this, cache the table size observed at the moment of table
> > ioremap() and use the cached value. This, however, does not help drivers
> > that peek at the PCIE_MSIX_FLAGS register directly.
> > 
> > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  drivers/pci/msi/api.c | 7 ++++++-
> >  drivers/pci/msi/msi.c | 2 +-
> >  include/linux/pci.h   | 1 +
> >  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> I'm not security expert here, but not sure that this protects from anything.
> 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> to cause crashes other than changing MSI-X.

Linux does NOT protect from malicious PCIe devices at this point in
time, you are correct.  If we wish to change that model, then we can
work on that with the explict understanding that most all drivers will
need to change as will the bus logic for the busses involved.

To do piece-meal patches like this for no good reason is not a good idea
as it achieves nothing in the end :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-22 10:57     ` Marc Zyngier
@ 2023-01-22 15:34       ` David Laight
  2023-01-24 11:59       ` Alexander Shishkin
  1 sibling, 0 replies; 16+ messages in thread
From: David Laight @ 2023-01-22 15:34 UTC (permalink / raw)
  To: 'Marc Zyngier', Leon Romanovsky
  Cc: Alexander Shishkin, Bjorn Helgaas, Thomas Gleixner, linux-pci,
	linux-kernel, darwi, elena.reshetova, kirill.shutemov,
	Mika Westerberg, stable

From: Marc Zyngier
> Sent: 22 January 2023 10:57
> 
> On Sun, 22 Jan 2023 09:00:04 +0000,
> Leon Romanovsky <leon@kernel.org> wrote:
> >
> > On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
> > > A malicious device can change its MSIX table size between the table
> > > ioremap() and subsequent accesses, resulting in a kernel page fault in
> > > pci_write_msg_msix().
> > >
> > > To avoid this, cache the table size observed at the moment of table
> > > ioremap() and use the cached value. This, however, does not help drivers
> > > that peek at the PCIE_MSIX_FLAGS register directly.
> > >
> > > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > > Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  drivers/pci/msi/api.c | 7 ++++++-
> > >  drivers/pci/msi/msi.c | 2 +-
> > >  include/linux/pci.h   | 1 +
> > >  3 files changed, 8 insertions(+), 2 deletions(-)
> >
> > I'm not security expert here, but not sure that this protects from anything.
> > 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> > to cause crashes other than changing MSI-X.
> > 2. Device can report large table size, kernel will cache it and
> > malicious device will reduce it back. It is not handled and will cause
> > to kernel crash too.
> >
> 
> Indeed, this was my exact reaction reading this patch. This only makes
> sure the same (potentially wrong) value is used at all times. So while
> this results in a consistent use, this doesn't give much guarantee.

Yep, the device can 'choose' to error any PCIe read.

> The only way to deal with this is to actually handle the resulting
> fault, similar to what the kernel does when accessing userspace. Not
> sure how possible this is with something like PCIe.

I don't think you get a fault, the PCIe read completes with value ~0.
You might get an AER indication, that may not be helpful at all.
We've some x86 systems where that ends up as an NMI and panic!

A more valid reason for caching the MSIX table size is to avoid
doing a slow PCIe read.
I'm not sure how fast they are on 'normal' hardware, but the Altera
fpga we use is particularly pedestrian.
I just measured back-to-back reads at 126 clocks of the internal
125MHz clock so almost exactly 1us - or 3000 clocks of a 3GHz cpu.
(I added PCIe trace to the fpga so we can see what goes on.)

There is actually a much more 'interesting' issue with MSIX.
There are 16 bytes of data for each interrupt.

The kernel doesn't even try to ensure they are written as
a single PCIe TLP, and even if it did there is no real
guarantee the writes aren't split before the logic that
raises interrupts reads the values.

It is also quite likely that the interrupt raising logic
doesn't to an atomic read of all 16 bytes, so a cpu write
could split the reads.

This doesn't normally matter - the interrupt is enabled long
after the address/data fields are initialised.
But if the interrupt affinity is changed both the address and
data fields are likely to be changed on an interrupt that is
(nominally) enabled.

It is pretty easy to imagine the new address being used with
the old data (or v.v.) or even a 'torn read' of the 64bit
address field.

I don't remember seeing anything in the MSIX spec about
requirements on the hardware - which puts the onus on
the software to ensure the MSIX data is always valid.
This means that changing the vector needs to:
	Disable the interrupt.
	Delay (a read from the MSIX block is probably enough).
	Update the address and data.
	Delay.
	Enable the interrupt.
But I don't remember seeing the kernel to any of that.

	David.

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-22 10:57     ` Greg KH
@ 2023-01-23 10:22       ` Jonathan Cameron
  0 siblings, 0 replies; 16+ messages in thread
From: Jonathan Cameron @ 2023-01-23 10:22 UTC (permalink / raw)
  To: Greg KH
  Cc: Leon Romanovsky, Alexander Shishkin, Bjorn Helgaas,
	Thomas Gleixner, linux-pci, linux-kernel, Marc Zyngier, darwi,
	elena.reshetova, kirill.shutemov, Mika Westerberg, stable,
	Lukas Wunner

On Sun, 22 Jan 2023 11:57:58 +0100
Greg KH <gregkh@linuxfoundation.org> wrote:

> On Sun, Jan 22, 2023 at 11:00:04AM +0200, Leon Romanovsky wrote:
> > On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:  
> > > A malicious device can change its MSIX table size between the table
> > > ioremap() and subsequent accesses, resulting in a kernel page fault in
> > > pci_write_msg_msix().
> > > 
> > > To avoid this, cache the table size observed at the moment of table
> > > ioremap() and use the cached value. This, however, does not help drivers
> > > that peek at the PCIE_MSIX_FLAGS register directly.
> > > 
> > > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > > Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > > Cc: stable@vger.kernel.org
> > > ---
> > >  drivers/pci/msi/api.c | 7 ++++++-
> > >  drivers/pci/msi/msi.c | 2 +-
> > >  include/linux/pci.h   | 1 +
> > >  3 files changed, 8 insertions(+), 2 deletions(-)  
> > 
> > I'm not security expert here, but not sure that this protects from anything.
> > 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> > to cause crashes other than changing MSI-X.  
> 
> Linux does NOT protect from malicious PCIe devices at this point in
> time, you are correct.  If we wish to change that model, then we can
> work on that with the explict understanding that most all drivers will
> need to change as will the bus logic for the busses involved.
> 
> To do piece-meal patches like this for no good reason is not a good idea
> as it achieves nothing in the end :(
> 
> thanks,
> 
> greg k-h

If you care enough about potential malicious PCIe devices, do device
attestation and reject any devices that don't support it (which means
rejecting pretty much everything today ;).
Or potentially limit what non attested devices are allowed to do.

+CC Lukas who is working on this.

Jonathan

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-22  9:00   ` Leon Romanovsky
  2023-01-22 10:57     ` Marc Zyngier
  2023-01-22 10:57     ` Greg KH
@ 2023-01-24 11:52     ` Alexander Shishkin
  2023-01-24 12:10       ` Leon Romanovsky
  2023-01-24 15:32       ` Greg KH
  2 siblings, 2 replies; 16+ messages in thread
From: Alexander Shishkin @ 2023-01-24 11:52 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Bjorn Helgaas, Thomas Gleixner, linux-pci, linux-kernel,
	Marc Zyngier, darwi, elena.reshetova, kirill.shutemov,
	Mika Westerberg, stable, alexander.shishkin

Leon Romanovsky <leon@kernel.org> writes:

> On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
>> A malicious device can change its MSIX table size between the table
>> ioremap() and subsequent accesses, resulting in a kernel page fault in
>> pci_write_msg_msix().
>> 
>> To avoid this, cache the table size observed at the moment of table
>> ioremap() and use the cached value. This, however, does not help drivers
>> that peek at the PCIE_MSIX_FLAGS register directly.
>> 
>> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  drivers/pci/msi/api.c | 7 ++++++-
>>  drivers/pci/msi/msi.c | 2 +-
>>  include/linux/pci.h   | 1 +
>>  3 files changed, 8 insertions(+), 2 deletions(-)
>
> I'm not security expert here, but not sure that this protects from anything.
> 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> to cause crashes other than changing MSI-X.

This particular bug was preventing our fuzzing from going deeper into
the code and reaching some more of the aforementioned gazillion bugs.

> 2. Device can report large table size, kernel will cache it and
> malicious device will reduce it back. It is not handled and will cause
> to kernel crash too.

How would that happen? If the device decides to have fewer vectors,
they'll all still fit in the ioremapped MSIX table. The worst thing that
can happen is 0xffffffff reads from the mmio space, which a device can
do anyway. But that shouldn't trigger a page fault or otherwise
crash. Or am I missing something?

Thanks,
--
Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-22 10:57     ` Marc Zyngier
  2023-01-22 15:34       ` David Laight
@ 2023-01-24 11:59       ` Alexander Shishkin
  1 sibling, 0 replies; 16+ messages in thread
From: Alexander Shishkin @ 2023-01-24 11:59 UTC (permalink / raw)
  To: Marc Zyngier, Leon Romanovsky
  Cc: Bjorn Helgaas, Thomas Gleixner, linux-pci, linux-kernel, darwi,
	elena.reshetova, kirill.shutemov, Mika Westerberg, stable,
	alexander.shishkin

Marc Zyngier <maz@kernel.org> writes:

> On Sun, 22 Jan 2023 09:00:04 +0000,
> Leon Romanovsky <leon@kernel.org> wrote:
>> 
>> On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
>> > A malicious device can change its MSIX table size between the table
>> > ioremap() and subsequent accesses, resulting in a kernel page fault in
>> > pci_write_msg_msix().
>> > 
>> > To avoid this, cache the table size observed at the moment of table
>> > ioremap() and use the cached value. This, however, does not help drivers
>> > that peek at the PCIE_MSIX_FLAGS register directly.
>> > 
>> > Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
>> > Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
>> > Cc: stable@vger.kernel.org
>> > ---
>> >  drivers/pci/msi/api.c | 7 ++++++-
>> >  drivers/pci/msi/msi.c | 2 +-
>> >  include/linux/pci.h   | 1 +
>> >  3 files changed, 8 insertions(+), 2 deletions(-)
>> 
>> I'm not security expert here, but not sure that this protects from anything.
>> 1. Kernel relies on working and not-malicious HW. There are gazillion ways
>> to cause crashes other than changing MSI-X.
>> 2. Device can report large table size, kernel will cache it and
>> malicious device will reduce it back. It is not handled and will cause
>> to kernel crash too.
>> 
>
> Indeed, this was my exact reaction reading this patch. This only makes
> sure the same (potentially wrong) value is used at all times. So while
> this results in a consistent use, this doesn't give much guarantee.

It guarantees that the MSIX table is big enough to fit all the vectors,
so it should prevent the page faults from out-of-bounds accesses.

> The only way to deal with this is to actually handle the resulting
> fault, similar to what the kernel does when accessing userspace. Not
> sure how possible this is with something like PCIe.

Do you mean replacing MMIO accesses with exception handling accessors?
That seems like a monumental effort. And then we'd have to figure out
how to handle errors in the __pci_write_msi_msg() path.

Preventing page faults from happening in the first place seems like a
more reasonable solution, or what do you think?

Thanks,
--
Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-24 11:52     ` Alexander Shishkin
@ 2023-01-24 12:10       ` Leon Romanovsky
  2023-01-24 12:42         ` Alexander Shishkin
  2023-01-24 15:32       ` Greg KH
  1 sibling, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2023-01-24 12:10 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Bjorn Helgaas, Thomas Gleixner, linux-pci, linux-kernel,
	Marc Zyngier, darwi, elena.reshetova, kirill.shutemov,
	Mika Westerberg, stable

On Tue, Jan 24, 2023 at 01:52:37PM +0200, Alexander Shishkin wrote:
> Leon Romanovsky <leon@kernel.org> writes:
> 
> > On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
> >> A malicious device can change its MSIX table size between the table
> >> ioremap() and subsequent accesses, resulting in a kernel page fault in
> >> pci_write_msg_msix().
> >> 
> >> To avoid this, cache the table size observed at the moment of table
> >> ioremap() and use the cached value. This, however, does not help drivers
> >> that peek at the PCIE_MSIX_FLAGS register directly.
> >> 
> >> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> >> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> >> Cc: stable@vger.kernel.org
> >> ---
> >>  drivers/pci/msi/api.c | 7 ++++++-
> >>  drivers/pci/msi/msi.c | 2 +-
> >>  include/linux/pci.h   | 1 +
> >>  3 files changed, 8 insertions(+), 2 deletions(-)
> >
> > I'm not security expert here, but not sure that this protects from anything.
> > 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> > to cause crashes other than changing MSI-X.
> 
> This particular bug was preventing our fuzzing from going deeper into
> the code and reaching some more of the aforementioned gazillion bugs.

Your commit message says nothing about fuzzing, but talks about
malicious device. 
Do you see "gazillion bugs" for devices which don't change their MSI-X
table size under the hood, which is main kernel assumption?

If yes, you should fix these bugs.

> 
> > 2. Device can report large table size, kernel will cache it and
> > malicious device will reduce it back. It is not handled and will cause
> > to kernel crash too.
> 
> How would that happen? If the device decides to have fewer vectors,
> they'll all still fit in the ioremapped MSIX table. The worst thing that
> can happen is 0xffffffff reads from the mmio space, which a device can
> do anyway. But that shouldn't trigger a page fault or otherwise
> crash. Or am I missing something?

Like I said, I'm no expert. You should tell me if it safe for all
callers of pci_msix_vec_count().

Thanks

> 
> Thanks,
> --
> Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-24 12:10       ` Leon Romanovsky
@ 2023-01-24 12:42         ` Alexander Shishkin
  2023-01-24 12:59           ` Leon Romanovsky
  0 siblings, 1 reply; 16+ messages in thread
From: Alexander Shishkin @ 2023-01-24 12:42 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Bjorn Helgaas, Thomas Gleixner, linux-pci, linux-kernel,
	Marc Zyngier, darwi, elena.reshetova, kirill.shutemov,
	Mika Westerberg, stable, alexander.shishkin

Leon Romanovsky <leon@kernel.org> writes:

> On Tue, Jan 24, 2023 at 01:52:37PM +0200, Alexander Shishkin wrote:
>> Leon Romanovsky <leon@kernel.org> writes:
>> 
>> > I'm not security expert here, but not sure that this protects from anything.
>> > 1. Kernel relies on working and not-malicious HW. There are gazillion ways
>> > to cause crashes other than changing MSI-X.
>> 
>> This particular bug was preventing our fuzzing from going deeper into
>> the code and reaching some more of the aforementioned gazillion bugs.
>
> Your commit message says nothing about fuzzing, but talks about
> malicious device. 

A malicious device is what the fuzzing is aiming to simulate. The fact
of fuzzing process itself didn't seem relevant to the patch, so I didn't
include it, going instead for the problem statement and proposed
solution. Will the commit message benefit from mentioning fuzzing?

> Do you see "gazillion bugs" for devices which don't change their MSI-X
> table size under the hood, which is main kernel assumption?

Not so far.

> If yes, you should fix these bugs.

That's absolutely the intention.

>> > 2. Device can report large table size, kernel will cache it and
>> > malicious device will reduce it back. It is not handled and will cause
>> > to kernel crash too.
>> 
>> How would that happen? If the device decides to have fewer vectors,
>> they'll all still fit in the ioremapped MSIX table. The worst thing that
>> can happen is 0xffffffff reads from the mmio space, which a device can
>> do anyway. But that shouldn't trigger a page fault or otherwise
>> crash. Or am I missing something?
>
> Like I said, I'm no expert. You should tell me if it safe for all
> callers of pci_msix_vec_count().

Well, since you stated that the reverse will cause a kernel crash, I had
to ask how. I'll include some version of the above paragraph in the
commit message to indicate that we reverse situation has been considered.

Regards,
--
Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-24 12:42         ` Alexander Shishkin
@ 2023-01-24 12:59           ` Leon Romanovsky
  2023-01-24 15:28             ` Alexander Shishkin
  0 siblings, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2023-01-24 12:59 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Bjorn Helgaas, Thomas Gleixner, linux-pci, linux-kernel,
	Marc Zyngier, darwi, elena.reshetova, kirill.shutemov,
	Mika Westerberg, stable

On Tue, Jan 24, 2023 at 02:42:11PM +0200, Alexander Shishkin wrote:
> Leon Romanovsky <leon@kernel.org> writes:
> 
> > On Tue, Jan 24, 2023 at 01:52:37PM +0200, Alexander Shishkin wrote:
> >> Leon Romanovsky <leon@kernel.org> writes:
> >> 
> >> > I'm not security expert here, but not sure that this protects from anything.
> >> > 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> >> > to cause crashes other than changing MSI-X.
> >> 
> >> This particular bug was preventing our fuzzing from going deeper into
> >> the code and reaching some more of the aforementioned gazillion bugs.
> >
> > Your commit message says nothing about fuzzing, but talks about
> > malicious device. 
> 
> A malicious device is what the fuzzing is aiming to simulate. The fact
> of fuzzing process itself didn't seem relevant to the patch, so I didn't
> include it, going instead for the problem statement and proposed
> solution. Will the commit message benefit from mentioning fuzzing?

No, for most if not all kernel developers, the fuzzing means some sort of
random user-space input. PCI devices are trusted in the kernel.

> 
> > Do you see "gazillion bugs" for devices which don't change their MSI-X
> > table size under the hood, which is main kernel assumption?
> 
> Not so far.

So please share them with us.

> 
> > If yes, you should fix these bugs.
> 
> That's absolutely the intention.

So let's fix the bugs and not hide them.

> 
> >> > 2. Device can report large table size, kernel will cache it and
> >> > malicious device will reduce it back. It is not handled and will cause
> >> > to kernel crash too.
> >> 
> >> How would that happen? If the device decides to have fewer vectors,
> >> they'll all still fit in the ioremapped MSIX table. The worst thing that
> >> can happen is 0xffffffff reads from the mmio space, which a device can
> >> do anyway. But that shouldn't trigger a page fault or otherwise
> >> crash. Or am I missing something?
> >
> > Like I said, I'm no expert. You should tell me if it safe for all
> > callers of pci_msix_vec_count().
> 
> Well, since you stated that the reverse will cause a kernel crash, I had
> to ask how. I'll include some version of the above paragraph in the
> commit message to indicate that we reverse situation has been considered.

Not really. I didn't see any explanation how will it work if number
of vectors (which MSI-X table represents) is completely different from
seeing by PCI core.

Thanks

> 
> Regards,
> --
> Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-24 12:59           ` Leon Romanovsky
@ 2023-01-24 15:28             ` Alexander Shishkin
  0 siblings, 0 replies; 16+ messages in thread
From: Alexander Shishkin @ 2023-01-24 15:28 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Bjorn Helgaas, Thomas Gleixner, linux-pci, linux-kernel,
	Marc Zyngier, darwi, elena.reshetova, kirill.shutemov,
	Mika Westerberg, stable, alexander.shishkin

Leon Romanovsky <leon@kernel.org> writes:

> On Tue, Jan 24, 2023 at 02:42:11PM +0200, Alexander Shishkin wrote:
>> Leon Romanovsky <leon@kernel.org> writes:
>> 
>> A malicious device is what the fuzzing is aiming to simulate. The fact
>> of fuzzing process itself didn't seem relevant to the patch, so I didn't
>> include it, going instead for the problem statement and proposed
>> solution. Will the commit message benefit from mentioning fuzzing?
>
> No, for most if not all kernel developers, the fuzzing means some sort of
> random user-space input. PCI devices are trusted in the kernel.

Right, it's a different kind of fuzzing. Apologies, I should have made
it clear.

>> > Do you see "gazillion bugs" for devices which don't change their MSI-X
>> > table size under the hood, which is main kernel assumption?
>> 
>> Not so far.
>
> So please share them with us.

We do, as soon as we find them. This patch is one such instance.

>> > If yes, you should fix these bugs.
>> 
>> That's absolutely the intention.
>
> So let's fix the bugs and not hide them.

Yes, that's what this patch aims to achieve.

Thanks,
--
Alex

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-24 11:52     ` Alexander Shishkin
  2023-01-24 12:10       ` Leon Romanovsky
@ 2023-01-24 15:32       ` Greg KH
  2023-01-25 12:33         ` Reshetova, Elena
  1 sibling, 1 reply; 16+ messages in thread
From: Greg KH @ 2023-01-24 15:32 UTC (permalink / raw)
  To: Alexander Shishkin
  Cc: Leon Romanovsky, Bjorn Helgaas, Thomas Gleixner, linux-pci,
	linux-kernel, Marc Zyngier, darwi, elena.reshetova,
	kirill.shutemov, Mika Westerberg, stable

On Tue, Jan 24, 2023 at 01:52:37PM +0200, Alexander Shishkin wrote:
> Leon Romanovsky <leon@kernel.org> writes:
> 
> > On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
> >> A malicious device can change its MSIX table size between the table
> >> ioremap() and subsequent accesses, resulting in a kernel page fault in
> >> pci_write_msg_msix().
> >> 
> >> To avoid this, cache the table size observed at the moment of table
> >> ioremap() and use the cached value. This, however, does not help drivers
> >> that peek at the PCIE_MSIX_FLAGS register directly.
> >> 
> >> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> >> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> >> Cc: stable@vger.kernel.org
> >> ---
> >>  drivers/pci/msi/api.c | 7 ++++++-
> >>  drivers/pci/msi/msi.c | 2 +-
> >>  include/linux/pci.h   | 1 +
> >>  3 files changed, 8 insertions(+), 2 deletions(-)
> >
> > I'm not security expert here, but not sure that this protects from anything.
> > 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> > to cause crashes other than changing MSI-X.
> 
> This particular bug was preventing our fuzzing from going deeper into
> the code and reaching some more of the aforementioned gazillion bugs.

As per our documentation, if you are "fixing" things based on a tool you
have, you HAVE TO document that in the changelog.  Otherwise we just get
to reject the patch flat out (hint, this has caused more than one group
of developers to just be flat out banned for ignoring...)

And what kind of tool is now fuzzing PCI config accesses with
"malicious" devices?  Again, this is NOT something that Linux currently
even attempts to want to protect itself against.  If you are wanting to
change that model, wonderful, let's discuss that and work on defining
the scope of your new security threat model and go from there.  Don't
throw random patches at us and expect us to think they are even valid.

Again, Linux trusts PCI devices.  If you don't want to trust PCI
devices, we already have a framework in place to prevent that which is
independant of this area of the PCI code.  Use that, or let's discuss
why that is insufficient.

Note, this is NOT the first time I have told developers from Intel about
this.  Why you all keep ignoring this is beyond me, I think you keep
thinking that if you don't send patches through me, you can just ignore
my statements about this.  Odd...

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH 1/2] PCI/MSI: Cache the MSIX table size
  2023-01-24 15:32       ` Greg KH
@ 2023-01-25 12:33         ` Reshetova, Elena
  0 siblings, 0 replies; 16+ messages in thread
From: Reshetova, Elena @ 2023-01-25 12:33 UTC (permalink / raw)
  To: Greg KH, Alexander Shishkin
  Cc: Leon Romanovsky, Bjorn Helgaas, Thomas Gleixner, linux-pci,
	linux-kernel, Marc Zyngier, darwi, kirill.shutemov,
	Mika Westerberg, stable


> On Tue, Jan 24, 2023 at 01:52:37PM +0200, Alexander Shishkin wrote:
> > Leon Romanovsky <leon@kernel.org> writes:
> >
> > > On Thu, Jan 19, 2023 at 07:06:32PM +0200, Alexander Shishkin wrote:
> > >> A malicious device can change its MSIX table size between the table
> > >> ioremap() and subsequent accesses, resulting in a kernel page fault in
> > >> pci_write_msg_msix().
> > >>
> > >> To avoid this, cache the table size observed at the moment of table
> > >> ioremap() and use the cached value. This, however, does not help drivers
> > >> that peek at the PCIE_MSIX_FLAGS register directly.
> > >>
> > >> Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > >> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > >> Cc: stable@vger.kernel.org
> > >> ---
> > >>  drivers/pci/msi/api.c | 7 ++++++-
> > >>  drivers/pci/msi/msi.c | 2 +-
> > >>  include/linux/pci.h   | 1 +
> > >>  3 files changed, 8 insertions(+), 2 deletions(-)
> > >
> > > I'm not security expert here, but not sure that this protects from anything.
> > > 1. Kernel relies on working and not-malicious HW. There are gazillion ways
> > > to cause crashes other than changing MSI-X.
> >
> > This particular bug was preventing our fuzzing from going deeper into
> > the code and reaching some more of the aforementioned gazillion bugs.
> 
> As per our documentation, if you are "fixing" things based on a tool you
> have, you HAVE TO document that in the changelog.  Otherwise we just get
> to reject the patch flat out (hint, this has caused more than one group
> of developers to just be flat out banned for ignoring...)
> 
> And what kind of tool is now fuzzing PCI config accesses with
> "malicious" devices?  Again, this is NOT something that Linux currently
> even attempts to want to protect itself against.  If you are wanting to
> change that model, wonderful, let's discuss that and work on defining
> the scope of your new security threat model and go from there.  Don't
> throw random patches at us and expect us to think they are even valid.
> 
> Again, Linux trusts PCI devices.  If you don't want to trust PCI
> devices, we already have a framework in place to prevent that which is
> independant of this area of the PCI code.  Use that, or let's discuss
> why that is insufficient.

Sure, I have started a new thread on this in 
https://lore.kernel.org/all/DM8PR11MB57505481B2FE79C3D56C9201E7CE9@DM8PR11MB5750.namprd11.prod.outlook.com/

I think it is much bigger topic to discuss, so better be handled separately. 

Best Regards,
Elena.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-01-25 12:33 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-19 17:06 [PATCH 0/2] PCI/MSI: Hardening fixes Alexander Shishkin
2023-01-19 17:06 ` [PATCH 1/2] PCI/MSI: Cache the MSIX table size Alexander Shishkin
2023-01-22  9:00   ` Leon Romanovsky
2023-01-22 10:57     ` Marc Zyngier
2023-01-22 15:34       ` David Laight
2023-01-24 11:59       ` Alexander Shishkin
2023-01-22 10:57     ` Greg KH
2023-01-23 10:22       ` Jonathan Cameron
2023-01-24 11:52     ` Alexander Shishkin
2023-01-24 12:10       ` Leon Romanovsky
2023-01-24 12:42         ` Alexander Shishkin
2023-01-24 12:59           ` Leon Romanovsky
2023-01-24 15:28             ` Alexander Shishkin
2023-01-24 15:32       ` Greg KH
2023-01-25 12:33         ` Reshetova, Elena
2023-01-19 17:06 ` [PATCH 2/2] PCI/MSI: Validate device supplied MSI table offset and size Alexander Shishkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).