Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] PCI: tegra: Do not allocate MSI target memory
@ 2019-02-28 15:00 Vidya Sagar
  2019-02-28 19:02 ` Lucas Stach
  0 siblings, 1 reply; 8+ messages in thread
From: Vidya Sagar @ 2019-02-28 15:00 UTC (permalink / raw)
  To: bhelgaas, lorenzo.pieralisi, treding, swarren, mperttunen, jonathanh
  Cc: linux-tegra, linux-pci, kthota, mmaddireddy, vidyas

The PCI host bridge found on Tegra SoCs doesn't require the MSI target
address to be backed by physical system memory. Writes are intercepted
within the controller and never make it to the memory pointed to.

Since no actual system memory is required, remove the allocation of a
single page and hardcode the MSI target address with a special address
on a per-SoC basis. Ideally this would be an address to an MMIO memory
region (such as where the controller's register are located). However,
those addresses don't work reliably across all Tegra generations. The
only set of addresses that work consistently are those that point to
external memory.

This is not ideal, since those addresses could technically be used for
DMA and hence be confusing. However, the first page of external memory
is unlikely to be used and special enough to avoid confusion.

Original base patch was posted by Thierry Reding <treding@nvidia.com>
( http://patchwork.ozlabs.org/patch/848569/ )
Current patch removes hardcoding of external RAM starting address instead
gets it using memblock_start_of_DRAM() API

Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
---
 drivers/pci/controller/pci-tegra.c | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/controller/pci-tegra.c b/drivers/pci/controller/pci-tegra.c
index f4f53d092e00..b33de7c78425 100644
--- a/drivers/pci/controller/pci-tegra.c
+++ b/drivers/pci/controller/pci-tegra.c
@@ -23,6 +23,7 @@
 #include <linux/irqdomain.h>
 #include <linux/kernel.h>
 #include <linux/init.h>
+#include <linux/memblock.h>
 #include <linux/module.h>
 #include <linux/msi.h>
 #include <linux/of_address.h>
@@ -231,9 +232,8 @@ struct tegra_msi {
 	struct msi_controller chip;
 	DECLARE_BITMAP(used, INT_PCI_MSI_NR);
 	struct irq_domain *domain;
-	unsigned long pages;
 	struct mutex lock;
-	u64 phys;
+	phys_addr_t phys;
 	int irq;
 };
 
@@ -1548,9 +1548,7 @@ static int tegra_pcie_msi_setup(struct tegra_pcie *pcie)
 		goto err;
 	}
 
-	/* setup AFI/FPCI range */
-	msi->pages = __get_free_pages(GFP_KERNEL, 0);
-	msi->phys = virt_to_phys((void *)msi->pages);
+	msi->phys = memblock_start_of_DRAM();
 	host->msi = &msi->chip;
 
 	return 0;
@@ -1592,8 +1590,6 @@ static void tegra_pcie_msi_teardown(struct tegra_pcie *pcie)
 	struct tegra_msi *msi = &pcie->msi;
 	unsigned int i, irq;
 
-	free_pages(msi->pages, 0);
-
 	if (msi->irq > 0)
 		free_irq(msi->irq, pcie);
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: tegra: Do not allocate MSI target memory
  2019-02-28 15:00 [PATCH] PCI: tegra: Do not allocate MSI target memory Vidya Sagar
@ 2019-02-28 19:02 ` Lucas Stach
  2019-03-01  3:15   ` Vidya Sagar
  0 siblings, 1 reply; 8+ messages in thread
From: Lucas Stach @ 2019-02-28 19:02 UTC (permalink / raw)
  To: Vidya Sagar, bhelgaas, lorenzo.pieralisi, treding, swarren,
	mperttunen, jonathanh
  Cc: linux-tegra, linux-pci, kthota, mmaddireddy

Am Donnerstag, den 28.02.2019, 20:30 +0530 schrieb Vidya Sagar:
> The PCI host bridge found on Tegra SoCs doesn't require the MSI target
> address to be backed by physical system memory. Writes are intercepted
> within the controller and never make it to the memory pointed to.
> 
> Since no actual system memory is required, remove the allocation of a
> single page and hardcode the MSI target address with a special address
> on a per-SoC basis. Ideally this would be an address to an MMIO memory
> region (such as where the controller's register are located). However,
> those addresses don't work reliably across all Tegra generations. The
> only set of addresses that work consistently are those that point to
> external memory.
> 
> This is not ideal, since those addresses could technically be used for
> DMA and hence be confusing. However, the first page of external memory
> is unlikely to be used and special enough to avoid confusion.

So you are trading a slight memory waste of a single page against a
sporadic (and probably hard to debug) DMA failure if any device happens
to initiate DMA to the first page of physical memory? That does not
sound like a good deal...

Also why would the first page of external memory be unlikely to be
used?

Regards,
Lucas

> Original base patch was posted by Thierry Reding <treding@nvidia.com>
> ( http://patchwork.ozlabs.org/patch/848569/ )
> Current patch removes hardcoding of external RAM starting address instead
> gets it using memblock_start_of_DRAM() API
> 
> Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
> ---
>  drivers/pci/controller/pci-tegra.c | 10 +++-------
>  1 file changed, 3 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/controller/pci-tegra.c b/drivers/pci/controller/pci-tegra.c
> index f4f53d092e00..b33de7c78425 100644
> --- a/drivers/pci/controller/pci-tegra.c
> +++ b/drivers/pci/controller/pci-tegra.c
> @@ -23,6 +23,7 @@
>  #include <linux/irqdomain.h>
>  #include <linux/kernel.h>
>  #include <linux/init.h>
> +#include <linux/memblock.h>
>  #include <linux/module.h>
>  #include <linux/msi.h>
>  #include <linux/of_address.h>
> @@ -231,9 +232,8 @@ struct tegra_msi {
>  	struct msi_controller chip;
>  	DECLARE_BITMAP(used, INT_PCI_MSI_NR);
>  	struct irq_domain *domain;
> -	unsigned long pages;
>  	struct mutex lock;
> -	u64 phys;
> +	phys_addr_t phys;
>  	int irq;
>  };
>  
> @@ -1548,9 +1548,7 @@ static int tegra_pcie_msi_setup(struct tegra_pcie *pcie)
>  		goto err;
>  	}
>  
> -	/* setup AFI/FPCI range */
> -	msi->pages = __get_free_pages(GFP_KERNEL, 0);
> -	msi->phys = virt_to_phys((void *)msi->pages);
> +	msi->phys = memblock_start_of_DRAM();
>  	host->msi = &msi->chip;
>  
>  	return 0;
> @@ -1592,8 +1590,6 @@ static void tegra_pcie_msi_teardown(struct tegra_pcie *pcie)
>  	struct tegra_msi *msi = &pcie->msi;
>  	unsigned int i, irq;
>  
> -	free_pages(msi->pages, 0);
> -
>  	if (msi->irq > 0)
>  		free_irq(msi->irq, pcie);
>  


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: tegra: Do not allocate MSI target memory
  2019-02-28 19:02 ` Lucas Stach
@ 2019-03-01  3:15   ` Vidya Sagar
  2019-03-01 14:56     ` Lucas Stach
  0 siblings, 1 reply; 8+ messages in thread
From: Vidya Sagar @ 2019-03-01  3:15 UTC (permalink / raw)
  To: Lucas Stach, bhelgaas, lorenzo.pieralisi, treding, swarren,
	mperttunen, jonathanh
  Cc: linux-tegra, linux-pci, kthota, mmaddireddy


On 3/1/2019 12:32 AM, Lucas Stach wrote:
> Am Donnerstag, den 28.02.2019, 20:30 +0530 schrieb Vidya Sagar:
>> The PCI host bridge found on Tegra SoCs doesn't require the MSI target
>> address to be backed by physical system memory. Writes are intercepted
>> within the controller and never make it to the memory pointed to.
>>
>> Since no actual system memory is required, remove the allocation of a
>> single page and hardcode the MSI target address with a special address
>> on a per-SoC basis. Ideally this would be an address to an MMIO memory
>> region (such as where the controller's register are located). However,
>> those addresses don't work reliably across all Tegra generations. The
>> only set of addresses that work consistently are those that point to
>> external memory.
>>
>> This is not ideal, since those addresses could technically be used for
>> DMA and hence be confusing. However, the first page of external memory
>> is unlikely to be used and special enough to avoid confusion.
> So you are trading a slight memory waste of a single page against a
> sporadic (and probably hard to debug) DMA failure if any device happens
> to initiate DMA to the first page of physical memory? That does not
> sound like a good deal...
>
> Also why would the first page of external memory be unlikely to be
> used?
>
> Regards,
> Lucas

We are not wasting single page of memory here and if any device's DMA is 
trying to

access it, it will still go through. Its just that we are using that 
same address for MSI (note that

MSI writes don't go beyond PCIe IP as they get decoded at PCIe IP level 
itself and only an interrupt

goes to CPU) and that might be a bit confusing as same address is used 
as normal memory as well

as MSI target address. Since there can never be any issue with this, 
would you suggest to

remove the last paragraph from commit description?

>> Original base patch was posted by Thierry Reding <treding@nvidia.com>
>> ( http://patchwork.ozlabs.org/patch/848569/ )
>> Current patch removes hardcoding of external RAM starting address instead
>> gets it using memblock_start_of_DRAM() API
>>
>> Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
>> ---
>>   drivers/pci/controller/pci-tegra.c | 10 +++-------
>>   1 file changed, 3 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/pci/controller/pci-tegra.c b/drivers/pci/controller/pci-tegra.c
>> index f4f53d092e00..b33de7c78425 100644
>> --- a/drivers/pci/controller/pci-tegra.c
>> +++ b/drivers/pci/controller/pci-tegra.c
>> @@ -23,6 +23,7 @@
>>   #include <linux/irqdomain.h>
>>   #include <linux/kernel.h>
>>   #include <linux/init.h>
>> +#include <linux/memblock.h>
>>   #include <linux/module.h>
>>   #include <linux/msi.h>
>>   #include <linux/of_address.h>
>> @@ -231,9 +232,8 @@ struct tegra_msi {
>>   	struct msi_controller chip;
>>   	DECLARE_BITMAP(used, INT_PCI_MSI_NR);
>>   	struct irq_domain *domain;
>> -	unsigned long pages;
>>   	struct mutex lock;
>> -	u64 phys;
>> +	phys_addr_t phys;
>>   	int irq;
>>   };
>>   
>> @@ -1548,9 +1548,7 @@ static int tegra_pcie_msi_setup(struct tegra_pcie *pcie)
>>   		goto err;
>>   	}
>>   
>> -	/* setup AFI/FPCI range */
>> -	msi->pages = __get_free_pages(GFP_KERNEL, 0);
>> -	msi->phys = virt_to_phys((void *)msi->pages);
>> +	msi->phys = memblock_start_of_DRAM();
>>   	host->msi = &msi->chip;
>>   
>>   	return 0;
>> @@ -1592,8 +1590,6 @@ static void tegra_pcie_msi_teardown(struct tegra_pcie *pcie)
>>   	struct tegra_msi *msi = &pcie->msi;
>>   	unsigned int i, irq;
>>   
>> -	free_pages(msi->pages, 0);
>> -
>>   	if (msi->irq > 0)
>>   		free_irq(msi->irq, pcie);
>>   

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: tegra: Do not allocate MSI target memory
  2019-03-01  3:15   ` Vidya Sagar
@ 2019-03-01 14:56     ` Lucas Stach
  2019-03-02  2:50       ` Vidya Sagar
  0 siblings, 1 reply; 8+ messages in thread
From: Lucas Stach @ 2019-03-01 14:56 UTC (permalink / raw)
  To: Vidya Sagar, bhelgaas, lorenzo.pieralisi, treding, swarren,
	mperttunen, jonathanh
  Cc: linux-tegra, linux-pci, kthota, mmaddireddy

Am Freitag, den 01.03.2019, 08:45 +0530 schrieb Vidya Sagar:
> On 3/1/2019 12:32 AM, Lucas Stach wrote:
> > Am Donnerstag, den 28.02.2019, 20:30 +0530 schrieb Vidya Sagar:
> > > The PCI host bridge found on Tegra SoCs doesn't require the MSI target
> > > address to be backed by physical system memory. Writes are intercepted
> > > within the controller and never make it to the memory pointed to.
> > > 
> > > Since no actual system memory is required, remove the allocation of a
> > > single page and hardcode the MSI target address with a special address
> > > on a per-SoC basis. Ideally this would be an address to an MMIO memory
> > > region (such as where the controller's register are located). However,
> > > those addresses don't work reliably across all Tegra generations. The
> > > only set of addresses that work consistently are those that point to
> > > external memory.
> > > 
> > > This is not ideal, since those addresses could technically be used for
> > > DMA and hence be confusing. However, the first page of external memory
> > > is unlikely to be used and special enough to avoid confusion.
> > So you are trading a slight memory waste of a single page against a
> > sporadic (and probably hard to debug) DMA failure if any device happens
> > to initiate DMA to the first page of physical memory? That does not
> > sound like a good deal...
> > 
> > Also why would the first page of external memory be unlikely to be
> > used?
> > 
> > Regards,
> > Lucas
> 
> We are not wasting single page of memory here and if any device's DMA is 
> trying to access it, it will still go through. Its just that we are using that 
> same address for MSI (note that MSI writes don't go beyond PCIe IP as they get
> decoded at PCIe IP level itself and only an interrupt
> goes to CPU) and that might be a bit confusing as same address is used 
> as normal memory as well as MSI target address. Since there can never be any
> issue with this would you suggest to remove the last paragraph from commit
> description?

How does the core distinguish between a normal DMA memory write and a
MSI? If I remember the PCIe spec correctly, there aren't any
differences between the two besides the target address.

So if you now set a non-reserved region of memory to decode as a MSI at
the PCIe host controller level, wouldn't this lead to normal DMA
transactions to this address being wrongfully turned into an MSI and
the write not reaching the targeted location?

Regards,
Lucas


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: tegra: Do not allocate MSI target memory
  2019-03-01 14:56     ` Lucas Stach
@ 2019-03-02  2:50       ` Vidya Sagar
  2019-03-02 14:22         ` Lucas Stach
  0 siblings, 1 reply; 8+ messages in thread
From: Vidya Sagar @ 2019-03-02  2:50 UTC (permalink / raw)
  To: Lucas Stach, bhelgaas, lorenzo.pieralisi, treding, swarren,
	mperttunen, jonathanh
  Cc: linux-tegra, linux-pci, kthota, mmaddireddy


On 3/1/2019 8:26 PM, Lucas Stach wrote:
> Am Freitag, den 01.03.2019, 08:45 +0530 schrieb Vidya Sagar:
>> On 3/1/2019 12:32 AM, Lucas Stach wrote:
>>> Am Donnerstag, den 28.02.2019, 20:30 +0530 schrieb Vidya Sagar:
>>>> The PCI host bridge found on Tegra SoCs doesn't require the MSI target
>>>> address to be backed by physical system memory. Writes are intercepted
>>>> within the controller and never make it to the memory pointed to.
>>>>
>>>> Since no actual system memory is required, remove the allocation of a
>>>> single page and hardcode the MSI target address with a special address
>>>> on a per-SoC basis. Ideally this would be an address to an MMIO memory
>>>> region (such as where the controller's register are located). However,
>>>> those addresses don't work reliably across all Tegra generations. The
>>>> only set of addresses that work consistently are those that point to
>>>> external memory.
>>>>
>>>> This is not ideal, since those addresses could technically be used for
>>>> DMA and hence be confusing. However, the first page of external memory
>>>> is unlikely to be used and special enough to avoid confusion.
>>> So you are trading a slight memory waste of a single page against a
>>> sporadic (and probably hard to debug) DMA failure if any device happens
>>> to initiate DMA to the first page of physical memory? That does not
>>> sound like a good deal...
>>>
>>> Also why would the first page of external memory be unlikely to be
>>> used?
>>>
>>> Regards,
>>> Lucas
>> We are not wasting single page of memory here and if any device's DMA is
>> trying to access it, it will still go through. Its just that we are using that
>> same address for MSI (note that MSI writes don't go beyond PCIe IP as they get
>> decoded at PCIe IP level itself and only an interrupt
>> goes to CPU) and that might be a bit confusing as same address is used
>> as normal memory as well as MSI target address. Since there can never be any
>> issue with this would you suggest to remove the last paragraph from commit
>> description?
> How does the core distinguish between a normal DMA memory write and a
> MSI? If I remember the PCIe spec correctly, there aren't any
> differences between the two besides the target address.
>
> So if you now set a non-reserved region of memory to decode as a MSI at
> the PCIe host controller level, wouldn't this lead to normal DMA
> transactions to this address being wrongfully turned into an MSI and
> the write not reaching the targeted location?
>
> Regards,
> Lucas

You are correct that core cannot distinguish between a normal DMA memory and
MSI. In that case, the only way I see is to alloc memory using 
dma_alloc_coherent()
and use the IOVA as the MSI target address. That way, a page gets 
reserved (in a way wasted
also as the MSI writes don't really make it to RAM) and there won't be 
any address
overlaps with normal DMA writes. I'll push a patch for it.

Thanks,
Vidya Sagar


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: tegra: Do not allocate MSI target memory
  2019-03-02  2:50       ` Vidya Sagar
@ 2019-03-02 14:22         ` Lucas Stach
  2019-03-04  5:38           ` Vidya Sagar
  0 siblings, 1 reply; 8+ messages in thread
From: Lucas Stach @ 2019-03-02 14:22 UTC (permalink / raw)
  To: Vidya Sagar, bhelgaas, lorenzo.pieralisi, treding, swarren,
	mperttunen, jonathanh
  Cc: linux-tegra, linux-pci, kthota, mmaddireddy

Am Samstag, den 02.03.2019, 08:20 +0530 schrieb Vidya Sagar:
> On 3/1/2019 8:26 PM, Lucas Stach wrote:
> > Am Freitag, den 01.03.2019, 08:45 +0530 schrieb Vidya Sagar:
> > > On 3/1/2019 12:32 AM, Lucas Stach wrote:
> > > > Am Donnerstag, den 28.02.2019, 20:30 +0530 schrieb Vidya Sagar:
> > > > > The PCI host bridge found on Tegra SoCs doesn't require the MSI target
> > > > > address to be backed by physical system memory. Writes are intercepted
> > > > > within the controller and never make it to the memory pointed to.
> > > > > 
> > > > > Since no actual system memory is required, remove the allocation of a
> > > > > single page and hardcode the MSI target address with a special address
> > > > > on a per-SoC basis. Ideally this would be an address to an MMIO memory
> > > > > region (such as where the controller's register are located). However,
> > > > > those addresses don't work reliably across all Tegra generations. The
> > > > > only set of addresses that work consistently are those that point to
> > > > > external memory.
> > > > > 
> > > > > This is not ideal, since those addresses could technically be used for
> > > > > DMA and hence be confusing. However, the first page of external memory
> > > > > is unlikely to be used and special enough to avoid confusion.
> > > > So you are trading a slight memory waste of a single page against a
> > > > sporadic (and probably hard to debug) DMA failure if any device happens
> > > > to initiate DMA to the first page of physical memory? That does not
> > > > sound like a good deal...
> > > > 
> > > > Also why would the first page of external memory be unlikely to be
> > > > used?
> > > > 
> > > > Regards,
> > > > Lucas
> > > We are not wasting single page of memory here and if any device's DMA is
> > > trying to access it, it will still go through. Its just that we are using that
> > > same address for MSI (note that MSI writes don't go beyond PCIe IP as they get
> > > decoded at PCIe IP level itself and only an interrupt
> > > goes to CPU) and that might be a bit confusing as same address is used
> > > as normal memory as well as MSI target address. Since there can never be any
> > > issue with this would you suggest to remove the last paragraph from commit
> > > description?
> > How does the core distinguish between a normal DMA memory write and a
> > MSI? If I remember the PCIe spec correctly, there aren't any
> > differences between the two besides the target address.
> > 
> > So if you now set a non-reserved region of memory to decode as a MSI at
> > the PCIe host controller level, wouldn't this lead to normal DMA
> > transactions to this address being wrongfully turned into an MSI and
> > the write not reaching the targeted location?
> > 
> > Regards,
> > Lucas
> 
> You are correct that core cannot distinguish between a normal DMA memory and
> MSI. In that case, the only way I see is to alloc memory using 
> dma_alloc_coherent()
> and use the IOVA as the MSI target address. That way, a page gets 
> reserved (in a way wasted
> also as the MSI writes don't really make it to RAM) and there won't be 
> any address
> overlaps with normal DMA writes. I'll push a patch for it.

At that point it's no longer different from the current code, which
simply reserves a single page of memory and uses its address as the MSI
target address.

So in conclusion this change should just be dropped.

Regards,
Lucas


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: tegra: Do not allocate MSI target memory
  2019-03-02 14:22         ` Lucas Stach
@ 2019-03-04  5:38           ` Vidya Sagar
  2019-03-04  8:37             ` Thierry Reding
  0 siblings, 1 reply; 8+ messages in thread
From: Vidya Sagar @ 2019-03-04  5:38 UTC (permalink / raw)
  To: Lucas Stach, bhelgaas, lorenzo.pieralisi, treding, swarren,
	mperttunen, jonathanh
  Cc: linux-tegra, linux-pci, kthota, mmaddireddy



On 3/2/2019 7:52 PM, Lucas Stach wrote:
> Am Samstag, den 02.03.2019, 08:20 +0530 schrieb Vidya Sagar:
>> On 3/1/2019 8:26 PM, Lucas Stach wrote:
>>> Am Freitag, den 01.03.2019, 08:45 +0530 schrieb Vidya Sagar:
>>>> On 3/1/2019 12:32 AM, Lucas Stach wrote:
>>>>> Am Donnerstag, den 28.02.2019, 20:30 +0530 schrieb Vidya Sagar:
>>>>>> The PCI host bridge found on Tegra SoCs doesn't require the MSI target
>>>>>> address to be backed by physical system memory. Writes are intercepted
>>>>>> within the controller and never make it to the memory pointed to.
>>>>>>
>>>>>> Since no actual system memory is required, remove the allocation of a
>>>>>> single page and hardcode the MSI target address with a special address
>>>>>> on a per-SoC basis. Ideally this would be an address to an MMIO memory
>>>>>> region (such as where the controller's register are located). However,
>>>>>> those addresses don't work reliably across all Tegra generations. The
>>>>>> only set of addresses that work consistently are those that point to
>>>>>> external memory.
>>>>>>
>>>>>> This is not ideal, since those addresses could technically be used for
>>>>>> DMA and hence be confusing. However, the first page of external memory
>>>>>> is unlikely to be used and special enough to avoid confusion.
>>>>> So you are trading a slight memory waste of a single page against a
>>>>> sporadic (and probably hard to debug) DMA failure if any device happens
>>>>> to initiate DMA to the first page of physical memory? That does not
>>>>> sound like a good deal...
>>>>>
>>>>> Also why would the first page of external memory be unlikely to be
>>>>> used?
>>>>>
>>>>> Regards,
>>>>> Lucas
>>>> We are not wasting single page of memory here and if any device's DMA is
>>>> trying to access it, it will still go through. Its just that we are using that
>>>> same address for MSI (note that MSI writes don't go beyond PCIe IP as they get
>>>> decoded at PCIe IP level itself and only an interrupt
>>>> goes to CPU) and that might be a bit confusing as same address is used
>>>> as normal memory as well as MSI target address. Since there can never be any
>>>> issue with this would you suggest to remove the last paragraph from commit
>>>> description?
>>> How does the core distinguish between a normal DMA memory write and a
>>> MSI? If I remember the PCIe spec correctly, there aren't any
>>> differences between the two besides the target address.
>>>
>>> So if you now set a non-reserved region of memory to decode as a MSI at
>>> the PCIe host controller level, wouldn't this lead to normal DMA
>>> transactions to this address being wrongfully turned into an MSI and
>>> the write not reaching the targeted location?
>>>
>>> Regards,
>>> Lucas
>> You are correct that core cannot distinguish between a normal DMA memory and
>> MSI. In that case, the only way I see is to alloc memory using
>> dma_alloc_coherent()
>> and use the IOVA as the MSI target address. That way, a page gets
>> reserved (in a way wasted
>> also as the MSI writes don't really make it to RAM) and there won't be
>> any address
>> overlaps with normal DMA writes. I'll push a patch for it.
> At that point it's no longer different from the current code, which
> simply reserves a single page of memory and uses its address as the MSI
> target address.
But, I think there is still one issue with the current code base. Since 
what is used as
MSI target address is the physical address equivalent of the memory page 
being reserved,
there is a chance that some IOVA might become equal to this physical 
address there by
making PCIe IP to raise an MSI interrupt instead of forwarding the write 
to memory.
Do you agree with this?
> So in conclusion this change should just be dropped.
>
> Regards,
> Lucas
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: tegra: Do not allocate MSI target memory
  2019-03-04  5:38           ` Vidya Sagar
@ 2019-03-04  8:37             ` Thierry Reding
  0 siblings, 0 replies; 8+ messages in thread
From: Thierry Reding @ 2019-03-04  8:37 UTC (permalink / raw)
  To: Vidya Sagar
  Cc: Lucas Stach, bhelgaas, lorenzo.pieralisi, swarren, mperttunen,
	jonathanh, linux-tegra, linux-pci, kthota, mmaddireddy


[-- Attachment #1: Type: text/plain, Size: 8288 bytes --]

On Mon, Mar 04, 2019 at 11:08:54AM +0530, Vidya Sagar wrote:
> 
> 
> On 3/2/2019 7:52 PM, Lucas Stach wrote:
> > Am Samstag, den 02.03.2019, 08:20 +0530 schrieb Vidya Sagar:
> > > On 3/1/2019 8:26 PM, Lucas Stach wrote:
> > > > Am Freitag, den 01.03.2019, 08:45 +0530 schrieb Vidya Sagar:
> > > > > On 3/1/2019 12:32 AM, Lucas Stach wrote:
> > > > > > Am Donnerstag, den 28.02.2019, 20:30 +0530 schrieb Vidya Sagar:
> > > > > > > The PCI host bridge found on Tegra SoCs doesn't require the MSI target
> > > > > > > address to be backed by physical system memory. Writes are intercepted
> > > > > > > within the controller and never make it to the memory pointed to.
> > > > > > > 
> > > > > > > Since no actual system memory is required, remove the allocation of a
> > > > > > > single page and hardcode the MSI target address with a special address
> > > > > > > on a per-SoC basis. Ideally this would be an address to an MMIO memory
> > > > > > > region (such as where the controller's register are located). However,
> > > > > > > those addresses don't work reliably across all Tegra generations. The
> > > > > > > only set of addresses that work consistently are those that point to
> > > > > > > external memory.
> > > > > > > 
> > > > > > > This is not ideal, since those addresses could technically be used for
> > > > > > > DMA and hence be confusing. However, the first page of external memory
> > > > > > > is unlikely to be used and special enough to avoid confusion.
> > > > > > So you are trading a slight memory waste of a single page against a
> > > > > > sporadic (and probably hard to debug) DMA failure if any device happens
> > > > > > to initiate DMA to the first page of physical memory? That does not
> > > > > > sound like a good deal...
> > > > > > 
> > > > > > Also why would the first page of external memory be unlikely to be
> > > > > > used?
> > > > > > 
> > > > > > Regards,
> > > > > > Lucas
> > > > > We are not wasting single page of memory here and if any device's DMA is
> > > > > trying to access it, it will still go through. Its just that we are using that
> > > > > same address for MSI (note that MSI writes don't go beyond PCIe IP as they get
> > > > > decoded at PCIe IP level itself and only an interrupt
> > > > > goes to CPU) and that might be a bit confusing as same address is used
> > > > > as normal memory as well as MSI target address. Since there can never be any
> > > > > issue with this would you suggest to remove the last paragraph from commit
> > > > > description?
> > > > How does the core distinguish between a normal DMA memory write and a
> > > > MSI? If I remember the PCIe spec correctly, there aren't any
> > > > differences between the two besides the target address.
> > > > 
> > > > So if you now set a non-reserved region of memory to decode as a MSI at
> > > > the PCIe host controller level, wouldn't this lead to normal DMA
> > > > transactions to this address being wrongfully turned into an MSI and
> > > > the write not reaching the targeted location?
> > > > 
> > > > Regards,
> > > > Lucas
> > > You are correct that core cannot distinguish between a normal DMA memory and
> > > MSI. In that case, the only way I see is to alloc memory using
> > > dma_alloc_coherent()
> > > and use the IOVA as the MSI target address. That way, a page gets
> > > reserved (in a way wasted
> > > also as the MSI writes don't really make it to RAM) and there won't be
> > > any address
> > > overlaps with normal DMA writes. I'll push a patch for it.
> > At that point it's no longer different from the current code, which
> > simply reserves a single page of memory and uses its address as the MSI
> > target address.
> But, I think there is still one issue with the current code base. Since what
> is used as
> MSI target address is the physical address equivalent of the memory page
> being reserved,
> there is a chance that some IOVA might become equal to this physical address
> there by
> making PCIe IP to raise an MSI interrupt instead of forwarding the write to
> memory.
> Do you agree with this?

Yeah, I think we need to make sure that the MSI target address is in the
same address space as other memory allocated for PCI devices. If we
continue to use __get_free_pages() that means we'd have to manually set
up an IOMMU domain for the IOVA mapping of the MSI target address. That
is pretty tedious and would probably also conflict with code in PCI
device drivers that are presumably going to allocate memory using the
DMA API and hence could get a different domain or IOVA space than the
PCI controller itself.

There's also still an issue with 32-bit vs. 64-bit addressing, if I'm
not mistaken. I think we had meant to address that separately, but it
looks like we never did. Or perhaps I'm forgetting what conclusion we
had come to regarding that?

The original proposal to use a fixed address outside of system memory
could've fixed the conflicting address issue and the 32-bit vs. 64-bit
addressing problem, but it also wouldn't have worked for IOVA addresses
because the MSI target address could easily clash with an address from
the PCI IOVA space.

> > So in conclusion this change should just be dropped.

I think we need to do something, even if this change is perhaps not the
best approach in retrospect.

Perhaps as a first step we could try to solve the 32-bit vs. 64-bit
addressing issue in a separate patch and then think about the SMMU case
some more and then resolve that separately.

I'm not exactly sure how SMMU works for PCI on Tegra, but I thought it
was using a single IOVA space (or ASID in Tegra-speak) for anything PCI.
What I don't understand is how that works in conjunction with the memory
resource assignment code. If we enable SMMU for PCI, don't all of the
BARs become invalid? I suppose if only memory accesses upwards of the
PCI controller get translated through the SMMU, then the memory resource
assignments might stay valid. But we would also have to ensure that IOVA
allocations for PCI don't intersect with the PCI address ranges from
which BARs are allocated.

I've been working on a set of patches that address a similar problem for
display. The problem that I'm seeing is that if we enable IOMMU-backed
DMA API for display, then the IOMMU translation for display controllers
gets enabled very early on, a long time before the driver gets a chance
to program the hardware. The result is that if the bootloader has left
the display controller enabled, it will now read from memory that has no
IOVA mapping (the bootloader usually doesn't set up the SMMU). This is a
bit of a bug without DMA/IOMMU as well because the display controller is
reading memory that the kernel will at some point claim and write to.

There are two things we need to fix that: one is to make sure that the
kernel doesn't write to the memory currently being scanned out and the
other is to properly set up the SMMU before it is enabled, but without
requiring the kernel driver to be loaded.

The first I solved using reserved-memory regions. See the documentation
in Documentation/devicetree/bindings/reserved-memory/ for background on
what these are. The idea is to use a reserved memory region to prevent
the kernel from handing the framebuffer memory to the buddy allocator.
The second problem can be fixed by implementing "reserved regions"
support in the SMMU driver. I have an implementation for that ready and
it works fine in conjunction with some bootloader patches that make sure
the correct reserved memory regions are set up when the kernel boots.

While not exactly the same, I think the issue for PCI is conceptually
similar. It doesn't make much sense to add a reserved region for the PCI
apertures because they aren't part of system memory in the first place.
I'm not sure it would hurt, but it's possible that the reserved memory
code is not prepared to deal with memory regions outside of system
memory. However, I think what we want to do is establish reserved
regions in the PCI ASID that represents the PCI apertures. That way we
avoid ever programming PCI devices with DMA addresses that would point
back into PCI memory.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-28 15:00 [PATCH] PCI: tegra: Do not allocate MSI target memory Vidya Sagar
2019-02-28 19:02 ` Lucas Stach
2019-03-01  3:15   ` Vidya Sagar
2019-03-01 14:56     ` Lucas Stach
2019-03-02  2:50       ` Vidya Sagar
2019-03-02 14:22         ` Lucas Stach
2019-03-04  5:38           ` Vidya Sagar
2019-03-04  8:37             ` Thierry Reding

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git