linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: jingoohan1@gmail.com, gustavo.pimentel@synopsys.com,
	lpieralisi@kernel.org, robh@kernel.org, kw@linux.com,
	bhelgaas@google.com, linux-pci@vger.kernel.org,
	linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
	quic_bjorande@quicinc.com, fancer.lancer@gmail.com,
	vidyas@nvidia.com
Subject: Re: [PATCH v7 1/2] PCI: designware-ep: Fix DBI access before core init
Date: Tue, 9 Jan 2024 15:12:39 -0600	[thread overview]
Message-ID: <20240109211239.GA2016581@bhelgaas> (raw)
In-Reply-To: <20231120084014.108274-2-manivannan.sadhasivam@linaro.org>

It doesn't look to me like the issues raised by Niklas have really
been resolved:

  https://lore.kernel.org/r/ZWYmX8Y%2F7Q9WMxES@x1-carbon/
  https://lore.kernel.org/r/ZZ2JXMhdOI1Upabx@x1-carbon

so I'm doubtful that we should apply this as-is.  The spurious
/sys/class/dma/ stuff and debugfs warnings sound like things that will
annoy users.

Apart from that, this patch has been floating around a long time, but
I still think this solution is hard to maintain for the same reasons I
mentioned here:
https://lore.kernel.org/linux-pci/20220919224014.GA1030798@bhelgaas/

On Mon, Nov 20, 2023 at 02:10:13PM +0530, Manivannan Sadhasivam wrote:
> The drivers for platforms requiring reference clock from the PCIe host for
> initializing their PCIe EP core, make use of the 'core_init_notifier'
> feature exposed by the DWC common code. On these platforms, access to the
> hw registers like DBI before completing the core initialization will result
> in a whole system hang. But the current DWC EP driver tries to access DBI
> registers during dw_pcie_ep_init() without waiting for core initialization
> and it results in system hang on platforms making use of
> 'core_init_notifier' such as Tegra194 and Qcom SM8450.

I see that only qcom_pcie_epc_features and tegra_pcie_epc_features
*set* "core_init_notifier", but all platforms use it because it's only
tested in dw_pcie_ep_init() (and a test case), which is generic to all
DWC drivers.

"core_init_notifier" is not a notifier.  From reading the code, it
only means "if this is set, skip the rest of dw_pcie_ep_init()".

Based on the code, I assume it implies that drivers that set
core_init_notifier must do some additional initialization or
something, but that initialization isn't connected here.

There should be some symbol, maybe a member of pci_epc_features, that
both *does* this initialization and *tells us* that the driver needs
this initialization.

Right now, I think it's something like:

  1) this driver sets core_init_notifier
  2) that must mean that it also calls dw_pcie_ep_init_notify() somewhere
  3) we must avoid DBI access until it does

There's nothing that directly connects those three things.

> To workaround this issue, users of the above mentioned platforms have to
> maintain the dependency with the PCIe host by booting the PCIe EP after
> host boot. But this won't provide a good user experience, since PCIe EP is
> _one_ of the features of those platforms and it doesn't make sense to
> delay the whole platform booting due to the PCIe dependency.

IIUC, "have to maintain the dependency" refers to the situation
*before* this patch, right?  This patch improves the user experience
by removing the need for users to enforce this "boot host before EP"
ordering?

> So to fix this issue, let's move all the DBI access during
> dw_pcie_ep_init() in the DWC EP driver to the dw_pcie_ep_init_complete()
> API that gets called only after core initialization on these platforms.
> This makes sure that the DBI register accesses are skipped during
> dw_pcie_ep_init() and accessed later once the core initialization happens.

This patch doesn't "skip" them in dw_pcie_ep_init(); it *moves* them
completely to dw_pcie_ep_late_init() and calls that from the end of
dw_pcie_ep_init().

> For the rest of the platforms, DBI access happens as usual.

I don't really understand what "as usual" means here.  I guess it just
means "if the driver doesn't set 'core_init_notifier', nothing
changes"?  I would at least make it specific to make it clear that
"rest of the platforms" means "those that don't set
core_init_notifier".

> Co-developed-by: Vidya Sagar <vidyas@nvidia.com>
> Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
> ---
>  .../pci/controller/dwc/pcie-designware-ep.c   | 139 ++++++++++++------
>  1 file changed, 91 insertions(+), 48 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
> index f6207989fc6a..b1c79cd8e25f 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-ep.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
> @@ -662,14 +662,19 @@ static unsigned int dw_pcie_ep_find_ext_capability(struct dw_pcie *pci, int cap)
>  	return 0;
>  }
>  
> -int dw_pcie_ep_init_complete(struct dw_pcie_ep *ep)
> +static int dw_pcie_ep_late_init(struct dw_pcie_ep *ep)
>  {
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> +	struct dw_pcie_ep_func *ep_func;
> +	struct device *dev = pci->dev;
> +	struct pci_epc *epc = ep->epc;
>  	unsigned int offset, ptm_cap_base;
>  	unsigned int nbars;
>  	u8 hdr_type;
> +	u8 func_no;
> +	int i, ret;
> +	void *addr;
>  	u32 reg;
> -	int i;
>  
>  	hdr_type = dw_pcie_readb_dbi(pci, PCI_HEADER_TYPE) &
>  		   PCI_HEADER_TYPE_MASK;
> @@ -680,6 +685,55 @@ int dw_pcie_ep_init_complete(struct dw_pcie_ep *ep)
>  		return -EIO;
>  	}
>  
> +	dw_pcie_version_detect(pci);
> +
> +	dw_pcie_iatu_detect(pci);
> +
> +	ret = dw_pcie_edma_detect(pci);
> +	if (ret)
> +		return ret;
> +
> +	if (!ep->ib_window_map) {
> +		ep->ib_window_map = devm_bitmap_zalloc(dev, pci->num_ib_windows,
> +						       GFP_KERNEL);
> +		if (!ep->ib_window_map)
> +			goto err_remove_edma;
> +	}
> +
> +	if (!ep->ob_window_map) {
> +		ep->ob_window_map = devm_bitmap_zalloc(dev, pci->num_ob_windows,
> +						       GFP_KERNEL);
> +		if (!ep->ob_window_map)
> +			goto err_remove_edma;
> +	}
> +
> +	if (!ep->outbound_addr) {
> +		addr = devm_kcalloc(dev, pci->num_ob_windows, sizeof(phys_addr_t),
> +				    GFP_KERNEL);
> +		if (!addr)
> +			goto err_remove_edma;
> +		ep->outbound_addr = addr;
> +	}
> +
> +	for (func_no = 0; func_no < epc->max_functions; func_no++) {
> +
> +		ep_func = dw_pcie_ep_get_func_from_ep(ep, func_no);
> +		if (ep_func)
> +			continue;
> +
> +		ep_func = devm_kzalloc(dev, sizeof(*ep_func), GFP_KERNEL);
> +		if (!ep_func)
> +			goto err_remove_edma;
> +
> +		ep_func->func_no = func_no;
> +		ep_func->msi_cap = dw_pcie_ep_find_capability(ep, func_no,
> +							      PCI_CAP_ID_MSI);
> +		ep_func->msix_cap = dw_pcie_ep_find_capability(ep, func_no,
> +							       PCI_CAP_ID_MSIX);
> +
> +		list_add_tail(&ep_func->list, &ep->func_list);
> +	}
> +
>  	offset = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_REBAR);
>  	ptm_cap_base = dw_pcie_ep_find_ext_capability(pci, PCI_EXT_CAP_ID_PTM);
>  
> @@ -714,14 +768,38 @@ int dw_pcie_ep_init_complete(struct dw_pcie_ep *ep)
>  	dw_pcie_dbi_ro_wr_dis(pci);
>  
>  	return 0;
> +
> +err_remove_edma:
> +	dw_pcie_edma_remove(pci);
> +
> +	return ret;
> +}
> +
> +int dw_pcie_ep_init_complete(struct dw_pcie_ep *ep)
> +{
> +	struct pci_epc *epc = ep->epc;
> +	int ret;
> +
> +	ret = dw_pcie_ep_late_init(ep);
> +	if (ret)
> +		goto err_cleanup;
> +
> +	return 0;
> +
> +err_cleanup:
> +	pci_epc_mem_free_addr(epc, ep->msi_mem_phys, ep->msi_mem,
> +			      epc->mem->window.page_size);
> +	pci_epc_mem_exit(epc);
> +	if (ep->ops->deinit)
> +		ep->ops->deinit(ep);
> +
> +	return ret;
>  }
>  EXPORT_SYMBOL_GPL(dw_pcie_ep_init_complete);
>  
>  int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  {
>  	int ret;
> -	void *addr;
> -	u8 func_no;
>  	struct resource *res;
>  	struct pci_epc *epc;
>  	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
> @@ -729,7 +807,6 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  	struct platform_device *pdev = to_platform_device(dev);
>  	struct device_node *np = dev->of_node;
>  	const struct pci_epc_features *epc_features;
> -	struct dw_pcie_ep_func *ep_func;
>  
>  	INIT_LIST_HEAD(&ep->func_list);
>  
> @@ -747,26 +824,6 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  	if (ep->ops->pre_init)
>  		ep->ops->pre_init(ep);
>  
> -	dw_pcie_version_detect(pci);
> -
> -	dw_pcie_iatu_detect(pci);
> -
> -	ep->ib_window_map = devm_bitmap_zalloc(dev, pci->num_ib_windows,
> -					       GFP_KERNEL);
> -	if (!ep->ib_window_map)
> -		return -ENOMEM;
> -
> -	ep->ob_window_map = devm_bitmap_zalloc(dev, pci->num_ob_windows,
> -					       GFP_KERNEL);
> -	if (!ep->ob_window_map)
> -		return -ENOMEM;
> -
> -	addr = devm_kcalloc(dev, pci->num_ob_windows, sizeof(phys_addr_t),
> -			    GFP_KERNEL);
> -	if (!addr)
> -		return -ENOMEM;
> -	ep->outbound_addr = addr;
> -
>  	epc = devm_pci_epc_create(dev, &epc_ops);
>  	if (IS_ERR(epc)) {
>  		dev_err(dev, "Failed to create epc device\n");
> @@ -780,20 +837,6 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  	if (ret < 0)
>  		epc->max_functions = 1;
>  
> -	for (func_no = 0; func_no < epc->max_functions; func_no++) {
> -		ep_func = devm_kzalloc(dev, sizeof(*ep_func), GFP_KERNEL);
> -		if (!ep_func)
> -			return -ENOMEM;
> -
> -		ep_func->func_no = func_no;
> -		ep_func->msi_cap = dw_pcie_ep_find_capability(ep, func_no,
> -							      PCI_CAP_ID_MSI);
> -		ep_func->msix_cap = dw_pcie_ep_find_capability(ep, func_no,
> -							       PCI_CAP_ID_MSIX);
> -
> -		list_add_tail(&ep_func->list, &ep->func_list);
> -	}
> -
>  	if (ep->ops->ep_init)
>  		ep->ops->ep_init(ep);
>  
> @@ -812,25 +855,25 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
>  		goto err_exit_epc_mem;
>  	}
>  
> -	ret = dw_pcie_edma_detect(pci);
> -	if (ret)
> -		goto err_free_epc_mem;
> -
>  	if (ep->ops->get_features) {
>  		epc_features = ep->ops->get_features(ep);
>  		if (epc_features->core_init_notifier)
>  			return 0;
>  	}
>  
> -	ret = dw_pcie_ep_init_complete(ep);
> +	/*
> +	 * NOTE:- Avoid accessing the hardware (Ex:- DBI space) before this
> +	 * step as platforms that implement 'core_init_notifier' feature may
> +	 * not have the hardware ready (i.e. core initialized) for access
> +	 * (Ex: tegra194). Any hardware access on such platforms result
> +	 * in system hang.

What specifically does "before this step" refer to?  I think the
intent is that it's something to do with "core_init_notifier", but
there's no *direct* connection because there's no test of
core_init_notifier except here and the test case.

> +	ret = dw_pcie_ep_late_init(ep);
>  	if (ret)
> -		goto err_remove_edma;
> +		goto err_free_epc_mem;
>  
>  	return 0;
>  
> -err_remove_edma:
> -	dw_pcie_edma_remove(pci);
> -
>  err_free_epc_mem:
>  	pci_epc_mem_free_addr(epc, ep->msi_mem_phys, ep->msi_mem,
>  			      epc->mem->window.page_size);
> -- 
> 2.25.1
> 

  parent reply	other threads:[~2024-01-09 21:12 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-20  8:40 [PATCH v7 0/2] PCI: designware-ep: Fix DBI access before core init Manivannan Sadhasivam
2023-11-20  8:40 ` [PATCH v7 1/2] " Manivannan Sadhasivam
2023-11-28 17:41   ` Niklas Cassel
2023-11-29 11:38     ` Niklas Cassel
2023-11-29 15:51       ` Manivannan Sadhasivam
2023-11-29 16:36         ` Niklas Cassel
2023-11-30  6:38           ` Manivannan Sadhasivam
2023-11-30 11:22             ` Niklas Cassel
2023-11-30 13:57               ` Manivannan Sadhasivam
2023-12-23  1:52                 ` Niklas Cassel
2024-01-06 15:12                   ` Manivannan Sadhasivam
2024-01-18 18:33                     ` Niklas Cassel
2024-01-19  7:28                       ` Manivannan Sadhasivam
2024-01-23  9:18                         ` Niklas Cassel
2024-01-23  9:59                           ` Manivannan Sadhasivam
2023-11-29 14:08     ` Manivannan Sadhasivam
2024-01-09 21:12   ` Bjorn Helgaas [this message]
2023-11-20  8:40 ` [PATCH v7 2/2] PCI: designware-ep: Move pci_epc_init_notify() inside dw_pcie_ep_init_complete() Manivannan Sadhasivam
2024-01-07  7:27 ` [PATCH v7 0/2] PCI: designware-ep: Fix DBI access before core init Krzysztof Wilczyński
2024-01-07  7:34   ` Krzysztof Wilczyński
2024-01-09 17:58   ` Niklas Cassel
2024-01-10  3:11     ` Manivannan Sadhasivam
2024-01-10 10:01       ` Niklas Cassel
2024-01-10 15:27         ` Manivannan Sadhasivam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240109211239.GA2016581@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=fancer.lancer@gmail.com \
    --cc=gustavo.pimentel@synopsys.com \
    --cc=jingoohan1@gmail.com \
    --cc=kw@linux.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=quic_bjorande@quicinc.com \
    --cc=robh@kernel.org \
    --cc=vidyas@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).