From: Tom Rix <trix@redhat.com>
To: Max Zhen <max.zhen@xilinx.com>, Lizhi Hou <lizhi.hou@xilinx.com>,
linux-kernel@vger.kernel.org
Cc: linux-fpga@vger.kernel.org, sonal.santan@xilinx.com,
yliu@xilinx.com, michal.simek@xilinx.com, stefanos@xilinx.com,
devicetree@vger.kernel.org, mdf@kernel.org, robh@kernel.org
Subject: Re: [PATCH V4 XRT Alveo 09/20] fpga: xrt: management physical function driver (root)
Date: Wed, 14 Apr 2021 08:40:10 -0700 [thread overview]
Message-ID: <9a1a3d6e-3a5d-2272-7235-d46d09589cf8@redhat.com> (raw)
In-Reply-To: <f38c7210-85aa-09af-9a52-ebb13ca3442e@xilinx.com>
On 4/9/21 11:50 AM, Max Zhen wrote:
> Hi Tom,
>
>
> On 3/31/21 6:03 AM, Tom Rix wrote:
>> On 3/23/21 10:29 PM, Lizhi Hou wrote:
>>> The PCIE device driver which attaches to management function on Alveo
>>> devices. It instantiates one or more group drivers which, in turn,
>>> instantiate platform drivers. The instantiation of group and platform
>>> drivers is completely dtb driven.
>>>
>>> Signed-off-by: Sonal Santan<sonal.santan@xilinx.com>
>>> Signed-off-by: Max Zhen<max.zhen@xilinx.com>
>>> Signed-off-by: Lizhi Hou<lizhi.hou@xilinx.com>
>>> ---
>>> drivers/fpga/xrt/mgmt/root.c | 333
>>> +++++++++++++++++++++++++++++++++++
>>> 1 file changed, 333 insertions(+)
>>> create mode 100644 drivers/fpga/xrt/mgmt/root.c
>>>
>>> diff --git a/drivers/fpga/xrt/mgmt/root.c
>>> b/drivers/fpga/xrt/mgmt/root.c
>>> new file mode 100644
>>> index 000000000000..f97f92807c01
>>> --- /dev/null
>>> +++ b/drivers/fpga/xrt/mgmt/root.c
>>> @@ -0,0 +1,333 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * Xilinx Alveo Management Function Driver
>>> + *
>>> + * Copyright (C) 2020-2021 Xilinx, Inc.
>>> + *
>>> + * Authors:
>>> + * Cheng Zhen<maxz@xilinx.com>
>>> + */
>>> +
>>> +#include <linux/module.h>
>>> +#include <linux/pci.h>
>>> +#include <linux/aer.h>
>>> +#include <linux/vmalloc.h>
>>> +#include <linux/delay.h>
>>> +
>>> +#include "xroot.h"
>>> +#include "xmgnt.h"
>>> +#include "metadata.h"
>>> +
>>> +#define XMGMT_MODULE_NAME "xrt-mgmt"
>> ok
>>> +#define XMGMT_DRIVER_VERSION "4.0.0"
>>> +
>>> +#define XMGMT_PDEV(xm) ((xm)->pdev)
>>> +#define XMGMT_DEV(xm) (&(XMGMT_PDEV(xm)->dev))
>>> +#define xmgmt_err(xm, fmt, args...) \
>>> + dev_err(XMGMT_DEV(xm), "%s: " fmt, __func__, ##args)
>>> +#define xmgmt_warn(xm, fmt, args...) \
>>> + dev_warn(XMGMT_DEV(xm), "%s: " fmt, __func__, ##args)
>>> +#define xmgmt_info(xm, fmt, args...) \
>>> + dev_info(XMGMT_DEV(xm), "%s: " fmt, __func__, ##args)
>>> +#define xmgmt_dbg(xm, fmt, args...) \
>>> + dev_dbg(XMGMT_DEV(xm), "%s: " fmt, __func__, ##args)
>>> +#define XMGMT_DEV_ID(_pcidev) \
>>> + ({ typeof(_pcidev) (pcidev) = (_pcidev); \
>>> + ((pci_domain_nr((pcidev)->bus) << 16) | \
>>> + PCI_DEVID((pcidev)->bus->number, 0)); })
>>> +
>>> +static struct class *xmgmt_class;
>>> +
>>> +/* PCI Device IDs */
>> add a comment on what a golden image is here something like
>>
>> /*
>>
>> * Golden image is preloaded on the device when it is shipped to
>> customer.
>>
>> * Then, customer can load other shells (from Xilinx or some other
>> vendor).
>>
>> * If something goes wrong with the shell, customer can always go back to
>>
>> * golden and start over again.
>>
>> */
>>
>
> Will do.
>
>
>>> +#define PCI_DEVICE_ID_U50_GOLDEN 0xD020
>>> +#define PCI_DEVICE_ID_U50 0x5020
>>> +static const struct pci_device_id xmgmt_pci_ids[] = {
>>> + { PCI_DEVICE(PCI_VENDOR_ID_XILINX, PCI_DEVICE_ID_U50_GOLDEN),
>>> }, /* Alveo U50 (golden) */
>>> + { PCI_DEVICE(PCI_VENDOR_ID_XILINX, PCI_DEVICE_ID_U50), }, /*
>>> Alveo U50 */
>>> + { 0, }
>>> +};
>>> +
>>> +struct xmgmt {
>>> + struct pci_dev *pdev;
>>> + void *root;
>>> +
>>> + bool ready;
>>> +};
>>> +
>>> +static int xmgmt_config_pci(struct xmgmt *xm)
>>> +{
>>> + struct pci_dev *pdev = XMGMT_PDEV(xm);
>>> + int rc;
>>> +
>>> + rc = pcim_enable_device(pdev);
>>> + if (rc < 0) {
>>> + xmgmt_err(xm, "failed to enable device: %d", rc);
>>> + return rc;
>>> + }
>>> +
>>> + rc = pci_enable_pcie_error_reporting(pdev);
>>> + if (rc)
>> ok
>>> + xmgmt_warn(xm, "failed to enable AER: %d", rc);
>>> +
>>> + pci_set_master(pdev);
>>> +
>>> + rc = pcie_get_readrq(pdev);
>>> + if (rc > 512)
>> 512 is magic number, change this to a #define
>
>
> Will do.
>
>
>>> + pcie_set_readrq(pdev, 512);
>>> + return 0;
>>> +}
>>> +
>>> +static int xmgmt_match_slot_and_save(struct device *dev, void *data)
>>> +{
>>> + struct xmgmt *xm = data;
>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>> +
>>> + if (XMGMT_DEV_ID(pdev) == XMGMT_DEV_ID(xm->pdev)) {
>>> + pci_cfg_access_lock(pdev);
>>> + pci_save_state(pdev);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static void xmgmt_pci_save_config_all(struct xmgmt *xm)
>>> +{
>>> + bus_for_each_dev(&pci_bus_type, NULL, xm,
>>> xmgmt_match_slot_and_save);
>> refactor expected in v5 when pseudo bus change happens.
>
>
> There might be some mis-understanding here...
>
> No matter how we reorganize our code (using platform_device bus type
> or defining our own bus type), it's a driver that drives a PCIE device
> after all. So, this mgmt/root.c must be a PCIE driver, which may
> interact with a whole bunch of IP drivers through a pseudo bus we are
> about to create.
>
> What this code is doing here is completely of PCIE business (PCIE
> config space access). So, I think it is appropriate code in a PCIE
> driver.
>
> The PCIE device we are driving is a multi-function device. The mgmt pf
> is of function 0, which, according to PCIE spec, can manage other
> functions on the same device. So, I think it's appropriate for mgmt pf
> driver (this root driver) to find it's peer function (through PCIE bus
> type) on the same device and do something about it in certain special
> cases.
>
> Please let me know why you expect this code to be refactored and how
> you want it to be refactored. I might have missed something here...
>
ok, i get it.
thanks for the explanation.
Tom
>
>>> +}
>>> +
>>> +static int xmgmt_match_slot_and_restore(struct device *dev, void
>>> *data)
>>> +{
>>> + struct xmgmt *xm = data;
>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>> +
>>> + if (XMGMT_DEV_ID(pdev) == XMGMT_DEV_ID(xm->pdev)) {
>>> + pci_restore_state(pdev);
>>> + pci_cfg_access_unlock(pdev);
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static void xmgmt_pci_restore_config_all(struct xmgmt *xm)
>>> +{
>>> + bus_for_each_dev(&pci_bus_type, NULL, xm,
>>> xmgmt_match_slot_and_restore);
>>> +}
>>> +
>>> +static void xmgmt_root_hot_reset(struct pci_dev *pdev)
>>> +{
>>> + struct xmgmt *xm = pci_get_drvdata(pdev);
>>> + struct pci_bus *bus;
>>> + u8 pci_bctl;
>>> + u16 pci_cmd, devctl;
>>> + int i, ret;
>>> +
>>> + xmgmt_info(xm, "hot reset start");
>>> +
>>> + xmgmt_pci_save_config_all(xm);
>>> +
>>> + pci_disable_device(pdev);
>>> +
>>> + bus = pdev->bus;
>> whitespace, all these nl's are not needed
>
>
> Will remove them.
>
>
>>> +
>>> + /*
>>> + * When flipping the SBR bit, device can fall off the bus.
>>> This is
>>> + * usually no problem at all so long as drivers are working
>>> properly
>>> + * after SBR. However, some systems complain bitterly when the
>>> device
>>> + * falls off the bus.
>>> + * The quick solution is to temporarily disable the SERR
>>> reporting of
>>> + * switch port during SBR.
>>> + */
>>> +
>>> + pci_read_config_word(bus->self, PCI_COMMAND, &pci_cmd);
>>> + pci_write_config_word(bus->self, PCI_COMMAND, (pci_cmd &
>>> ~PCI_COMMAND_SERR));
>>> + pcie_capability_read_word(bus->self, PCI_EXP_DEVCTL, &devctl);
>>> + pcie_capability_write_word(bus->self, PCI_EXP_DEVCTL, (devctl
>>> & ~PCI_EXP_DEVCTL_FERE));
>>> + pci_read_config_byte(bus->self, PCI_BRIDGE_CONTROL, &pci_bctl);
>>> + pci_write_config_byte(bus->self, PCI_BRIDGE_CONTROL, pci_bctl
>>> | PCI_BRIDGE_CTL_BUS_RESET);
>> ok
>>> + msleep(100);
>>> + pci_write_config_byte(bus->self, PCI_BRIDGE_CONTROL, pci_bctl);
>>> + ssleep(1);
>>> +
>>> + pcie_capability_write_word(bus->self, PCI_EXP_DEVCTL, devctl);
>>> + pci_write_config_word(bus->self, PCI_COMMAND, pci_cmd);
>>> +
>>> + ret = pci_enable_device(pdev);
>>> + if (ret)
>>> + xmgmt_err(xm, "failed to enable device, ret %d", ret);
>>> +
>>> + for (i = 0; i < 300; i++) {
>>> + pci_read_config_word(pdev, PCI_COMMAND, &pci_cmd);
>>> + if (pci_cmd != 0xffff)
>>> + break;
>>> + msleep(20);
>>> + }
>>> + if (i == 300)
>>> + xmgmt_err(xm, "time'd out waiting for device to be
>>> online after reset");
>> time'd -> timed
>
>
> Will do.
>
>
> Thanks,
>
> Max
>
>> Tom
>>
>>> +
>>> + xmgmt_info(xm, "waiting for %d ms", i * 20);
>>> + xmgmt_pci_restore_config_all(xm);
>>> + xmgmt_config_pci(xm);
>>> +}
>>> +
>>> +static int xmgmt_create_root_metadata(struct xmgmt *xm, char
>>> **root_dtb)
>>> +{
>>> + char *dtb = NULL;
>>> + int ret;
>>> +
>>> + ret = xrt_md_create(XMGMT_DEV(xm), &dtb);
>>> + if (ret) {
>>> + xmgmt_err(xm, "create metadata failed, ret %d", ret);
>>> + goto failed;
>>> + }
>>> +
>>> + ret = xroot_add_vsec_node(xm->root, dtb);
>>> + if (ret == -ENOENT) {
>>> + /*
>>> + * We may be dealing with a MFG board.
>>> + * Try vsec-golden which will bring up all hard-coded
>>> leaves
>>> + * at hard-coded offsets.
>>> + */
>>> + ret = xroot_add_simple_node(xm->root, dtb,
>>> XRT_MD_NODE_VSEC_GOLDEN);
>>> + } else if (ret == 0) {
>>> + ret = xroot_add_simple_node(xm->root, dtb,
>>> XRT_MD_NODE_MGMT_MAIN);
>>> + }
>>> + if (ret)
>>> + goto failed;
>>> +
>>> + *root_dtb = dtb;
>>> + return 0;
>>> +
>>> +failed:
>>> + vfree(dtb);
>>> + return ret;
>>> +}
>>> +
>>> +static ssize_t ready_show(struct device *dev,
>>> + struct device_attribute *da,
>>> + char *buf)
>>> +{
>>> + struct pci_dev *pdev = to_pci_dev(dev);
>>> + struct xmgmt *xm = pci_get_drvdata(pdev);
>>> +
>>> + return sprintf(buf, "%d\n", xm->ready);
>>> +}
>>> +static DEVICE_ATTR_RO(ready);
>>> +
>>> +static struct attribute *xmgmt_root_attrs[] = {
>>> + &dev_attr_ready.attr,
>>> + NULL
>>> +};
>>> +
>>> +static struct attribute_group xmgmt_root_attr_group = {
>>> + .attrs = xmgmt_root_attrs,
>>> +};
>>> +
>>> +static struct xroot_physical_function_callback xmgmt_xroot_pf_cb = {
>>> + .xpc_hot_reset = xmgmt_root_hot_reset,
>>> +};
>>> +
>>> +static int xmgmt_probe(struct pci_dev *pdev, const struct
>>> pci_device_id *id)
>>> +{
>>> + int ret;
>>> + struct device *dev = &pdev->dev;
>>> + struct xmgmt *xm = devm_kzalloc(dev, sizeof(*xm), GFP_KERNEL);
>>> + char *dtb = NULL;
>>> +
>>> + if (!xm)
>>> + return -ENOMEM;
>>> + xm->pdev = pdev;
>>> + pci_set_drvdata(pdev, xm);
>>> +
>>> + ret = xmgmt_config_pci(xm);
>>> + if (ret)
>>> + goto failed;
>>> +
>>> + ret = xroot_probe(pdev, &xmgmt_xroot_pf_cb, &xm->root);
>>> + if (ret)
>>> + goto failed;
>>> +
>>> + ret = xmgmt_create_root_metadata(xm, &dtb);
>>> + if (ret)
>>> + goto failed_metadata;
>>> +
>>> + ret = xroot_create_group(xm->root, dtb);
>>> + vfree(dtb);
>>> + if (ret)
>>> + xmgmt_err(xm, "failed to create root group: %d", ret);
>>> +
>>> + if (!xroot_wait_for_bringup(xm->root))
>>> + xmgmt_err(xm, "failed to bringup all groups");
>>> + else
>>> + xm->ready = true;
>>> +
>>> + ret = sysfs_create_group(&pdev->dev.kobj,
>>> &xmgmt_root_attr_group);
>>> + if (ret) {
>>> + /* Warning instead of failing the probe. */
>>> + xmgmt_warn(xm, "create xmgmt root attrs failed: %d",
>>> ret);
>>> + }
>>> +
>>> + xroot_broadcast(xm->root, XRT_EVENT_POST_CREATION);
>>> + xmgmt_info(xm, "%s started successfully", XMGMT_MODULE_NAME);
>>> + return 0;
>>> +
>>> +failed_metadata:
>>> + xroot_remove(xm->root);
>>> +failed:
>>> + pci_set_drvdata(pdev, NULL);
>>> + return ret;
>>> +}
>>> +
>>> +static void xmgmt_remove(struct pci_dev *pdev)
>>> +{
>>> + struct xmgmt *xm = pci_get_drvdata(pdev);
>>> +
>>> + xroot_broadcast(xm->root, XRT_EVENT_PRE_REMOVAL);
>>> + sysfs_remove_group(&pdev->dev.kobj, &xmgmt_root_attr_group);
>>> + xroot_remove(xm->root);
>>> + pci_disable_pcie_error_reporting(xm->pdev);
>>> + xmgmt_info(xm, "%s cleaned up successfully", XMGMT_MODULE_NAME);
>>> +}
>>> +
>>> +static struct pci_driver xmgmt_driver = {
>>> + .name = XMGMT_MODULE_NAME,
>>> + .id_table = xmgmt_pci_ids,
>>> + .probe = xmgmt_probe,
>>> + .remove = xmgmt_remove,
>>> +};
>>> +
>>> +static int __init xmgmt_init(void)
>>> +{
>>> + int res = 0;
>>> +
>>> + res = xmgmt_register_leaf();
>>> + if (res)
>>> + return res;
>>> +
>>> + xmgmt_class = class_create(THIS_MODULE, XMGMT_MODULE_NAME);
>>> + if (IS_ERR(xmgmt_class))
>>> + return PTR_ERR(xmgmt_class);
>>> +
>>> + res = pci_register_driver(&xmgmt_driver);
>>> + if (res) {
>>> + class_destroy(xmgmt_class);
>>> + return res;
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> +static __exit void xmgmt_exit(void)
>>> +{
>>> + pci_unregister_driver(&xmgmt_driver);
>>> + class_destroy(xmgmt_class);
>>> + xmgmt_unregister_leaf();
>>> +}
>>> +
>>> +module_init(xmgmt_init);
>>> +module_exit(xmgmt_exit);
>>> +
>>> +MODULE_DEVICE_TABLE(pci, xmgmt_pci_ids);
>>> +MODULE_VERSION(XMGMT_DRIVER_VERSION);
>>> +MODULE_AUTHOR("XRT Team<runtime@xilinx.com>");
>>> +MODULE_DESCRIPTION("Xilinx Alveo management function driver");
>>> +MODULE_LICENSE("GPL v2");
>
next prev parent reply other threads:[~2021-04-14 15:40 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-24 5:29 [PATCH V4 XRT Alveo 00/20] XRT Alveo driver overview Lizhi Hou
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 01/20] Documentation: fpga: Add a document describing XRT Alveo drivers Lizhi Hou
2021-03-27 14:37 ` Tom Rix
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 02/20] fpga: xrt: driver metadata helper functions Lizhi Hou
2021-03-28 15:30 ` Tom Rix
2021-04-06 4:36 ` Lizhi Hou
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 03/20] fpga: xrt: xclbin file " Lizhi Hou
2021-03-29 17:12 ` Tom Rix
2021-04-06 17:52 ` Lizhi Hou
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 04/20] fpga: xrt: xrt-lib platform driver manager Lizhi Hou
2021-03-29 19:44 ` Tom Rix
2021-04-06 20:59 ` Max Zhen
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 05/20] fpga: xrt: group platform driver Lizhi Hou
2021-03-30 12:52 ` Tom Rix
2021-04-06 21:42 ` Max Zhen
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 06/20] fpga: xrt: char dev node helper functions Lizhi Hou
2021-03-30 13:45 ` Tom Rix
2021-04-06 16:29 ` Max Zhen
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 07/20] fpga: xrt: root driver infrastructure Lizhi Hou
2021-03-30 15:11 ` Tom Rix
2021-04-05 20:53 ` Max Zhen
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 08/20] fpga: xrt: platform " Lizhi Hou
2021-03-31 12:50 ` Tom Rix
2021-04-08 17:09 ` Max Zhen
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 09/20] fpga: xrt: management physical function driver (root) Lizhi Hou
2021-03-31 13:03 ` Tom Rix
2021-04-09 18:50 ` Max Zhen
2021-04-14 15:40 ` Tom Rix [this message]
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 10/20] fpga: xrt: main platform driver for management function device Lizhi Hou
2021-04-01 14:07 ` Tom Rix
2021-04-07 22:37 ` Lizhi Hou
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 11/20] fpga: xrt: fpga-mgr and region implementation for xclbin download Lizhi Hou
2021-04-01 14:43 ` Tom Rix
2021-04-07 22:41 ` Lizhi Hou
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 12/20] fpga: xrt: VSEC platform driver Lizhi Hou
2021-04-02 14:12 ` Tom Rix
2021-04-06 21:01 ` Lizhi Hou
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 13/20] fpga: xrt: User Clock Subsystem " Lizhi Hou
2021-04-02 14:27 ` Tom Rix
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 14/20] fpga: xrt: ICAP " Lizhi Hou
2021-04-06 13:50 ` Tom Rix
2021-04-06 23:00 ` Lizhi Hou
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 15/20] fpga: xrt: devctl " Lizhi Hou
2021-04-06 14:18 ` Tom Rix
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 16/20] fpga: xrt: clock " Lizhi Hou
2021-04-06 20:11 ` Tom Rix
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 17/20] fpga: xrt: clock frequency counter " Lizhi Hou
2021-04-06 20:32 ` Tom Rix
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 18/20] fpga: xrt: DDR calibration " Lizhi Hou
2021-04-06 20:37 ` Tom Rix
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 19/20] fpga: xrt: partition isolation " Lizhi Hou
2021-04-06 20:46 ` Tom Rix
2021-03-24 5:29 ` [PATCH V4 XRT Alveo 20/20] fpga: xrt: Kconfig and Makefile updates for XRT drivers Lizhi Hou
2021-04-06 21:00 ` Tom Rix
2021-04-06 23:39 ` Lizhi Hou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9a1a3d6e-3a5d-2272-7235-d46d09589cf8@redhat.com \
--to=trix@redhat.com \
--cc=devicetree@vger.kernel.org \
--cc=linux-fpga@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=lizhi.hou@xilinx.com \
--cc=max.zhen@xilinx.com \
--cc=mdf@kernel.org \
--cc=michal.simek@xilinx.com \
--cc=robh@kernel.org \
--cc=sonal.santan@xilinx.com \
--cc=stefanos@xilinx.com \
--cc=yliu@xilinx.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).