All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yazen Ghannam <yazen.ghannam@amd.com>
To: Naveen Krishna Chatradhi <nchatrad@amd.com>
Cc: linux-edac@vger.kernel.org, x86@kernel.org,
	linux-kernel@vger.kernel.org, bp@alien8.de, mingo@redhat.com,
	mchehab@kernel.org, Muralidhara M K <muralimk@amd.com>
Subject: Re: [PATCH v5 1/5] x86/amd_nb: Add support for northbridges on Aldebaran
Date: Wed, 27 Oct 2021 20:54:58 +0000	[thread overview]
Message-ID: <YXm8or71P9rcukk1@yaz-ubuntu> (raw)
In-Reply-To: <20211025145018.29985-2-nchatrad@amd.com>

On Mon, Oct 25, 2021 at 08:20:14PM +0530, Naveen Krishna Chatradhi wrote:
...
> +/* GPU Data Fabric ID Device 24 Function 1 */
> +#define PCI_DEVICE_ID_AMD_ALDEBARAN_DF_F1 0x14d1
> +
> +static struct pci_dev *get_gpu_df_f1(void)
> +{
> +	return pci_get_device(PCI_VENDOR_ID_AMD,
> +			      PCI_DEVICE_ID_AMD_ALDEBARAN_DF_F1, NULL);
> +}
> +
> +/* DF18xF1 registers on Aldebaran GPU */
> +#define REG_LOCAL_NODE_TYPE_MAP		0x144
> +#define REG_RMT_NODE_TYPE_MAP		0x148
> +
> +/*
> + * Newer AMD CPUs and GPUs whose data fabrics can be connected via custom xGMI
> + * links, comes with registers to gather local and remote node type map info.
> + *
> + * "Local Node Type" refers to nodes with the same type as that from which the
> + * register is read, and "Remote Node Type" refers to nodes with a different type.
> + *
> + * This function, reads the registers from GPU DF function 1.
> + * Hence, local nodes are GPU and remote nodes are CPUs.
> + */
> +static int amd_get_node_map(struct pci_dev *pdev)
> +{
> +	struct amd_node_map *nodemap;
> +	u32 tmp;
> +
> +	nodemap = kmalloc(sizeof(*nodemap), GFP_KERNEL);
> +	if (!nodemap)
> +		return -ENOMEM;
> +
> +	pci_read_config_dword(pdev, REG_LOCAL_NODE_TYPE_MAP, &tmp);
> +	nodemap->gpu_node_start_id = tmp & 0xFFF;
> +
> +	pci_read_config_dword(pdev, REG_RMT_NODE_TYPE_MAP, &tmp);
> +	nodemap->cpu_node_count = tmp >> 16 & 0xFFF;
> +
> +	amd_northbridges.nodemap = nodemap;
> +	return 0;
> +}
> +
...
>  int amd_cache_northbridges(void)
>  {
>  	const struct pci_device_id *misc_ids = amd_nb_misc_ids;
>  	const struct pci_device_id *link_ids = amd_nb_link_ids;
>  	const struct pci_device_id *root_ids = amd_root_ids;
> -	struct pci_dev *root, *misc, *link;
> +	struct pci_dev *root, *misc, *link, *dff1;
...
> +	/*
> +	 * The number of miscs, roots and roots_per_misc might vary on different
> +	 * nodes of a heterogeneous system.
> +	 * Calculate roots_per_misc accordingly in order to skip the redundant
> +	 * roots and map the DF/SMN interfaces to correct PCI roots.
> +	 */
> +	if (gpu_root_count && gpu_misc_count) {
> +		dff1 = get_gpu_df_f1();

dff1 is only used in this section, so it can be declared here.

> +		if (!dff1) {
> +			pr_debug("Failed to gather GPU node info.\n");

This message doesn't make sense to me. The failure is that a particular PCI
device was not found. Gathering node info hasn't even been attempted yet.

> +			return -ENODEV;
> +		}
> +
> +		if (amd_get_node_map(dff1))
> +			return -ENOMEM;
> +

Also, why do these functions need to be split up? If it was because of the
return values, then you could have done the following.

		int ret = amd_get_node_map();

		if (ret)
			return ret;

Thanks,
Yazen

  reply	other threads:[~2021-10-27 20:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-25 14:50 [PATCH v5 0/5] x86/edac/amd64: Add heterogeneous node support Naveen Krishna Chatradhi
2021-10-25 14:50 ` [PATCH v5 1/5] x86/amd_nb: Add support for northbridges on Aldebaran Naveen Krishna Chatradhi
2021-10-27 20:54   ` Yazen Ghannam [this message]
2021-10-25 14:50 ` [PATCH v5 2/5] EDAC/mce_amd: Extract node id from MCA_IPID Naveen Krishna Chatradhi
2021-10-25 14:50 ` [PATCH v5 3/5] EDAC/amd64: Extend family ops functions Naveen Krishna Chatradhi
2021-10-27 21:08   ` Yazen Ghannam
2021-10-25 14:50 ` [PATCH v5 4/5] EDAC/amd64: Move struct fam_type into amd64_pvt structure Naveen Krishna Chatradhi
2021-10-27 21:12   ` Yazen Ghannam
2021-10-25 14:50 ` [PATCH v5 5/5] EDAC/amd64: Enumerate memory on Aldebaran GPU nodes Naveen Krishna Chatradhi
2021-10-27 22:32   ` Yazen Ghannam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YXm8or71P9rcukk1@yaz-ubuntu \
    --to=yazen.ghannam@amd.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=mingo@redhat.com \
    --cc=muralimk@amd.com \
    --cc=nchatrad@amd.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.