From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: linuxppc-dev@lists.ozlabs.org
Cc: kvm@vger.kernel.org, Alexey Kardashevskiy <aik@ozlabs.ru>,
Alistair Popple <alistair@popple.id.au>,
Alex Williamson <alex.williamson@redhat.com>,
Oliver O'Halloran <oohall@gmail.com>,
David Gibson <david@gibson.dropbear.id.au>
Subject: [PATCH kernel RFC 1/4] powerpc/powernv/ioda: Rework for huge DMA window at 4GB
Date: Mon, 2 Dec 2019 12:59:50 +1100 [thread overview]
Message-ID: <20191202015953.127902-2-aik@ozlabs.ru> (raw)
In-Reply-To: <20191202015953.127902-1-aik@ozlabs.ru>
This moves code to make the next patches look simpler. In particular:
1. Separate locals declaration as we will be allocating a smaller DMA
window if a TVE1_4GB option (allows a huge DMA windows at 4GB) is enabled;
2. Pass the bypass offset directly to pnv_pci_ioda2_create_table()
as it is the only information needed from @pe;
3. Use PAGE_SHIFT for it_map allocation estimate and @tceshift for
the IOMMU page size; this makes the distinction clear and allows
easy switching between different IOMMU page size.
These changes should not cause behavioral change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
I really need 1), 2) makes the code less dependent on the PE struct member
value (==easier to follow), 3) is to enable 2MB quickly for the default
DMA window for debugging/performance testing.
---
arch/powerpc/platforms/powernv/pci-ioda.c | 38 ++++++++++++-----------
1 file changed, 20 insertions(+), 18 deletions(-)
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 9e55bd9d7ecd..328fb27a966f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2349,15 +2349,10 @@ static void pnv_pci_ioda2_set_bypass(struct pnv_ioda_pe *pe, bool enable)
pe->tce_bypass_enabled = enable;
}
-static long pnv_pci_ioda2_create_table(struct iommu_table_group *table_group,
- int num, __u32 page_shift, __u64 window_size, __u32 levels,
+static long pnv_pci_ioda2_create_table(int nid, int num, __u64 bus_offset,
+ __u32 page_shift, __u64 window_size, __u32 levels,
bool alloc_userspace_copy, struct iommu_table **ptbl)
{
- struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
- table_group);
- int nid = pe->phb->hose->node;
- __u64 bus_offset = num ?
- pe->table_group.tce64_start : table_group->tce32_start;
long ret;
struct iommu_table *tbl;
@@ -2384,21 +2379,23 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe)
{
struct iommu_table *tbl = NULL;
long rc;
- unsigned long res_start, res_end;
+ u64 max_memory, maxblock, window_size;
+ const unsigned int tceshift = PAGE_SHIFT;
+ unsigned long res_start, res_end, tces_order, tcelevel_order, levels;
/*
* crashkernel= specifies the kdump kernel's maximum memory at
* some offset and there is no guaranteed the result is a power
* of 2, which will cause errors later.
*/
- const u64 max_memory = __rounddown_pow_of_two(memory_hotplug_max());
+ max_memory = __rounddown_pow_of_two(memory_hotplug_max());
/*
* In memory constrained environments, e.g. kdump kernel, the
* DMA window can be larger than available memory, which will
* cause errors later.
*/
- const u64 maxblock = 1UL << (PAGE_SHIFT + MAX_ORDER - 1);
+ maxblock = 1UL << (PAGE_SHIFT + MAX_ORDER - 1);
/*
* We create the default window as big as we can. The constraint is
@@ -2408,11 +2405,11 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe)
* to support crippled devices (i.e. not fully 64bit DMAble) only.
*/
/* iommu_table::it_map uses 1 bit per IOMMU page, hence 8 */
- const u64 window_size = min((maxblock * 8) << PAGE_SHIFT, max_memory);
+ window_size = min((maxblock * 8) << tceshift, max_memory);
/* Each TCE level cannot exceed maxblock so go multilevel if needed */
- unsigned long tces_order = ilog2(window_size >> PAGE_SHIFT);
- unsigned long tcelevel_order = ilog2(maxblock >> 3);
- unsigned int levels = tces_order / tcelevel_order;
+ tces_order = ilog2(window_size >> tceshift);
+ tcelevel_order = ilog2(maxblock >> 3);
+ levels = tces_order / tcelevel_order;
if (tces_order % tcelevel_order)
levels += 1;
@@ -2422,8 +2419,8 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe)
*/
levels = max_t(unsigned int, levels, POWERNV_IOMMU_DEFAULT_LEVELS);
- rc = pnv_pci_ioda2_create_table(&pe->table_group, 0, PAGE_SHIFT,
- window_size, levels, false, &tbl);
+ rc = pnv_pci_ioda2_create_table(pe->phb->hose->node,
+ 0, 0, tceshift, window_size, levels, false, &tbl);
if (rc) {
pe_err(pe, "Failed to create 32-bit TCE table, err %ld",
rc);
@@ -2525,8 +2522,13 @@ static long pnv_pci_ioda2_create_table_userspace(
int num, __u32 page_shift, __u64 window_size, __u32 levels,
struct iommu_table **ptbl)
{
- long ret = pnv_pci_ioda2_create_table(table_group,
- num, page_shift, window_size, levels, true, ptbl);
+ struct pnv_ioda_pe *pe = container_of(table_group, struct pnv_ioda_pe,
+ table_group);
+ __u64 bus_offset = num ?
+ pe->table_group.tce64_start : table_group->tce32_start;
+ long ret = pnv_pci_ioda2_create_table(pe->phb->hose->node,
+ num, bus_offset, page_shift, window_size, levels, true,
+ ptbl);
if (!ret)
(*ptbl)->it_allocated_size = pnv_pci_ioda2_get_table_size(
--
2.17.1
next prev parent reply other threads:[~2019-12-02 2:13 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-02 1:59 [PATCH kernel RFC 0/4] powerpc/powenv/ioda: Allow huge DMA window at 4GB Alexey Kardashevskiy
2019-12-02 1:59 ` Alexey Kardashevskiy [this message]
2019-12-02 1:59 ` [PATCH kernel RFC 2/4] powerpc/powernv/ioda: Allow smaller TCE table levels Alexey Kardashevskiy
2019-12-02 1:59 ` [PATCH kernel RFC 3/4] powerpc/powernv/phb4: Add 4GB IOMMU bypass mode Alexey Kardashevskiy
2019-12-02 1:59 ` [PATCH kernel RFC 4/4] vfio/spapr_tce: Advertise and allow a huge DMA windows at 4GB Alexey Kardashevskiy
2019-12-02 5:36 ` [PATCH kernel RFC 0/4] powerpc/powenv/ioda: Allow huge DMA window " Alistair Popple
2019-12-02 5:58 ` Alexey Kardashevskiy
2019-12-02 5:51 ` [PATCH kernel RFC 00/4] powerpc/powernv/ioda: Move TCE bypass base to PE Alexey Kardashevskiy
2020-01-10 4:18 ` [PATCH kernel RFC 0/4] powerpc/powenv/ioda: Allow huge DMA window at 4GB Alexey Kardashevskiy
2020-01-23 0:53 ` Alexey Kardashevskiy
2020-01-23 1:17 ` David Gibson
2020-01-23 8:42 ` Alexey Kardashevskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191202015953.127902-2-aik@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=alistair@popple.id.au \
--cc=david@gibson.dropbear.id.au \
--cc=kvm@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=oohall@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).