xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Stefano Stabellini <sstabellini@kernel.org>
To: Oleksandr <olekstysh@gmail.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
	Julien Grall <julien@xen.org>,
	 xen-devel@lists.xenproject.org,
	 Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
	 Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>,
	 Henry Wang <Henry.Wang@arm.com>,
	 Bertrand Marquis <bertrand.marquis@arm.com>,
	Wei Chen <Wei.Chen@arm.com>,
	 fnuv@xilinx.com
Subject: Re: [PATCH V2 2/3] xen/arm: Add handling of extended regions for Dom0
Date: Wed, 22 Sep 2021 13:50:10 -0700 (PDT)	[thread overview]
Message-ID: <alpine.DEB.2.21.2109221349350.17979@sstabellini-ThinkPad-T480s> (raw)
In-Reply-To: <1ac58681-ef42-bb32-25b6-620d51d4f075@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 12828 bytes --]

On Wed, 22 Sep 2021, Oleksandr wrote:
> On 22.09.21 01:00, Stefano Stabellini wrote:
> > On Tue, 21 Sep 2021, Oleksandr wrote:
> > > On 21.09.21 02:21, Stefano Stabellini wrote:
> > > > On Sun, 19 Sep 2021, Oleksandr wrote:
> > > > > > On 18/09/2021 03:37, Stefano Stabellini wrote:
> > > > > > > On Fri, 17 Sep 2021, Stefano Stabellini wrote:
> > > > > > > > On Fri, 17 Sep 2021, Oleksandr wrote:
> > > > > > > > > > > +
> > > > > > > > > > > +    dt_dprintk("Find unallocated memory for extended
> > > > > > > > > > > regions\n");
> > > > > > > > > > > +
> > > > > > > > > > > +    unalloc_mem = rangeset_new(NULL, NULL, 0);
> > > > > > > > > > > +    if ( !unalloc_mem )
> > > > > > > > > > > +        return -ENOMEM;
> > > > > > > > > > > +
> > > > > > > > > > > +    /* Start with all available RAM */
> > > > > > > > > > > +    for ( i = 0; i < bootinfo.mem.nr_banks; i++ )
> > > > > > > > > > > +    {
> > > > > > > > > > > +        start = bootinfo.mem.bank[i].start;
> > > > > > > > > > > +        end = bootinfo.mem.bank[i].start +
> > > > > > > > > > > bootinfo.mem.bank[i].size - 1;
> > > > > > > > > > > +        res = rangeset_add_range(unalloc_mem, start,
> > > > > > > > > > > end);
> > > > > > > > > > > +        if ( res )
> > > > > > > > > > > +        {
> > > > > > > > > > > +            printk(XENLOG_ERR "Failed to add:
> > > > > > > > > > > %#"PRIx64"->%#"PRIx64"\n",
> > > > > > > > > > > +                   start, end);
> > > > > > > > > > > +            goto out;
> > > > > > > > > > > +        }
> > > > > > > > > > > +    }
> > > > > > > > > > > +
> > > > > > > > > > > +    /* Remove RAM assigned to Dom0 */
> > > > > > > > > > > +    for ( i = 0; i < assign_mem->nr_banks; i++ )
> > > > > > > > > > > +    {
> > > > > > > > > > > +        start = assign_mem->bank[i].start;
> > > > > > > > > > > +        end = assign_mem->bank[i].start +
> > > > > > > > > > > assign_mem->bank[i].size - 1;
> > > > > > > > > > > +        res = rangeset_remove_range(unalloc_mem, start,
> > > > > > > > > > > end);
> > > > > > > > > > > +        if ( res )
> > > > > > > > > > > +        {
> > > > > > > > > > > +            printk(XENLOG_ERR "Failed to remove:
> > > > > > > > > > > %#"PRIx64"->%#"PRIx64"\n",
> > > > > > > > > > > +                   start, end);
> > > > > > > > > > > +            goto out;
> > > > > > > > > > > +        }
> > > > > > > > > > > +    }
> > > > > > > > > > > +
> > > > > > > > > > > +    /* Remove reserved-memory regions */
> > > > > > > > > > > +    for ( i = 0; i < bootinfo.reserved_mem.nr_banks; i++
> > > > > > > > > > > )
> > > > > > > > > > > +    {
> > > > > > > > > > > +        start = bootinfo.reserved_mem.bank[i].start;
> > > > > > > > > > > +        end = bootinfo.reserved_mem.bank[i].start +
> > > > > > > > > > > +            bootinfo.reserved_mem.bank[i].size - 1;
> > > > > > > > > > > +        res = rangeset_remove_range(unalloc_mem, start,
> > > > > > > > > > > end);
> > > > > > > > > > > +        if ( res )
> > > > > > > > > > > +        {
> > > > > > > > > > > +            printk(XENLOG_ERR "Failed to remove:
> > > > > > > > > > > %#"PRIx64"->%#"PRIx64"\n",
> > > > > > > > > > > +                   start, end);
> > > > > > > > > > > +            goto out;
> > > > > > > > > > > +        }
> > > > > > > > > > > +    }
> > > > > > > > > > > +
> > > > > > > > > > > +    /* Remove grant table region */
> > > > > > > > > > > +    start = kinfo->gnttab_start;
> > > > > > > > > > > +    end = kinfo->gnttab_start + kinfo->gnttab_size - 1;
> > > > > > > > > > > +    res = rangeset_remove_range(unalloc_mem, start, end);
> > > > > > > > > > > +    if ( res )
> > > > > > > > > > > +    {
> > > > > > > > > > > +        printk(XENLOG_ERR "Failed to remove:
> > > > > > > > > > > %#"PRIx64"->%#"PRIx64"\n",
> > > > > > > > > > > +               start, end);
> > > > > > > > > > > +        goto out;
> > > > > > > > > > > +    }
> > > > > > > > > > > +
> > > > > > > > > > > +    start = EXT_REGION_START;
> > > > > > > > > > > +    end = min((1ULL << p2m_ipa_bits) - 1,
> > > > > > > > > > > EXT_REGION_END);
> > > > > > > > > > > +    res = rangeset_report_ranges(unalloc_mem, start, end,
> > > > > > > > > > > +                                 add_ext_regions,
> > > > > > > > > > > ext_regions);
> > > > > > > > > > > +    if ( res )
> > > > > > > > > > > +        ext_regions->nr_banks = 0;
> > > > > > > > > > > +    else if ( !ext_regions->nr_banks )
> > > > > > > > > > > +        res = -ENOENT;
> > > > > > > > > > > +
> > > > > > > > > > > +out:
> > > > > > > > > > > +    rangeset_destroy(unalloc_mem);
> > > > > > > > > > > +
> > > > > > > > > > > +    return res;
> > > > > > > > > > > +}
> > > > > > > > > > > +
> > > > > > > > > > > +static int __init find_memory_holes(const struct
> > > > > > > > > > > kernel_info
> > > > > > > > > > > *kinfo,
> > > > > > > > > > > +                                    struct meminfo
> > > > > > > > > > > *ext_regions)
> > > > > > > > > > > +{
> > > > > > > > > > > +    struct dt_device_node *np;
> > > > > > > > > > > +    struct rangeset *mem_holes;
> > > > > > > > > > > +    paddr_t start, end;
> > > > > > > > > > > +    unsigned int i;
> > > > > > > > > > > +    int res;
> > > > > > > > > > > +
> > > > > > > > > > > +    dt_dprintk("Find memory holes for extended
> > > > > > > > > > > regions\n");
> > > > > > > > > > > +
> > > > > > > > > > > +    mem_holes = rangeset_new(NULL, NULL, 0);
> > > > > > > > > > > +    if ( !mem_holes )
> > > > > > > > > > > +        return -ENOMEM;
> > > > > > > > > > > +
> > > > > > > > > > > +    /* Start with maximum possible addressable physical
> > > > > > > > > > > memory
> > > > > > > > > > > range */
> > > > > > > > > > > +    start = EXT_REGION_START;
> > > > > > > > > > > +    end = min((1ULL << p2m_ipa_bits) - 1,
> > > > > > > > > > > EXT_REGION_END);
> > > > > > > > > > > +    res = rangeset_add_range(mem_holes, start, end);
> > > > > > > > > > > +    if ( res )
> > > > > > > > > > > +    {
> > > > > > > > > > > +        printk(XENLOG_ERR "Failed to add:
> > > > > > > > > > > %#"PRIx64"->%#"PRIx64"\n",
> > > > > > > > > > > +               start, end);
> > > > > > > > > > > +        goto out;
> > > > > > > > > > > +    }
> > > > > > > > > > > +
> > > > > > > > > > > +    /* Remove all regions described by "reg" property
> > > > > > > > > > > (MMIO,
> > > > > > > > > > > RAM,
> > > > > > > > > > > etc) */
> > > > > > > > > > Well... The loop below is not going to handle all the
> > > > > > > > > > regions
> > > > > > > > > > described in
> > > > > > > > > > the property "reg". Instead, it will cover a subset of "reg"
> > > > > > > > > > where
> > > > > > > > > > the
> > > > > > > > > > memory is addressable.
> > > > > > > > > As I understand, we are only interested in subset of "reg"
> > > > > > > > > where
> > > > > > > > > the
> > > > > > > > > memory is
> > > > > > > > > addressable.
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > > You will also need to cover "ranges" that will describe the
> > > > > > > > > > BARs
> > > > > > > > > > for
> > > > > > > > > > the PCI
> > > > > > > > > > devices.
> > > > > > > > > Good point.
> > > > > > > > Yes, very good point!
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > Could you please clarify how to recognize whether it is a PCI
> > > > > > > > > device as long as PCI support is not merged? Or just to find
> > > > > > > > > any
> > > > > > > > > device nodes
> > > > > > > > > with non-empty "ranges" property
> > > > > > > > > and retrieve addresses?
> > > > > > > > Normally any bus can have a ranges property with the aperture
> > > > > > > > and
> > > > > > > > possible address translations, including /amba (compatible =
> > > > > > > > "simple-bus"). However, in these cases dt_device_get_address
> > > > > > > > already
> > > > > > > > takes care of it, see
> > > > > > > > xen/common/device_tree.c:dt_device_get_address.
> > > > > > > > 
> > > > > > > > The PCI bus is special for 2 reasons:
> > > > > > > > - the ranges property has a different format
> > > > > > > > - the bus is hot-pluggable
> > > > > > > > 
> > > > > > > > So I think the only one that we need to treat specially is PCI.
> > > > > > > > 
> > > > > > > > As far as I am aware PCI is the only bus (or maybe just the only
> > > > > > > > bus
> > > > > > > > that we support?) where ranges means the aperture.
> > > > > > > Now that I think about this, there is another "hotpluggable"
> > > > > > > scenario
> > > > > > > we
> > > > > > > need to think about:
> > > > > > > 
> > > > > > > [1] https://marc.info/?l=xen-devel&m=163056546214978
> > > > > > > 
> > > > > > > Xilinx devices have FPGA regions with apertures currently not
> > > > > > > described
> > > > > > > in device tree, where things can programmed in PL at runtime
> > > > > > > making
> > > > > > > new
> > > > > > > devices appear with new MMIO regions out of thin air.
> > > > > > > 
> > > > > > > Now let me start by saying that yes, the entire programmable
> > > > > > > region
> > > > > > > aperture could probably be described in device tree, however, in
> > > > > > > reality it is not currently done in any of the device trees we use
> > > > > > > (including the upstream device trees in linux.git).
> > > > > > This is rather annoying, but not unheard. There are a couple of
> > > > > > platforms
> > > > > > where the MMIOs are not fully described in the DT.
> > > > > > 
> > > > > > In fact, we have a callback 'specific_mappings' which create
> > > > > > additional
> > > > > > mappings (e.g. on the omap5) for dom0.
> > > > > > 
> > > > > > > So, we have a problem :-(
> > > > > > > 
> > > > > > > 
> > > > > > > I can work toward getting the right info on device tree, but in
> > > > > > > reality
> > > > > > > that is going to take time and for now the device tree doesn't
> > > > > > > have
> > > > > > > the
> > > > > > > FPGA aperture in it. So if we accept this series as is, it is
> > > > > > > going to
> > > > > > > stop features like [1] from working. >
> > > > > > > If we cannot come up with any better plans, I think it would be
> > > > > > > better
> > > > > > > to drop find_memory_holes, only rely on find_unallocated_memory
> > > > > > > even
> > > > > > > when the IOMMU is on. One idea is that we could add on top of the
> > > > > > > regions found by find_unallocated_memory any MMIO regions marked
> > > > > > > as
> > > > > > > xen,passthrough: they are safe because they are not going to dom0
> > > > > > > anyway.
> > > > > > (Oleksandr, it looks like some rationale about the different
> > > > > > approach is
> > > > > > missing in the commit message. Can you add it?)
> > > > > Yes sure, but let me please clarify what is different approach in this
> > > > > context. Is it to *also* take into the account MMIO regions of the
> > > > > devices
> > > > > for
> > > > > passthrough for case when IOMMU is off (in addition to unallocated
> > > > > memory)? If
> > > > > yes, I wonder whether we will gain much with that according to that
> > > > > device's
> > > > > MMIO regions are usually not big enough and we stick to allocate
> > > > > extended
> > > > > regions with bigger size (> 64MB).
> > > > That's fair enough. There are a couple of counter examples where the
> > > > MMIO regions for the device to assign are quite large, for instance a
> > > > GPU, Xilinx AIEngine, or the PCIe Root Complex with the entire aperture,
> > > > but maybe they are not that common. I am not sure if it is worth
> > > > scanning the tree for xen,passthrough regions every time at boot for
> > > > this.
> > > ok, I will add a few sentences to commit message about this different
> > > approach
> > > for now. At least this could be implemented later on if there is a need.
> > One thing that worries me about this is that if we take an old Xen with
> > this series and run it on a new board, it might cause problems. At the
> > very least [1] wouldn't work.
> 
> I got it.
> 
> 
> > 
> > Can we have a Xen command line argument to disable extended regions as
> > an emergecy toggle?
> 
> I think, yes. If no preference for the argument name I will name it
> "no-ext-region".

It is better to introduce it as ext-regions=yes/no with yes as default.
So that in the future we could extending it to ext-regions=start,size if
we wanted to.

  reply	other threads:[~2021-09-22 20:50 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-10 18:18 [PATCH V2 0/3] Add handling of extended regions (safe ranges) on Arm (Was "xen/memory: Introduce a hypercall to provide unallocated space") Oleksandr Tyshchenko
2021-09-10 18:18 ` [PATCH V2 1/3] xen: Introduce "gpaddr_bits" field to XEN_SYSCTL_physinfo Oleksandr Tyshchenko
2021-09-16 14:49   ` Jan Beulich
2021-09-16 15:43     ` Oleksandr
2021-09-16 15:47       ` Jan Beulich
2021-09-16 16:05         ` Oleksandr
2021-09-10 18:18 ` [PATCH V2 2/3] xen/arm: Add handling of extended regions for Dom0 Oleksandr Tyshchenko
2021-09-14  0:55   ` Stefano Stabellini
2021-09-15 19:10     ` Oleksandr
2021-09-15 21:21       ` Stefano Stabellini
2021-09-16 20:57         ` Oleksandr
2021-09-16 21:30           ` Stefano Stabellini
2021-09-17  7:28             ` Oleksandr
2021-09-17 14:08       ` Oleksandr
2021-09-17 15:52       ` Julien Grall
2021-09-17 20:13         ` Oleksandr
2021-09-17 15:48   ` Julien Grall
2021-09-17 19:51     ` Oleksandr
2021-09-17 21:56       ` Stefano Stabellini
2021-09-17 22:37         ` Stefano Stabellini
2021-09-19 14:34           ` Julien Grall
2021-09-19 20:18             ` Oleksandr
2021-09-20 23:21               ` Stefano Stabellini
2021-09-21 18:14                 ` Oleksandr
2021-09-21 22:00                   ` Stefano Stabellini
2021-09-22 18:25                     ` Oleksandr
2021-09-22 20:50                       ` Stefano Stabellini [this message]
2021-09-23 10:10                         ` Oleksandr
2021-09-20 23:55             ` Stefano Stabellini
2021-09-21 19:43         ` Oleksandr
2021-09-22 18:18           ` Oleksandr
2021-09-22 21:05             ` Stefano Stabellini
2021-09-23 10:11               ` Oleksandr
2021-09-18 16:59       ` Oleksandr
2021-09-23 10:41         ` Oleksandr
2021-09-23 16:38           ` Stefano Stabellini
2021-09-23 17:44             ` Oleksandr
2021-09-19 14:00       ` Julien Grall
2021-09-19 17:59         ` Oleksandr
2021-09-10 18:18 ` [PATCH V2 3/3] libxl/arm: Add handling of extended regions for DomU Oleksandr Tyshchenko
2021-09-16 22:35   ` Stefano Stabellini
2021-09-20 20:07     ` Oleksandr
2021-09-21 17:35       ` Oleksandr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.21.2109221349350.17979@sstabellini-ThinkPad-T480s \
    --to=sstabellini@kernel.org \
    --cc=Henry.Wang@arm.com \
    --cc=Volodymyr_Babchuk@epam.com \
    --cc=Wei.Chen@arm.com \
    --cc=bertrand.marquis@arm.com \
    --cc=fnuv@xilinx.com \
    --cc=julien@xen.org \
    --cc=oleksandr_tyshchenko@epam.com \
    --cc=olekstysh@gmail.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).