All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
@ 2014-09-04 17:47 Peter Maydell
  2014-09-06  0:26 ` Peter Crosthwaite
  2014-09-07  1:47 ` Edgar E. Iglesias
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Maydell @ 2014-09-04 17:47 UTC (permalink / raw)
  To: QEMU Developers
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	Edgar E. Iglesias, Greg Bellows

One of the parts of the ARM TrustZone/Security Extensions
which the patchsets we've seen so far haven't attempted to
tackle is the problem of Secure vs NonSecure memory accesses.
Architecturally, every memory transaction should have
an S/NS bit accompanying the physical address, effectively
making an extra address bit (you could in theory have
completely different physical memory maps for S and NS,
though usually they're similar). We can't fake this up
by having the device call back into the CPU to check its
current status, because the CPU can make NS accesses
when it is in the Secure world (controlled by page table bits).

We have some other cases where devices would really
like to have some kind of "memory transaction attributes"
information:
 * watchpoints on ARM need to know whether the access
   was userspace or system (ie which mmu_idx it was).
   [The LDRT/STRT instructions which let the kernel
   do userspace-privilege accesses mean that the current
   state of the CPU isn't sufficient to determine this.]
   They'll also want to know the S/NS info.
 * The GIC wants to provide a different set of registers
   to each CPU (currently we do this with a hacky use of
   current_cpu), as do some other devices.

Paolo also mentioned that x86 SMM has some situations
where devices need to be only visible in some cases.

(Another oddball usecase is the Cortex-M split I and D
bus for low memory, where instruction and data accesses
go out via different buses and might map to different things,
but for now I think I'm happy to ignore this as more a
theoretical question than a practical one...)

Here's one idea which deals with some of these...

We introduce the concept of memory transaction attributes,
which are a guest-CPU specific bunch of bits (say, a
uint32_t). We also allow the CPU to have more than one
AddressSpace, and have the guest CPU code provide
a function returning the AddressSpace to use for a
given set of transaction attributes. For ARM we'd
put all of (S/NS, mmu_idx, CPU number) into the attributes,
use an AddressSpace each for S and NS, and use the
S/NS bit in the attributes to select the AddressSpace
The default is that we have one AddressSpace and always
use that, ie the transaction attributes are ignored.
(Maybe the function should return an integer index
into a cpu->as[] array?)

tlb_set_page() takes an extra argument specifying the
transaction attributes. For RAM accesses we can just
immediately use this to get the AddressSpace to pass
to address_space_translate_for_iotlb(). For IO accesses
we need to stash the attributes in the iotlb[], which
means extending that from an array of hwaddrs to an
array of struct {hwaddr, attributes}, which is easy enough.
Then the io_read/write glue functions in softmmu_template.h
can fish the attributes out of the iotlb and use them to
pick the AddressSpace to pass to iotlb_to_region().
More importantly, we can arrange to pass them through
to the device read/write callbacks (either directly,
or indirectly by saving them in the CPU struct like we
do for mem_io_vaddr; since changing the prototypes on
every device read and write callback would be insane
we probably want to have fields in MemoryRegionOps for
read_with_attrs and write_with_attrs function pointers).

We would need APIs so bus masters other than CPUs can
specify the transaction attributes (and it's probably
a good idea for the guest CPU to arrange its attribute
bit definitions so that "0" means "no attribute info",
to account for legacy bus masters).

The watchpoint read/write callbacks would just stuff
the transaction attribute word for the access that
triggered the watchpoint into the CPUWatchpoint struct
so that target-specific code can look at it later.

This combination of things would let us handle both
"some devices and some RAM are only visible in
the Secure address space" (by having the board
construct the S AddressSpace with them and the NS
without them) and also "some devices are TrustZone
aware and behave differently" (by having the device
read/write callbacks look at the attributes word).

Thoughts?

-- PMM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2014-09-04 17:47 [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes Peter Maydell
@ 2014-09-06  0:26 ` Peter Crosthwaite
  2014-09-06 10:30   ` Peter Maydell
  2014-09-07  1:47 ` Edgar E. Iglesias
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Crosthwaite @ 2014-09-06  0:26 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Paolo Bonzini, Greg Bellows, Edgar E. Iglesias, QEMU Developers,
	Richard Henderson

On Fri, Sep 5, 2014 at 3:47 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
> One of the parts of the ARM TrustZone/Security Extensions
> which the patchsets we've seen so far haven't attempted to
> tackle is the problem of Secure vs NonSecure memory accesses.
> Architecturally, every memory transaction should have
> an S/NS bit accompanying the physical address, effectively
> making an extra address bit (you could in theory have
> completely different physical memory maps for S and NS,
> though usually they're similar). We can't fake this up
> by having the device call back into the CPU to check its
> current status, because the CPU can make NS accesses
> when it is in the Secure world (controlled by page table bits).
>
> We have some other cases where devices would really
> like to have some kind of "memory transaction attributes"
> information:
>  * watchpoints on ARM need to know whether the access
>    was userspace or system (ie which mmu_idx it was).
>    [The LDRT/STRT instructions which let the kernel
>    do userspace-privilege accesses mean that the current
>    state of the CPU isn't sufficient to determine this.]
>    They'll also want to know the S/NS info.
>  * The GIC wants to provide a different set of registers
>    to each CPU (currently we do this with a hacky use of
>    current_cpu), as do some other devices.
>
> Paolo also mentioned that x86 SMM has some situations
> where devices need to be only visible in some cases.
>
> (Another oddball usecase is the Cortex-M split I and D
> bus for low memory, where instruction and data accesses
> go out via different buses and might map to different things,
> but for now I think I'm happy to ignore this as more a
> theoretical question than a practical one...)
>

Similar problem that I ran into was trying to differentiate I from D
for unassigned memory accesses. I'm trying to correctly implement
prefetch and data aborts for unassigned but the memory API has no
information on whether an access is I or D. Attributes could solve
this.

> Here's one idea which deals with some of these...
>
> We introduce the concept of memory transaction attributes,
> which are a guest-CPU specific bunch of bits (say, a
> uint32_t). We also allow the CPU to have more than one
> AddressSpace,

Or any master for that matter. I have an example of this for a dma
device already:

http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg00370.html

 and have the guest CPU code provide
> a function returning the AddressSpace to use for a
> given set of transaction attributes. For ARM we'd
> put all of (S/NS, mmu_idx, CPU number) into the attributes,

Many buses have master ID or transaction ID for master identification,
so I think the concept of CPU number should be softened to just
master-id.

> use an AddressSpace each for S and NS, and use the
> S/NS bit in the attributes to select the AddressSpace
> The default is that we have one AddressSpace and always
> use that, ie the transaction attributes are ignored.
> (Maybe the function should return an integer index
> into a cpu->as[] array?)
>
> tlb_set_page() takes an extra argument specifying the
> transaction attributes. For RAM accesses we can just
> immediately use this to get the AddressSpace to pass
> to address_space_translate_for_iotlb(). For IO accesses
> we need to stash the attributes in the iotlb[], which
> means extending that from an array of hwaddrs to an
> array of struct {hwaddr, attributes}, which is easy enough.
> Then the io_read/write glue functions in softmmu_template.h
> can fish the attributes out of the iotlb and use them to
> pick the AddressSpace to pass to iotlb_to_region().
> More importantly, we can arrange to pass them through
> to the device read/write callbacks (either directly,
> or indirectly by saving them in the CPU struct like we
> do for mem_io_vaddr; since changing the prototypes on
> every device read and write callback would be insane
> we probably want to have fields in MemoryRegionOps for
> read_with_attrs and write_with_attrs function pointers).
>
> We would need APIs so bus masters other than CPUs can
> specify the transaction attributes (and it's probably
> a good idea for the guest CPU to arrange its attribute
> bit definitions so that "0" means "no attribute info",
> to account for legacy bus masters).
>
> The watchpoint read/write callbacks would just stuff
> the transaction attribute word for the access that
> triggered the watchpoint into the CPUWatchpoint struct
> so that target-specific code can look at it later.
>
> This combination of things would let us handle both
> "some devices and some RAM are only visible in
> the Secure address space" (by having the board
> construct the S AddressSpace with them and the NS
> without them) and also "some devices are TrustZone
> aware and behave differently" (by having the device
> read/write callbacks look at the attributes word).
>
> Thoughts?
>

The transaction attributes concept you are proposing sounds orthogonal
to the multi-address space work I have been working on, so I guess I
should proceed with that series to solve some of the multi-address
space requirements independently of the transaction attrs extension.

Overall sounds ok, more than just addr and data needs to come on IO
bus transactions and there's lots of features that require it.

Regards,
Peter

> -- PMM
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2014-09-06  0:26 ` Peter Crosthwaite
@ 2014-09-06 10:30   ` Peter Maydell
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Maydell @ 2014-09-06 10:30 UTC (permalink / raw)
  To: Peter Crosthwaite
  Cc: Paolo Bonzini, Greg Bellows, Edgar E. Iglesias, QEMU Developers,
	Richard Henderson

On 6 September 2014 01:26, Peter Crosthwaite
<peter.crosthwaite@xilinx.com> wrote:
> On Fri, Sep 5, 2014 at 3:47 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
>> (Another oddball usecase is the Cortex-M split I and D
>> bus for low memory, where instruction and data accesses
>> go out via different buses and might map to different things,
>> but for now I think I'm happy to ignore this as more a
>> theoretical question than a practical one...)

> Similar problem that I ran into was trying to differentiate I from D
> for unassigned memory accesses. I'm trying to correctly implement
> prefetch and data aborts for unassigned but the memory API has no
> information on whether an access is I or D. Attributes could solve
> this.

Nice; that's been something I've vaguely thought we should
do for a while now. (Watch out for all the places which do
ldl_phys() or equivalent and implicitly assume it won't
longjump out on them, though.)

>> Here's one idea which deals with some of these...
>>
>> We introduce the concept of memory transaction attributes,
>> which are a guest-CPU specific bunch of bits (say, a
>> uint32_t). We also allow the CPU to have more than one
>> AddressSpace,
>
> Or any master for that matter. I have an example of this for a dma
> device already:
>
> http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg00370.html

Sure, but non-CPU masters are much simpler because they
don't have to deal with the TLB data structure. The DMA
APIs already take an AddressSpace, I think. You're right
that TZ-aware bus masters will want to have multiple
AddressSpaces though.

>  and have the guest CPU code provide
>> a function returning the AddressSpace to use for a
>> given set of transaction attributes. For ARM we'd
>> put all of (S/NS, mmu_idx, CPU number) into the attributes,
>
> Many buses have master ID or transaction ID for master identification,
> so I think the concept of CPU number should be softened to just
> master-id.

Yes, I was sort of using CPU number as a shorthand there.

>> Thoughts?

> The transaction attributes concept you are proposing sounds orthogonal
> to the multi-address space work I have been working on, so I guess I
> should proceed with that series to solve some of the multi-address
> space requirements independently of the transaction attrs extension.
>
> Overall sounds ok, more than just addr and data needs to come on IO
> bus transactions and there's lots of features that require it.

Sounds good.

-- PMM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2014-09-04 17:47 [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes Peter Maydell
  2014-09-06  0:26 ` Peter Crosthwaite
@ 2014-09-07  1:47 ` Edgar E. Iglesias
  2014-09-08 11:53   ` Peter Maydell
  1 sibling, 1 reply; 11+ messages in thread
From: Edgar E. Iglesias @ 2014-09-07  1:47 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	QEMU Developers, Greg Bellows

On Thu, Sep 04, 2014 at 06:47:58PM +0100, Peter Maydell wrote:
> One of the parts of the ARM TrustZone/Security Extensions
> which the patchsets we've seen so far haven't attempted to
> tackle is the problem of Secure vs NonSecure memory accesses.
> Architecturally, every memory transaction should have
> an S/NS bit accompanying the physical address, effectively
> making an extra address bit (you could in theory have
> completely different physical memory maps for S and NS,
> though usually they're similar). We can't fake this up
> by having the device call back into the CPU to check its
> current status, because the CPU can make NS accesses
> when it is in the Secure world (controlled by page table bits).
> 
> We have some other cases where devices would really
> like to have some kind of "memory transaction attributes"
> information:
>  * watchpoints on ARM need to know whether the access
>    was userspace or system (ie which mmu_idx it was).
>    [The LDRT/STRT instructions which let the kernel
>    do userspace-privilege accesses mean that the current
>    state of the CPU isn't sufficient to determine this.]
>    They'll also want to know the S/NS info.
>  * The GIC wants to provide a different set of registers
>    to each CPU (currently we do this with a hacky use of
>    current_cpu), as do some other devices.
> 
> Paolo also mentioned that x86 SMM has some situations
> where devices need to be only visible in some cases.
> 
> (Another oddball usecase is the Cortex-M split I and D
> bus for low memory, where instruction and data accesses
> go out via different buses and might map to different things,
> but for now I think I'm happy to ignore this as more a
> theoretical question than a practical one...)
> 
> Here's one idea which deals with some of these...

Hi Peter,

Thanks for writing this down.

I was looking at some of this stuff some weeks ago and exploring something
similar to what you suggest but not exactly the same. Never fully implemented
it though. I'll comment on your suggestion and explain what I did and what
I ended up using.


> We introduce the concept of memory transaction attributes,
> which are a guest-CPU specific bunch of bits (say, a
> uint32_t). We also allow the CPU to have more than one
> AddressSpace, and have the guest CPU code provide
> a function returning the AddressSpace to use for a
> given set of transaction attributes. For ARM we'd
> put all of (S/NS, mmu_idx, CPU number) into the attributes,
> use an AddressSpace each for S and NS, and use the
> S/NS bit in the attributes to select the AddressSpace
> The default is that we have one AddressSpace and always
> use that, ie the transaction attributes are ignored.
> (Maybe the function should return an integer index
> into a cpu->as[] array?)

The new read_with_attrs (or another name) sounds good.
It allows us to incrementally change the codebase. It would be nice if the
arguments to that function be passed in a structure so we can
make future changes easier.

Maybe it could even end up going this far:
void access(BusPayload *pay);

With *pay having fields for attributes, data, slave error signaling etc.

Im a little concerned about the guest CPU defining the attribute
format. Maybe we could start by making the attributes generic
(look the same for everyone). BUS specialization and convertion
as transactions make their way through the bus could be future work.

> 
> tlb_set_page() takes an extra argument specifying the
> transaction attributes. For RAM accesses we can just
> immediately use this to get the AddressSpace to pass
> to address_space_translate_for_iotlb(). For IO accesses
> we need to stash the attributes in the iotlb[], which
> means extending that from an array of hwaddrs to an
> array of struct {hwaddr, attributes}, which is easy enough.
> Then the io_read/write glue functions in softmmu_template.h
> can fish the attributes out of the iotlb and use them to
> pick the AddressSpace to pass to iotlb_to_region().
> More importantly, we can arrange to pass them through
> to the device read/write callbacks (either directly,
> or indirectly by saving them in the CPU struct like we
> do for mem_io_vaddr; since changing the prototypes on
> every device read and write callback would be insane
> we probably want to have fields in MemoryRegionOps for
> read_with_attrs and write_with_attrs function pointers).

I think this will mostly work but could become a bit hard
to deal with when IOMMUs come into the picture that may want
to modify the attributes and AS.

Maybe we could consider having a pointer to a bundle of
AS and attributes stored in the iotlb? example:

memory.h:
typedef struct BusAttrSomething
{
    AddressSpace *as;
    MemoryTransactionAttr *attr;
} BusAttrSomthing;

So that the stuff stored in the IOTLB is not specific
to the CPU in question but can be created by any
IOMMU along the bus path. See below for more info.

> 
> We would need APIs so bus masters other than CPUs can
> specify the transaction attributes (and it's probably
> a good idea for the guest CPU to arrange its attribute
> bit definitions so that "0" means "no attribute info",
> to account for legacy bus masters).
> 
> The watchpoint read/write callbacks would just stuff
> the transaction attribute word for the access that
> triggered the watchpoint into the CPUWatchpoint struct
> so that target-specific code can look at it later.

I'll explain what I played around with.

I added a concept of TLB attribute indexes, similar to the mmu_idx. I actually
first considered reusing the mmu_idx and even spliting up the mmu_idx into
fields containing the tlb_attr_idx but didn't in the end.

To memory.h:
/* TODO: generic for now but could maybe be specialized for different
 * buses in the future.  */
typedef struct MemoryTransactionAttr
{
    unsigned int secure : 1;
    ...
} MemoryTransactionAttr;


The CPU then for example defines:
typedef struct CPUBusAttr
{
    AddressSpace *as;
    MemoryTransactionAttr *attr;
} CPUBusAttr;

IOTLB:
unsigned int attr_idx[NB_MMU_MODES][CPU_TLB_SIZE];
CPUBusAttr iotlb_attr[NB_TLB_ATTR];

cpu.h:
#define NB_TLB_ATTR 2
#define TLB_ATTR_NS 0
#define TLB_ATTR_SEC 1

At CPU reset/init we do something like the following:
    env->iotlb_attr[TLB_ATTR_NS].as = cpu->as_ns;
    env->iotlb_attr[TLB_ATTR_SEC].as = cpu->as_secure;
    env->iotlb_attr[TLB_ATTR_SEC].attr = (MemoryTransactionAttr[]) {{ .secure = true }};

The MMU code the CPUs selects the initial TLB attr index based on the
current CPU state. Address translators are then allowed to downgrade
the index as they see fit. tlb_set_page takes the attr_index as input.
The base attr_index could also be passed from the translators but would
need further changes (or the attr_idx encoded in mmu_idx tricks).

In practice, this looks something like:

static int get_phys_addr_lpae_common(CPUARMState *env,
                                     ...
                                     unsigned int *attr_idx)
{
    ...
    if (stage == 1) {
        if (attrs & (1 << 3)) {
            /* NS downgrade */
            *attr_idx = TLB_ATTR_NS;
        }


int arm_cpu_handle_mmu_fault(CPUState *cs, vaddr address,
                             int access_type, int mmu_idx)
{
    ...
    bool secure = arm_is_secure(env);
    unsigned int attr_idx = secure ? TLB_ATTR_SEC : TLB_ATTR_NS;

    ret = get_phys_addr(env, address, access_type, cur_el, &phys_addr, &prot,
                        &page_size, 2, &attr_idx);
    ...
    tlb_set_page_attr(cs, address, phys_addr, prot, mmu_idx, page_size,
                          attr_idx);

Were things got messier is when IOMMUs come into the picture.
I've got some patches that add support for CPUs to sit behind IOMMUs.

Somewhere at this point I abandonned the .access with attributes approach
and am now using AddresSpaces only. I still have the CPUBusAttr in the iotlb
so that MMU translation can select and modify the AS (i.e downgrade S to NS)
but I don't use the attributes parts of it to propagate the info to devs.

At the device level, devs export multiple mr ports that internally map to
common access functions with S/NS argsw. IOMMUs and other masters do not know
about or set specific attributes, they operate only on AddressSpaces.
My reasons for going this path were maybe not so technical, it was more
pragmatism to get something working with little changes to current QEMU.

Having mem attributes support upstream would be nice :-)

Cheers,
Edgar

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2014-09-07  1:47 ` Edgar E. Iglesias
@ 2014-09-08 11:53   ` Peter Maydell
  2014-09-09  1:09     ` Edgar E. Iglesias
  2015-05-12 14:47     ` Peter Maydell
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Maydell @ 2014-09-08 11:53 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	QEMU Developers, Greg Bellows

On 7 September 2014 02:47, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> On Thu, Sep 04, 2014 at 06:47:58PM +0100, Peter Maydell wrote:
>> We introduce the concept of memory transaction attributes,
>> which are a guest-CPU specific bunch of bits (say, a
>> uint32_t). We also allow the CPU to have more than one
>> AddressSpace, and have the guest CPU code provide
>> a function returning the AddressSpace to use for a
>> given set of transaction attributes. For ARM we'd
>> put all of (S/NS, mmu_idx, CPU number) into the attributes,
>> use an AddressSpace each for S and NS, and use the
>> S/NS bit in the attributes to select the AddressSpace
>> The default is that we have one AddressSpace and always
>> use that, ie the transaction attributes are ignored.
>> (Maybe the function should return an integer index
>> into a cpu->as[] array?)
>
> The new read_with_attrs (or another name) sounds good.
> It allows us to incrementally change the codebase. It would be nice if the
> arguments to that function be passed in a structure so we can
> make future changes easier.
>
> Maybe it could even end up going this far:
> void access(BusPayload *pay);
>
> With *pay having fields for attributes, data, slave error signaling etc.

Yes, picking an API that gives us expansion room
for things like "let device signal an error response"
would be a good idea.

> Im a little concerned about the guest CPU defining the attribute
> format. Maybe we could start by making the attributes generic
> (look the same for everyone).

I don't think this works at all -- our very first
use case is ARM S/NS signalling, which is a very
ARM specific concept. We could make it be partly
generic and partly guest-defined, but really the
attributes seem to me to be pretty much bus
specific, and guest-CPU-specific is about the
best proxy we have for that. I'd rather just have
the core code pass these things around as an
opaque pile of bits.

> BUS specialization and convertion
> as transactions make their way through the bus
> could be future work.

Conversion of attributes by bus fabric seems a bit tricky
since QEMU's design rather assumes you can completely
squash the bus hierarchy down to a flat view ahead of
time, which means that intercepting transactions is painful.

>> tlb_set_page() takes an extra argument specifying the
>> transaction attributes. For RAM accesses we can just
>> immediately use this to get the AddressSpace to pass
>> to address_space_translate_for_iotlb(). For IO accesses
>> we need to stash the attributes in the iotlb[], which
>> means extending that from an array of hwaddrs to an
>> array of struct {hwaddr, attributes}, which is easy enough.
>> Then the io_read/write glue functions in softmmu_template.h
>> can fish the attributes out of the iotlb and use them to
>> pick the AddressSpace to pass to iotlb_to_region().
>> More importantly, we can arrange to pass them through
>> to the device read/write callbacks (either directly,
>> or indirectly by saving them in the CPU struct like we
>> do for mem_io_vaddr; since changing the prototypes on
>> every device read and write callback would be insane
>> we probably want to have fields in MemoryRegionOps for
>> read_with_attrs and write_with_attrs function pointers).
>
> I think this will mostly work but could become a bit hard
> to deal with when IOMMUs come into the picture that may want
> to modify the attributes and AS.
>
> Maybe we could consider having a pointer to a bundle of
> AS and attributes stored in the iotlb? example:
>
> memory.h:
> typedef struct BusAttrSomething
> {
>     AddressSpace *as;
>     MemoryTransactionAttr *attr;
> } BusAttrSomthing;
>
> So that the stuff stored in the IOTLB is not specific
> to the CPU in question but can be created by any
> IOMMU along the bus path. See below for more info.

Mmm, we probably want to allow for IOTLBs, so more
flexibility than a simple index into the CPU's list
of address spaces does seem warranted.

> Somewhere at this point I abandonned the .access with attributes approach
> and am now using AddresSpaces only. I still have the CPUBusAttr in the iotlb
> so that MMU translation can select and modify the AS (i.e downgrade S to NS)
> but I don't use the attributes parts of it to propagate the info to devs.
>
> At the device level, devs export multiple mr ports that internally map to
> common access functions with S/NS argsw. IOMMUs and other masters do not know
> about or set specific attributes, they operate only on AddressSpaces.
> My reasons for going this path were maybe not so technical, it was more
> pragmatism to get something working with little changes to current QEMU.

Yeah, you can implement the S/NS distinction purely by setting
up memory regions and having every device that cares have
multiple exported regions; I started out thinking about doing
that, but it makes the board code pretty nasty, and we need
to pass in attributes for user/system privilege level anyway.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2014-09-08 11:53   ` Peter Maydell
@ 2014-09-09  1:09     ` Edgar E. Iglesias
  2015-05-12 14:47     ` Peter Maydell
  1 sibling, 0 replies; 11+ messages in thread
From: Edgar E. Iglesias @ 2014-09-09  1:09 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	QEMU Developers, Greg Bellows

On Mon, Sep 08, 2014 at 12:53:57PM +0100, Peter Maydell wrote:
> On 7 September 2014 02:47, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> > On Thu, Sep 04, 2014 at 06:47:58PM +0100, Peter Maydell wrote:
> >> We introduce the concept of memory transaction attributes,
> >> which are a guest-CPU specific bunch of bits (say, a
> >> uint32_t). We also allow the CPU to have more than one
> >> AddressSpace, and have the guest CPU code provide
> >> a function returning the AddressSpace to use for a
> >> given set of transaction attributes. For ARM we'd
> >> put all of (S/NS, mmu_idx, CPU number) into the attributes,
> >> use an AddressSpace each for S and NS, and use the
> >> S/NS bit in the attributes to select the AddressSpace
> >> The default is that we have one AddressSpace and always
> >> use that, ie the transaction attributes are ignored.
> >> (Maybe the function should return an integer index
> >> into a cpu->as[] array?)
> >
> > The new read_with_attrs (or another name) sounds good.
> > It allows us to incrementally change the codebase. It would be nice if the
> > arguments to that function be passed in a structure so we can
> > make future changes easier.
> >
> > Maybe it could even end up going this far:
> > void access(BusPayload *pay);
> >
> > With *pay having fields for attributes, data, slave error signaling etc.
> 
> Yes, picking an API that gives us expansion room
> for things like "let device signal an error response"
> would be a good idea.
> 
> > Im a little concerned about the guest CPU defining the attribute
> > format. Maybe we could start by making the attributes generic
> > (look the same for everyone).
> 
> I don't think this works at all -- our very first
> use case is ARM S/NS signalling, which is a very
> ARM specific concept. We could make it be partly
> generic and partly guest-defined, but really the
> attributes seem to me to be pretty much bus
> specific, and guest-CPU-specific is about the
> best proxy we have for that. I'd rather just have
> the core code pass these things around as an
> opaque pile of bits.

I'll try to clarify what worries me.

If we let the guest decide the bit format and ordering we'll end up with
something like following:

struct {
    union {
        struct {
            unsigned int secure : 1;
            unsigned int featureA : 1;
            unsigned int featureB : 1;
        } arm;
        struct {
            unsigned int featureC : 1;
            unsigned int featureD : 1;
        } mips;
        struct {
            unsigned int featureX : 1;
            unsigned int featureY : 1;
        } cris;
    };
};

Once we start adding masters and slaves all writing and reading these bits
in different formats we will have trouble mixing and matching because the
interpretations will clash. If we add a type to the
attr object, we could disallow incompatible setups or maybe add
conversion routines along the path (this is what I meant with attribute
specialization and conversion below) but this becomes quite complex
to enforce.

What happens on real SoCs is that allthough various CPUs typically have their
native bus protocols, they can speak to all kinds of devices through
bus bridges. Typically bus specific features get translated or dropped.

If we instead describe the attributes like the following
(no union, i.e the attrs have globally allocated bits and look and mean
the same to everyone):

struct {
    struct {
        unsigned int master_id;
        unsigned int featureA;
    } generic;

    struct {
        unsigned int secure : 1;
        unsigned int featureB : 1;
    } axi;

    struct {
        unsigned int featureC : 1;
    } ocp;

    struct {
        unsigned int featureX : 1;
        unsigned int featureY : 1;
    } smif;
};

As soon as a feature bit needs to be in multiple bus specific sub structs,
it becomes generic. As there will never be dups, the sub structs may not
make much sense, i.e everything is really generic. It's the highlevel
feature we describe, not the fact that it's on a specific bus protocol.

Anyway, masters that set axi.secure can talk to non-TZ OCP slaves without
enabling ocp.featureC accidentally. Features will naturally get dropped
along the buspath if they don't match between master and slave. E.g,
a slave that does not know the notion of a secure bit will never try to
read it. IMO this is a friendlier default behaviour to work with.

For cases where we need specific attr translations, this can be done through
modeling bridges. The IOMMU framework with attributes support would work for that
as a bridge would be a subset of an IOMMU (does attribute translation but no
address translation). I think this would become the exception rather than the
rule.

Cheers,
Edgar

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2014-09-08 11:53   ` Peter Maydell
  2014-09-09  1:09     ` Edgar E. Iglesias
@ 2015-05-12 14:47     ` Peter Maydell
  2015-05-13  6:41       ` Edgar E. Iglesias
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Maydell @ 2015-05-12 14:47 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	QEMU Developers, Greg Bellows

Resurrecting a six month old thread (and starting with
a big long quote for context):

On 8 September 2014 at 12:53, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 7 September 2014 02:47, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
>> On Thu, Sep 04, 2014 at 06:47:58PM +0100, Peter Maydell wrote:
>>> tlb_set_page() takes an extra argument specifying the
>>> transaction attributes. For RAM accesses we can just
>>> immediately use this to get the AddressSpace to pass
>>> to address_space_translate_for_iotlb(). For IO accesses
>>> we need to stash the attributes in the iotlb[], which
>>> means extending that from an array of hwaddrs to an
>>> array of struct {hwaddr, attributes}, which is easy enough.
>>> Then the io_read/write glue functions in softmmu_template.h
>>> can fish the attributes out of the iotlb and use them to
>>> pick the AddressSpace to pass to iotlb_to_region().
>>> More importantly, we can arrange to pass them through
>>> to the device read/write callbacks (either directly,
>>> or indirectly by saving them in the CPU struct like we
>>> do for mem_io_vaddr; since changing the prototypes on
>>> every device read and write callback would be insane
>>> we probably want to have fields in MemoryRegionOps for
>>> read_with_attrs and write_with_attrs function pointers).
>>
>> I think this will mostly work but could become a bit hard
>> to deal with when IOMMUs come into the picture that may want
>> to modify the attributes and AS.
>>
>> Maybe we could consider having a pointer to a bundle of
>> AS and attributes stored in the iotlb? example:
>>
>> memory.h:
>> typedef struct BusAttrSomething
>> {
>>     AddressSpace *as;
>>     MemoryTransactionAttr *attr;
>> } BusAttrSomthing;
>>
>> So that the stuff stored in the IOTLB is not specific
>> to the CPU in question but can be created by any
>> IOMMU along the bus path. See below for more info.
>
> Mmm, we probably want to allow for IOTLBs, so more
> flexibility than a simple index into the CPU's list
> of address spaces does seem warranted.

Now that the tx-attributes patches are in master I'm
looking at the "multiple AddressSpaces per CPU" part. In
the intervening time, this code has been somewhat complicated
by Paolo's RCU patches. In particular having actual
AddressSpace pointers in the iotlb doesn't look like it
will work given the way we now cache the memory_dispatch
pointer.

So we could deal with this by just falling back to
"CPUs have N AddressSpaces and when the target code
calls tlb_set_page_with_attrs it passes in the index
of the AddressSpace as well as the paddr" (and we then
can stash the index in the iotlb for later use, as
well as handing it to address_space_translate_for_iotlb).
Internally exec.c would also maintain an array of
AddressSpaceDispatch pointers corresponding to the
AddressSpaces (so effectively cs->as and cs->memory_dispatch
become arrays, though likely with some syntactic sugar
so we don't have to change all the uses of cs->as to
cs->as[0] for CPUs which only have 1 AS).

Or is there a better approach? Edgar, is your IOMMU
stuff sufficiently far advanced that you can see how
it ought to fit into the code at the moment?

thanks
-- PMM

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2015-05-12 14:47     ` Peter Maydell
@ 2015-05-13  6:41       ` Edgar E. Iglesias
  2015-05-13  9:28         ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Edgar E. Iglesias @ 2015-05-13  6:41 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Paolo Bonzini, Peter Crosthwaite, Richard Henderson,
	QEMU Developers, Greg Bellows

On Tue, May 12, 2015 at 03:47:00PM +0100, Peter Maydell wrote:
> Resurrecting a six month old thread (and starting with
> a big long quote for context):
> 
> On 8 September 2014 at 12:53, Peter Maydell <peter.maydell@linaro.org> wrote:
> > On 7 September 2014 02:47, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> >> On Thu, Sep 04, 2014 at 06:47:58PM +0100, Peter Maydell wrote:
> >>> tlb_set_page() takes an extra argument specifying the
> >>> transaction attributes. For RAM accesses we can just
> >>> immediately use this to get the AddressSpace to pass
> >>> to address_space_translate_for_iotlb(). For IO accesses
> >>> we need to stash the attributes in the iotlb[], which
> >>> means extending that from an array of hwaddrs to an
> >>> array of struct {hwaddr, attributes}, which is easy enough.
> >>> Then the io_read/write glue functions in softmmu_template.h
> >>> can fish the attributes out of the iotlb and use them to
> >>> pick the AddressSpace to pass to iotlb_to_region().
> >>> More importantly, we can arrange to pass them through
> >>> to the device read/write callbacks (either directly,
> >>> or indirectly by saving them in the CPU struct like we
> >>> do for mem_io_vaddr; since changing the prototypes on
> >>> every device read and write callback would be insane
> >>> we probably want to have fields in MemoryRegionOps for
> >>> read_with_attrs and write_with_attrs function pointers).
> >>
> >> I think this will mostly work but could become a bit hard
> >> to deal with when IOMMUs come into the picture that may want
> >> to modify the attributes and AS.
> >>
> >> Maybe we could consider having a pointer to a bundle of
> >> AS and attributes stored in the iotlb? example:
> >>
> >> memory.h:
> >> typedef struct BusAttrSomething
> >> {
> >>     AddressSpace *as;
> >>     MemoryTransactionAttr *attr;
> >> } BusAttrSomthing;
> >>
> >> So that the stuff stored in the IOTLB is not specific
> >> to the CPU in question but can be created by any
> >> IOMMU along the bus path. See below for more info.
> >
> > Mmm, we probably want to allow for IOTLBs, so more
> > flexibility than a simple index into the CPU's list
> > of address spaces does seem warranted.
> 
> Now that the tx-attributes patches are in master I'm
> looking at the "multiple AddressSpaces per CPU" part. In
> the intervening time, this code has been somewhat complicated
> by Paolo's RCU patches. In particular having actual
> AddressSpace pointers in the iotlb doesn't look like it
> will work given the way we now cache the memory_dispatch
> pointer.
> 
> So we could deal with this by just falling back to
> "CPUs have N AddressSpaces and when the target code
> calls tlb_set_page_with_attrs it passes in the index
> of the AddressSpace as well as the paddr" (and we then
> can stash the index in the iotlb for later use, as
> well as handing it to address_space_translate_for_iotlb).
> Internally exec.c would also maintain an array of
> AddressSpaceDispatch pointers corresponding to the
> AddressSpaces (so effectively cs->as and cs->memory_dispatch
> become arrays, though likely with some syntactic sugar
> so we don't have to change all the uses of cs->as to
> cs->as[0] for CPUs which only have 1 AS).

Your suggestion sounds good to me.

I think it would be nice if address_space_translate_for_iotlb
was allowed to modify the attributes so that an IOMMU in front
of a CPU could for example down-grade a secure to a non-secure accesse
(once we add IOMMU support in front of CPUs). If I understood correctly
the memattrs would stay as a separate field in the iotlb, so this
would be easy to implement?

The naive implementation I have keeps pointers to AS and the memattrs
in the iotlb. address_space_translate_for_iotlb walks any IOMMU
translate() fns, if it hits a RAM it returns the host addr as usual.
If it hits MMIO behind IOMMUs it returns the first memsection, i.e
the one pointing to the iommu-ops and lets the IO access helpers
deal with the access via address_space_rw (for every IO access, slow).

Cheers,
Edgar

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2015-05-13  6:41       ` Edgar E. Iglesias
@ 2015-05-13  9:28         ` Paolo Bonzini
  2015-05-13 11:21           ` Edgar E. Iglesias
  0 siblings, 1 reply; 11+ messages in thread
From: Paolo Bonzini @ 2015-05-13  9:28 UTC (permalink / raw)
  To: Edgar E. Iglesias, Peter Maydell
  Cc: Peter Crosthwaite, Richard Henderson, QEMU Developers, Greg Bellows



On 13/05/2015 08:41, Edgar E. Iglesias wrote:
> I think it would be nice if address_space_translate_for_iotlb
> was allowed to modify the attributes so that an IOMMU in front
> of a CPU could for example down-grade a secure to a non-secure accesse
> (once we add IOMMU support in front of CPUs). If I understood correctly
> the memattrs would stay as a separate field in the iotlb, so this
> would be easy to implement?

Yes, it should.

> The naive implementation I have keeps pointers to AS and the memattrs
> in the iotlb.  address_space_translate_for_iotlb walks any IOMMU
> translate() fns, if it hits a RAM it returns the host addr as usual.

How is the TLB invalidated on IOMMU changes?

Paolo

> If it hits MMIO behind IOMMUs it returns the first memsection, i.e
> the one pointing to the iommu-ops and lets the IO access helpers
> deal with the access via address_space_rw (for every IO access, slow).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2015-05-13  9:28         ` Paolo Bonzini
@ 2015-05-13 11:21           ` Edgar E. Iglesias
  2015-05-13 11:36             ` Paolo Bonzini
  0 siblings, 1 reply; 11+ messages in thread
From: Edgar E. Iglesias @ 2015-05-13 11:21 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Peter Maydell, Peter Crosthwaite, Richard Henderson,
	QEMU Developers, Greg Bellows

On Wed, May 13, 2015 at 11:28:38AM +0200, Paolo Bonzini wrote:
> 
> 
> On 13/05/2015 08:41, Edgar E. Iglesias wrote:
> > I think it would be nice if address_space_translate_for_iotlb
> > was allowed to modify the attributes so that an IOMMU in front
> > of a CPU could for example down-grade a secure to a non-secure accesse
> > (once we add IOMMU support in front of CPUs). If I understood correctly
> > the memattrs would stay as a separate field in the iotlb, so this
> > would be easy to implement?
> 
> Yes, it should.
> 
> > The naive implementation I have keeps pointers to AS and the memattrs
> > in the iotlb.  address_space_translate_for_iotlb walks any IOMMU
> > translate() fns, if it hits a RAM it returns the host addr as usual.
> 
> How is the TLB invalidated on IOMMU changes?

Hi Paolo,

I have a hack in the iommu model that triggers an AS change and
makes the CPU flush it's TLB:

        memory_region_notify_iommu(&s->masters[i].iommu, entry);
        /* Temporary hack.  */
        memory_region_transaction_begin();
        memory_region_set_readonly(&s->masters[i].iommu, false);
        memory_region_set_readonly(&s->masters[i].iommu, true);
        memory_region_transaction_commit();

It was not clear to me if CPUs should hook into the iommu notification
system or if we should make the iommu notification code signal changes
through AS change notifications.

The latter would be easy to get right I guess but we wouldn't be
able to have any granularity in the flushing so performance could
be better if the CPU somehow knows what parts have changed.

Cheers,
Edgar


> 
> Paolo
> 
> > If it hits MMIO behind IOMMUs it returns the first memsection, i.e
> > the one pointing to the iommu-ops and lets the IO access helpers
> > deal with the access via address_space_rw (for every IO access, slow).
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes
  2015-05-13 11:21           ` Edgar E. Iglesias
@ 2015-05-13 11:36             ` Paolo Bonzini
  0 siblings, 0 replies; 11+ messages in thread
From: Paolo Bonzini @ 2015-05-13 11:36 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Peter Maydell, Peter Crosthwaite, Richard Henderson,
	QEMU Developers, Greg Bellows



On 13/05/2015 13:21, Edgar E. Iglesias wrote:
> 
> It was not clear to me if CPUs should hook into the iommu notification
> system or if we should make the iommu notification code signal changes
> through AS change notifications.
> 
> The latter would be easy to get right I guess but we wouldn't be
> able to have any granularity in the flushing so performance could
> be better if the CPU somehow knows what parts have changed.

I think it's CPUs that should hook and flush their TLBs.

Paolo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-05-13 11:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-04 17:47 [Qemu-devel] Supporting multiple CPU AddressSpaces and memory transaction attributes Peter Maydell
2014-09-06  0:26 ` Peter Crosthwaite
2014-09-06 10:30   ` Peter Maydell
2014-09-07  1:47 ` Edgar E. Iglesias
2014-09-08 11:53   ` Peter Maydell
2014-09-09  1:09     ` Edgar E. Iglesias
2015-05-12 14:47     ` Peter Maydell
2015-05-13  6:41       ` Edgar E. Iglesias
2015-05-13  9:28         ` Paolo Bonzini
2015-05-13 11:21           ` Edgar E. Iglesias
2015-05-13 11:36             ` Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.