On Mon, Sep 1, 2014 at 11:07 PM, Julien Grall wrote: > Hello Tamas, > > > On 01/09/14 10:22, Tamas K Lengyel wrote: > >> This patch enables to store, set, check and deliver LPAE R/W mem_events. >> > > I would expand a bit more the commit message to explain the logic of mem > event in ARM. > > I know you already explain it in patch #8 ("p2m type definitions and > changes") but I think it's more relevant to explain where the "real" code > is. Ack. > > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c >> index a6dea5b..e8f5671 100644 >> --- a/xen/arch/arm/p2m.c >> +++ b/xen/arch/arm/p2m.c >> @@ -10,6 +10,7 @@ >> #include >> #include >> #include >> +#include >> #include >> #include >> #include >> @@ -148,6 +149,74 @@ static lpae_t *p2m_map_first(struct p2m_domain *p2m, >> paddr_t addr) >> return __map_domain_page(page); >> } >> >> +static void p2m_set_permission(lpae_t *e, p2m_type_t t, p2m_access_t a) >> +{ >> + /* First apply type permissions */ >> + switch (t) >> > > switch ( t ) > > Ack. > [..] > > > + /* Then restrict with access permissions */ >> + switch(a) >> > > switch ( a ) > > [..] > > Ack. > > /* >> * Lookup the MFN corresponding to a domain's PFN. >> * >> @@ -228,7 +297,7 @@ int p2m_pod_decrease_reservation(struct domain *d, >> } >> >> static lpae_t mfn_to_p2m_entry(unsigned long mfn, unsigned int mattr, >> - p2m_type_t t) >> + p2m_type_t t, p2m_access_t a) >> > > It looks strange to modify nearly all prototypes except this one in #6. > > Ack. > [..] > > > @@ -346,7 +385,7 @@ static int p2m_create_table(struct domain *d, lpae_t >> *entry, >> for ( i=0 ; i < LPAE_ENTRIES; i++ ) >> { >> pte = mfn_to_p2m_entry(base_pfn + >> (i<<(level_shift-LPAE_SHIFT)), >> - MATTR_MEM, t); >> + MATTR_MEM, t, p2m->default_access); >> > > I'm not familiar with xen memaccess. Do we need to track access to > intermediate page table (i.e Level 1 & 2)? > > Yes, we need to track if a violation happened as a result of mem-access restricting the pte permission or otherwise. From that perspective there is no difference if the violation happened on a large or 4k page. From a tracing perspective it makes more sense to have more granularity (4k pages only) so I do shatter the large pages when setting mem-access permissions. > > /* >> * First and second level super pages set p2m.table = 0, >> but >> @@ -366,7 +405,7 @@ static int p2m_create_table(struct domain *d, lpae_t >> *entry, >> >> unmap_domain_page(p); >> >> - pte = mfn_to_p2m_entry(page_to_mfn(page), MATTR_MEM, p2m_invalid); >> + pte = mfn_to_p2m_entry(page_to_mfn(page), MATTR_MEM, p2m_invalid, >> p2m->default_access); >> > > Same question here. > > [..] > > > +int p2m_mem_access_check(paddr_t gpa, vaddr_t gla, struct npfec npfec) >> > > This function only return 0/1, I would use bool_t here. > > To reply on your answer on the last version, this function is not used in > common code and x86 also use bool_t as return type. > > You are right, I confused it with p2m_set_mem_access. > > +{ >> + struct vcpu *v = current; >> + struct p2m_domain *p2m = p2m_get_hostp2m(v->domain); >> + mem_event_request_t *req = NULL; >> + xenmem_access_t xma; >> + bool_t violation; >> + int rc; >> + >> + rc = p2m_get_mem_access(v->domain, paddr_to_pfn(gpa), &xma); >> + if ( rc ) >> + { >> + /* No setting was found, reinject */ >> + return 1; >> + } >> + else >> + { >> + /* First, handle rx2rw and n2rwx conversion automatically. */ >> + if ( npfec.write_access && xma == XENMEM_access_rx2rw ) >> + { >> + rc = p2m_set_mem_access(v->domain, paddr_to_pfn(gpa), 1, >> + 0, ~0, XENMEM_access_rw); >> + ASSERT(rc == 0); >> > > It's not a good idea the ASSERT here. The function call radix_tree_insert > in p2m_set_mem_access may fail because there is a memory allocation via > xmalloc inside. > > I suspect x86 adds the ASSERT because the mapping always exists and there > is no memory allocation in set_entry. I will let x86 folks confirm or not > my purpose. > > We can certainly adjust it here if required, this is mainly just copy-paste from x86. > > + return 0; >> + } >> + else if ( xma == XENMEM_access_n2rwx ) >> + { >> + rc = p2m_set_mem_access(v->domain, paddr_to_pfn(gpa), 1, >> + 0, ~0, XENMEM_access_rwx); >> + ASSERT(rc == 0); >> > > Same remark here. > > Ack. > > + } >> + } >> + >> + /* Otherwise, check if there is a memory event listener, and send >> the message along */ >> + if ( !mem_event_check_ring( &v->domain->mem_event->access ) ) >> + { >> + /* No listener */ >> + if ( p2m->access_required ) >> + { >> + gdprintk(XENLOG_INFO, "Memory access permissions failure, " >> + "no mem_event listener VCPU %d, dom >> %d\n", >> + v->vcpu_id, v->domain->domain_id); >> + domain_crash(v->domain); >> + } >> + else >> + { >> + /* n2rwx was already handled */ >> + if ( xma != XENMEM_access_n2rwx) >> + { >> + /* A listener is not required, so clear the access >> + * restrictions. */ >> + rc = p2m_set_mem_access(v->domain, paddr_to_pfn(gpa), 1, >> + 0, ~0, XENMEM_access_rwx); >> + ASSERT(rc == 0); >> > > Same here. > > Ack. > > +void p2m_mem_access_resume(struct domain *d) >> +{ >> + mem_event_response_t rsp; >> + >> + /* Pull all responses off the ring */ >> + while( mem_event_get_response(d, &d->mem_event->access, &rsp) ) >> + { >> + struct vcpu *v; >> + >> + if ( rsp.flags & MEM_EVENT_FLAG_DUMMY ) >> + continue; >> + >> + /* Validate the vcpu_id in the response. */ >> + if ( (rsp.vcpu_id >= d->max_vcpus) || !d->vcpu[rsp.vcpu_id] ) >> + continue; >> + >> + v = d->vcpu[rsp.vcpu_id]; >> + >> + /* Unpause domain */ >> + if ( rsp.flags & MEM_EVENT_FLAG_VCPU_PAUSED ) >> + mem_event_vcpu_unpause(v); >> + } >> +} >> > > This function looks very similar, if not a copy, of the x86 one. Can't we > share the code? > > That would make sense, abstract it out into common mem_access code. > > +/* Set access type for a region of pfns. >> + * If start_pfn == -1ul, sets the default access type */ >> +long p2m_set_mem_access(struct domain *d, unsigned long pfn, uint32_t nr, >> + uint32_t start, uint32_t mask, xenmem_access_t >> access) >> +{ >> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >> + p2m_access_t a; >> + long rc = 0; >> + >> + static const p2m_access_t memaccess[] = { >> +#define ACCESS(ac) [XENMEM_access_##ac] = p2m_access_##ac >> + ACCESS(n), >> + ACCESS(r), >> + ACCESS(w), >> + ACCESS(rw), >> + ACCESS(x), >> + ACCESS(rx), >> + ACCESS(wx), >> + ACCESS(rwx), >> +#undef ACCESS >> + }; >> + >> + switch ( access ) >> + { >> + case 0 ... ARRAY_SIZE(memaccess) - 1: >> + a = memaccess[access]; >> + break; >> + case XENMEM_access_default: >> + a = p2m->default_access; >> + break; >> + default: >> + return -EINVAL; >> + } >> + >> + /* If request to set default access */ >> + if ( pfn == ~0ul ) >> + { >> + p2m->default_access = a; >> + return 0; >> + } >> + >> + spin_lock(&p2m->lock); >> + for ( pfn += start; nr > start; ++pfn ) >> > > Why don't you reuse apply_p2m_changes? everything to get/update a pte is > there and it contains few optimization. > > Also this would avoid to duplicate the shatter code in p2m_set_entry. Honestly, I tried to wrap my head around apply_p2m_changes and it is already quite complex. While I see I could apply the type/access permissions with it over a range, I'm not entirely sure how I would make it force-shatter all large pages. It was just easier (for now) to do it manually. > > > + { >> + >> + bool_t pte_update = p2m_set_entry(d, pfn_to_paddr(pfn), a); >> + >> + if ( !pte_update ) >> + break; >> > > Shouldn't you continue here? The other pages in the batch may require > updates. > > This is the same approach as in x86. > [..] > > > +int p2m_get_mem_access(struct domain *d, unsigned long gpfn, >> + xenmem_access_t *access) >> +{ >> > > I think this function is not complete. You only set the variable access > when the page frame number has been found in the radix tree. But the page > may use the default access which could result to a trap in Xen. > > On x86, We agree that the default access value is stored in the entry. So > if the default access value changes, Xen will retrieved the value > previously stored in the pte. > > With your current solution, reproduce this behavior on ARM will be > difficult, unless you add every page in the radix tree. > > But I don't have better idea for now. Right, I missed page additions with default_access. With so few software programmable bits available in the PTE, we have no other choice but to store the permission settings separately. I just need to make sure that the radix tree is updated when a pte is added/removed. > > > + struct p2m_domain *p2m = p2m_get_hostp2m(d); >> + void *i; >> + int index; >> + >> + static const xenmem_access_t memaccess[] = { >> +#define ACCESS(ac) [XENMEM_access_##ac] = XENMEM_access_##ac >> + ACCESS(n), >> + ACCESS(r), >> + ACCESS(w), >> + ACCESS(rw), >> + ACCESS(x), >> + ACCESS(rx), >> + ACCESS(wx), >> + ACCESS(rwx), >> +#undef ACCESS >> + }; >> + >> + /* If request to get default access */ >> + if ( gpfn == ~0ull ) >> + { >> + *access = memaccess[p2m->default_access]; >> + return 0; >> + } >> + >> + spin_lock(&p2m->lock); >> + >> + i = radix_tree_lookup(&p2m->mem_access_settings, gpfn); >> + >> + spin_unlock(&p2m->lock); >> + >> + if (!i) >> > > if ( !i ) > > Ack. > > + return -ESRCH; >> + >> + index = radix_tree_ptr_to_int(i); >> + >> + if ( (unsigned) index >= ARRAY_SIZE(memaccess) ) >> + return -ERANGE; >> > > You are casting to unsigned all usage of index within this function. Why > not directly define index as an "unsigned int"? > > The radix tree returns an int but I guess could do that. > > + >> + *access = memaccess[ (unsigned) index]; >> > > memaccess[(unsigned)index]; > > > diff --git a/xen/include/asm-arm/processor.h b/xen/include/asm-arm/ >> processor.h >> index 0cc5b6d..b844f1d 100644 >> --- a/xen/include/asm-arm/processor.h >> +++ b/xen/include/asm-arm/processor.h >> @@ -262,6 +262,36 @@ enum dabt_size { >> DABT_DOUBLE_WORD = 3, >> }; >> >> +/* Data abort data fetch status codes */ >> +enum dabt_dfsc { >> + DABT_DFSC_ADDR_SIZE_0 = 0b000000, >> + DABT_DFSC_ADDR_SIZE_1 = 0b000001, >> + DABT_DFSC_ADDR_SIZE_2 = 0b000010, >> + DABT_DFSC_ADDR_SIZE_3 = 0b000011, >> + DABT_DFSC_TRANSLATION_0 = 0b000100, >> + DABT_DFSC_TRANSLATION_1 = 0b000101, >> + DABT_DFSC_TRANSLATION_2 = 0b000110, >> + DABT_DFSC_TRANSLATION_3 = 0b000111, >> + DABT_DFSC_ACCESS_1 = 0b001001, >> + DABT_DFSC_ACCESS_2 = 0b001010, >> + DABT_DFSC_ACCESS_3 = 0b001011, >> + DABT_DFSC_PERMISSION_1 = 0b001101, >> + DABT_DFSC_PERMISSION_2 = 0b001110, >> + DABT_DFSC_PERMISSION_3 = 0b001111, >> + DABT_DFSC_SYNC_EXT = 0b010000, >> + DABT_DFSC_SYNC_PARITY = 0b011000, >> + DABT_DFSC_SYNC_EXT_TTW_0 = 0b010100, >> + DABT_DFSC_SYNC_EXT_TTW_1 = 0b010101, >> + DABT_DFSC_SYNC_EXT_TTW_2 = 0b010110, >> + DABT_DFSC_SYNC_EXT_TTW_3 = 0b010111, >> + DABT_DFSC_SYNC_PARITY_TTW_0 = 0b011100, >> + DABT_DFSC_SYNC_PARITY_TTW_1 = 0b011101, >> + DABT_DFSC_SYNC_PARITY_TTW_2 = 0b011110, >> + DABT_DFSC_SYNC_PARITY_TTW_3 = 0b011111, >> + DABT_DFSC_ALIGNMENT = 0b100001, >> + DABT_DFSC_TLB_CONFLICT = 0b110000, >> +}; >> + >> > > I'm not sure if it's necessary to define every possible value. > > I figured for the sake of completeness it would make sense. I really only use the PERMISSION values so the rest could be removed. Thanks, Tamas > Regards, > > -- > Julien Grall > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >