All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] p2m-pt: avoid get the pte falgs repeatedly.
@ 2016-11-17  8:24 Liang Li
  2016-11-17 11:29 ` Jan Beulich
  0 siblings, 1 reply; 3+ messages in thread
From: Liang Li @ 2016-11-17  8:24 UTC (permalink / raw)
  To: xen-devel; +Cc: george.dunlap, andrew.cooper3, Liang Li, jbeulich

There are a lot of code try to get the pte flags repeatedly, why
not save the result and reuse it in the following code? It can help
to save some CPU cycles and make the code cleaner, no?

I am not sure if this is the right direction, just change one place.

Signed-off-by: Liang Li <liang.z.li@intel.com>
---
 xen/arch/x86/mm/p2m-pt.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/mm/p2m-pt.c b/xen/arch/x86/mm/p2m-pt.c
index 3b025d5..bbaa54f 100644
--- a/xen/arch/x86/mm/p2m-pt.c
+++ b/xen/arch/x86/mm/p2m-pt.c
@@ -64,7 +64,7 @@
 #define clear_recalc(level, ent) level##e_add_flags(ent, RECALC_FLAGS)
 #define _needs_recalc(flags) (!((flags) & _PAGE_USER))
 #define needs_recalc(level, ent) _needs_recalc(level##e_get_flags(ent))
-#define valid_recalc(level, ent) (!(level##e_get_flags(ent) & _PAGE_ACCESSED))
+#define valid_recalc(flags) (!(flags & _PAGE_ACCESSED))
 
 static const unsigned long pgt[] = {
     PGT_l1_page_table,
@@ -367,7 +367,7 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
 {
     void *table;
     unsigned long gfn_remainder = gfn;
-    unsigned int level = 4;
+    unsigned int level = 4, pte_flags;
     l1_pgentry_t *pent;
     int err = 0;
 
@@ -378,15 +378,20 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
 
         pent = p2m_find_entry(table, &remainder, gfn,
                               level * PAGETABLE_ORDER, 1 << PAGETABLE_ORDER);
-        if ( !pent || !(l1e_get_flags(*pent) & _PAGE_PRESENT) )
+        if ( !pent )
+            goto out;
+	else
+            pte_flags = l1e_get_flags(*pent);
+
+        if ( !(pte_flags & _PAGE_PRESENT) )
             goto out;
 
-        if ( l1e_get_flags(*pent) & _PAGE_PSE )
+        if ( pte_flags & _PAGE_PSE )
         {
             unsigned long mask = ~0UL << (level * PAGETABLE_ORDER);
 
-            if ( !needs_recalc(l1, *pent) ||
-                 !p2m_is_changeable(p2m_flags_to_type(l1e_get_flags(*pent))) ||
+            if ( !_needs_recalc(pte_flags) ||
+                 !p2m_is_changeable(p2m_flags_to_type(pte_flags)) ||
                  p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask) >= 0 )
                 break;
         }
@@ -397,21 +402,21 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
         if ( err )
             goto out;
 
-        if ( needs_recalc(l1, *pent) )
+        if ( _needs_recalc(pte_flags) )
         {
             l1_pgentry_t e = *pent, *ptab = table;
             unsigned int i;
 
-            if ( !valid_recalc(l1, e) )
+            if ( !valid_recalc(pte_flags) )
                 P2M_DEBUG("bogus recalc state at d%d:%lx:%u\n",
                           p2m->domain->domain_id, gfn, level);
             remainder = gfn_remainder;
             for ( i = 0; i < (1 << PAGETABLE_ORDER); ++i )
             {
                 l1_pgentry_t ent = ptab[i];
+                int flags = l1e_get_flags(ent);
 
-                if ( (l1e_get_flags(ent) & _PAGE_PRESENT) &&
-                     !needs_recalc(l1, ent) )
+                if ( (flags & _PAGE_PRESENT) && !_needs_recalc(flags) )
                 {
                     set_recalc(l1, ent);
                     p2m->write_p2m_entry(p2m, gfn - remainder, &ptab[i],
@@ -428,15 +433,17 @@ static int do_recalc(struct p2m_domain *p2m, unsigned long gfn)
 
     pent = p2m_find_entry(table, &gfn_remainder, gfn,
                           level * PAGETABLE_ORDER, 1 << PAGETABLE_ORDER);
-    if ( pent && (l1e_get_flags(*pent) & _PAGE_PRESENT) &&
-         needs_recalc(l1, *pent) )
+    if (!pent)
+        goto out;
+    pte_flags = l1e_get_flags(*pent);
+    if ( (pte_flags & _PAGE_PRESENT) && _needs_recalc(pte_flags) )
     {
         l1_pgentry_t e = *pent;
 
-        if ( !valid_recalc(l1, e) )
+        if ( !valid_recalc(pte_flags) )
             P2M_DEBUG("bogus recalc leaf at d%d:%lx:%u\n",
                       p2m->domain->domain_id, gfn, level);
-        if ( p2m_is_changeable(p2m_flags_to_type(l1e_get_flags(e))) )
+        if ( p2m_is_changeable(p2m_flags_to_type(pte_flags)) )
         {
             unsigned long mask = ~0UL << (level * PAGETABLE_ORDER);
             p2m_type_t p2mt = p2m_is_logdirty_range(p2m, gfn & mask, gfn | ~mask)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] p2m-pt: avoid get the pte falgs repeatedly.
  2016-11-17  8:24 [RFC PATCH] p2m-pt: avoid get the pte falgs repeatedly Liang Li
@ 2016-11-17 11:29 ` Jan Beulich
  2016-11-17 13:42   ` Li, Liang Z
  0 siblings, 1 reply; 3+ messages in thread
From: Jan Beulich @ 2016-11-17 11:29 UTC (permalink / raw)
  To: Liang Li; +Cc: george.dunlap, andrew.cooper3, xen-devel

>>> On 17.11.16 at 09:24, <liang.z.li@intel.com> wrote:
> There are a lot of code try to get the pte flags repeatedly, why
> not save the result and reuse it in the following code? It can help
> to save some CPU cycles and make the code cleaner, no?
> 
> I am not sure if this is the right direction, just change one place.

Did you compare generated code with and without this change? A
compiler doing well optimization wise may produce better code in at
least some of the cases: The calculation involved in lNe_get_flags()
may be possible to completely elide, and hence your patch may
increase register pressure / stack consumption due to the new
variable. But of course a benefit of the change is better overall
readability ...

In any event you'd need to address various style issues before
such a patch could be accepted.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC PATCH] p2m-pt: avoid get the pte falgs repeatedly.
  2016-11-17 11:29 ` Jan Beulich
@ 2016-11-17 13:42   ` Li, Liang Z
  0 siblings, 0 replies; 3+ messages in thread
From: Li, Liang Z @ 2016-11-17 13:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: george.dunlap, andrew.cooper3, xen-devel

> >>> On 17.11.16 at 09:24, <liang.z.li@intel.com> wrote:
> > There are a lot of code try to get the pte flags repeatedly, why not
> > save the result and reuse it in the following code? It can help to
> > save some CPU cycles and make the code cleaner, no?
> >
> > I am not sure if this is the right direction, just change one place.
> 
> Did you compare generated code with and without this change? A compiler

No.  
> doing well optimization wise may produce better code in at least some of the
> cases: The calculation involved in lNe_get_flags() may be possible to
> completely elide, and hence your patch may increase register pressure /
> stack consumption due to the new variable. But of course a benefit of the
> change is better overall readability ...
> 
Good to know that it's worthy to make the change.

> In any event you'd need to address various style issues before such a patch
> could be accepted.

Of course, and I will try to change other places and resend the patch.

Thanks!
Liang
> 
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-11-17 13:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-17  8:24 [RFC PATCH] p2m-pt: avoid get the pte falgs repeatedly Liang Li
2016-11-17 11:29 ` Jan Beulich
2016-11-17 13:42   ` Li, Liang Z

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.