All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: fix paging_log_dirty_op to work with paging guests
@ 2018-12-12 14:54 Roger Pau Monne
  2018-12-12 16:07 ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monne @ 2018-12-12 14:54 UTC (permalink / raw)
  To: xen-devel
  Cc: George Dunlap, Andrew Cooper, Wei Liu, Jan Beulich, Roger Pau Monne

When the caller of paging_log_dirty_op is a paging mode guest Xen
would choke when trying to copy the dirty bitmap to the guest provided
buffer because the paging lock of the target is already locked, and
trying to lock the paging lock of the caller will cause the mm lock
order checks to trigger:

(XEN) mm locking order violation: 64 > 16
(XEN) Xen BUG at ./mm-locks.h:143
(XEN) ----[ Xen-4.12-unstable  x86_64  debug=y   Tainted:  C   ]----
(XEN) CPU:    4
(XEN) RIP:    e008:[<ffff82d080328581>] p2m.c#_mm_read_lock+0x41/0x50
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor (d0v3)
[...]
(XEN) Xen call trace:
(XEN)    [<ffff82d080328581>] p2m.c#_mm_read_lock+0x41/0x50
(XEN)    [<ffff82d080322ef6>] vmac.c#p2m_get_page_from_gfn+0x46/0x200
(XEN)    [<ffff82d08035249a>] vmac.c#hap_p2m_ga_to_gfn_4_levels+0x4a/0x270
(XEN)    [<ffff82d0802eb103>] vmac.c#hvm_translate_get_page+0x33/0x1c0
(XEN)    [<ffff82d0802eb35d>] hvm.c#__hvm_copy+0x8d/0x230
(XEN)    [<ffff82d0802eada6>] vmac.c#hvm_copy_to_guest_linear+0x46/0x60
(XEN)    [<ffff82d0802eb5c9>] vmac.c#copy_to_user_hvm+0x79/0x90
(XEN)    [<ffff82d0803314f9>] paging.c#paging_log_dirty_op+0x2c9/0x6d0
(XEN)    [<ffff82d08027384e>] vmac.c#arch_do_domctl+0x63e/0x1c00
(XEN)    [<ffff82d080205720>] vmac.c#do_domctl+0x450/0x1020
(XEN)    [<ffff82d0802ef929>] vmac.c#hvm_hypercall+0x1c9/0x4b0
(XEN)    [<ffff82d080313161>] vmac.c#vmx_vmexit_handler+0x6c1/0xea0
(XEN)    [<ffff82d08031a84a>] vmac.c#vmx_asm_vmexit_handler+0xfa/0x270
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 4:
(XEN) Xen BUG at ./mm-locks.h:143
(XEN) ****************************************

Fix this by releasing the target paging lock before attempting to
perform the copy of the dirty bitmap, and then forcing a restart of
the whole process in case there have been changes to the dirty bitmap
tables.

Note that the path for non-paging guests remains the same, and the
paging lock is not dropped in that case.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
---
 xen/arch/x86/mm/paging.c     | 37 +++++++++++++++++++++++++++++++++---
 xen/include/asm-x86/domain.h |  1 +
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
index d5836eb688..752f827f38 100644
--- a/xen/arch/x86/mm/paging.c
+++ b/xen/arch/x86/mm/paging.c
@@ -408,6 +408,7 @@ static int paging_log_dirty_op(struct domain *d,
     unsigned long *l1 = NULL;
     int i4, i3, i2;
 
+ again:
     if ( !resuming )
     {
         /*
@@ -468,17 +469,18 @@ static int paging_log_dirty_op(struct domain *d,
     l4 = paging_map_log_dirty_bitmap(d);
     i4 = d->arch.paging.preempt.log_dirty.i4;
     i3 = d->arch.paging.preempt.log_dirty.i3;
+    i2 = d->arch.paging.preempt.log_dirty.i2;
     pages = d->arch.paging.preempt.log_dirty.done;
 
     for ( ; (pages < sc->pages) && (i4 < LOGDIRTY_NODE_ENTRIES); i4++, i3 = 0 )
     {
         l3 = (l4 && mfn_valid(l4[i4])) ? map_domain_page(l4[i4]) : NULL;
-        for ( ; (pages < sc->pages) && (i3 < LOGDIRTY_NODE_ENTRIES); i3++ )
+        for ( ; (pages < sc->pages) && (i3 < LOGDIRTY_NODE_ENTRIES);
+              i3++, i2 = 0 )
         {
             l2 = ((l3 && mfn_valid(l3[i3])) ?
                   map_domain_page(l3[i3]) : NULL);
-            for ( i2 = 0;
-                  (pages < sc->pages) && (i2 < LOGDIRTY_NODE_ENTRIES);
+            for ( ; (pages < sc->pages) && (i2 < LOGDIRTY_NODE_ENTRIES);
                   i2++ )
             {
                 unsigned int bytes = PAGE_SIZE;
@@ -488,6 +490,16 @@ static int paging_log_dirty_op(struct domain *d,
                     bytes = (unsigned int)((sc->pages - pages + 7) >> 3);
                 if ( likely(peek) )
                 {
+                    if ( paging_mode_enabled(current->domain) )
+                        /*
+                         * Drop the target p2m lock, or else Xen will panic
+                         * when trying to acquire the p2m lock of the caller
+                         * due to invalid lock order. Note that there are no
+                         * lock ordering issues here, and the panic is due to
+                         * the fact that the lock level tracking doesn't record
+                         * the domain the lock belongs to.
+                         */
+                        paging_unlock(d);
                     if ( (l1 ? copy_to_guest_offset(sc->dirty_bitmap,
                                                     pages >> 3, (uint8_t *)l1,
                                                     bytes)
@@ -505,6 +517,23 @@ static int paging_log_dirty_op(struct domain *d,
                         clear_page(l1);
                     unmap_domain_page(l1);
                 }
+                if ( paging_mode_enabled(current->domain) && peek )
+                {
+                    d->arch.paging.preempt.log_dirty.i4 = i4;
+                    d->arch.paging.preempt.log_dirty.i3 = i3;
+                    d->arch.paging.preempt.log_dirty.i2 = i2 + 1;
+                    d->arch.paging.preempt.log_dirty.done = pages;
+                    d->arch.paging.preempt.dom = current->domain;
+                    d->arch.paging.preempt.op = sc->op;
+                    resuming = 1;
+                    if ( l2 )
+                        unmap_domain_page(l2);
+                    if ( l3 )
+                        unmap_domain_page(l3);
+                    if ( l4 )
+                        unmap_domain_page(l4);
+                    goto again;
+                }
             }
             if ( l2 )
                 unmap_domain_page(l2);
@@ -513,6 +542,7 @@ static int paging_log_dirty_op(struct domain *d,
             {
                 d->arch.paging.preempt.log_dirty.i4 = i4;
                 d->arch.paging.preempt.log_dirty.i3 = i3 + 1;
+                d->arch.paging.preempt.log_dirty.i2 = 0;
                 rv = -ERESTART;
                 break;
             }
@@ -525,6 +555,7 @@ static int paging_log_dirty_op(struct domain *d,
         {
             d->arch.paging.preempt.log_dirty.i4 = i4 + 1;
             d->arch.paging.preempt.log_dirty.i3 = 0;
+            d->arch.paging.preempt.log_dirty.i2 = 0;
             rv = -ERESTART;
         }
         if ( rv )
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 277f99f633..b3e527cd51 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -212,6 +212,7 @@ struct paging_domain {
                 unsigned long done:PADDR_BITS - PAGE_SHIFT;
                 unsigned long i4:PAGETABLE_ORDER;
                 unsigned long i3:PAGETABLE_ORDER;
+                unsigned long i2:PAGETABLE_ORDER;
             } log_dirty;
         };
     } preempt;
-- 
2.19.2


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-12 14:54 [PATCH] x86: fix paging_log_dirty_op to work with paging guests Roger Pau Monne
@ 2018-12-12 16:07 ` Jan Beulich
  2018-12-13 11:39   ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-12-12 16:07 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel

>>> On 12.12.18 at 15:54, <roger.pau@citrix.com> wrote:
> Fix this by releasing the target paging lock before attempting to
> perform the copy of the dirty bitmap, and then forcing a restart of
> the whole process in case there have been changes to the dirty bitmap
> tables.

I'm afraid it's not that simple: The writer side (paging_mark_pfn_dirty())
uses the same lock, and I think the assumption is that while the copying
takes place no updates to the bitmap can occur. Then again the subject
domain gets paused anyway, so updates probably can't happen (the
"probably" of course needs to be eliminated). But at the very least I'd
expect a more thorough discussion here as to why dropping the lock
intermediately is fine. The preemption mechanism can't be used as sole
reference, because that one doesn't drop the lock in the middle of an
iteration.

> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -408,6 +408,7 @@ static int paging_log_dirty_op(struct domain *d,
>      unsigned long *l1 = NULL;
>      int i4, i3, i2;
>  
> + again:
>      if ( !resuming )
>      {

Considering that you set "resuming" to 1 before jumping here,
why don't you put the label after this if()? I'd even consider pulling
the label even further down (perhaps to immediately before the
last if() ahead of the main loop), and have the path branching there
re-acquire the lock (and drop some of the other state setting that
then becomes unnecessary).

> @@ -488,6 +490,16 @@ static int paging_log_dirty_op(struct domain *d,
>                      bytes = (unsigned int)((sc->pages - pages + 7) >> 3);
>                  if ( likely(peek) )
>                  {
> +                    if ( paging_mode_enabled(current->domain) )
> +                        /*
> +                         * Drop the target p2m lock, or else Xen will panic
> +                         * when trying to acquire the p2m lock of the caller
> +                         * due to invalid lock order. Note that there are no
> +                         * lock ordering issues here, and the panic is due to
> +                         * the fact that the lock level tracking doesn't record
> +                         * the domain the lock belongs to.
> +                         */
> +                        paging_unlock(d);

This makes it sound as if tracking the domain would help. It doesn't,
at least not as long as there is not also some sort of ordering or
hierarchy between domains. IOW I'd prefer if the "doesn't" became
"can't".

Now before we go this route I think we need to consider whether
this really is the only place where the two locks get into one
another's way. That's because I don't think we want to introduce
more of such, well, hackery, and hence we'd otherwise need a
better solution. For example the locking model could be adjusted
to allow such nesting in the general case: Dom0 and any domain
whose ->target matches the subject domain here could be given
a slightly different "weight" in the lock order violation check logic.

I'm also uncertain about the overall effect this change has, in
that now the lock gets dropped for _every_ page to be copied.

> @@ -505,6 +517,23 @@ static int paging_log_dirty_op(struct domain *d,
>                          clear_page(l1);
>                      unmap_domain_page(l1);
>                  }
> +                if ( paging_mode_enabled(current->domain) && peek )
> +                {
> +                    d->arch.paging.preempt.log_dirty.i4 = i4;
> +                    d->arch.paging.preempt.log_dirty.i3 = i3;
> +                    d->arch.paging.preempt.log_dirty.i2 = i2 + 1;

If i2 + 1 == LOGDIRTY_NODE_ENTRIES I think it would be better

> +                    d->arch.paging.preempt.log_dirty.done = pages;
> +                    d->arch.paging.preempt.dom = current->domain;
> +                    d->arch.paging.preempt.op = sc->op;
> +                    resuming = 1;

true?

> +                    if ( l2 )
> +                        unmap_domain_page(l2);
> +                    if ( l3 )
> +                        unmap_domain_page(l3);
> +                    if ( l4 )
> +                        unmap_domain_page(l4);
> +                    goto again;

At the very least for safety reasons you should set l2, l3, and l4
back to NULL here.

> @@ -513,6 +542,7 @@ static int paging_log_dirty_op(struct domain *d,
>              {
>                  d->arch.paging.preempt.log_dirty.i4 = i4;
>                  d->arch.paging.preempt.log_dirty.i3 = i3 + 1;
> +                d->arch.paging.preempt.log_dirty.i2 = 0;
>                  rv = -ERESTART;
>                  break;
>              }
> @@ -525,6 +555,7 @@ static int paging_log_dirty_op(struct domain *d,
>          {
>              d->arch.paging.preempt.log_dirty.i4 = i4 + 1;
>              d->arch.paging.preempt.log_dirty.i3 = 0;
> +            d->arch.paging.preempt.log_dirty.i2 = 0;
>              rv = -ERESTART;
>          }

With these I don't see why you need storage in the domain
structure: All you should need is that the local variable retain its
value across the "goto again". You never exit the function with i2
non-zero afaict.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-12 16:07 ` Jan Beulich
@ 2018-12-13 11:39   ` Roger Pau Monné
  2018-12-13 12:51     ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2018-12-13 11:39 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel

On Wed, Dec 12, 2018 at 09:07:14AM -0700, Jan Beulich wrote:
> >>> On 12.12.18 at 15:54, <roger.pau@citrix.com> wrote:
> > Fix this by releasing the target paging lock before attempting to
> > perform the copy of the dirty bitmap, and then forcing a restart of
> > the whole process in case there have been changes to the dirty bitmap
> > tables.
> 
> I'm afraid it's not that simple: The writer side (paging_mark_pfn_dirty())
> uses the same lock, and I think the assumption is that while the copying
> takes place no updates to the bitmap can occur. Then again the subject
> domain gets paused anyway, so updates probably can't happen (the
> "probably" of course needs to be eliminated).

I've looked into this, and I think you are right, the copy must be
done with the paging lock held. Even if the domain is paused, the
device model can still mark gfns as dirty using
XEN_DMOP_modified_memory, so the only option would be to copy to a
temporary page while holding the lock, release the lock and copy the
temporary page to the called buffer.

> > @@ -488,6 +490,16 @@ static int paging_log_dirty_op(struct domain *d,
> >                      bytes = (unsigned int)((sc->pages - pages + 7) >> 3);
> >                  if ( likely(peek) )
> >                  {
> > +                    if ( paging_mode_enabled(current->domain) )
> > +                        /*
> > +                         * Drop the target p2m lock, or else Xen will panic
> > +                         * when trying to acquire the p2m lock of the caller
> > +                         * due to invalid lock order. Note that there are no
> > +                         * lock ordering issues here, and the panic is due to
> > +                         * the fact that the lock level tracking doesn't record
> > +                         * the domain the lock belongs to.
> > +                         */
> > +                        paging_unlock(d);
> 
> This makes it sound as if tracking the domain would help. It doesn't,
> at least not as long as there is not also some sort of ordering or
> hierarchy between domains. IOW I'd prefer if the "doesn't" became
> "can't".

Well, Just keeping correct order between each domain locks should be
enough?

Ie: exactly the same that Xen currently does but on a per-domain
basis. This is feasible, but each CPU would need to store the lock
order of each possible domain:

DECLARE_PER_CPU(uint8_t, mm_lock_level[DOMID_FIRST_RESERVED]);

This would consume ~32KB per CPU, which is not that much but seems a
waste when most of the time a single entry will be used.

> Now before we go this route I think we need to consider whether
> this really is the only place where the two locks get into one
> another's way. That's because I don't think we want to introduce
> more of such, well, hackery, and hence we'd otherwise need a
> better solution. For example the locking model could be adjusted
> to allow such nesting in the general case: Dom0 and any domain
> whose ->target matches the subject domain here could be given
> a slightly different "weight" in the lock order violation check logic.

So locks from domains != current would be given a lower order, let's
say:

#define MM_LOCK_ORDER_nestedp2m               8
#define MM_LOCK_ORDER_p2m                    16
#define MM_LOCK_ORDER_per_page_sharing       24
#define MM_LOCK_ORDER_altp2mlist             32
#define MM_LOCK_ORDER_altp2m                 40
#define MM_LOCK_ORDER_pod                    48
#define MM_LOCK_ORDER_page_alloc             56
#define MM_LOCK_ORDER_paging                 64
#define MM_LOCK_ORDER_MAX                    MM_LOCK_ORDER_paging

If domain != current, the above values are used. If domain == current
the values above are used + MM_LOCK_ORDER_MAX? So in that case the
order of MM_LOCK_ORDER_p2m against the current domain would be 16 + 64
= 80?

This has the slight inconvenience that not all mm lock call sites have
the target domain available, but can be solved.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-13 11:39   ` Roger Pau Monné
@ 2018-12-13 12:51     ` Jan Beulich
  2018-12-13 14:14       ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-12-13 12:51 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: George Dunlap, Andrew Cooper, Wei Liu, xen-devel

>>> On 13.12.18 at 12:39, <roger.pau@citrix.com> wrote:
> On Wed, Dec 12, 2018 at 09:07:14AM -0700, Jan Beulich wrote:
>> >>> On 12.12.18 at 15:54, <roger.pau@citrix.com> wrote:
>> > @@ -488,6 +490,16 @@ static int paging_log_dirty_op(struct domain *d,
>> >                      bytes = (unsigned int)((sc->pages - pages + 7) >> 3);
>> >                  if ( likely(peek) )
>> >                  {
>> > +                    if ( paging_mode_enabled(current->domain) )
>> > +                        /*
>> > +                         * Drop the target p2m lock, or else Xen will panic
>> > +                         * when trying to acquire the p2m lock of the caller
>> > +                         * due to invalid lock order. Note that there are no
>> > +                         * lock ordering issues here, and the panic is due to
>> > +                         * the fact that the lock level tracking doesn't record
>> > +                         * the domain the lock belongs to.
>> > +                         */
>> > +                        paging_unlock(d);
>> 
>> This makes it sound as if tracking the domain would help. It doesn't,
>> at least not as long as there is not also some sort of ordering or
>> hierarchy between domains. IOW I'd prefer if the "doesn't" became
>> "can't".
> 
> Well, Just keeping correct order between each domain locks should be
> enough?
> 
> Ie: exactly the same that Xen currently does but on a per-domain
> basis. This is feasible, but each CPU would need to store the lock
> order of each possible domain:
> 
> DECLARE_PER_CPU(uint8_t, mm_lock_level[DOMID_FIRST_RESERVED]);
> 
> This would consume ~32KB per CPU, which is not that much but seems a
> waste when most of the time a single entry will be used.

Well, tracking by domain ID wouldn't help you - the controlling
domain may well have a higher ID than the being controlled one,
i.e. the nesting you want needs to be independent of domain ID.

>> Now before we go this route I think we need to consider whether
>> this really is the only place where the two locks get into one
>> another's way. That's because I don't think we want to introduce
>> more of such, well, hackery, and hence we'd otherwise need a
>> better solution. For example the locking model could be adjusted
>> to allow such nesting in the general case: Dom0 and any domain
>> whose ->target matches the subject domain here could be given
>> a slightly different "weight" in the lock order violation check logic.
> 
> So locks from domains != current would be given a lower order, let's
> say:
> 
> #define MM_LOCK_ORDER_nestedp2m               8
> #define MM_LOCK_ORDER_p2m                    16
> #define MM_LOCK_ORDER_per_page_sharing       24
> #define MM_LOCK_ORDER_altp2mlist             32
> #define MM_LOCK_ORDER_altp2m                 40
> #define MM_LOCK_ORDER_pod                    48
> #define MM_LOCK_ORDER_page_alloc             56
> #define MM_LOCK_ORDER_paging                 64
> #define MM_LOCK_ORDER_MAX                    MM_LOCK_ORDER_paging
> 
> If domain != current, the above values are used. If domain == current
> the values above are used + MM_LOCK_ORDER_MAX? So in that case the
> order of MM_LOCK_ORDER_p2m against the current domain would be 16 + 64
> = 80?
> 
> This has the slight inconvenience that not all mm lock call sites have
> the target domain available, but can be solved.

No, I wasn't envisioning a really generic approach, i.e.
permitting this for all of the locks. Do we need this? Till now
I was rather considering to do this for just the paging lock,
with Dom0 and the controlling domains using just a small
(+/- 2 and +/- 1) offset from the normal paging one.
However, I now realize that indeed there may be more of
such nesting, and you've merely hit a specific case of it.

In any event it seems to me that you've got the ordering
inversed, i.e. domain == current (or really: the general case,
i.e. after excluding the two special cases) would use base
values, domain == currd->target would use some offset, and
Dom0 would use double the offset.

With it extended to all locks I'm no longer sure though that
the end result would be safe. I think it would be helpful to
rope in whoever did originally design this locking model.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-13 12:51     ` Jan Beulich
@ 2018-12-13 14:14       ` Roger Pau Monné
  2018-12-13 14:52         ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2018-12-13 14:14 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

On Thu, Dec 13, 2018 at 05:51:51AM -0700, Jan Beulich wrote:
> >>> On 13.12.18 at 12:39, <roger.pau@citrix.com> wrote:
> > On Wed, Dec 12, 2018 at 09:07:14AM -0700, Jan Beulich wrote:
> >> >>> On 12.12.18 at 15:54, <roger.pau@citrix.com> wrote:
> >> > @@ -488,6 +490,16 @@ static int paging_log_dirty_op(struct domain *d,
> >> >                      bytes = (unsigned int)((sc->pages - pages + 7) >> 3);
> >> >                  if ( likely(peek) )
> >> >                  {
> >> > +                    if ( paging_mode_enabled(current->domain) )
> >> > +                        /*
> >> > +                         * Drop the target p2m lock, or else Xen will panic
> >> > +                         * when trying to acquire the p2m lock of the caller
> >> > +                         * due to invalid lock order. Note that there are no
> >> > +                         * lock ordering issues here, and the panic is due to
> >> > +                         * the fact that the lock level tracking doesn't record
> >> > +                         * the domain the lock belongs to.
> >> > +                         */
> >> > +                        paging_unlock(d);
> >> 
> >> This makes it sound as if tracking the domain would help. It doesn't,
> >> at least not as long as there is not also some sort of ordering or
> >> hierarchy between domains. IOW I'd prefer if the "doesn't" became
> >> "can't".
> > 
> > Well, Just keeping correct order between each domain locks should be
> > enough?
> > 
> > Ie: exactly the same that Xen currently does but on a per-domain
> > basis. This is feasible, but each CPU would need to store the lock
> > order of each possible domain:
> > 
> > DECLARE_PER_CPU(uint8_t, mm_lock_level[DOMID_FIRST_RESERVED]);
> > 
> > This would consume ~32KB per CPU, which is not that much but seems a
> > waste when most of the time a single entry will be used.
> 
> Well, tracking by domain ID wouldn't help you - the controlling
> domain may well have a higher ID than the being controlled one,
> i.e. the nesting you want needs to be independent of domain ID.

It's not tracking the domain ID, but rather tracking the lock level of
each different domain, hence the need for the array in the pcpu
structure. The lock checker would take a domain id and a level, and
perform the check as:

if ( mm_lock_level[domid] > level )
    panic

> >> Now before we go this route I think we need to consider whether
> >> this really is the only place where the two locks get into one
> >> another's way. That's because I don't think we want to introduce
> >> more of such, well, hackery, and hence we'd otherwise need a
> >> better solution. For example the locking model could be adjusted
> >> to allow such nesting in the general case: Dom0 and any domain
> >> whose ->target matches the subject domain here could be given
> >> a slightly different "weight" in the lock order violation check logic.
> > 
> > So locks from domains != current would be given a lower order, let's
> > say:
> > 
> > #define MM_LOCK_ORDER_nestedp2m               8
> > #define MM_LOCK_ORDER_p2m                    16
> > #define MM_LOCK_ORDER_per_page_sharing       24
> > #define MM_LOCK_ORDER_altp2mlist             32
> > #define MM_LOCK_ORDER_altp2m                 40
> > #define MM_LOCK_ORDER_pod                    48
> > #define MM_LOCK_ORDER_page_alloc             56
> > #define MM_LOCK_ORDER_paging                 64
> > #define MM_LOCK_ORDER_MAX                    MM_LOCK_ORDER_paging
> > 
> > If domain != current, the above values are used. If domain == current
> > the values above are used + MM_LOCK_ORDER_MAX? So in that case the
> > order of MM_LOCK_ORDER_p2m against the current domain would be 16 + 64
> > = 80?
> > 
> > This has the slight inconvenience that not all mm lock call sites have
> > the target domain available, but can be solved.
> 
> No, I wasn't envisioning a really generic approach, i.e.
> permitting this for all of the locks. Do we need this? Till now
> I was rather considering to do this for just the paging lock,
> with Dom0 and the controlling domains using just a small
> (+/- 2 and +/- 1) offset from the normal paging one.
> However, I now realize that indeed there may be more of
> such nesting, and you've merely hit a specific case of it.

So far that's the only case I've hit, but I'm quite sure there are a
non-trivial amount of hypercalls that where only used by a PV Dom0, so
I don't discard finding other such instances.

> In any event it seems to me that you've got the ordering
> inversed, i.e. domain == current (or really: the general case,
> i.e. after excluding the two special cases) would use base
> values, domain == currd->target would use some offset, and
> Dom0 would use double the offset.
> 
> With it extended to all locks I'm no longer sure though that
> the end result would be safe. I think it would be helpful to
> rope in whoever did originally design this locking model.

I'm adding Tim which I think is the original author of the mm lock
infrastructure.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-13 14:14       ` Roger Pau Monné
@ 2018-12-13 14:52         ` Jan Beulich
  2018-12-13 15:34           ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-12-13 14:52 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

>>> On 13.12.18 at 15:14, <roger.pau@citrix.com> wrote:
> On Thu, Dec 13, 2018 at 05:51:51AM -0700, Jan Beulich wrote:
>> >>> On 13.12.18 at 12:39, <roger.pau@citrix.com> wrote:
>> > Well, Just keeping correct order between each domain locks should be
>> > enough?
>> > 
>> > Ie: exactly the same that Xen currently does but on a per-domain
>> > basis. This is feasible, but each CPU would need to store the lock
>> > order of each possible domain:
>> > 
>> > DECLARE_PER_CPU(uint8_t, mm_lock_level[DOMID_FIRST_RESERVED]);
>> > 
>> > This would consume ~32KB per CPU, which is not that much but seems a
>> > waste when most of the time a single entry will be used.
>> 
>> Well, tracking by domain ID wouldn't help you - the controlling
>> domain may well have a higher ID than the being controlled one,
>> i.e. the nesting you want needs to be independent of domain ID.
> 
> It's not tracking the domain ID, but rather tracking the lock level of
> each different domain, hence the need for the array in the pcpu
> structure. The lock checker would take a domain id and a level, and
> perform the check as:
> 
> if ( mm_lock_level[domid] > level )
>     panic

But this would open things up for deadlocks because of intermixed
lock usage between the calling domain's and the subject one's.
There needs to be a linear sequence of locks (of all involved
domains) describing the one and only order in which they may be
acquired.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-13 14:52         ` Jan Beulich
@ 2018-12-13 15:34           ` Roger Pau Monné
  2018-12-13 15:53             ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2018-12-13 15:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

On Thu, Dec 13, 2018 at 07:52:16AM -0700, Jan Beulich wrote:
> >>> On 13.12.18 at 15:14, <roger.pau@citrix.com> wrote:
> > On Thu, Dec 13, 2018 at 05:51:51AM -0700, Jan Beulich wrote:
> >> >>> On 13.12.18 at 12:39, <roger.pau@citrix.com> wrote:
> >> > Well, Just keeping correct order between each domain locks should be
> >> > enough?
> >> > 
> >> > Ie: exactly the same that Xen currently does but on a per-domain
> >> > basis. This is feasible, but each CPU would need to store the lock
> >> > order of each possible domain:
> >> > 
> >> > DECLARE_PER_CPU(uint8_t, mm_lock_level[DOMID_FIRST_RESERVED]);
> >> > 
> >> > This would consume ~32KB per CPU, which is not that much but seems a
> >> > waste when most of the time a single entry will be used.
> >> 
> >> Well, tracking by domain ID wouldn't help you - the controlling
> >> domain may well have a higher ID than the being controlled one,
> >> i.e. the nesting you want needs to be independent of domain ID.
> > 
> > It's not tracking the domain ID, but rather tracking the lock level of
> > each different domain, hence the need for the array in the pcpu
> > structure. The lock checker would take a domain id and a level, and
> > perform the check as:
> > 
> > if ( mm_lock_level[domid] > level )
> >     panic
> 
> But this would open things up for deadlocks because of intermixed
> lock usage between the calling domain's and the subject one's.
> There needs to be a linear sequence of locks (of all involved
> domains) describing the one and only order in which they may be
> acquired.

Well, my plan was to only check for deadlocks between the locks of the
same domain, without taking into account intermixed domain locking.

I guess at this point I will need some input from Tim and George about
how to proceed, because I'm not sure how to weight locks when using
intermixed domain locks, neither what is the correct order. The order
in paging_log_dirty_op looks like a valid order that we want to
support, but are there any others?

Is it possible to have multiple valid interdomain lock orders that
cannot be expressed using the current weighted lock ordering?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-13 15:34           ` Roger Pau Monné
@ 2018-12-13 15:53             ` Jan Beulich
  2018-12-14 10:03               ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-12-13 15:53 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

>>> On 13.12.18 at 16:34, <roger.pau@citrix.com> wrote:
> On Thu, Dec 13, 2018 at 07:52:16AM -0700, Jan Beulich wrote:
>> >>> On 13.12.18 at 15:14, <roger.pau@citrix.com> wrote:
>> > On Thu, Dec 13, 2018 at 05:51:51AM -0700, Jan Beulich wrote:
>> >> >>> On 13.12.18 at 12:39, <roger.pau@citrix.com> wrote:
>> >> > Well, Just keeping correct order between each domain locks should be
>> >> > enough?
>> >> > 
>> >> > Ie: exactly the same that Xen currently does but on a per-domain
>> >> > basis. This is feasible, but each CPU would need to store the lock
>> >> > order of each possible domain:
>> >> > 
>> >> > DECLARE_PER_CPU(uint8_t, mm_lock_level[DOMID_FIRST_RESERVED]);
>> >> > 
>> >> > This would consume ~32KB per CPU, which is not that much but seems a
>> >> > waste when most of the time a single entry will be used.
>> >> 
>> >> Well, tracking by domain ID wouldn't help you - the controlling
>> >> domain may well have a higher ID than the being controlled one,
>> >> i.e. the nesting you want needs to be independent of domain ID.
>> > 
>> > It's not tracking the domain ID, but rather tracking the lock level of
>> > each different domain, hence the need for the array in the pcpu
>> > structure. The lock checker would take a domain id and a level, and
>> > perform the check as:
>> > 
>> > if ( mm_lock_level[domid] > level )
>> >     panic
>> 
>> But this would open things up for deadlocks because of intermixed
>> lock usage between the calling domain's and the subject one's.
>> There needs to be a linear sequence of locks (of all involved
>> domains) describing the one and only order in which they may be
>> acquired.
> 
> Well, my plan was to only check for deadlocks between the locks of the
> same domain, without taking into account intermixed domain locking.
> 
> I guess at this point I will need some input from Tim and George about
> how to proceed, because I'm not sure how to weight locks when using
> intermixed domain locks, neither what is the correct order. The order
> in paging_log_dirty_op looks like a valid order that we want to
> support, but are there any others?
> 
> Is it possible to have multiple valid interdomain lock orders that
> cannot be expressed using the current weighted lock ordering?

Well, first of all I'm afraid I didn't look closely enough at you
original mail: We're not talking about the paging lock of two
domains here, but about he paging lock of the subject domain
and dom0's p2m lock.

Second I then notice that

(XEN) mm locking order violation: 64 > 16

indicates that it might not have complained when two similar
locks of different domains were acquired in a nested fashion,
which I'd call a shortcoming that would be nice to eliminate at
this same occasion.

And third, to answer your question, I can't see anything
conceptually wrong with an arbitrary intermix of locks from
two different domains, as long their inner-domain ordering is
correct. E.g. a Dom0 hypercall may find a need to acquire
- the subject domain's p2m lock
- its own p2m lock
- the subject domain's PoD lock
- its own paging lock
Of course it may be possible to determine that "own" locks
are not supposed to be acquired outside of any "subject
domain" ones, in which case we'd have a workable hierarchy
(along the lines of what you had earlier suggested). But I'm
not sure how expensive (in terms of code auditing) such
determination is going be, which is why for the moment I'm
trying to think of a solution (ordering criteria) for the general
case.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-13 15:53             ` Jan Beulich
@ 2018-12-14 10:03               ` Roger Pau Monné
  2018-12-14 10:45                 ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2018-12-14 10:03 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

On Thu, Dec 13, 2018 at 08:53:22AM -0700, Jan Beulich wrote:
> >>> On 13.12.18 at 16:34, <roger.pau@citrix.com> wrote:
> > On Thu, Dec 13, 2018 at 07:52:16AM -0700, Jan Beulich wrote:
> >> >>> On 13.12.18 at 15:14, <roger.pau@citrix.com> wrote:
> >> > On Thu, Dec 13, 2018 at 05:51:51AM -0700, Jan Beulich wrote:
> >> >> >>> On 13.12.18 at 12:39, <roger.pau@citrix.com> wrote:
> >> >> > Well, Just keeping correct order between each domain locks should be
> >> >> > enough?
> >> >> > 
> >> >> > Ie: exactly the same that Xen currently does but on a per-domain
> >> >> > basis. This is feasible, but each CPU would need to store the lock
> >> >> > order of each possible domain:
> >> >> > 
> >> >> > DECLARE_PER_CPU(uint8_t, mm_lock_level[DOMID_FIRST_RESERVED]);
> >> >> > 
> >> >> > This would consume ~32KB per CPU, which is not that much but seems a
> >> >> > waste when most of the time a single entry will be used.
> >> >> 
> >> >> Well, tracking by domain ID wouldn't help you - the controlling
> >> >> domain may well have a higher ID than the being controlled one,
> >> >> i.e. the nesting you want needs to be independent of domain ID.
> >> > 
> >> > It's not tracking the domain ID, but rather tracking the lock level of
> >> > each different domain, hence the need for the array in the pcpu
> >> > structure. The lock checker would take a domain id and a level, and
> >> > perform the check as:
> >> > 
> >> > if ( mm_lock_level[domid] > level )
> >> >     panic
> >> 
> >> But this would open things up for deadlocks because of intermixed
> >> lock usage between the calling domain's and the subject one's.
> >> There needs to be a linear sequence of locks (of all involved
> >> domains) describing the one and only order in which they may be
> >> acquired.
> > 
> > Well, my plan was to only check for deadlocks between the locks of the
> > same domain, without taking into account intermixed domain locking.
> > 
> > I guess at this point I will need some input from Tim and George about
> > how to proceed, because I'm not sure how to weight locks when using
> > intermixed domain locks, neither what is the correct order. The order
> > in paging_log_dirty_op looks like a valid order that we want to
> > support, but are there any others?
> > 
> > Is it possible to have multiple valid interdomain lock orders that
> > cannot be expressed using the current weighted lock ordering?
> 
> Well, first of all I'm afraid I didn't look closely enough at you
> original mail: We're not talking about the paging lock of two
> domains here, but about he paging lock of the subject domain
> and dom0's p2m lock.
> 
> Second I then notice that
> 
> (XEN) mm locking order violation: 64 > 16
> 
> indicates that it might not have complained when two similar
> locks of different domains were acquired in a nested fashion,
> which I'd call a shortcoming that would be nice to eliminate at
> this same occasion.

Yes, that's a current shortcoming, but then I'm not sure if such case
would be a violation of the lock ordering if the locks belong to
different domains and arbitrary interdomain locking is allowed.

> And third, to answer your question, I can't see anything
> conceptually wrong with an arbitrary intermix of locks from
> two different domains, as long their inner-domain ordering is
> correct. E.g. a Dom0 hypercall may find a need to acquire
> - the subject domain's p2m lock
> - its own p2m lock
> - the subject domain's PoD lock
> - its own paging lock

OK, so if the plan is to allow arbitrary intermix of locks from
different domains then using a per-domain lock level tracking seems
like the best option, along the lines of what I was proposing earlier:

if ( mm_lock_level[domid] > level )
    panic

> Of course it may be possible to determine that "own" locks
> are not supposed to be acquired outside of any "subject
> domain" ones, in which case we'd have a workable hierarchy
> (along the lines of what you had earlier suggested).

I expect the interdomain locking as a result of using a paging caller
domain is going to be restricted to the p2m lock of the caller domain,
as a result of the usage of copy to/from helpers.

Maybe the less intrusive change would be to just allow locking the
caller p2m lock (that only lock) regardless of the subject domain lock
level?

> But I'm
> not sure how expensive (in terms of code auditing) such
> determination is going be, which is why for the moment I'm
> trying to think of a solution (ordering criteria) for the general
> case.

Yes, I think auditing current interdomain locking (if there's more
apart from the paging logdirty hypercall) will be expensive.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-14 10:03               ` Roger Pau Monné
@ 2018-12-14 10:45                 ` Jan Beulich
  2018-12-14 11:45                   ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-12-14 10:45 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

>>> On 14.12.18 at 11:03, <roger.pau@citrix.com> wrote:
> I expect the interdomain locking as a result of using a paging caller
> domain is going to be restricted to the p2m lock of the caller domain,
> as a result of the usage of copy to/from helpers.
> 
> Maybe the less intrusive change would be to just allow locking the
> caller p2m lock (that only lock) regardless of the subject domain lock
> level?

With caller != subject, and with the lock level then being higher
than all "normal" ones, this might be an option. But from the
very beginning we should keep the transitive aspect here in
mind: If Dom0 controls a PVH domain which controls a HVM one,
the Dom0 p2m lock may also need allowing to nest inside the PVH
DomU's one, so it'll be a total of two extra new lock levels even
if we restrict this to the p2m locks.

Furthermore, if we limited this to the p2m locks, it would become
illegal to obtain any of the higher lock level locks (i.e. ones that
nest inside the p2m lock) for the caller domain. I'm not sure
that's a good idea. As mentioned before, if caller locks collectively
all get obtained inside any subject domain ones, applying a bias
to all lock levels would perhaps be preferable. The question is
whether this criteria holds.

I also as of yet can't see how this could be expressed
transparently to existing code. Yet obviously having to change
various p2m_lock() invocations would be rather undesirable.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-14 10:45                 ` Jan Beulich
@ 2018-12-14 11:45                   ` Roger Pau Monné
  2018-12-14 11:52                     ` Jan Beulich
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2018-12-14 11:45 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

On Fri, Dec 14, 2018 at 03:45:21AM -0700, Jan Beulich wrote:
> >>> On 14.12.18 at 11:03, <roger.pau@citrix.com> wrote:
> > I expect the interdomain locking as a result of using a paging caller
> > domain is going to be restricted to the p2m lock of the caller domain,
> > as a result of the usage of copy to/from helpers.
> > 
> > Maybe the less intrusive change would be to just allow locking the
> > caller p2m lock (that only lock) regardless of the subject domain lock
> > level?
> 
> With caller != subject, and with the lock level then being higher
> than all "normal" ones, this might be an option. But from the
> very beginning we should keep the transitive aspect here in
> mind: If Dom0 controls a PVH domain which controls a HVM one,
> the Dom0 p2m lock may also need allowing to nest inside the PVH
> DomU's one, so it'll be a total of two extra new lock levels even
> if we restrict this to the p2m locks.

I'm not sure I follow here, so far we have spoken about a subject and
a caller domain (subject being the target of the operation). In the
above scenario I see a relation between Dom0 and the PVH domain, and
between the PVH domain and the HVM one, but not a relation that
encompasses the three domains in terms of mm locking.

Dom0 (caller) mm lock could be nested inside the PVH DomU (subject) mm
locks, and the PVH DomU (caller) locks could be nested inside the HVM
domain (subject) locks, but I don't see an operation where Dom0 mm
lock could be nested inside of a PVH DomU lock that's already nested
inside of the HVM DomU lock.

> 
> Furthermore, if we limited this to the p2m locks, it would become
> illegal to obtain any of the higher lock level locks (i.e. ones that
> nest inside the p2m lock) for the caller domain. I'm not sure
> that's a good idea. As mentioned before, if caller locks collectively
> all get obtained inside any subject domain ones, applying a bias
> to all lock levels would perhaps be preferable. The question is
> whether this criteria holds.

So far I can only think of paging callers requiring the p2m lock after
having locked an arbitrary number of subject domain mm locks in order
to perform the copy to/from of the operation data.

IMO a bias applied to caller locks when nested inside the subject
locks looks like the best solution.

> I also as of yet can't see how this could be expressed
> transparently to existing code. Yet obviously having to change
> various p2m_lock() invocations would be rather undesirable.

Most of the lock callers can be easily adjusted. Take p2m_lock for
example, the lock owner domain can be obtained from the p2m_domain
parameter (p2m_domain->domain) and this could be checked against
current->domain in order to figure out if the lock belongs to the
subject or the caller domain.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-14 11:45                   ` Roger Pau Monné
@ 2018-12-14 11:52                     ` Jan Beulich
  2018-12-17 15:42                       ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-12-14 11:52 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

>>> On 14.12.18 at 12:45, <roger.pau@citrix.com> wrote:
> On Fri, Dec 14, 2018 at 03:45:21AM -0700, Jan Beulich wrote:
>> >>> On 14.12.18 at 11:03, <roger.pau@citrix.com> wrote:
>> > I expect the interdomain locking as a result of using a paging caller
>> > domain is going to be restricted to the p2m lock of the caller domain,
>> > as a result of the usage of copy to/from helpers.
>> > 
>> > Maybe the less intrusive change would be to just allow locking the
>> > caller p2m lock (that only lock) regardless of the subject domain lock
>> > level?
>> 
>> With caller != subject, and with the lock level then being higher
>> than all "normal" ones, this might be an option. But from the
>> very beginning we should keep the transitive aspect here in
>> mind: If Dom0 controls a PVH domain which controls a HVM one,
>> the Dom0 p2m lock may also need allowing to nest inside the PVH
>> DomU's one, so it'll be a total of two extra new lock levels even
>> if we restrict this to the p2m locks.
> 
> I'm not sure I follow here, so far we have spoken about a subject and
> a caller domain (subject being the target of the operation). In the
> above scenario I see a relation between Dom0 and the PVH domain, and
> between the PVH domain and the HVM one, but not a relation that
> encompasses the three domains in terms of mm locking.
> 
> Dom0 (caller) mm lock could be nested inside the PVH DomU (subject) mm
> locks, and the PVH DomU (caller) locks could be nested inside the HVM
> domain (subject) locks, but I don't see an operation where Dom0 mm
> lock could be nested inside of a PVH DomU lock that's already nested
> inside of the HVM DomU lock.

Well, if we're _sure_ this can't happen, then of course we also
don't need to deal with it.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-14 11:52                     ` Jan Beulich
@ 2018-12-17 15:42                       ` Roger Pau Monné
       [not found]                         ` <4B192692020000B80063616D@prv1-mh.provo.novell.com>
  0 siblings, 1 reply; 15+ messages in thread
From: Roger Pau Monné @ 2018-12-17 15:42 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

On Fri, Dec 14, 2018 at 04:52:00AM -0700, Jan Beulich wrote:
> >>> On 14.12.18 at 12:45, <roger.pau@citrix.com> wrote:
> > On Fri, Dec 14, 2018 at 03:45:21AM -0700, Jan Beulich wrote:
> >> >>> On 14.12.18 at 11:03, <roger.pau@citrix.com> wrote:
> >> > I expect the interdomain locking as a result of using a paging caller
> >> > domain is going to be restricted to the p2m lock of the caller domain,
> >> > as a result of the usage of copy to/from helpers.
> >> > 
> >> > Maybe the less intrusive change would be to just allow locking the
> >> > caller p2m lock (that only lock) regardless of the subject domain lock
> >> > level?
> >> 
> >> With caller != subject, and with the lock level then being higher
> >> than all "normal" ones, this might be an option. But from the
> >> very beginning we should keep the transitive aspect here in
> >> mind: If Dom0 controls a PVH domain which controls a HVM one,
> >> the Dom0 p2m lock may also need allowing to nest inside the PVH
> >> DomU's one, so it'll be a total of two extra new lock levels even
> >> if we restrict this to the p2m locks.
> > 
> > I'm not sure I follow here, so far we have spoken about a subject and
> > a caller domain (subject being the target of the operation). In the
> > above scenario I see a relation between Dom0 and the PVH domain, and
> > between the PVH domain and the HVM one, but not a relation that
> > encompasses the three domains in terms of mm locking.
> > 
> > Dom0 (caller) mm lock could be nested inside the PVH DomU (subject) mm
> > locks, and the PVH DomU (caller) locks could be nested inside the HVM
> > domain (subject) locks, but I don't see an operation where Dom0 mm
> > lock could be nested inside of a PVH DomU lock that's already nested
> > inside of the HVM DomU lock.
> 
> Well, if we're _sure_ this can't happen, then of course we also
> don't need to deal with it.

So I've got a patch that adds a positive offset to lock priority when
the lock owner is the current domain. This allows locking other
domains (subject) mm locks and then take the current domain's mm locks.

There's one slight issue with the page sharing lock, that will always
be locked as belonging to a subject domain, due to the fact that
there's no clear owner of such lock. AFAICT some page-sharing routines
are even run from the idle thread.

I'm attaching such patch below, note it keeps the same lock order as
before, but allows taking the caller locks _only_ after having took
some of the subject mm locks. Note this doesn't allow interleaved mm
locking between the subject and the caller.

The nice part about this patch is that changes are mostly confined to
mm-locks.h, there's only an extra change to xen/arch/x86/mm/p2m-pod.c.

I am however not sure how to prove there are no valid paths that could
require a different lock ordering, am I expected to check all possible
code paths for mm lock ordering?

I think this solution introduces the minimum amount of modifications
to fix the current issue, ie: paging caller being able to use copy
to/from after taking subject mm locks.

Thanks, Roger.

---8<---
diff --git a/xen/arch/x86/mm/mm-locks.h b/xen/arch/x86/mm/mm-locks.h
index 95295b62d2..a47beaf893 100644
--- a/xen/arch/x86/mm/mm-locks.h
+++ b/xen/arch/x86/mm/mm-locks.h
@@ -46,29 +46,36 @@ static inline int mm_locked_by_me(mm_lock_t *l)
     return (l->lock.recurse_cpu == current->processor);
 }
 
+#define MM_LOCK_ORDER_MAX                    64
+
 /*
  * If you see this crash, the numbers printed are order levels defined
  * in this file.
  */
-#define __check_lock_level(l)                           \
-do {                                                    \
-    if ( unlikely(__get_lock_level() > (l)) )           \
-    {                                                   \
-        printk("mm locking order violation: %i > %i\n", \
-               __get_lock_level(), (l));                \
-        BUG();                                          \
-    }                                                   \
+#define __check_lock_level(d, l)                                \
+do {                                                            \
+    if ( unlikely(__get_lock_level() >                          \
+         (l) + ((d) == current->domain->domain_id ?             \
+                MM_LOCK_ORDER_MAX : 0)) )                       \
+    {                                                           \
+        printk("mm locking order violation: %i > %i\n",         \
+               __get_lock_level(),                              \
+               (l) + ((d) == current->domain->domain_id ?       \
+                      MM_LOCK_ORDER_MAX : 0));                  \
+        BUG();                                                  \
+    }                                                           \
 } while(0)
 
-#define __set_lock_level(l)         \
-do {                                \
-    __get_lock_level() = (l);       \
+#define __set_lock_level(l)                                     \
+do {                                                            \
+    __get_lock_level() = (l);                                   \
 } while(0)
 
-static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
+static inline void _mm_lock(domid_t d, mm_lock_t *l, const char *func,
+                            int level, int rec)
 {
-    if ( !((mm_locked_by_me(l)) && rec) ) 
-        __check_lock_level(level);
+    if ( !((mm_locked_by_me(l)) && rec) )
+        __check_lock_level(d, level);
     spin_lock_recursive(&l->lock);
     if ( l->lock.recurse_cnt == 1 )
     {
@@ -77,16 +84,18 @@ static inline void _mm_lock(mm_lock_t *l, const char *func, int level, int rec)
     }
     else if ( (unlikely(!rec)) )
         panic("mm lock already held by %s\n", l->locker_function);
-    __set_lock_level(level);
+    __set_lock_level(level + (d == current->domain->domain_id ?
+                              MM_LOCK_ORDER_MAX : 0));
 }
 
-static inline void _mm_enforce_order_lock_pre(int level)
+static inline void _mm_enforce_order_lock_pre(domid_t d, int level)
 {
-    __check_lock_level(level);
+    __check_lock_level(d, level);
 }
 
-static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
-                                                unsigned short *recurse_count)
+static inline void _mm_enforce_order_lock_post(domid_t d, int level,
+                                               int *unlock_level,
+                                               unsigned short *recurse_count)
 {
     if ( recurse_count )
     {
@@ -97,7 +106,8 @@ static inline void _mm_enforce_order_lock_post(int level, int *unlock_level,
     } else {
         *unlock_level = __get_lock_level();
     }
-    __set_lock_level(level);
+    __set_lock_level(level + (d == current->domain->domain_id ?
+                              MM_LOCK_ORDER_MAX : 0));
 }
 
 
@@ -114,16 +124,18 @@ static inline int mm_write_locked_by_me(mm_rwlock_t *l)
     return (l->locker == get_processor_id());
 }
 
-static inline void _mm_write_lock(mm_rwlock_t *l, const char *func, int level)
+static inline void _mm_write_lock(domid_t d, mm_rwlock_t *l, const char *func,
+                                  int level)
 {
     if ( !mm_write_locked_by_me(l) )
     {
-        __check_lock_level(level);
+        __check_lock_level(d, level);
         percpu_write_lock(p2m_percpu_rwlock, &l->lock);
         l->locker = get_processor_id();
         l->locker_function = func;
         l->unlock_level = __get_lock_level();
-        __set_lock_level(level);
+        __set_lock_level(level + (d == current->domain->domain_id ?
+                                  MM_LOCK_ORDER_MAX : 0));
     }
     l->recurse_count++;
 }
@@ -138,9 +150,9 @@ static inline void mm_write_unlock(mm_rwlock_t *l)
     percpu_write_unlock(p2m_percpu_rwlock, &l->lock);
 }
 
-static inline void _mm_read_lock(mm_rwlock_t *l, int level)
+static inline void _mm_read_lock(domid_t d, mm_rwlock_t *l, int level)
 {
-    __check_lock_level(level);
+    __check_lock_level(d, level);
     percpu_read_lock(p2m_percpu_rwlock, &l->lock);
     /* There's nowhere to store the per-CPU unlock level so we can't
      * set the lock level. */
@@ -153,28 +165,28 @@ static inline void mm_read_unlock(mm_rwlock_t *l)
 
 /* This wrapper uses the line number to express the locking order below */
 #define declare_mm_lock(name)                                                 \
-    static inline void mm_lock_##name(mm_lock_t *l, const char *func, int rec)\
-    { _mm_lock(l, func, MM_LOCK_ORDER_##name, rec); }
+    static inline void mm_lock_##name(domid_t d, mm_lock_t *l, const char *func, int rec)\
+    { _mm_lock(d, l, func, MM_LOCK_ORDER_##name, rec); }
 #define declare_mm_rwlock(name)                                               \
-    static inline void mm_write_lock_##name(mm_rwlock_t *l, const char *func) \
-    { _mm_write_lock(l, func, MM_LOCK_ORDER_##name); }                                    \
-    static inline void mm_read_lock_##name(mm_rwlock_t *l)                    \
-    { _mm_read_lock(l, MM_LOCK_ORDER_##name); }
+    static inline void mm_write_lock_##name(domid_t d, mm_rwlock_t *l, const char *func) \
+    { _mm_write_lock(d, l, func, MM_LOCK_ORDER_##name); }                                    \
+    static inline void mm_read_lock_##name(domid_t d, mm_rwlock_t *l)                    \
+    { _mm_read_lock(d, l, MM_LOCK_ORDER_##name); }
 /* These capture the name of the calling function */
-#define mm_lock(name, l) mm_lock_##name(l, __func__, 0)
-#define mm_lock_recursive(name, l) mm_lock_##name(l, __func__, 1)
-#define mm_write_lock(name, l) mm_write_lock_##name(l, __func__)
-#define mm_read_lock(name, l) mm_read_lock_##name(l)
+#define mm_lock(name, d, l) mm_lock_##name(d, l, __func__, 0)
+#define mm_lock_recursive(name, d, l) mm_lock_##name(d, l, __func__, 1)
+#define mm_write_lock(name, d, l) mm_write_lock_##name(d, l, __func__)
+#define mm_read_lock(name, d, l) mm_read_lock_##name(d, l)
 
 /* This wrapper is intended for "external" locks which do not use
  * the mm_lock_t types. Such locks inside the mm code are also subject
  * to ordering constraints. */
 #define declare_mm_order_constraint(name)                                   \
-    static inline void mm_enforce_order_lock_pre_##name(void)               \
-    { _mm_enforce_order_lock_pre(MM_LOCK_ORDER_##name); }                               \
+    static inline void mm_enforce_order_lock_pre_##name(domid_t d)               \
+    { _mm_enforce_order_lock_pre(d, MM_LOCK_ORDER_##name); }                               \
     static inline void mm_enforce_order_lock_post_##name(                   \
-                        int *unlock_level, unsigned short *recurse_count)   \
-    { _mm_enforce_order_lock_post(MM_LOCK_ORDER_##name, unlock_level, recurse_count); } \
+                        domid_t d, int *unlock_level, unsigned short *recurse_count)   \
+    { _mm_enforce_order_lock_post(d, MM_LOCK_ORDER_##name, unlock_level, recurse_count); } \
 
 static inline void mm_unlock(mm_lock_t *l)
 {
@@ -186,8 +198,8 @@ static inline void mm_unlock(mm_lock_t *l)
     spin_unlock_recursive(&l->lock);
 }
 
-static inline void mm_enforce_order_unlock(int unlock_level, 
-                                            unsigned short *recurse_count)
+static inline void mm_enforce_order_unlock(int unlock_level,
+                                           unsigned short *recurse_count)
 {
     if ( recurse_count )
     {
@@ -218,7 +230,7 @@ static inline void mm_enforce_order_unlock(int unlock_level,
 
 #define MM_LOCK_ORDER_nestedp2m               8
 declare_mm_lock(nestedp2m)
-#define nestedp2m_lock(d)   mm_lock(nestedp2m, &(d)->arch.nested_p2m_lock)
+#define nestedp2m_lock(d)   mm_lock(nestedp2m, (d)->domain_id, &(d)->arch.nested_p2m_lock)
 #define nestedp2m_unlock(d) mm_unlock(&(d)->arch.nested_p2m_lock)
 
 /* P2M lock (per-non-alt-p2m-table)
@@ -257,9 +269,9 @@ declare_mm_rwlock(p2m);
 
 #define MM_LOCK_ORDER_per_page_sharing       24
 declare_mm_order_constraint(per_page_sharing)
-#define page_sharing_mm_pre_lock()   mm_enforce_order_lock_pre_per_page_sharing()
+#define page_sharing_mm_pre_lock()   mm_enforce_order_lock_pre_per_page_sharing(DOMID_INVALID)
 #define page_sharing_mm_post_lock(l, r) \
-        mm_enforce_order_lock_post_per_page_sharing((l), (r))
+        mm_enforce_order_lock_post_per_page_sharing(DOMID_INVALID, (l), (r))
 #define page_sharing_mm_unlock(l, r) mm_enforce_order_unlock((l), (r))
 
 /* Alternate P2M list lock (per-domain)
@@ -272,7 +284,8 @@ declare_mm_order_constraint(per_page_sharing)
 
 #define MM_LOCK_ORDER_altp2mlist             32
 declare_mm_lock(altp2mlist)
-#define altp2m_list_lock(d)   mm_lock(altp2mlist, &(d)->arch.altp2m_list_lock)
+#define altp2m_list_lock(d)   mm_lock(altp2mlist, (d)->domain_id, \
+                                      &(d)->arch.altp2m_list_lock)
 #define altp2m_list_unlock(d) mm_unlock(&(d)->arch.altp2m_list_lock)
 
 /* P2M lock (per-altp2m-table)
@@ -290,9 +303,9 @@ declare_mm_rwlock(altp2m);
 #define p2m_lock(p)                             \
     do {                                        \
         if ( p2m_is_altp2m(p) )                 \
-            mm_write_lock(altp2m, &(p)->lock);  \
+            mm_write_lock(altp2m, (p)->domain->domain_id, &(p)->lock);  \
         else                                    \
-            mm_write_lock(p2m, &(p)->lock);     \
+            mm_write_lock(p2m, (p)->domain->domain_id, &(p)->lock);     \
         (p)->defer_flush++;                     \
     } while (0)
 #define p2m_unlock(p)                           \
@@ -304,7 +317,8 @@ declare_mm_rwlock(altp2m);
     } while (0)
 #define gfn_lock(p,g,o)       p2m_lock(p)
 #define gfn_unlock(p,g,o)     p2m_unlock(p)
-#define p2m_read_lock(p)      mm_read_lock(p2m, &(p)->lock)
+#define p2m_read_lock(p)      mm_read_lock(p2m, (p)->domain->domain_id,\
+                                           &(p)->lock)
 #define p2m_read_unlock(p)    mm_read_unlock(&(p)->lock)
 #define p2m_locked_by_me(p)   mm_write_locked_by_me(&(p)->lock)
 #define gfn_locked_by_me(p,g) p2m_locked_by_me(p)
@@ -316,7 +330,8 @@ declare_mm_rwlock(altp2m);
 
 #define MM_LOCK_ORDER_pod                    48
 declare_mm_lock(pod)
-#define pod_lock(p)           mm_lock(pod, &(p)->pod.lock)
+#define pod_lock(p)           mm_lock(pod, (p)->domain->domain_id, \
+                                      &(p)->pod.lock)
 #define pod_unlock(p)         mm_unlock(&(p)->pod.lock)
 #define pod_locked_by_me(p)   mm_locked_by_me(&(p)->pod.lock)
 
@@ -329,9 +344,9 @@ declare_mm_lock(pod)
 
 #define MM_LOCK_ORDER_page_alloc             56
 declare_mm_order_constraint(page_alloc)
-#define page_alloc_mm_pre_lock()   mm_enforce_order_lock_pre_page_alloc()
-#define page_alloc_mm_post_lock(l) mm_enforce_order_lock_post_page_alloc(&(l), NULL)
-#define page_alloc_mm_unlock(l)    mm_enforce_order_unlock((l), NULL)
+#define page_alloc_mm_pre_lock(d)   mm_enforce_order_lock_pre_page_alloc((d)->domain_id)
+#define page_alloc_mm_post_lock(d, l) mm_enforce_order_lock_post_page_alloc((d)->domain_id, &(l), NULL)
+#define page_alloc_mm_unlock(d, l)    mm_enforce_order_unlock((l), NULL)
 
 /* Paging lock (per-domain)
  *
@@ -350,9 +365,10 @@ declare_mm_order_constraint(page_alloc)
 
 #define MM_LOCK_ORDER_paging                 64
 declare_mm_lock(paging)
-#define paging_lock(d)         mm_lock(paging, &(d)->arch.paging.lock)
-#define paging_lock_recursive(d) \
-                    mm_lock_recursive(paging, &(d)->arch.paging.lock)
+#define paging_lock(d)         mm_lock(paging, (d)->domain_id,\
+                                       &(d)->arch.paging.lock)
+#define paging_lock_recursive(d) mm_lock_recursive(paging, (d)->domain_id, \
+                                                   &(d)->arch.paging.lock)
 #define paging_unlock(d)       mm_unlock(&(d)->arch.paging.lock)
 #define paging_locked_by_me(d) mm_locked_by_me(&(d)->arch.paging.lock)
 
diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 4c56cb58c6..7cac396187 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -34,14 +34,16 @@
 /* Enforce lock ordering when grabbing the "external" page_alloc lock */
 static inline void lock_page_alloc(struct p2m_domain *p2m)
 {
-    page_alloc_mm_pre_lock();
+    page_alloc_mm_pre_lock(p2m->domain);
     spin_lock(&(p2m->domain->page_alloc_lock));
-    page_alloc_mm_post_lock(p2m->domain->arch.page_alloc_unlock_level);
+    page_alloc_mm_post_lock(p2m->domain,
+                            p2m->domain->arch.page_alloc_unlock_level);
 }
 
 static inline void unlock_page_alloc(struct p2m_domain *p2m)
 {
-    page_alloc_mm_unlock(p2m->domain->arch.page_alloc_unlock_level);
+    page_alloc_mm_unlock(p2m->domain,
+                         p2m->domain->arch.page_alloc_unlock_level);
     spin_unlock(&(p2m->domain->page_alloc_lock));
 }
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
       [not found]                         ` <4B192692020000B80063616D@prv1-mh.provo.novell.com>
@ 2018-12-17 16:47                           ` Jan Beulich
  2018-12-17 16:58                             ` Roger Pau Monné
  0 siblings, 1 reply; 15+ messages in thread
From: Jan Beulich @ 2018-12-17 16:47 UTC (permalink / raw)
  To: Roger Pau Monne, George Dunlap
  Cc: Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

>>> On 17.12.18 at 16:42, <roger.pau@citrix.com> wrote:
> On Fri, Dec 14, 2018 at 04:52:00AM -0700, Jan Beulich wrote:
>> >>> On 14.12.18 at 12:45, <roger.pau@citrix.com> wrote:
>> > On Fri, Dec 14, 2018 at 03:45:21AM -0700, Jan Beulich wrote:
>> >> >>> On 14.12.18 at 11:03, <roger.pau@citrix.com> wrote:
>> >> > I expect the interdomain locking as a result of using a paging caller
>> >> > domain is going to be restricted to the p2m lock of the caller domain,
>> >> > as a result of the usage of copy to/from helpers.
>> >> > 
>> >> > Maybe the less intrusive change would be to just allow locking the
>> >> > caller p2m lock (that only lock) regardless of the subject domain lock
>> >> > level?
>> >> 
>> >> With caller != subject, and with the lock level then being higher
>> >> than all "normal" ones, this might be an option. But from the
>> >> very beginning we should keep the transitive aspect here in
>> >> mind: If Dom0 controls a PVH domain which controls a HVM one,
>> >> the Dom0 p2m lock may also need allowing to nest inside the PVH
>> >> DomU's one, so it'll be a total of two extra new lock levels even
>> >> if we restrict this to the p2m locks.
>> > 
>> > I'm not sure I follow here, so far we have spoken about a subject and
>> > a caller domain (subject being the target of the operation). In the
>> > above scenario I see a relation between Dom0 and the PVH domain, and
>> > between the PVH domain and the HVM one, but not a relation that
>> > encompasses the three domains in terms of mm locking.
>> > 
>> > Dom0 (caller) mm lock could be nested inside the PVH DomU (subject) mm
>> > locks, and the PVH DomU (caller) locks could be nested inside the HVM
>> > domain (subject) locks, but I don't see an operation where Dom0 mm
>> > lock could be nested inside of a PVH DomU lock that's already nested
>> > inside of the HVM DomU lock.
>> 
>> Well, if we're _sure_ this can't happen, then of course we also
>> don't need to deal with it.
> 
> So I've got a patch that adds a positive offset to lock priority when
> the lock owner is the current domain. This allows locking other
> domains (subject) mm locks and then take the current domain's mm locks.
> 
> There's one slight issue with the page sharing lock, that will always
> be locked as belonging to a subject domain, due to the fact that
> there's no clear owner of such lock. AFAICT some page-sharing routines
> are even run from the idle thread.
> 
> I'm attaching such patch below, note it keeps the same lock order as
> before, but allows taking the caller locks _only_ after having took
> some of the subject mm locks. Note this doesn't allow interleaved mm
> locking between the subject and the caller.
> 
> The nice part about this patch is that changes are mostly confined to
> mm-locks.h, there's only an extra change to xen/arch/x86/mm/p2m-pod.c.

Yes, this looks quite reasonable at the first glance. One thing I
don't understand is why you compare domain IDs instead of
struct domain pointers. The other this is that perhaps it would
be a good idea to factor out into a macro or inline function the
biasing construct.

> I am however not sure how to prove there are no valid paths that could
> require a different lock ordering, am I expected to check all possible
> code paths for mm lock ordering?

Well, this is a really good question. To do such an audit properly
would require quite a bit of time, so I think it would be unfair to
demand this of you. Yet without such a proof we risk running into
unforeseen issues with this sooner or later. I have no good idea.
George?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] x86: fix paging_log_dirty_op to work with paging guests
  2018-12-17 16:47                           ` Jan Beulich
@ 2018-12-17 16:58                             ` Roger Pau Monné
  0 siblings, 0 replies; 15+ messages in thread
From: Roger Pau Monné @ 2018-12-17 16:58 UTC (permalink / raw)
  To: Jan Beulich; +Cc: George Dunlap, Andrew Cooper, Tim Deegan, Wei Liu, xen-devel

On Mon, Dec 17, 2018 at 09:47:30AM -0700, Jan Beulich wrote:
> >>> On 17.12.18 at 16:42, <roger.pau@citrix.com> wrote:
> > On Fri, Dec 14, 2018 at 04:52:00AM -0700, Jan Beulich wrote:
> >> >>> On 14.12.18 at 12:45, <roger.pau@citrix.com> wrote:
> >> > On Fri, Dec 14, 2018 at 03:45:21AM -0700, Jan Beulich wrote:
> >> >> >>> On 14.12.18 at 11:03, <roger.pau@citrix.com> wrote:
> >> >> > I expect the interdomain locking as a result of using a paging caller
> >> >> > domain is going to be restricted to the p2m lock of the caller domain,
> >> >> > as a result of the usage of copy to/from helpers.
> >> >> > 
> >> >> > Maybe the less intrusive change would be to just allow locking the
> >> >> > caller p2m lock (that only lock) regardless of the subject domain lock
> >> >> > level?
> >> >> 
> >> >> With caller != subject, and with the lock level then being higher
> >> >> than all "normal" ones, this might be an option. But from the
> >> >> very beginning we should keep the transitive aspect here in
> >> >> mind: If Dom0 controls a PVH domain which controls a HVM one,
> >> >> the Dom0 p2m lock may also need allowing to nest inside the PVH
> >> >> DomU's one, so it'll be a total of two extra new lock levels even
> >> >> if we restrict this to the p2m locks.
> >> > 
> >> > I'm not sure I follow here, so far we have spoken about a subject and
> >> > a caller domain (subject being the target of the operation). In the
> >> > above scenario I see a relation between Dom0 and the PVH domain, and
> >> > between the PVH domain and the HVM one, but not a relation that
> >> > encompasses the three domains in terms of mm locking.
> >> > 
> >> > Dom0 (caller) mm lock could be nested inside the PVH DomU (subject) mm
> >> > locks, and the PVH DomU (caller) locks could be nested inside the HVM
> >> > domain (subject) locks, but I don't see an operation where Dom0 mm
> >> > lock could be nested inside of a PVH DomU lock that's already nested
> >> > inside of the HVM DomU lock.
> >> 
> >> Well, if we're _sure_ this can't happen, then of course we also
> >> don't need to deal with it.
> > 
> > So I've got a patch that adds a positive offset to lock priority when
> > the lock owner is the current domain. This allows locking other
> > domains (subject) mm locks and then take the current domain's mm locks.
> > 
> > There's one slight issue with the page sharing lock, that will always
> > be locked as belonging to a subject domain, due to the fact that
> > there's no clear owner of such lock. AFAICT some page-sharing routines
> > are even run from the idle thread.
> > 
> > I'm attaching such patch below, note it keeps the same lock order as
> > before, but allows taking the caller locks _only_ after having took
> > some of the subject mm locks. Note this doesn't allow interleaved mm
> > locking between the subject and the caller.
> > 
> > The nice part about this patch is that changes are mostly confined to
> > mm-locks.h, there's only an extra change to xen/arch/x86/mm/p2m-pod.c.
> 
> Yes, this looks quite reasonable at the first glance. One thing I
> don't understand is why you compare domain IDs instead of
> struct domain pointers.

This patches started as something else, where I planned to store the
domid, and I just reused it. Comparing domain pointers is indeed
better.

> The other this is that perhaps it would
> be a good idea to factor out into a macro or inline function the
> biasing construct.

Sure.

While there I might also add a pre-patch to convert __check_lock_level
and __set_lock_level into a function instead of a macro.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-12-17 16:58 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-12 14:54 [PATCH] x86: fix paging_log_dirty_op to work with paging guests Roger Pau Monne
2018-12-12 16:07 ` Jan Beulich
2018-12-13 11:39   ` Roger Pau Monné
2018-12-13 12:51     ` Jan Beulich
2018-12-13 14:14       ` Roger Pau Monné
2018-12-13 14:52         ` Jan Beulich
2018-12-13 15:34           ` Roger Pau Monné
2018-12-13 15:53             ` Jan Beulich
2018-12-14 10:03               ` Roger Pau Monné
2018-12-14 10:45                 ` Jan Beulich
2018-12-14 11:45                   ` Roger Pau Monné
2018-12-14 11:52                     ` Jan Beulich
2018-12-17 15:42                       ` Roger Pau Monné
     [not found]                         ` <4B192692020000B80063616D@prv1-mh.provo.novell.com>
2018-12-17 16:47                           ` Jan Beulich
2018-12-17 16:58                             ` Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.