xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: xen-devel <xen-devel@lists.xenproject.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Keir Fraser <keir@xen.org>,
	Paul Durrant <paul.durrant@citrix.com>,
	Chang Jianzhong <changjzh@gmail.com>
Subject: [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests
Date: Thu, 24 Mar 2016 05:28:56 -0600	[thread overview]
Message-ID: <56F3DD8802000078000E007F@prv-mh.provo.novell.com> (raw)
In-Reply-To: <56F3DBC502000078000E0060@prv-mh.provo.novell.com>

[-- Attachment #1: Type: text/plain, Size: 4975 bytes --]

Forwarding entire batches to the device model when an individual
iteration of them got rejected by internal device emulation handlers
with X86EMUL_UNHANDLEABLE is wrong: The device model would then handle
all iterations, without the internal handler getting to see any past
the one it returned failure for. This causes misbehavior in at least
the MSI-X and VGA code, which want to see all such requests for
internal tracking/caching purposes. But note that this does not apply
to buffered I/O requests.

This in turn means that the condition in hvm_process_io_intercept() of
when to crash the domain was wrong: Since X86EMUL_UNHANDLEABLE can
validly be returned by the individual device handlers, we mustn't
blindly crash the domain if such occurs on other than the initial
iteration. Instead we need to distinguish hvm_copy_*_guest_phys()
failures from device specific ones, and then the former need to always
be fatal to the domain (i.e. also on the first iteration), since
otherwise we again would end up forwarding a request to qemu which the
internal handler didn't get to see.

Also commit 4faffc41d ("x86/hvm: limit reps to avoid the need to handle
retry") went too far in removing code from hvm_process_io_intercept():
When there were successfully handled iterations, the function should
continue to return success with a clipped repeat count.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Chang Jianzhong <changjzh@gmail.com>
---
I assume this also addresses the issue which
http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg03189.html 
attempted to deal with in a not really acceptable way.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -95,7 +95,7 @@ static const struct hvm_io_handler null_
 };
 
 static int hvmemul_do_io(
-    bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
+    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
@@ -104,7 +104,7 @@ static int hvmemul_do_io(
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
-        .count = reps,
+        .count = *reps,
         .dir = dir,
         .df = df,
         .data = data,
@@ -136,7 +136,7 @@ static int hvmemul_do_io(
         if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
              (p.addr != addr) ||
              (p.size != size) ||
-             (p.count != reps) ||
+             (p.count != *reps) ||
              (p.dir != dir) ||
              (p.df != df) ||
              (p.data_is_ptr != data_is_addr) )
@@ -214,7 +214,7 @@ static int hvmemul_do_io_buffer(
 
     BUG_ON(buffer == NULL);
 
-    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
+    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
                        (uintptr_t)buffer);
     if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
         memset(buffer, 0xff, size);
@@ -305,13 +305,13 @@ static int hvmemul_do_io_addr(
         count = 1;
     }
 
-    rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
+    rc = hvmemul_do_io(is_mmio, addr, &count, size, dir, df, 1,
                        ram_gpa);
+
     if ( rc == X86EMUL_OKAY )
-    {
         v->arch.hvm_vcpu.hvm_io.mmio_retry = (count < *reps);
-        *reps = count;
-    }
+
+    *reps = count;
 
  out:
     while ( nr_pages )
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -148,7 +148,7 @@ int hvm_process_io_intercept(const struc
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
-                    rc = X86EMUL_UNHANDLEABLE;
+                    rc = -1; /* != any X86EMUL_* value. */
                     break;
                 }
                 if ( rc != X86EMUL_OKAY )
@@ -178,7 +178,7 @@ int hvm_process_io_intercept(const struc
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
-                    rc = X86EMUL_UNHANDLEABLE;
+                    rc = -1; /* != any X86EMUL_* value. */
                     break;
                 }
                 if ( rc != X86EMUL_OKAY )
@@ -196,8 +196,22 @@ int hvm_process_io_intercept(const struc
         }
     }
 
-    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
+    if ( unlikely(rc < 0) )
         domain_crash(current->domain);
+    else if ( i )
+    {
+        p->count = i;
+        rc = X86EMUL_OKAY;
+    }
+    else if ( rc == X86EMUL_UNHANDLEABLE )
+    {
+        /*
+         * Don't forward entire batches to the device model: This would
+         * prevent the internal handlers to see subsequent iterations of
+         * the request.
+         */
+        p->count = 1;
+    }
 
     return rc;
 }



[-- Attachment #2: x86-HVM-internal-IO-with-REP.patch --]
[-- Type: text/plain, Size: 5027 bytes --]

x86/HVM: fix forwarding of internally cached requests

Forwarding entire batches to the device model when an individual
iteration of them got rejected by internal device emulation handlers
with X86EMUL_UNHANDLEABLE is wrong: The device model would then handle
all iterations, without the internal handler getting to see any past
the one it returned failure for. This causes misbehavior in at least
the MSI-X and VGA code, which want to see all such requests for
internal tracking/caching purposes. But note that this does not apply
to buffered I/O requests.

This in turn means that the condition in hvm_process_io_intercept() of
when to crash the domain was wrong: Since X86EMUL_UNHANDLEABLE can
validly be returned by the individual device handlers, we mustn't
blindly crash the domain if such occurs on other than the initial
iteration. Instead we need to distinguish hvm_copy_*_guest_phys()
failures from device specific ones, and then the former need to always
be fatal to the domain (i.e. also on the first iteration), since
otherwise we again would end up forwarding a request to qemu which the
internal handler didn't get to see.

Also commit 4faffc41d ("x86/hvm: limit reps to avoid the need to handle
retry") went too far in removing code from hvm_process_io_intercept():
When there were successfully handled iterations, the function should
continue to return success with a clipped repeat count.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Chang Jianzhong <changjzh@gmail.com>
---
I assume this also addresses the issue which
http://lists.xenproject.org/archives/html/xen-devel/2016-01/msg03189.html
attempted to deal with in a not really acceptable way.

--- a/xen/arch/x86/hvm/emulate.c
+++ b/xen/arch/x86/hvm/emulate.c
@@ -95,7 +95,7 @@ static const struct hvm_io_handler null_
 };
 
 static int hvmemul_do_io(
-    bool_t is_mmio, paddr_t addr, unsigned long reps, unsigned int size,
+    bool_t is_mmio, paddr_t addr, unsigned long *reps, unsigned int size,
     uint8_t dir, bool_t df, bool_t data_is_addr, uintptr_t data)
 {
     struct vcpu *curr = current;
@@ -104,7 +104,7 @@ static int hvmemul_do_io(
         .type = is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO,
         .addr = addr,
         .size = size,
-        .count = reps,
+        .count = *reps,
         .dir = dir,
         .df = df,
         .data = data,
@@ -136,7 +136,7 @@ static int hvmemul_do_io(
         if ( (p.type != is_mmio ? IOREQ_TYPE_COPY : IOREQ_TYPE_PIO) ||
              (p.addr != addr) ||
              (p.size != size) ||
-             (p.count != reps) ||
+             (p.count != *reps) ||
              (p.dir != dir) ||
              (p.df != df) ||
              (p.data_is_ptr != data_is_addr) )
@@ -214,7 +214,7 @@ static int hvmemul_do_io_buffer(
 
     BUG_ON(buffer == NULL);
 
-    rc = hvmemul_do_io(is_mmio, addr, *reps, size, dir, df, 0,
+    rc = hvmemul_do_io(is_mmio, addr, reps, size, dir, df, 0,
                        (uintptr_t)buffer);
     if ( rc == X86EMUL_UNHANDLEABLE && dir == IOREQ_READ )
         memset(buffer, 0xff, size);
@@ -305,13 +305,13 @@ static int hvmemul_do_io_addr(
         count = 1;
     }
 
-    rc = hvmemul_do_io(is_mmio, addr, count, size, dir, df, 1,
+    rc = hvmemul_do_io(is_mmio, addr, &count, size, dir, df, 1,
                        ram_gpa);
+
     if ( rc == X86EMUL_OKAY )
-    {
         v->arch.hvm_vcpu.hvm_io.mmio_retry = (count < *reps);
-        *reps = count;
-    }
+
+    *reps = count;
 
  out:
     while ( nr_pages )
--- a/xen/arch/x86/hvm/intercept.c
+++ b/xen/arch/x86/hvm/intercept.c
@@ -148,7 +148,7 @@ int hvm_process_io_intercept(const struc
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
-                    rc = X86EMUL_UNHANDLEABLE;
+                    rc = -1; /* != any X86EMUL_* value. */
                     break;
                 }
                 if ( rc != X86EMUL_OKAY )
@@ -178,7 +178,7 @@ int hvm_process_io_intercept(const struc
                     ASSERT_UNREACHABLE();
                     /* fall through */
                 default:
-                    rc = X86EMUL_UNHANDLEABLE;
+                    rc = -1; /* != any X86EMUL_* value. */
                     break;
                 }
                 if ( rc != X86EMUL_OKAY )
@@ -196,8 +196,22 @@ int hvm_process_io_intercept(const struc
         }
     }
 
-    if ( i != 0 && rc == X86EMUL_UNHANDLEABLE )
+    if ( unlikely(rc < 0) )
         domain_crash(current->domain);
+    else if ( i )
+    {
+        p->count = i;
+        rc = X86EMUL_OKAY;
+    }
+    else if ( rc == X86EMUL_UNHANDLEABLE )
+    {
+        /*
+         * Don't forward entire batches to the device model: This would
+         * prevent the internal handlers to see subsequent iterations of
+         * the request.
+         */
+        p->count = 1;
+    }
 
     return rc;
 }

[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-03-24 11:29 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-24 11:21 [PATCH 0/3] x86/HVM: adjustments to internal device emulation Jan Beulich
2016-03-24 11:28 ` Jan Beulich [this message]
2016-03-24 11:49   ` [PATCH 1/3] x86/HVM: fix forwarding of internally cached requests Paul Durrant
2016-03-24 12:01     ` Jan Beulich
2016-03-24 12:11       ` Paul Durrant
2016-03-24 12:52     ` Jan Beulich
2016-03-24 12:58       ` Paul Durrant
2016-03-24 11:29 ` [PATCH 2/3] x86/vLAPIC: vlapic_reg_write() can't fail Jan Beulich
2016-03-24 14:03   ` Andrew Cooper
2016-03-24 11:30 ` [PATCH 3/3] x86/HVM: terminate writes to PM_TMR port Jan Beulich
2016-03-24 14:08   ` Andrew Cooper

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F3DD8802000078000E007F@prv-mh.provo.novell.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=changjzh@gmail.com \
    --cc=keir@xen.org \
    --cc=paul.durrant@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).