All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] xSplice v1 implementation.
@ 2016-01-14 21:46 Konrad Rzeszutek Wilk
  2016-01-14 21:46 ` [PATCH v2 01/13] xsplice: Design document (v5) Konrad Rzeszutek Wilk
                   ` (13 more replies)
  0 siblings, 14 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:46 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

Changelog (since the RFC and the Seattle Xen presentation)
 - Finished off some of the work around the build-id.
 - Settled on the preemption mechanism.
 - Cleaned the patches a lot up, broke them up to easy
   review for maintainers.
v1 (http://lists.xenproject.org/archives/html/xen-devel/2015-09/msg02116.html)
  - Put all the design comments in the code
Prototype: (http://lists.xenproject.org/archives/html/xen-devel/2015-10/msg02595.html)
[Posting by Ross]
 - Took all reviews into account.
 - Redid the patches per review comments, added some extra code, etc.


*What is xSplice?*

A mechanism to binarily patch the running hypervisor with new
opcodes that have come about due to primarily security updates.

*What will this patchset do once I've it*

Patch the hypervisor.

*Why are you emailing me?*

Please please review the patches. The first three are the foundation of the
design and everything else depends on them.

*OK, what do you have?*

They are located at a git tree:
  git://xenbits.xen.org/people/konradwilk/xen.git xsplice.v2

There are a lot more patches after this - that implement more code
(see design document and the v2 milestone) but I do not want to
overwhelm the reviewers with 40+ patches so taking it easy and doing
it in waves.

(Copying from Ross's email):

Much of the work is implementing a basic version of the Linux kernel module
loader. The code:
* Loading of xSplice ELF payloads.
* Copying allocated sections into a new executable region of memory.
* Resolving symbols.
* Applying relocations.
* Patching of altinstructions.
* Special handling of bug frames and exception tables.
* Unloading of xSplice ELF payloads.
* Compiling a sample xSplice ELF payload (*NEW*)

The other main bit of this work is applying and reverting the patches safely.
As implemented, the code is patched with each CPU waiting in the
return-to-guest path (i.e. with no stack) or on the cpu-idle path
which appears to be the safest way of patching. While it is safe we should
still (in the next wave of patches) to verify to not patch cetain critical
sections (say the code doing the patching)

All of the following should work:
* Applying patches safely.
* Reverting patches safely.
* Replacing patches safely (e.g. reverting any applied patches and applying
   a new patch).
* Bug frames as part of modules. This means adding or
  changing WARN, ASSERT, BUG, and run_in_exception_handler works correctly.
  Line number only changes _are ignored_.
* Exception tables as part of modules. E.g. wrmsr_safe and copy_to_user work
  correctly when used in a patch module.


*Limitations*

The above is enough to fully implement an update system where multiple source
patches are combined (using combinediff) and built into a single binary
which then atomically replaces any existing loaded patches
(this is why Ross added a REPLACE operation). This is the approach used
by kPatch and kGraft.

Multiple completely independent patches can also be loaded but unexpected
interactions may occur.

As it stands, the patches are statically linked which means that independent
patches cannot be linked against one another (e.g. if one introduces a
new symbol). Using the combinediff approach above fixes this.

Backtraces containing functions from a patch module do not show the symbol name.

There is no checking that a patch which is loaded is built for the
correct hypervisor (need to use build-id). That would be done in another
"wave" of patches.

Binary patching works at the function level.

*Testing*

You can use the example code included in this patchset:

# xl info | grep extra
xen_extra              : -unstable
# xen-xsplice load /usr/lib/xen/bin/xen_hello_world.xsplice
/usr/lib/xen/bin/xen_hello_world.xsplice xen_hello_world /usr/lib/xen/bin/xen_hello_world.xsplice
Uploading /usr/lib/xen/bin/xen_hello_world.xsplice (2071 bytes)
Performing check: completed
Performing apply:. completed
# xl info | grep extra
xen_extra              : Hello World
# xen-xsplice revert xen_hello_world
Performing revert:. completed
# xen-xsplice unload xen_hello_world
Performing unload: completed
# xl info | grep extra
xen_extra              : -unstable


Or you can use git://github.com/rosslagerwall/xsplice-build.git tool
(it will need an extra patch, will send that shortly) - which
generates the ELF payloads.

This link has a nice description of how to use the tool:
http://lists.xenproject.org/archives/html/xen-devel/2015-10/msg02595.html


Thank you for reading down to here :-)

 .gitignore                                   |    1 +
 docs/misc/xsplice.markdown                   | 1043 ++++++++++++++++++++++++
 tools/flask/policy/policy/modules/xen/xen.te |    1 +
 tools/libxc/include/xenctrl.h                |   18 +
 tools/libxc/xc_misc.c                        |  284 +++++++
 tools/misc/Makefile                          |   29 +-
 tools/misc/xen-xsplice.c                     |  475 +++++++++++
 tools/misc/xen_hello_world.c                 |   15 +
 tools/misc/xsplice.h                         |   12 +
 tools/misc/xsplice.lds                       |   11 +
 xen/arch/arm/Kconfig                         |    1 +
 xen/arch/arm/Makefile                        |    1 +
 xen/arch/arm/xsplice.c                       |   31 +
 xen/arch/x86/Kconfig                         |    1 +
 xen/arch/x86/Makefile                        |    3 +-
 xen/arch/x86/alternative.c                   |   12 +-
 xen/arch/x86/domain.c                        |    4 +
 xen/arch/x86/extable.c                       |   36 +-
 xen/arch/x86/hvm/svm/svm.c                   |    2 +
 xen/arch/x86/hvm/vmx/vmcs.c                  |    2 +
 xen/arch/x86/setup.c                         |    7 +
 xen/arch/x86/traps.c                         |   30 +-
 xen/arch/x86/xsplice.c                       |  125 +++
 xen/common/Kconfig                           |   14 +
 xen/common/Makefile                          |    3 +
 xen/common/symbols.c                         |    7 +
 xen/common/sysctl.c                          |    8 +
 xen/common/xsplice.c                         | 1089 ++++++++++++++++++++++++++
 xen/common/xsplice_elf.c                     |  285 +++++++
 xen/include/asm-arm/bug.h                    |    2 +
 xen/include/asm-arm/config.h                 |    2 +
 xen/include/asm-arm/nmi.h                    |   13 +
 xen/include/asm-x86/alternative.h            |    1 +
 xen/include/asm-x86/bug.h                    |    2 +
 xen/include/asm-x86/uaccess.h                |    5 +
 xen/include/asm-x86/x86_64/page.h            |    2 +
 xen/include/public/sysctl.h                  |  156 ++++
 xen/include/xen/elfstructs.h                 |    8 +
 xen/include/xen/kernel.h                     |    1 +
 xen/include/xen/keyhandler.h                 |    1 +
 xen/include/xen/xsplice.h                    |   63 ++
 xen/include/xen/xsplice_elf.h                |   42 +
 xen/xsm/flask/hooks.c                        |    6 +
 xen/xsm/flask/policy/access_vectors          |    2 +
 44 files changed, 3824 insertions(+), 32 deletions(-)


Konrad Rzeszutek Wilk (6):
      xsplice: Design document (v5).
      hypervisor/arm/keyhandler: Declare struct cpu_user_regs;
      xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7)
      libxc: Implementation of XEN_XSPLICE_op in libxc (v4).
      xen-xsplice: Tool to manipulate xsplice payloads (v3)
      xen_hello_world.xsplice: Test payload for patching 'xen_extra_version'.

Ross Lagerwall (7):
      elf: Add relocation types to elfstructs.h
      xsplice: Add helper elf routines (v2)
      xsplice: Implement payload loading (v2)
      xsplice: Implement support for applying/reverting/replacing patches. (v2)
      xsplice: Add support for bug frames. (v2)
      xsplice: Add support for exception tables. (v2)
      xsplice: Add support for alternatives

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v2 01/13] xsplice: Design document (v5).
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
@ 2016-01-14 21:46 ` Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
                     ` (2 more replies)
  2016-01-14 21:47 ` [PATCH v2 02/13] hypervisor/arm/keyhandler: Declare struct cpu_user_regs; Konrad Rzeszutek Wilk
                   ` (12 subsequent siblings)
  13 siblings, 3 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:46 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

A mechanism is required to binarily patch the running hypervisor with new
opcodes that have come about due to primarily security updates.

This document describes the design of the API that would allow us to
upload to the hypervisor binary patches.

This document has been shaped by the input from:
  Martin Pohlack <mpohlack@amazon.de>
  Jan Beulich <jbeulich@suse.com>

Thank you!

Input-from: Martin Pohlack <mpohlack@amazon.de>
Input-from: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
---
v1-2: review
v3: Split document in v1 and v2 (todo) to simplify implementation goals.
v4: Add const on some structures. Truncate size to uint16_t where it makes sense.
v5: Convert 'id' to 'name', Add Ross's comments about what is implemented.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 docs/misc/xsplice.markdown | 993 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 993 insertions(+)
 create mode 100644 docs/misc/xsplice.markdown

diff --git a/docs/misc/xsplice.markdown b/docs/misc/xsplice.markdown
new file mode 100644
index 0000000..beb452e
--- /dev/null
+++ b/docs/misc/xsplice.markdown
@@ -0,0 +1,993 @@
+# xSplice Design v1
+
+## Rationale
+
+A mechanism is required to binarily patch the running hypervisor with new
+opcodes that have come about due to primarily security updates.
+
+This document describes the design of the API that would allow us to
+upload to the hypervisor binary patches.
+
+The document is split in four sections:
+
+ * Detailed descriptions of the problem statement.
+ * Design of the data structures.
+ * Design of the hypercalls.
+ * Implementation notes that should be taken into consideration.
+
+
+## Glossary
+
+ * splice - patch in the binary code with new opcodes
+ * trampoline - a jump to a new instruction.
+ * payload - telemetries of the old code along with binary blob of the new
+   function (if needed).
+ * reloc - telemetries contained in the payload to construct proper trampoline.
+
+## History
+
+The document has gone under various reviews and only covers v1 design.
+
+The end of the document has a section titled `Not Yet Done` which
+outlines ideas and design for the v2 version of this work.
+
+## Multiple ways to patch
+
+The mechanism needs to be flexible to patch the hypervisor in multiple ways
+and be as simple as possible. The compiled code is contiguous in memory with
+no gaps - so we have no luxury of 'moving' existing code and must either
+insert a trampoline to the new code to be executed - or only modify in-place
+the code if there is sufficient space. The placement of new code has to be done
+by hypervisor and the virtual address for the new code is allocated dynamically.
+
+This implies that the hypervisor must compute the new offsets when splicing
+in the new trampoline code. Where the trampoline is added (inside
+the function we are patching or just the callers?) is also important.
+
+To lessen the amount of code in hypervisor, the consumer of the API
+is responsible for identifying which mechanism to employ and how many locations
+to patch. Combinations of modifying in-place code, adding trampoline, etc
+has to be supported. The API should allow read/write any memory within
+the hypervisor virtual address space.
+
+We must also have a mechanism to query what has been applied and a mechanism
+to revert it if needed.
+
+## Workflow
+
+The expected workflows of higher-level tools that manage multiple patches
+on production machines would be:
+
+ * The first obvious task is loading all available / suggested
+   hotpatches around system start.
+ * Whenever new hotpatches are installed, they should be loaded too.
+ * One wants to query which modules have been loaded at runtime.
+ * If unloading is deemed safe (see unloading below), one may want to
+   support a workflow where a specific hotpatch is marked as bad and
+   unloaded.
+ * If we do no restrict module activation order and want to report tboot
+   state on sequences, we might have a complexity explosion problem, in
+   what system hashes should be considered acceptable.
+
+## Patching code
+
+The first mechanism to patch that comes in mind is in-place replacement.
+That is replace the affected code with new code. Unfortunately the x86
+ISA is variable size which places limits on how much space we have available
+to replace the instructions. That is not a problem if the change is smaller
+than the original opcode and we can fill it with nops. Problems will
+appear if the replacement code is longer.
+
+The second mechanism is by replacing the call or jump to the
+old function with the address of the new function.
+
+A third mechanism is to add a jump to the new function at the
+start of the old function. N.B. The Xen hypervisor implements the third
+mechanism.
+
+### Example of trampoline and in-place splicing
+
+As example we will assume the hypervisor does not have XSA-132 (see
+*domctl/sysctl: don't leak hypervisor stack to toolstacks*
+4ff3449f0e9d175ceb9551d3f2aecb59273f639d) and we would like to binary patch
+the hypervisor with it. The original code looks as so:
+
+<pre>
+   48 89 e0                  mov    %rsp,%rax  
+   48 25 00 80 ff ff         and    $0xffffffffffff8000,%rax  
+</pre>
+
+while the new patched hypervisor would be:
+
+<pre>
+   48 c7 45 b8 00 00 00 00   movq   $0x0,-0x48(%rbp)  
+   48 c7 45 c0 00 00 00 00   movq   $0x0,-0x40(%rbp)  
+   48 c7 45 c8 00 00 00 00   movq   $0x0,-0x38(%rbp)  
+   48 89 e0                  mov    %rsp,%rax  
+   48 25 00 80 ff ff         and    $0xffffffffffff8000,%rax  
+</pre>
+
+This is inside the arch_do_domctl. This new change adds 21 extra
+bytes of code which alters all the offsets inside the function. To alter
+these offsets and add the extra 21 bytes of code we might not have enough
+space in .text to squeeze this in.
+
+As such we could simplify this problem by only patching the site
+which calls arch_do_domctl:
+
+<pre>
+<do_domctl>:  
+ e8 4b b1 05 00          callq  ffff82d08015fbb9 <arch_do_domctl>  
+</pre>
+
+with a new address for where the new `arch_do_domctl` would be (this
+area would be allocated dynamically).
+
+Astute readers will wonder what we need to do if we were to patch `do_domctl`
+- which is not called directly by hypervisor but on behalf of the guests via
+the `compat_hypercall_table` and `hypercall_table`.
+Patching the offset in `hypercall_table` for `do_domctl:
+(ffff82d080103079 <do_domctl>:)
+<pre>
+
+ ffff82d08024d490:   79 30  
+ ffff82d08024d492:   10 80 d0 82 ff ff   
+
+</pre>
+with the new address where the new `do_domctl` is possible. The other
+place where it is used is in `hvm_hypercall64_table` which would need
+to be patched in a similar way. This would require an in-place splicing
+of the new virtual address of `arch_do_domctl`.
+
+In summary this example patched the callee of the affected function by
+ * allocating memory for the new code to live in,
+ * changing the virtual address in all the functions which called the old
+   code (computing the new offset, patching the callq with a new callq).
+ * changing the function pointer tables with the new virtual address of
+   the function (splicing in the new virtual address). Since this table
+   resides in the .rodata section we would need to temporarily change the
+   page table permissions during this part.
+
+
+However it has severe drawbacks - the safety checks which have to make sure
+the function is not on the stack - must also check every caller. For some
+patches this could mean - if there were an sufficient large amount of
+callers - that we would never be able to apply the update.
+
+### Example of different trampoline patching.
+
+An alternative mechanism exists where we can insert a trampoline in the
+existing function to be patched to jump directly to the new code. This
+lessens the locations to be patched to one but it puts pressure on the
+CPU branching logic (I-cache, but it is just one unconditional jump).
+
+For this example we will assume that the hypervisor has not been compiled
+with fe2e079f642effb3d24a6e1a7096ef26e691d93e (XSA-125: *pre-fill structures
+for certain HYPERVISOR_xen_version sub-ops*) which mem-sets an structure
+in `xen_version` hypercall. This function is not called **anywhere** in
+the hypervisor (it is called by the guest) but referenced in the
+`compat_hypercall_table` and `hypercall_table` (and indirectly called
+from that). Patching the offset in `hypercall_table` for the old
+`do_xen_version` (ffff82d080112f9e <do_xen_version>)
+
+</pre>
+ ffff82d08024b270 <hypercall_table>  
+ ...  
+ ffff82d08024b2f8:   9e 2f 11 80 d0 82 ff ff  
+
+</pre>
+with the new address where the new `do_xen_version` is possible. The other
+place where it is used is in `hvm_hypercall64_table` which would need
+to be patched in a similar way. This would require an in-place splicing
+of the new virtual address of `do_xen_version`.
+
+An alternative solution would be to patch insert a trampoline in the
+old `do_xen_version' function to directly jump to the new `do_xen_version`.
+
+<pre>
+ ffff82d080112f9e <do_xen_version>:  
+ ffff82d080112f9e:       48 c7 c0 da ff ff ff    mov    $0xffffffffffffffda,%rax  
+ ffff82d080112fa5:       83 ff 09                cmp    $0x9,%edi  
+ ffff82d080112fa8:       0f 87 24 05 00 00       ja     ffff82d0801134d2 <do_xen_version+0x534>  
+</pre>
+
+with:
+
+<pre>
+ ffff82d080112f9e <do_xen_version>:  
+ ffff82d080112f9e:       e9 XX YY ZZ QQ          jmpq   [new do_xen_version]  
+</pre>
+
+which would lessen the amount of patching to just one location.
+
+In summary this example patched the affected function to jump to the
+new replacement function which required:
+ * allocating memory for the new code to live in,
+ * inserting trampoline with new offset in the old function to point to the
+   new function.
+ * Optionally we can insert in the old function a trampoline jump to an function
+   providing an BUG_ON to catch errant code.
+
+The disadvantage of this are that the unconditional jump will consume a small
+I-cache penalty. However the simplicity of the patching and higher chance
+of passing safety checks make this a worthwhile option.
+
+### Security
+
+With this method we can re-write the hypervisor - and as such we **MUST** be
+diligent in only allowing certain guests to perform this operation.
+
+Furthermore with SecureBoot or tboot, we **MUST** also verify the signature
+of the payload to be certain it came from a trusted source and integrity
+was intact.
+
+As such the hypercall **MUST** support an XSM policy to limit what the guest
+is allowed to invoke. If the system is booted with signature checking the
+signature checking will be enforced.
+
+## Design of payload format
+
+The payload **MUST** contain enough data to allow us to apply the update
+and also safely reverse it. As such we **MUST** know:
+
+ * The locations in memory to be patched. This can be determined dynamically
+   via symbols or via virtual addresses.
+ * The new code that will be patched in.
+ * Signature to verify the payload.
+
+This binary format can be constructed using an custom binary format but
+there are severe disadvantages of it:
+
+ * The format might need to be changed and we need an mechanism to accommodate
+   that.
+ * It has to be platform agnostic.
+ * Easily constructed using existing tools.
+
+As such having the payload in an ELF file is the sensible way. We would be
+carrying the various sets of structures (and data) in the ELF sections under
+different names and with definitions. The prefix for the ELF section name
+would always be: *.xsplice* to match up to the names of the structures.
+
+Note that every structure has padding. This is added so that the hypervisor
+can re-use those fields as it sees fit.
+
+Earlier design attempted to ineptly explain the relations of the ELF sections
+to each other without using proper ELF mechanism (sh_info, sh_link, data
+structures using Elf types, etc). This design will explain the structures
+and how they are used together and not dig in the ELF format - except mention
+that the section names should match the structure names.
+
+The xSplice payload is a relocatable ELF binary. A typical binary would have:
+
+ * One or more .text sections.
+ * Zero or more read-only data sections.
+ * Zero or more data sections.
+ * Relocations for each of these sections.
+
+It may also have some architecture-specific sections. For example:
+
+ * Alternatives instructions.
+ * Bug frames.
+ * Exception tables.
+ * Relocations for each of these sections.
+
+The xSplice core code loads the payload as a standard ELF binary, relocates it
+and handles the architecture-specifc sections as needed. This process is much
+like what the Linux kernel module loader does.
+
+The payload contains a section (xsplice_patch_func) with an array of structures
+describing the functions to be patched:
+<pre>
+struct xsplice_patch_func {  
+    const char *name;  
+    unsigned long new_addr;  
+    const unsigned long old_addr;  
+    uint32_t new_size;  
+    const uint32_t long old_size;  
+    uint8_t pad[32];  
+};  
+</pre>
+
+The size of the structure is 64 bytes.
+
+* `name` is the symbol name of the old function. Only used if `old_addr` is
+   zero, otherwise will be used during dynamic linking (when hypervisor loads
+   the payload).
+
+* `old_addr` is the address of the function to be patched and is filled in at
+  payload generation time if hypervisor function address is known. If unknown,
+  the value *MUST* be zero and the hypervisor will attempt to resolve the address.
+
+* `new_addr` is the address of the function that is replacing the old
+  function. The address is filled in during relocation. The value **MUST** be
+  the address of the new function in the file.
+
+* `old_size` and `new_size` contain the sizes of the respective functions in bytes.
+   The value **MUST** not be zero.
+
+* `pad` **MUST** be zero.
+
+The size of the `xsplice_patch_func` array is determined from the ELF section
+size.
+
+When applying the patch the hypervisor iterates over each `xsplice_patch_func`
+structure and the core code inserts a trampoline at `old_addr` to `new_addr`.
+
+When reverting a patch, the hypervisor iterates over each `xsplice_patch_func`
+and the core code copies the data from the undo buffer (private internal copy)
+to `old_addr`.
+
+## Hypercalls
+
+We will employ the sub operations of the system management hypercall (sysctl).
+There are to be four sub-operations:
+
+ * upload the payloads.
+ * listing of payloads summary uploaded and their state.
+ * getting an particular payload summary and its state.
+ * command to apply, delete, or revert the payload.
+
+Most of the actions are asynchronous therefore the caller is responsible
+to verify that it has been applied properly by retrieving the summary of it
+and verifying that there are no error codes associated with the payload.
+
+We **MUST** make some of them asynchronous due to the nature of patching
+it requires every physical CPU to be lock-step with each other.
+The patching mechanism while an implementation detail, is not an short
+operation and as such the design **MUST** assume it will be an long-running
+operation.
+
+The sub-operations will spell out how preemption is to be handled (if at all).
+
+Furthermore it is possible to have multiple different payloads for the same
+function. As such an unique id per payload has to be visible to allow proper manipulation.
+
+The hypercall is part of the `xen_sysctl`. The top level structure contains
+one uint32_t to determine the sub-operations and one padding field which
+*MUST* always be zero.
+
+<pre>
+struct xen_sysctl_xsplice_op {  
+    uint32_t cmd;                   /* IN: XEN_SYSCTL_XSPLICE_*. */  
+    uint32_t pad;                   /* IN: Always zero. */  
+	union {  
+          ... see below ...  
+        } u;  
+};  
+
+</pre>
+while the rest of hypercall specific structures are part of the this structure.
+
+### Basic type: struct xen_xsplice_id
+
+Most of the hypercalls employ an shared structure called `struct xen_xsplice_id`
+which contains:
+
+ * `name` - pointer where the string for the id is located.
+ * `size` - the size of the string
+ * `pad` - padding - to be zero.
+
+The structure is as follow:
+
+<pre>
+#define XEN_XSPLICE_NAME_SIZE 128  
+struct xen_xsplice_id {  
+    XEN_GUEST_HANDLE_64(char) name;         /* IN, pointer to name. */  
+    uint16_t size;                          /* IN, size of name. May be upto   
+                                               XEN_XSPLICE_NAME_SIZE. */  
+    uint16_t pad[3];                        /* IN: MUST be zero. */ 
+};  
+</pre>
+
+### XEN_SYSCTL_XSPLICE_UPLOAD (0)
+
+Upload a payload to the hypervisor. The payload is verified
+against basic checks and if there are any issues the proper return code
+will be returned. The payload is not applied at this time - that is
+controlled by *XEN_SYSCTL_XSPLICE_ACTION*.
+
+The caller provides:
+
+ * A `struct xen_xsplice_id` called `id` which has the unique id.
+ * `size` the size of the ELF payload (in bytes).
+ * `payload` the virtual address of where the ELF payload is.
+
+The `id` could be an UUID that stays fixed forever for a given
+payload. It can be embedded into the ELF payload at creation time
+and extracted by tools.
+
+The return value is zero if the payload was succesfully uploaded.
+Otherwise an XEN_EXX return value is provided. Duplicate `id` are not supported.
+
+The `payload` is the ELF payload as mentioned in the `Payload format` section.
+
+The structure is as follow:
+
+<pre>
+struct xen_sysctl_xsplice_upload {  
+    xen_xsplice_id_t id;                /* IN, name of the patch. */  
+    uint64_t size;                      /* IN, size of the ELF file. */  
+    XEN_GUEST_HANDLE_64(uint8) payload; /* IN: ELF file. */  
+};  
+</pre>
+
+### XEN_SYSCTL_XSPLICE_GET (1)
+
+Retrieve an status of an specific payload. This caller provides:
+
+ * A `struct xen_xsplice_id` called `id` which has the unique id.
+ * A `struct xen_xsplice_status` structure which has all members
+   set to zero: That is:
+   * `status` *MUST* be set to zero.
+   * `rc` *MUST* be set to zero.
+
+Upon completion the `struct xen_xsplice_status` is updated.
+
+ * `status` - whether it has been:
+   * *XSPLICE_STATUS_LOADED* (1) has been loaded.
+   * *XSPLICE_STATUS_CHECKED*  (2) the ELF payload safety checks passed.
+   * *XSPLICE_STATUS_APPLIED* (3) loaded, checked, and applied.
+   *  No other value is possible.
+ * `rc` - XEN_EXX type errors encountered while performing the `status`
+   operation. The normal values can be zero or XEN_EAGAIN which
+   respectively mean: success or operation in progress. Other values
+   imply an error occurred.
+
+The return value of the hypercall is zero on success and XEN_EXX on failure.
+(Note that the `rc`` value can be different from the return value, as in
+rc=XEN_EAGAIN and return value can be 0).
+
+This operation is synchronous and does not require preemption.
+
+The structure is as follow:
+
+<pre>
+struct xen_xsplice_status {  
+#define XSPLICE_STATUS_LOADED       1  
+#define XSPLICE_STATUS_CHECKED      2  
+#define XSPLICE_STATUS_APPLIED      3  
+    int32_t state;                  /* OUT: XSPLICE_STATE_*. IN: MUST be zero. */  
+    int32_t rc;                     /* OUT: 0 if no error, otherwise -XEN_EXX. */  
+                                    /* IN: MUST be zero. */
+};  
+
+struct xen_sysctl_xsplice_summary {  
+    xen_xsplice_id_t id;            /* IN, the name of the payload. */  
+    xen_xsplice_status_t status;    /* IN/OUT: status of the payload. */  
+};  
+</pre>
+
+### XEN_SYSCTL_XSPLICE_LIST (2)
+
+Retrieve an array of abbreviated status and names of payloads that are loaded in the
+hypervisor.
+
+The caller provides:
+
+ * `version`. Initially (on first hypercall) *MUST* be zero.
+ * `idx` index iterator. On first call *MUST* be zero, subsequent calls varies.
+ * `nr` the max number of entries to populate.
+ * `pad` - *MUST* be zero.
+ * `status` virtual address of where to write `struct xen_xsplice_status`
+   structures. Caller *MUST* allocate up to `nr` of them.
+ * `id` - virtual address of where to write the unique id of the payload.
+   Caller *MUST* allocate up to `nr` of them. Each *MUST* be of
+   **XEN_XSPLICE_NAME_SIZE** size.
+ * `len` - virtual address of where to write the length of each unique id
+   of the payload. Caller *MUST* allocate up to `nr` of them. Each *MUST* be
+   of sizeof(uint32_t) (4 bytes).
+
+If the hypercall returns an positive number, it is the number (up to `nr`)
+of the payloads returned, along with `nr` updated with the number of remaining
+payloads, `version` updated (it may be the same across hypercalls. If it
+varies the data is stale and further calls could fail). The `status`,
+`id`, and `len`' are updated at their designed index value (`idx`) with
+the returned value of data.
+
+If the hypercall returns E2BIG the `count` is too big and should be
+lowered.
+
+This operation can be preempted by the hypercall returning XEN_EAGAIN.
+Retry.
+
+Note that due to the asynchronous nature of hypercalls the control domain might
+have added or removed a number of payloads making this information stale. It is
+the responsibility of the toolstack to use the `version` field to check
+between each invocation. if the version differs it should discard the stale
+data and start from scratch. It is OK for the toolstack to use the new
+`version` field.
+
+The `struct xen_xsplice_status` structure contains an status of payload which includes:
+
+ * `status` - whether it has been:
+   * *XSPLICE_STATUS_LOADED* (1) has been loaded.
+   * *XSPLICE_STATUS_CHECKED*  (2) the ELF payload safety checks passed.
+   * *XSPLICE_STATUS_APPLIED* (3) loaded, checked, and applied.
+ * `rc` - XEN_EXX type errors encountered while performing the `status`
+   operation. The expected values are zero or XEN_EAGAIN which
+   respectively mean: success or operation in progress.
+
+The structure is as follow:
+
+<pre>
+struct xen_sysctl_xsplice_list {  
+    uint32_t version;                       /* IN/OUT: Initially *MUST* be zero.  
+                                               On subsequent calls reuse value.  
+                                               If varies between calls, we are  
+                                             * getting stale data. */  
+    uint32_t idx;                           /* IN/OUT: Index into array. */  
+    uint32_t nr;                            /* IN: How many status, id, and len  
+                                               should fill out.  
+                                               OUT: How many payloads left. */  
+    uint32_t pad;                           /* IN: Must be zero. */  
+    XEN_GUEST_HANDLE_64(xen_xsplice_status_t) status;  /* OUT. Must have enough  
+                                               space allocate for n of them. */  
+    XEN_GUEST_HANDLE_64(char) id;           /* OUT: Array of ids. Each member  
+                                               MUST XEN_XSPLICE_NAME_SIZE in size.  
+                                               Must have n of them. */  
+    XEN_GUEST_HANDLE_64(uint32) len;        /* OUT: Array of lengths of ids.  
+                                               Must have n of them. */  
+};  
+</pre>
+
+### XEN_SYSCTL_XSPLICE_ACTION (3)
+
+Perform an operation on the payload structure referenced by the `id` field.
+The operation request is asynchronous and the status should be retrieved
+by using either **XEN_SYSCTL_XSPLICE_GET** or **XEN_SYSCTL_XSPLICE_LIST** hypercall.
+
+The caller provides:
+
+ * A 'struct xen_xsplice_id` `id` containing the unique id.
+ * `cmd` the command requested:
+  * *XSPLICE_ACTION_CHECK* (1) check that the payload will apply properly.
+    This also verfies the payload - which may require SecureBoot firmware
+    calls.
+  * *XSPLICE_ACTION_UNLOAD* (2) unload the payload.
+   Any further hypercalls against the `id` will result in failure unless
+   **XEN_SYSCTL_XSPLICE_UPLOAD** hypercall is perfomed with same `id`.
+  * *XSPLICE_ACTION_REVERT* (3) revert the payload. If the operation takes
+  more time than the upper bound of time the `status` will XEN_EBUSY.
+  * *XSPLICE_ACTION_APPLY* (4) apply the payload. If the operation takes
+  more time than the upper bound of time the `status` will be XEN_EBUSY.
+  * *XSPLICE_ACTION_REPLACE* (5) revert all applied payloads and apply this
+  payload.
+  * *XSPLICE_ACTION_LOADED* is an initial state and cannot be requested.
+ * `time` the upper bound of time (ms) the cmd should take. Zero means infinite.
+   If within the time the operation does not succeed the operation would go in
+   error state.
+ * `pad` - *MUST* be zero.
+
+The return value will be zero unless the provided fields are incorrect.
+
+The structure is as follow:
+
+<pre>
+#define XSPLICE_ACTION_CHECK   1  
+#define XSPLICE_ACTION_UNLOAD  2  
+#define XSPLICE_ACTION_REVERT  3  
+#define XSPLICE_ACTION_APPLY   4  
+#define XSPLICE_ACTION_REPLACE 5  
+struct xen_sysctl_xsplice_action {  
+    xen_xsplice_id_t id;                    /* IN, name of the patch. */  
+    uint32_t cmd;                           /* IN: XSPLICE_ACTION_* */  
+    uint32_t time;                          /* IN: Zero if no timeout. */   
+                                            /* Or upper bound of time (ms) */   
+                                            /* for operation to take. */  
+};  
+
+</pre>
+
+## State diagrams of XSPLICE_ACTION commands.
+
+There is a strict ordering state of what the commands can be.
+The XSPLICE_ACTION prefix has been dropped to easy reading and
+does not include the XSPLICE_STATES:
+
+<pre>
+              /->\  
+              \  /  
+ UNLOAD <--- CHECK ---> REPLACE|APPLY --> REVERT --\  
+                \                                  |  
+                 \-------------------<-------------/  
+
+</pre>
+## State transition table of XSPLICE_ACTION commands and XSPLICE_STATUS.
+
+Note that:
+
+ - The LOADED state is the starting one achieved with *XEN_SYSCTL_XSPLICE_UPLOAD* hypercall.
+ - The REVERT operation on success will automatically move to CHECK state.
+ - There are three STATES: LOADED, CHECKED and APPLIED.
+ - There are five actions (aka commands): CHECK, APPLY, REPLACE, REVERT, and UNLOAD.
+
+The state transition table of valid states and action states:
+
+<pre>
+
++---------+---------+--------------------------------+-------+-------+--------+
+| ACTION  | Current | Result                         |       Next STATE:      |
+| ACTION  | STATE   |                                | LOADED|CHECKED|APPLIED |
++---------+----------+-------------------------------+-------+-------+--------+
+| CHECK   | LOADED  | Check payload (success).       |       |   x   |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| CHECK   | LOADED  | Check payload (error).         |  x    |       |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| CHECK   | CHECKED | Check payload (once more, no)  |       |   x   |        |
+|         |         | errors)                        |       |       |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| CHECK   | CHECKED | Check payload (once more, with |   x   |       |        |
+|         |         | errors)                        |       |       |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| UNLOAD  | CHECKED | Unload payload. Always works.  |       |       |        |
+|         |         | No next states.                |       |       |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| UNLOAD  | LOADED  | Unload payload. Always works.  |       |       |        |
+|         |         | No next states.                |       |       |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| APPLY   | CHECKED | Apply payload (success).       |       |       |   x    |
++---------+---------+--------------------------------+-------+-------+--------+
+| APPLY   | CHECKED | Apply payload (error|timeout)  |       |   x   |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| REPLACE | CHECKED | Revert payloads and apply new  |       |       |   x    |
+|         |         | payload with success.          |       |       |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| REPLACE | CHECKED | Revert payloads and apply new  |       |   x   |        |
+|         |         | payload with error.            |       |       |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| REVERT  | APPLIED | Revert payload (success).      |       |   x   |        |
++---------+---------+--------------------------------+-------+-------+--------+
+| REVERT  | APPLIED | Revert payload (error|timeout) |       |       |   x    |
++---------+---------+--------------------------------+-------+-------+--------+
+</pre>
+
+All the other state transitions are invalid.
+
+## Sequence of events.
+
+The normal sequence of events is to:
+
+ 1. *XEN_SYSCTL_XSPLICE_UPLOAD* to upload the payload. If there are errors *STOP* here.
+ 2. *XEN_SYSCTL_XSPLICE_GET* to check the `->rc`. If *XEN_EAGAIN* spin. If zero go to next step.
+ 3. *XEN_SYSCTL_XSPLICE_ACTION* with *XSPLICE_ACTION_CHECK* command to verify that the payload can be succesfully applied.
+ 4. *XEN_SYSCTL_XSPLICE_GET* to check the `->rc`. If *XEN_EAGAIN* spin. If zero go to next step.
+ 5. *XEN_SYSCTL_XSPLICE_ACTION* with *XSPLICE_ACTION_APPLY* to apply the patch.
+ 6. *XEN_SYSCTL_XSPLICE_GET* to check the `->rc`. If in *XEN_EAGAIN* spin. If zero exit with success.
+
+
+## Addendum
+
+Implementation quirks should not be discussed in a design document.
+
+However these observations can provide aid when developing against this
+document.
+
+
+### Alternative assembler
+
+Alternative assembler is a mechanism to use different instructions depending
+on what the CPU supports. This is done by providing multiple streams of code
+that can be patched in - or if the CPU does not support it - padded with
+`nop` operations. The alternative assembler macros cause the compiler to
+expand the code to place a most generic code in place - emit a special
+ELF .section header to tag this location. During run-time the hypervisor
+can leave the areas alone or patch them with an better suited opcodes.
+
+Note that patching functions that copy to or from guest memory requires
+to support alternative support. This is due to SMAP (specifically *stac*
+and *clac* operations) which is enabled on Broadwell and later architectures.
+
+### When to patch
+
+During the discussion on the design two candidates bubbled where
+the call stack for each CPU would be deterministic. This would
+minimize the chance of the patch not being applied due to safety
+checks failing. Safety checks such as not patching code which
+is on the stack - which can lead to corruption.
+
+#### Rendezvous code instead of stop_machine for patching
+
+The hypervisor's time rendezvous code runs synchronously across all CPUs
+every second. Using the stop_machine to patch can stall the time rendezvous
+code and result in NMI. As such having the patching be done at the tail
+of rendezvous code should avoid this problem.
+
+However the entrance point for that code is
+do_softirq->timer_softirq_action->time_calibration
+which ends up calling on_selected_cpus on remote CPUs.
+
+The remote CPUs receive CALL_FUNCTION_VECTOR IPI and execute the
+desired function.
+
+#### Before entering the guest code.
+
+Before we call VMXResume we check whether any soft IRQs need to be executed.
+This is a good spot because all Xen stacks are effectively empty at
+that point.
+
+To randezvous all the CPUs an barrier with an maximum timeout (which
+could be adjusted), combined with forcing all other CPUs through the
+hypervisor with IPIs, can be utilized to have all the CPUs be lockstep.
+
+The approach is similar in concept to stop_machine and the time rendezvous
+but is time-bound. However the local CPU stack is much shorter and
+a lot more deterministic.
+
+This is implemented in the Xen Project hypervisor.
+
+### Compiling the hypervisor code
+
+Hotpatch generation often requires support for compiling the target
+with -ffunction-sections / -fdata-sections.  Changes would have to
+be done to the linker scripts to support this.
+
+### Generation of xSplice ELF payloads
+
+The design of that is not discussed in this design.
+
+This is implemented in a seperate tool which lives in a seperate
+GIT repo.
+
+Currently it resides at https://github.com/rosslagerwall/xsplice-build
+
+### Exception tables and symbol tables growth
+
+We may need support for adapting or augmenting exception tables if
+patching such code.  Hotpatches may need to bring their own small
+exception tables (similar to how Linux modules support this).
+
+If supporting hotpatches that introduce additional exception-locations
+is not important, one could also change the exception table in-place
+and reorder it afterwards.
+
+As found almost every patch (XSA) to a non-trivial function requires
+additional entries in the exception table and/or the bug frames.
+
+This is implemented in the Xen Project hypervisor.
+
+### Security
+
+Only the privileged domain should be allowed to do this operation.
+
+
+# v2: Not Yet Done
+
+
+## Goals
+
+The v2 design must also have a mechanism for:
+
+ *  An dependency mechanism for the payloads. To use that information to load:
+    - The appropiate payload. To verify that payload is built against the
+      hypervisor. This can be done via the `build-id`
+      or via providing an copy of the old code - so that the hypervisor can
+       verify it against the code in memory.
+    - To construct an appropiate order of payloads to load in case they
+      depend on each other.
+ * Be able to lookup in the Xen hypervisor the symbol names of functions from the ELF payload.
+ * Be able to patch .rodata, .bss, and .data sections.
+ * Further safety checks (blacklist of which functions cannot be patched, check
+   the stack, etc).
+
+### xSplice interdependencies
+
+xSplice patches interdependencies are tricky.
+
+There are the ways this can be addressed:
+ * A single large patch that subsumes and replaces all previous ones.
+   Over the life-time of patching the hypervisor this large patch
+   grows to accumulate all the code changes.
+ * Hotpatch stack - where an mechanism exists that loads the hotpatches
+   in the same order they were built in. We would need an build-id
+   of the hypevisor to make sure the hot-patches are build against the
+   correct build.
+ * Payload containing the old code to check against that. That allows
+   the hotpatches to be loaded indepedently (if they don't overlap) - or
+   if the old code also containst previously patched code - even if they
+   overlap.
+
+The disadvantage of the first large patch is that it can grow over
+time and not provide an bisection mechanism to identify faulty patches.
+
+The hot-patch stack puts stricts requirements on the order of the patches
+being loaded and requires an hypervisor build-id to match against.
+
+The old code allows much more flexibility and an additional guard,
+but is more complex to implement.
+
+### Handle inlined __LINE__
+
+This problem is related to hotpatch construction
+and potentially has influence on the design of the hotpatching
+infrastructure in Xen.
+
+For example:
+
+We have file1.c with functions f1 and f2 (in that order).  f2 contains a
+BUG() (or WARN()) macro and at that point embeds the source line number
+into the generated code for f2.
+
+Now we want to hotpatch f1 and the hotpatch source-code patch adds 2
+lines to f1 and as a consequence shifts out f2 by two lines.  The newly
+constructed file1.o will now contain differences in both binary
+functions f1 (because we actually changed it with the applied patch) and
+f2 (because the contained BUG macro embeds the new line number).
+
+Without additional information, an algorithm comparing file1.o before
+and after hotpatch application will determine both functions to be
+changed and will have to include both into the binary hotpatch.
+
+Options:
+
+1. Transform source code patches for hotpatches to be line-neutral for
+   each chunk.  This can be done in almost all cases with either
+   reformatting of the source code or by introducing artificial
+   preprocessor "#line n" directives to adjust for the introduced
+   differences.
+
+   This approach is low-tech and simple.  Potentially generated
+   backtraces and existing debug information refers to the original
+   build and does not reflect hotpatching state except for actually
+   hotpatched functions but should be mostly correct.
+
+2. Ignoring the problem and living with artificially large hotpatches
+   that unnecessarily patch many functions.
+
+   This approach might lead to some very large hotpatches depending on
+   content of specific source file.  It may also trigger pulling in
+   functions into the hotpatch that cannot reasonable be hotpatched due
+   to limitations of a hotpatching framework (init-sections, parts of
+   the hotpatching framework itself, ...) and may thereby prevent us
+   from patching a specific problem.
+
+   The decision between 1. and 2. can be made on a patch--by-patch
+   basis.
+
+3. Introducing an indirection table for storing line numbers and
+   treating that specially for binary diffing. Linux may follow
+   this approach.
+
+   We might either use this indirection table for runtime use and patch
+   that with each hotpatch (similarly to exception tables) or we might
+   purely use it when building hotpatches to ignore functions that only
+   differ at exactly the location where a line-number is embedded.
+
+   For BUG(), WARN(), etc., the line number is embedded into the bug frame, not
+   the function itself.
+
+Similar considerations are true to a lesser extent for __FILE__, but it
+could be argued that file renaming should be done outside of hotpatches.
+
+## Signature checking requirements.
+
+The signature checking requires that the layout of the data in memory
+**MUST** be same for signature to be verified. This means that the payload
+data layout in ELF format **MUST** match what the hypervisor would be
+expecting such that it can properly do signature verification.
+
+The signature is based on the all of the payloads continuously laid out
+in memory. The signature is to be appended at the end of the ELF payload
+prefixed with the string '~Module signature appended~\n', followed by
+an signature header then followed by the signature, key identifier, and signers
+name.
+
+Specifically the signature header would be:
+
+<pre>
+#define PKEY_ALGO_DSA       0  
+#define PKEY_ALGO_RSA       1  
+
+#define PKEY_ID_PGP         0 /* OpenPGP generated key ID */  
+#define PKEY_ID_X509        1 /* X.509 arbitrary subjectKeyIdentifier */  
+
+#define HASH_ALGO_MD4          0  
+#define HASH_ALGO_MD5          1  
+#define HASH_ALGO_SHA1         2  
+#define HASH_ALGO_RIPE_MD_160  3  
+#define HASH_ALGO_SHA256       4  
+#define HASH_ALGO_SHA384       5  
+#define HASH_ALGO_SHA512       6  
+#define HASH_ALGO_SHA224       7  
+#define HASH_ALGO_RIPE_MD_128  8  
+#define HASH_ALGO_RIPE_MD_256  9  
+#define HASH_ALGO_RIPE_MD_320 10  
+#define HASH_ALGO_WP_256      11  
+#define HASH_ALGO_WP_384      12  
+#define HASH_ALGO_WP_512      13  
+#define HASH_ALGO_TGR_128     14  
+#define HASH_ALGO_TGR_160     15  
+#define HASH_ALGO_TGR_192     16  
+
+
+struct elf_payload_signature {  
+	u8	algo;		/* Public-key crypto algorithm PKEY_ALGO_*. */  
+	u8	hash;		/* Digest algorithm: HASH_ALGO_*. */  
+	u8	id_type;	/* Key identifier type PKEY_ID*. */  
+	u8	signer_len;	/* Length of signer's name */  
+	u8	key_id_len;	/* Length of key identifier */  
+	u8	__pad[3];  
+	__be32	sig_len;	/* Length of signature data */  
+};
+
+</pre>
+(Note that this has been borrowed from Linux module signature code.).
+
+
+### .rodata sections
+
+The patching might require strings to be updated as well. As such we must be
+also able to patch the strings as needed. This sounds simple - but the compiler
+has a habit of coalescing strings that are the same - which means if we in-place
+alter the strings - other users will be inadvertently affected as well.
+
+This is also where pointers to functions live - and we may need to patch this
+as well. And switch-style jump tables.
+
+To guard against that we must be prepared to do patching similar to
+trampoline patching or in-line depending on the flavour. If we can
+do in-line patching we would need to:
+
+ * alter `.rodata` to be writeable.
+ * inline patch.
+ * alter `.rodata` to be read-only.
+
+If are doing trampoline patching we would need to:
+
+ * allocate a new memory location for the string.
+ * all locations which use this string will have to be updated to use the
+   offset to the string.
+ * mark the region RO when we are done.
+
+### .bss and .data sections.
+
+In place patching writable data is not suitable as it is unclear what should be done
+depending on the current state of data. As such it should not be attempted.
+
+However, functions which are being patched can bring in changes to strings
+(.data or .rodata section changes), or even to .bss sections.
+
+As such the ELF payload can introduce new .rodata, .bss, and .data sections.
+Patching in the new function will end up also patching in the new .rodata
+section and the new function will reference the new string in the new
+.rodata section.
+
+
+### Inline patching
+
+The hypervisor should verify that the in-place patching would fit within
+the code or data.
+
+### Trampoline (e9 opcode)
+
+The e9 opcode used for jmpq uses a 32-bit signed displacement. That means
+we are limited to up to 2GB of virtual address to place the new code
+from the old code. That should not be a problem since Xen hypervisor has
+a very small footprint.
+
+However if we need - we can always add two trampolines. One at the 2GB
+limit that calls the next trampoline.
+
+Please note there is a small limitation for trampolines in
+function entries: The target function (+ trailing padding) must be able
+to accomodate the trampoline. On x86 with +-2 GB relative jumps,
+this means 5 bytes are  required.
+
+Depending on compiler settings, there are several functions in Xen that
+are smaller (without inter-function padding).
+
+<pre> 
+readelf -sW xen-syms | grep " FUNC " | \
+    awk '{ if ($3 < 5) print $3, $4, $5, $8 }'
+
+...
+3 FUNC LOCAL wbinvd_ipi
+3 FUNC LOCAL shadow_l1_index
+...
+</pre>
+A compile-time check for, e.g., a minimum alignment of functions or a
+runtime check that verifies symbol size (+ padding to next symbols) for
+that in the hypervisor is advised.
+
+The tool for generating payloads currently does perform a compile-time
+check to ensure that the function to be replaced is large enough.
+
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 02/13] hypervisor/arm/keyhandler: Declare struct cpu_user_regs;
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
  2016-01-14 21:46 ` [PATCH v2 01/13] xsplice: Design document (v5) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-14 21:47 ` [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7) Konrad Rzeszutek Wilk
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

in the keyhandler.h file. Otherwise on ARM builds if we
just use the keyhandler file - the compile will fail.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/include/xen/keyhandler.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/include/xen/keyhandler.h b/xen/include/xen/keyhandler.h
index 39052b5..c79671f 100644
--- a/xen/include/xen/keyhandler.h
+++ b/xen/include/xen/keyhandler.h
@@ -19,6 +19,7 @@
  */
 typedef void (keyhandler_fn_t)(unsigned char key);
 
+struct cpu_user_regs;
 /*
  * Callback type for irq_keyhandler.
  *
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7)
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
  2016-01-14 21:46 ` [PATCH v2 01/13] xsplice: Design document (v5) Konrad Rzeszutek Wilk
  2016-01-14 21:47 ` [PATCH v2 02/13] hypervisor/arm/keyhandler: Declare struct cpu_user_regs; Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-19 14:30   ` Ross Lagerwall
  2016-02-06 22:35   ` Doug Goldstein
  2016-01-14 21:47 ` [PATCH v2 04/13] libxc: Implementation of XEN_XSPLICE_op in libxc (v4) Konrad Rzeszutek Wilk
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

The implementation does not actually do any patching.

It just adds the framework for doing the hypercalls,
keeping track of ELF payloads, and the basic operations:
 - query which payloads exist,
 - query for specific payloads,
 - check*1, apply*1, replace*1, and unload payloads.

*1: Which of course in this patch are nops.

Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>

---
v2: Rebased on keyhandler: rework keyhandler infrastructure
v3: Fixed XSM.
v4: Removed REVERTED state.
    Split status and error code.
    Add REPLACE action.
    Separate payload data from the payload structure.
    s/XSPLICE_ID_../XSPLICE_NAME_../
v5: Add xsplice and CONFIG_XSPLICE build toption.
    Fix code per Jan's review.
    Update the sysctl.h (change bits to enum like)
v6: Rebase on Kconfig changes.
v7: Add missing pad checks. Re-order keyhandler.h to build on ARM.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 tools/flask/policy/policy/modules/xen/xen.te |   1 +
 xen/arch/arm/Kconfig                         |   1 +
 xen/arch/x86/Kconfig                         |   1 +
 xen/common/Kconfig                           |  14 +
 xen/common/Makefile                          |   2 +
 xen/common/sysctl.c                          |   8 +
 xen/common/xsplice.c                         | 386 +++++++++++++++++++++++++++
 xen/include/public/sysctl.h                  | 156 +++++++++++
 xen/include/xen/xsplice.h                    |   7 +
 xen/xsm/flask/hooks.c                        |   6 +
 xen/xsm/flask/policy/access_vectors          |   2 +
 11 files changed, 584 insertions(+)
 create mode 100644 xen/common/xsplice.c
 create mode 100644 xen/include/xen/xsplice.h

diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te
index d35ae22..542c3e1 100644
--- a/tools/flask/policy/policy/modules/xen/xen.te
+++ b/tools/flask/policy/policy/modules/xen/xen.te
@@ -72,6 +72,7 @@ allow dom0_t xen_t:xen2 {
 allow dom0_t xen_t:xen2 {
     pmu_ctrl
     get_symbol
+    xsplice_op
 };
 allow dom0_t xen_t:mmu memorymap;
 
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 60e923c..3780949 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -23,6 +23,7 @@ config ARM
 	select HAS_PASSTHROUGH
 	select HAS_PDX
 	select HAS_VIDEO
+	select HAS_XSPLICE
 
 config ARCH_DEFCONFIG
 	string
diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
index 4781b34..2b6c832 100644
--- a/xen/arch/x86/Kconfig
+++ b/xen/arch/x86/Kconfig
@@ -18,6 +18,7 @@ config X86
 	select HAS_PCI
 	select HAS_PDX
 	select HAS_VGA
+	select HAS_XSPLICE
 
 config ARCH_DEFCONFIG
 	string
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index eadfc3b..aaf4053 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -51,6 +51,9 @@ config HAS_GDBSX
 config HAS_IOPORTS
 	bool
 
+config HAS_XSPLICE
+	bool
+
 # Enable/Disable kexec support
 config KEXEC
 	bool "kexec support"
@@ -97,4 +100,15 @@ config XSM
 
 	  If unsure, say N.
 
+# Enable/Disable xsplice support
+config XSPLICE
+	bool "xsplice support"
+	default y
+	depends on HAS_XSPLICE
+	---help---
+	  Allows a running Xen hypervisor to be patched without rebooting.
+	  This is primarily used to patch an hypervisor with XSA fixes.
+
+	  If unsure, say Y.
+
 endmenu
diff --git a/xen/common/Makefile b/xen/common/Makefile
index 9f8b214..6fdeccf 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -71,3 +71,5 @@ subdir-$(coverage) += gcov
 
 subdir-y += libelf
 subdir-$(CONFIG_HAS_DEVICE_TREE) += libfdt
+
+obj-$(CONFIG_XSPLICE) += xsplice.o
diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index a3007b8..55e6cfa 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -28,6 +28,7 @@
 #include <xsm/xsm.h>
 #include <xen/pmstat.h>
 #include <xen/gcov.h>
+#include <xen/xsplice.h>
 
 long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
 {
@@ -460,6 +461,13 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
         ret = tmem_control(&op->u.tmem_op);
         break;
 
+#ifdef CONFIG_XSPLICE
+    case XEN_SYSCTL_xsplice_op:
+        ret = xsplice_control(&op->u.xsplice);
+        copyback = 1;
+        break;
+#endif
+
     default:
         ret = arch_do_sysctl(op, u_sysctl);
         copyback = 0;
diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
new file mode 100644
index 0000000..3c6acc3
--- /dev/null
+++ b/xen/common/xsplice.c
@@ -0,0 +1,386 @@
+/*
+ * Copyright (c) 2016 Oracle and/or its affiliates. All rights reserved.
+ *
+ */
+
+#include <xen/guest_access.h>
+#include <xen/keyhandler.h>
+#include <xen/lib.h>
+#include <xen/list.h>
+#include <xen/mm.h>
+#include <xen/sched.h>
+#include <xen/smp.h>
+#include <xen/spinlock.h>
+#include <xen/xsplice.h>
+
+#include <asm/event.h>
+#include <public/sysctl.h>
+
+static DEFINE_SPINLOCK(payload_list_lock);
+static LIST_HEAD(payload_list);
+
+static unsigned int payload_cnt;
+static unsigned int payload_version = 1;
+
+struct payload {
+    int32_t state;                       /* One of the XSPLICE_STATE_*. */
+    int32_t rc;                          /* 0 or -XEN_EXX. */
+    struct list_head list;               /* Linked to 'payload_list'. */
+    char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
+};
+
+static const char *state2str(int32_t state)
+{
+#define STATE(x) [XSPLICE_STATE_##x] = #x
+    static const char *const names[] = {
+            STATE(LOADED),
+            STATE(CHECKED),
+            STATE(APPLIED),
+    };
+#undef STATE
+
+    if (state >= ARRAY_SIZE(names))
+        return "unknown";
+
+    if (state < 0)
+        return "-EXX";
+
+    if (!names[state])
+        return "unknown";
+
+    return names[state];
+}
+
+static void xsplice_printall(unsigned char key)
+{
+    struct payload *data;
+
+    spin_lock(&payload_list_lock);
+
+    list_for_each_entry ( data, &payload_list, list )
+        printk(" name=%s state=%s(%d)\n", data->name,
+               state2str(data->state), data->state);
+
+    spin_unlock(&payload_list_lock);
+}
+
+static int verify_name(xen_xsplice_name_t *name)
+{
+    if ( name->size == 0 || name->size > XEN_XSPLICE_NAME_SIZE )
+        return -EINVAL;
+
+    if ( name->pad[0] || name->pad[1] || name->pad[2] )
+        return -EINVAL;
+
+    if ( !guest_handle_okay(name->name, name->size) )
+        return -EINVAL;
+
+    return 0;
+}
+
+static int find_payload(xen_xsplice_name_t *name, bool_t need_lock,
+                        struct payload **f)
+{
+    struct payload *data;
+    XEN_GUEST_HANDLE_PARAM(char) str;
+    char n[XEN_XSPLICE_NAME_SIZE + 1] = { 0 };
+    int rc = -EINVAL;
+
+    rc = verify_name(name);
+    if ( rc )
+        return rc;
+
+    str = guest_handle_cast(name->name, char);
+    if ( copy_from_guest(n, str, name->size) )
+        return -EFAULT;
+
+    if ( need_lock )
+        spin_lock(&payload_list_lock);
+
+    rc = -ENOENT;
+    list_for_each_entry ( data, &payload_list, list )
+    {
+        if ( !strcmp(data->name, n) )
+        {
+            *f = data;
+            rc = 0;
+            break;
+        }
+    }
+
+    if ( need_lock )
+        spin_unlock(&payload_list_lock);
+
+    return rc;
+}
+
+static int verify_payload(xen_sysctl_xsplice_upload_t *upload)
+{
+    if ( verify_name(&upload->name) )
+        return -EINVAL;
+
+    if ( upload->size == 0 )
+        return -EINVAL;
+
+    if ( !guest_handle_okay(upload->payload, upload->size) )
+        return -EFAULT;
+
+    return 0;
+}
+
+/*
+ * We MUST be holding the payload_list_lock spinlock.
+ */
+static void free_payload(struct payload *data)
+{
+    list_del(&data->list);
+    payload_cnt--;
+    payload_version++;
+    xfree(data);
+}
+
+static int xsplice_upload(xen_sysctl_xsplice_upload_t *upload)
+{
+    struct payload *data = NULL;
+    uint8_t *raw_data;
+    int rc;
+
+    rc = verify_payload(upload);
+    if ( rc )
+        return rc;
+
+    rc = find_payload(&upload->name, 1 /* true. */, &data);
+    if ( rc == 0 /* Found. */ )
+        return -EEXIST;
+
+    if ( rc != -ENOENT )
+        return rc;
+
+    data = xzalloc(struct payload);
+    if ( !data )
+        return -ENOMEM;
+
+    memset(data, 0, sizeof *data);
+    rc = -EFAULT;
+    if ( copy_from_guest(data->name, upload->name.name, upload->name.size) )
+        goto err_data;
+
+    rc = -ENOMEM;
+    raw_data = alloc_xenheap_pages(get_order_from_bytes(upload->size), 0);
+    if ( !raw_data )
+        goto err_data;
+
+    rc = -EFAULT;
+    if ( copy_from_guest(raw_data, upload->payload, upload->size) )
+        goto err_raw;
+
+    data->state = XSPLICE_STATE_LOADED;
+    data->rc = 0;
+    INIT_LIST_HEAD(&data->list);
+
+    spin_lock(&payload_list_lock);
+    list_add_tail(&data->list, &payload_list);
+    payload_cnt++;
+    payload_version++;
+    spin_unlock(&payload_list_lock);
+
+    free_xenheap_pages(raw_data, get_order_from_bytes(upload->size));
+    return 0;
+
+ err_raw:
+    free_xenheap_pages(raw_data, get_order_from_bytes(upload->size));
+ err_data:
+    xfree(data);
+    return rc;
+}
+
+static int xsplice_get(xen_sysctl_xsplice_summary_t *summary)
+{
+    struct payload *data;
+    int rc;
+
+    if ( summary->status.state )
+        return -EINVAL;
+
+    if ( summary->status.rc != 0 )
+        return -EINVAL;
+
+    rc = verify_name(&summary->name);
+    if ( rc )
+        return rc;
+
+    rc = find_payload(&summary->name, 1 /* true. */, &data);
+    if ( rc )
+        return rc;
+
+    summary->status.state = data->state;
+    summary->status.rc = data->rc;
+
+    return 0;
+}
+
+static int xsplice_list(xen_sysctl_xsplice_list_t *list)
+{
+    xen_xsplice_status_t status;
+    struct payload *data;
+    unsigned int idx = 0, i = 0;
+    int rc = 0;
+
+    if ( list->nr > 1024 )
+        return -E2BIG;
+
+    if ( list->pad != 0 )
+        return -EINVAL;
+
+    if ( !guest_handle_okay(list->status, sizeof(status) * list->nr) ||
+         !guest_handle_okay(list->name, XEN_XSPLICE_NAME_SIZE * list->nr) ||
+         !guest_handle_okay(list->len, sizeof(uint32_t) * list->nr) )
+        return -EINVAL;
+
+    spin_lock(&payload_list_lock);
+    if ( list->idx > payload_cnt || !list->nr )
+    {
+        spin_unlock(&payload_list_lock);
+        return -EINVAL;
+    }
+
+    list_for_each_entry( data, &payload_list, list )
+    {
+        uint32_t len;
+
+        if ( list->idx > i++ )
+            continue;
+
+        status.state = data->state;
+        status.rc = data->rc;
+        len = strlen(data->name);
+
+        /* N.B. 'idx' != 'i'. */
+        if ( __copy_to_guest_offset(list->name, idx * XEN_XSPLICE_NAME_SIZE,
+                                    data->name, len) ||
+             __copy_to_guest_offset(list->len, idx, &len, 1) ||
+             __copy_to_guest_offset(list->status, idx, &status, 1) )
+        {
+            rc = -EFAULT;
+            break;
+        }
+        idx++;
+        if ( hypercall_preempt_check() || (idx + 1 > list->nr) )
+            break;
+    }
+    list->nr = payload_cnt - i; /* Remaining amount. */
+    list->version = payload_version;
+    spin_unlock(&payload_list_lock);
+
+    /* And how many we have processed. */
+    return rc ? : idx;
+}
+
+static int xsplice_action(xen_sysctl_xsplice_action_t *action)
+{
+    struct payload *data;
+    int rc;
+
+    rc = verify_name(&action->name);
+    if ( rc )
+        return rc;
+
+    spin_lock(&payload_list_lock);
+    rc = find_payload(&action->name, 0 /* We are holding the lock. */, &data);
+    if ( rc )
+        goto out;
+
+    switch ( action->cmd )
+    {
+    case XSPLICE_ACTION_CHECK:
+        if ( (data->state == XSPLICE_STATE_LOADED) ||
+             (data->state == XSPLICE_STATE_CHECKED) )
+        {
+            /* No implementation yet. */
+            data->state = XSPLICE_STATE_CHECKED;
+            data->rc = 0;
+            rc = 0;
+        }
+        break;
+    case XSPLICE_ACTION_UNLOAD:
+        if ( (data->state == XSPLICE_STATE_LOADED) ||
+             (data->state == XSPLICE_STATE_CHECKED) )
+        {
+            free_payload(data);
+            /* No touching 'data' from here on! */
+            rc = 0;
+        }
+        break;
+    case XSPLICE_ACTION_REVERT:
+        if ( data->state == XSPLICE_STATE_APPLIED )
+        {
+            /* No implementation yet. */
+            data->state = XSPLICE_STATE_CHECKED;
+            data->rc = 0;
+            rc = 0;
+        }
+        break;
+    case XSPLICE_ACTION_APPLY:
+        if ( (data->state == XSPLICE_STATE_CHECKED) )
+        {
+            /* No implementation yet. */
+            data->state = XSPLICE_STATE_APPLIED;
+            data->rc = 0;
+            rc = 0;
+        }
+        break;
+    case XSPLICE_ACTION_REPLACE:
+        if ( data->state == XSPLICE_STATE_CHECKED )
+        {
+            /* No implementation yet. */
+            data->state = XSPLICE_STATE_CHECKED;
+            data->rc = 0;
+            rc = 0;
+        }
+        break;
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+    }
+
+ out:
+    spin_unlock(&payload_list_lock);
+
+    return rc;
+}
+
+int xsplice_control(xen_sysctl_xsplice_op_t *xsplice)
+{
+    int rc;
+
+    if ( xsplice->pad != 0 )
+        return -EINVAL;
+
+    switch ( xsplice->cmd )
+    {
+    case XEN_SYSCTL_XSPLICE_UPLOAD:
+        rc = xsplice_upload(&xsplice->u.upload);
+        break;
+    case XEN_SYSCTL_XSPLICE_GET:
+        rc = xsplice_get(&xsplice->u.get);
+        break;
+    case XEN_SYSCTL_XSPLICE_LIST:
+        rc = xsplice_list(&xsplice->u.list);
+        break;
+    case XEN_SYSCTL_XSPLICE_ACTION:
+        rc = xsplice_action(&xsplice->u.action);
+        break;
+    default:
+        rc = -EOPNOTSUPP;
+        break;
+   }
+
+    return rc;
+}
+
+static int __init xsplice_init(void)
+{
+    register_keyhandler('x', xsplice_printall, "print xsplicing info", 1);
+    return 0;
+}
+__initcall(xsplice_init);
diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
index 96680eb..0b0b879 100644
--- a/xen/include/public/sysctl.h
+++ b/xen/include/public/sysctl.h
@@ -766,6 +766,160 @@ struct xen_sysctl_tmem_op {
 typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
 
+/*
+ * XEN_SYSCTL_XSPLICE_op
+ *
+ * Refer to the http://xenbits.xenproject.org/docs/unstable/misc/xsplice.html
+ * for the design details of this hyprcall.
+ */
+
+/*
+ * Structure describing an ELF payload. Uniquely identifies the
+ * payload. Should be human readable.
+ * Recommended length is upto XEN_XSPLICE_NAME_SIZE.
+ */
+#define XEN_XSPLICE_NAME_SIZE 128
+struct xen_xsplice_name {
+    XEN_GUEST_HANDLE_64(char) name;         /* IN: pointer to name. */
+    uint16_t size;                          /* IN: size of name. May be upto
+                                               XEN_XSPLICE_NAME_SIZE. */
+    uint16_t pad[3];                        /* IN: MUST be zero. */
+};
+typedef struct xen_xsplice_name xen_xsplice_name_t;
+DEFINE_XEN_GUEST_HANDLE(xen_xsplice_name_t);
+
+/*
+ * Upload a payload to the hypervisor. The payload is verified
+ * against basic checks and if there are any issues the proper return code
+ * will be returned. The payload is not applied at this time - that is
+ * controlled by XEN_SYSCTL_XSPLICE_ACTION.
+ *
+ * The return value is zero if the payload was succesfully uploaded.
+ * Otherwise an EXX return value is provided. Duplicate `name` are not
+ * supported.
+ *
+ * The payload at this point is verified against the basic checks.
+ *
+ * The `payload` is the ELF payload as mentioned in the `Payload format`
+ * section in the xSplice design document.
+ */
+#define XEN_SYSCTL_XSPLICE_UPLOAD 0
+struct xen_sysctl_xsplice_upload {
+    xen_xsplice_name_t name;                /* IN, name of the patch. */
+    uint64_t size;                          /* IN, size of the ELF file. */
+    XEN_GUEST_HANDLE_64(uint8) payload;     /* IN, the ELF file. */
+};
+typedef struct xen_sysctl_xsplice_upload xen_sysctl_xsplice_upload_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_upload_t);
+
+/*
+ * Retrieve an status of an specific payload.
+ *
+ * Upon completion the `struct xen_xsplice_status` is updated.
+ *
+ * The return value is zero on success and XEN_EXX on failure. This operation
+ * is synchronous and does not require preemption.
+ */
+#define XEN_SYSCTL_XSPLICE_GET 1
+
+struct xen_xsplice_status {
+#define XSPLICE_STATE_LOADED       1
+#define XSPLICE_STATE_CHECKED      2
+#define XSPLICE_STATE_APPLIED      3
+    int32_t state;                 /* OUT: XSPLICE_STATE_*. IN: MUST be zero. */
+    int32_t rc;                    /* OUT: 0 if no error, otherwise -XEN_EXX. */
+                                   /* IN: MUST be zero. */
+};
+typedef struct xen_xsplice_status xen_xsplice_status_t;
+DEFINE_XEN_GUEST_HANDLE(xen_xsplice_status_t);
+
+struct xen_sysctl_xsplice_summary {
+    xen_xsplice_name_t name;                /* IN, name of the payload. */
+    xen_xsplice_status_t status;            /* IN/OUT, state of it. */
+};
+typedef struct xen_sysctl_xsplice_summary xen_sysctl_xsplice_summary_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_summary_t);
+
+/*
+ * Retrieve an array of abbreviated status and names of payloads that are
+ * loaded in the hypervisor.
+ *
+ * If the hypercall returns an positive number, it is the number (up to `nr`)
+ * of the payloads returned, along with `nr` updated with the number of remaining
+ * payloads, `version` updated (it may be the same across hypercalls. If it
+ * varies the data is stale and further calls could fail). The `status`,
+ * `name`, and `len`' are updated at their designed index value (`idx`) with
+ * the returned value of data.
+ *
+ * If the hypercall returns E2BIG the `nr` is too big and should be
+ * lowered.
+ *
+ * This operation can be preempted by the hypercall returning EAGAIN.
+ * Retry.
+ *
+ * Note that due to the asynchronous nature of hypercalls the domain might have
+ * added or removed the number of payloads making this information stale. It is
+ * the responsibility of the toolstack to use the `version` field to check
+ * between each invocation. if the version differs it should discard the stale
+ * data and start from scratch. It is OK for the toolstack to use the new
+ * `version` field.
+ */
+#define XEN_SYSCTL_XSPLICE_LIST 2
+struct xen_sysctl_xsplice_list {
+    uint32_t version;                       /* IN/OUT: Initially *MUST* be zero.
+                                               On subsequent calls reuse value.
+                                               If varies between calls, we are
+                                             * getting stale data. */
+    uint32_t idx;                           /* IN/OUT: Index into array. */
+    uint32_t nr;                            /* IN: How many status, id, and len
+                                               should fill out.
+                                               OUT: How many payloads left. */
+    uint32_t pad;                           /* IN: Must be zero. */
+    XEN_GUEST_HANDLE_64(xen_xsplice_status_t) status;  /* OUT. Must have enough
+                                               space allocate for nr of them. */
+    XEN_GUEST_HANDLE_64(char) name;         /* OUT: Array of ids. Each member
+                                               MUST XEN_XSPLICE_NAME_SIZE in size.
+                                               Must have nr of them. */
+    XEN_GUEST_HANDLE_64(uint32) len;        /* OUT: Array of lengths of ids.
+                                               Must have nr of them. */
+};
+typedef struct xen_sysctl_xsplice_list xen_sysctl_xsplice_list_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_list_t);
+
+/*
+ * Perform an operation on the payload structure referenced by the `name` field.
+ * The operation request is asynchronous and the status should be retrieved
+ * by using either XEN_SYSCTL_XSPLICE_GET or XEN_SYSCTL_XSPLICE_LIST hypercall.
+ */
+#define XEN_SYSCTL_XSPLICE_ACTION 3
+struct xen_sysctl_xsplice_action {
+    xen_xsplice_name_t name;                /* IN, name of the patch. */
+#define XSPLICE_ACTION_CHECK        1
+#define XSPLICE_ACTION_UNLOAD       2
+#define XSPLICE_ACTION_REVERT       3
+#define XSPLICE_ACTION_APPLY        4
+#define XSPLICE_ACTION_REPLACE      5
+    uint32_t cmd;                           /* IN: XSPLICE_ACTION_*. */
+    uint32_t timeout;                       /* IN: Zero if no timeout. */
+                                            /* Or upper bound of time (ms) */
+                                            /* for operation to take. */
+};
+typedef struct xen_sysctl_xsplice_action xen_sysctl_xsplice_action_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_action_t);
+
+struct xen_sysctl_xsplice_op {
+    uint32_t cmd;                           /* IN: XEN_SYSCTL_XSPLICE_*. */
+    uint32_t pad;                           /* IN: Always zero. */
+    union {
+        xen_sysctl_xsplice_upload_t upload;
+        xen_sysctl_xsplice_list_t list;
+        xen_sysctl_xsplice_summary_t get;
+        xen_sysctl_xsplice_action_t action;
+    } u;
+};
+typedef struct xen_sysctl_xsplice_op xen_sysctl_xsplice_op_t;
+DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_op_t);
+
 struct xen_sysctl {
     uint32_t cmd;
 #define XEN_SYSCTL_readconsole                    1
@@ -791,6 +945,7 @@ struct xen_sysctl {
 #define XEN_SYSCTL_pcitopoinfo                   22
 #define XEN_SYSCTL_psr_cat_op                    23
 #define XEN_SYSCTL_tmem_op                       24
+#define XEN_SYSCTL_xsplice_op                    25
     uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
     union {
         struct xen_sysctl_readconsole       readconsole;
@@ -816,6 +971,7 @@ struct xen_sysctl {
         struct xen_sysctl_psr_cmt_op        psr_cmt_op;
         struct xen_sysctl_psr_cat_op        psr_cat_op;
         struct xen_sysctl_tmem_op           tmem_op;
+        struct xen_sysctl_xsplice_op        xsplice;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
new file mode 100644
index 0000000..2cb2035
--- /dev/null
+++ b/xen/include/xen/xsplice.h
@@ -0,0 +1,7 @@
+#ifndef __XEN_XSPLICE_H__
+#define __XEN_XSPLICE_H__
+
+struct xen_sysctl_xsplice_op;
+int xsplice_control(struct xen_sysctl_xsplice_op *);
+
+#endif /* __XEN_XSPLICE_H__ */
diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
index 9b7de30..5346dcf 100644
--- a/xen/xsm/flask/hooks.c
+++ b/xen/xsm/flask/hooks.c
@@ -807,6 +807,12 @@ static int flask_sysctl(int cmd)
     case XEN_SYSCTL_tmem_op:
         return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
 
+#ifdef CONFIG_XSPLICE
+    case XEN_SYSCTL_xsplice_op:
+        return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
+                                    XEN2__XSPLICE_OP, NULL);
+#endif
+
     default:
         printk("flask_sysctl: Unknown op %d\n", cmd);
         return -EPERM;
diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
index effb59f..5f08d05 100644
--- a/xen/xsm/flask/policy/access_vectors
+++ b/xen/xsm/flask/policy/access_vectors
@@ -93,6 +93,8 @@ class xen2
     pmu_ctrl
 # PMU use (domains, including unprivileged ones, will be using this operation)
     pmu_use
+# XEN_SYSCTL_xsplice_op
+    xsplice_op
 }
 
 # Classes domain and domain2 consist of operations that a domain performs on
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 04/13] libxc: Implementation of XEN_XSPLICE_op in libxc (v4).
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (2 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
  2016-01-14 21:47 ` [PATCH v2 05/13] xen-xsplice: Tool to manipulate xsplice payloads (v3) Konrad Rzeszutek Wilk
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

The underlaying toolstack code to do the basic
operations when using the XEN_XSPLICE_op syscalls:
 - upload the payload,
 - get status of an payload,
 - list all the payloads,
 - apply, check, replace, and revert the payload.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
---
v2: Actually set zero for the _pad entries.
v3: Split status into state and error code.
    Add REPLACE action.
v4: Use timeout and utilize pads.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 tools/libxc/include/xenctrl.h |  18 +++
 tools/libxc/xc_misc.c         | 284 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 302 insertions(+)

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 01a6dda..bb80c8d 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -2851,6 +2851,24 @@ int xc_psr_cat_get_l3_info(xc_interface *xch, uint32_t socket,
                            bool *cdp_enabled);
 #endif
 
+int xc_xsplice_upload(xc_interface *xch,
+                      char *name, char *payload, uint32_t size);
+
+int xc_xsplice_get(xc_interface *xch,
+                   char *name,
+                   xen_xsplice_status_t *status);
+
+int xc_xsplice_list(xc_interface *xch, unsigned int max, unsigned int start,
+                    xen_xsplice_status_t *info, char *name,
+                    uint32_t *len, unsigned int *done,
+                    unsigned int *left);
+
+int xc_xsplice_apply(xc_interface *xch, char *name, uint32_t timeout);
+int xc_xsplice_revert(xc_interface *xch, char *name, uint32_t timeout);
+int xc_xsplice_unload(xc_interface *xch, char *name, uint32_t timeout);
+int xc_xsplice_check(xc_interface *xch, char *name, uint32_t timeout);
+int xc_xsplice_replace(xc_interface *xch, char *name, uint32_t timeout);
+
 #endif /* XENCTRL_H */
 
 /*
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index c613545..beadcb1 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -718,6 +718,290 @@ int xc_hvm_inject_trap(
     return rc;
 }
 
+int xc_xsplice_upload(xc_interface *xch,
+                      char *name,
+                      char *payload,
+                      uint32_t size)
+{
+    int rc;
+    DECLARE_SYSCTL;
+    DECLARE_HYPERCALL_BOUNCE(payload, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+    DECLARE_HYPERCALL_BOUNCE(name, 0 /* adjust later */, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+    xen_xsplice_name_t def_name = { .pad = { 0, 0, 0 } };
+
+    if ( !name || !payload )
+        return -1;
+
+    def_name.size = strlen(name);
+    if ( def_name.size > XEN_XSPLICE_NAME_SIZE )
+        return -1;
+
+    HYPERCALL_BOUNCE_SET_SIZE(name, def_name.size );
+
+    if ( xc_hypercall_bounce_pre(xch, name) )
+        return -1;
+
+    if ( xc_hypercall_bounce_pre(xch, payload) )
+        return -1;
+
+    sysctl.cmd = XEN_SYSCTL_xsplice_op;
+    sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_UPLOAD;
+    sysctl.u.xsplice.pad = 0;
+    sysctl.u.xsplice.u.upload.size = size;
+    set_xen_guest_handle(sysctl.u.xsplice.u.upload.payload, payload);
+
+    sysctl.u.xsplice.u.upload.name = def_name;
+    set_xen_guest_handle(sysctl.u.xsplice.u.upload.name.name, name);
+
+    rc = do_sysctl(xch, &sysctl);
+
+    xc_hypercall_bounce_post(xch, payload);
+    xc_hypercall_bounce_post(xch, name);
+
+    return rc;
+}
+
+int xc_xsplice_get(xc_interface *xch,
+                   char *name,
+                   xen_xsplice_status_t *status)
+{
+    int rc;
+    DECLARE_SYSCTL;
+    DECLARE_HYPERCALL_BOUNCE(name, 0 /*adjust later */, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+    xen_xsplice_name_t def_name = { .pad = { 0, 0, 0 } };
+
+    if ( !name )
+        return -1;
+
+    def_name.size = strlen(name);
+    if ( def_name.size > XEN_XSPLICE_NAME_SIZE )
+        return -1;
+
+    HYPERCALL_BOUNCE_SET_SIZE(name, def_name.size );
+
+    if ( xc_hypercall_bounce_pre(xch, name) )
+        return -1;
+
+    sysctl.cmd = XEN_SYSCTL_xsplice_op;
+    sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_GET;
+    sysctl.u.xsplice.pad = 0;
+
+    sysctl.u.xsplice.u.get.status.state = 0;
+    sysctl.u.xsplice.u.get.status.rc = 0;
+
+    sysctl.u.xsplice.u.get.name = def_name;
+    set_xen_guest_handle(sysctl.u.xsplice.u.get.name.name, name);
+
+    rc = do_sysctl(xch, &sysctl);
+
+    xc_hypercall_bounce_post(xch, name);
+
+    memcpy(status, &sysctl.u.xsplice.u.get.status, sizeof(*status));
+
+    return rc;
+}
+
+int xc_xsplice_list(xc_interface *xch, unsigned int max, unsigned int start,
+                    xen_xsplice_status_t *info,
+                    char *name, uint32_t *len,
+                    unsigned int *done,
+                    unsigned int *left)
+{
+    int rc;
+    DECLARE_SYSCTL;
+    DECLARE_HYPERCALL_BOUNCE(info, 0 /* adjust later. */, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+    DECLARE_HYPERCALL_BOUNCE(name, 0 /* adjust later. */, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+    DECLARE_HYPERCALL_BOUNCE(len, 0 /* adjust later. */, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
+    uint32_t max_batch_sz, nr;
+    uint32_t version = 0, retries = 0;
+    uint32_t adjust = 0;
+
+    if ( !max || !info || !name || !len )
+        return -1;
+
+    sysctl.cmd = XEN_SYSCTL_xsplice_op;
+    sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_LIST;
+    sysctl.u.xsplice.pad = 0;
+    sysctl.u.xsplice.u.list.version = 0;
+    sysctl.u.xsplice.u.list.idx = start;
+    sysctl.u.xsplice.u.list.pad = 0;
+
+    max_batch_sz = max;
+
+    *done = 0;
+    *left = 0;
+    do {
+        if ( adjust )
+            adjust = 0; /* Used when adjusting the 'max_batch_sz' or 'retries'. */
+
+        nr = min(max - *done, max_batch_sz);
+
+        sysctl.u.xsplice.u.list.nr = nr;
+        /* Fix the size (may vary between hypercalls). */
+        HYPERCALL_BOUNCE_SET_SIZE(info, nr * sizeof(*info));
+        HYPERCALL_BOUNCE_SET_SIZE(name, nr * sizeof(*name) * XEN_XSPLICE_NAME_SIZE);
+        HYPERCALL_BOUNCE_SET_SIZE(len, nr * sizeof(*len));
+        /* Move the pointer to proper offset into 'info'. */
+        (HYPERCALL_BUFFER(info))->ubuf = info + *done;
+        (HYPERCALL_BUFFER(name))->ubuf = name + (sizeof(*name) * XEN_XSPLICE_NAME_SIZE * *done);
+        (HYPERCALL_BUFFER(len))->ubuf = len + *done;
+        /* Allocate memory. */
+        rc = xc_hypercall_bounce_pre(xch, info);
+        if ( rc )
+            return rc;
+
+        rc = xc_hypercall_bounce_pre(xch, name);
+        if ( rc )
+        {
+            xc_hypercall_bounce_post(xch, info);
+            return rc;
+        }
+        rc = xc_hypercall_bounce_pre(xch, len);
+        if ( rc )
+        {
+            xc_hypercall_bounce_post(xch, info);
+            xc_hypercall_bounce_post(xch, name);
+            return rc;
+        }
+        set_xen_guest_handle(sysctl.u.xsplice.u.list.status, info);
+        set_xen_guest_handle(sysctl.u.xsplice.u.list.name, name);
+        set_xen_guest_handle(sysctl.u.xsplice.u.list.len, len);
+
+        rc = do_sysctl(xch, &sysctl);
+        /*
+         * From here on we MUST call xc_hypercall_bounce. If rc < 0 we
+         * end up doing it (outside the loop), so using a break is OK.
+         */
+        if ( rc < 0 && errno == E2BIG )
+        {
+            if ( max_batch_sz <= 1 )
+                break;
+            max_batch_sz >>= 1;
+            adjust = 1; /* For the loop conditional to let us loop again. */
+            /* No memory leaks! */
+            xc_hypercall_bounce_post(xch, info);
+            xc_hypercall_bounce_post(xch, name);
+            xc_hypercall_bounce_post(xch, len);
+            continue;
+        }
+        else if ( rc < 0 ) /* For all other errors we bail out. */
+            break;
+
+        if ( !version )
+            version = sysctl.u.xsplice.u.list.version;
+
+        if ( sysctl.u.xsplice.u.list.version != version )
+        {
+            /* TODO: retries should be configurable? */
+            if ( retries++ > 3 )
+            {
+                rc = -1;
+                errno = EBUSY;
+                break;
+            }
+            *done = 0; /* Retry from scratch. */
+            version = sysctl.u.xsplice.u.list.version;
+            adjust = 1; /* And make sure we continue in the loop. */
+            /* No memory leaks. */
+            xc_hypercall_bounce_post(xch, info);
+            xc_hypercall_bounce_post(xch, name);
+            xc_hypercall_bounce_post(xch, len);
+            continue;
+        }
+
+        /* We should never hit this, but just in case. */
+        if ( rc > nr )
+        {
+            errno = EINVAL; /* Overflow! */
+            rc = -1;
+            break;
+        }
+        *left = sysctl.u.xsplice.u.list.nr; /* Total remaining count. */
+        /* Copy only up 'rc' of data' - we could add 'min(rc,nr) if desired. */
+        HYPERCALL_BOUNCE_SET_SIZE(info, (rc * sizeof(*info)));
+        HYPERCALL_BOUNCE_SET_SIZE(name, (rc * sizeof(*name) * XEN_XSPLICE_NAME_SIZE));
+        HYPERCALL_BOUNCE_SET_SIZE(len, (rc * sizeof(*len)));
+        /* Bounce the data and free the bounce buffer. */
+        xc_hypercall_bounce_post(xch, info);
+        xc_hypercall_bounce_post(xch, name);
+        xc_hypercall_bounce_post(xch, len);
+        /* And update how many elements of info we have copied into. */
+        *done += rc;
+        /* Update idx. */
+        sysctl.u.xsplice.u.list.idx = *done;
+    } while ( adjust || (*done < max && *left != 0) );
+
+    if ( rc < 0 )
+    {
+        xc_hypercall_bounce_post(xch, len);
+        xc_hypercall_bounce_post(xch, name);
+        xc_hypercall_bounce_post(xch, info);
+    }
+
+    return rc > 0 ? 0 : rc;
+}
+
+static int _xc_xsplice_action(xc_interface *xch,
+                              char *name,
+                              unsigned int action,
+                              uint32_t timeout)
+{
+    int rc;
+    DECLARE_SYSCTL;
+    DECLARE_HYPERCALL_BOUNCE(name, 0 /* adjust later */, XC_HYPERCALL_BUFFER_BOUNCE_IN);
+    xen_xsplice_name_t def_name = { .pad = { 0, 0, 0 } };
+
+    def_name.size = strlen(name);
+
+    if ( def_name.size > XEN_XSPLICE_NAME_SIZE )
+        return -1;
+
+    HYPERCALL_BOUNCE_SET_SIZE(name, def_name.size);
+
+    if ( xc_hypercall_bounce_pre(xch, name) )
+        return -1;
+
+    sysctl.cmd = XEN_SYSCTL_xsplice_op;
+    sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_ACTION;
+    sysctl.u.xsplice.pad = 0;
+    sysctl.u.xsplice.u.action.cmd = action;
+    sysctl.u.xsplice.u.action.timeout = timeout;
+
+    sysctl.u.xsplice.u.action.name = def_name;
+    set_xen_guest_handle(sysctl.u.xsplice.u.action.name.name, name);
+
+    rc = do_sysctl(xch, &sysctl);
+
+    xc_hypercall_bounce_post(xch, name);
+
+    return rc;
+}
+
+int xc_xsplice_apply(xc_interface *xch, char *name, uint32_t timeout)
+{
+    return _xc_xsplice_action(xch, name, XSPLICE_ACTION_APPLY, timeout);
+}
+
+int xc_xsplice_revert(xc_interface *xch, char *name, uint32_t timeout)
+{
+    return _xc_xsplice_action(xch, name, XSPLICE_ACTION_REVERT, timeout);
+}
+
+int xc_xsplice_unload(xc_interface *xch, char *name, uint32_t timeout)
+{
+    return _xc_xsplice_action(xch, name, XSPLICE_ACTION_UNLOAD, timeout);
+}
+
+int xc_xsplice_check(xc_interface *xch, char *name, uint32_t timeout)
+{
+    return _xc_xsplice_action(xch, name, XSPLICE_ACTION_CHECK, timeout);
+}
+
+int xc_xsplice_replace(xc_interface *xch, char *name, uint32_t timeout)
+{
+    return _xc_xsplice_action(xch, name, XSPLICE_ACTION_REPLACE, timeout);
+}
+
 /*
  * Local variables:
  * mode: C
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 05/13] xen-xsplice: Tool to manipulate xsplice payloads (v3)
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (3 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 04/13] libxc: Implementation of XEN_XSPLICE_op in libxc (v4) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
  2016-01-19 14:30   ` Ross Lagerwall
  2016-01-14 21:47 ` [PATCH v2 06/13] elf: Add relocation types to elfstructs.h Konrad Rzeszutek Wilk
                   ` (8 subsequent siblings)
  13 siblings, 2 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

A simple tool that allows an system admin to perform
basic xsplice operations:

 - Upload a xsplice file (with an unique id)
 - List all the xsplice payloads loaded.
 - Apply, revert, replace, unload, or check the payload using the
   unique id.
 - Do all three - upload, check, and apply the
   payload in one go (load). Also will use the name of the
   file as the <id>

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
---
v2:
 - Removed REVERTED state.
 - Fixed bugs handling XSPLICE_STATUS_PROGRESS.
 - Split status into state and error.
   Add REPLACE action.
v3:
 - Utilize the timeout and use the default one (let the hypervisor
   pick it).
 - Change the s/all/load and infer the <id> from name of file.
---
 .gitignore               |   1 +
 tools/misc/Makefile      |   4 +
 tools/misc/xen-xsplice.c | 475 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 480 insertions(+)
 create mode 100644 tools/misc/xen-xsplice.c

diff --git a/.gitignore b/.gitignore
index e0df903..2528d8f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -169,6 +169,7 @@ tools/misc/xc_shadow
 tools/misc/xen_cpuperf
 tools/misc/xen-detect
 tools/misc/xen-tmem-list-parse
+tools/misc/xen-xsplice
 tools/misc/xenperf
 tools/misc/xenpm
 tools/misc/xen-hvmctx
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index c4490f3..c46873e 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -30,6 +30,7 @@ INSTALL_SBIN                   += xenlockprof
 INSTALL_SBIN                   += xenperf
 INSTALL_SBIN                   += xenpm
 INSTALL_SBIN                   += xenwatchdogd
+INSTALL_SBIN                   += xen-xsplice
 INSTALL_SBIN += $(INSTALL_SBIN-y)
 
 # Everything to be installed in a private bin/
@@ -98,6 +99,9 @@ xen-mfndump: xen-mfndump.o
 xenwatchdogd: xenwatchdogd.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xen-xsplice: xen-xsplice.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
+
 xen-lowmemd: xen-lowmemd.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(LDLIBS_libxenstore) $(APPEND_LDFLAGS)
 
diff --git a/tools/misc/xen-xsplice.c b/tools/misc/xen-xsplice.c
new file mode 100644
index 0000000..0c7f4da
--- /dev/null
+++ b/tools/misc/xen-xsplice.c
@@ -0,0 +1,475 @@
+/*
+ * Copyright (c) 2016 Oracle and/or its affiliates. All rights reserved.
+ */
+
+#include <fcntl.h>
+#include <libgen.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <xenctrl.h>
+#include <xenstore.h>
+
+static xc_interface *xch;
+
+void show_help(void)
+{
+    fprintf(stderr,
+            "xen-xsplice: Xsplice test tool\n"
+            "Usage: xen-xsplice <command> [args]\n"
+            " <id> An unique name of payload. Up to %d characters.\n"
+            "Commands:\n"
+            "  help                 display this help\n"
+            "  upload <id> <file>   upload file <file> with <id> name\n"
+            "  list                 list payloads uploaded.\n"
+            "  apply <id>           apply <id> patch.\n"
+            "  revert <id>          revert id <id> patch.\n"
+            "  replace <id>         apply <id> patch and revert all others.\n"
+            "  unload <id>          unload id <id> patch.\n"
+            "  check <id>           check id <id> patch.\n"
+            "  load  <file>         upload, check and apply <file>.\n"
+            "                       id is the <file> name\n",
+            XEN_XSPLICE_NAME_SIZE);
+}
+
+/* wrapper function */
+static int help_func(int argc, char *argv[])
+{
+    show_help();
+    return 0;
+}
+
+#define ARRAY_SIZE(a) (sizeof (a) / sizeof ((a)[0]))
+
+static const char *state2str(long state)
+{
+#define STATE(x) [XSPLICE_STATE_##x] = #x
+    static const char *const names[] = {
+            STATE(LOADED),
+            STATE(CHECKED),
+            STATE(APPLIED),
+    };
+#undef STATE
+    if (state >= ARRAY_SIZE(names))
+        return "unknown";
+
+    if (state < 0)
+        return "-EXX";
+
+    if (!names[state])
+        return "unknown";
+
+    return names[state];
+}
+
+/* This value was choosen adhoc. It could be 42 too. */
+#define MAX_LEN 11
+static int list_func(int argc, char *argv[])
+{
+    unsigned int idx, done, left, i;
+    xen_xsplice_status_t *info = NULL;
+    char *id = NULL;
+    uint32_t *len = NULL;
+    int rc = ENOMEM;
+
+    if ( argc )
+    {
+        show_help();
+        return -1;
+    }
+    idx = left = 0;
+    info = malloc(sizeof(*info) * MAX_LEN);
+    if ( !info )
+        goto out;
+    id = malloc(sizeof(*id) * XEN_XSPLICE_NAME_SIZE * MAX_LEN);
+    if ( !id )
+        goto out;
+    len = malloc(sizeof(*len) * MAX_LEN);
+    if ( !len )
+        goto out;
+
+    fprintf(stdout," ID                                     | status\n"
+                   "----------------------------------------+------------\n");
+    do {
+        done = 0;
+        memset(info, 'A', sizeof(*info) * MAX_LEN); /* Optional. */
+        memset(id, 'i', sizeof(*id) * MAX_LEN * XEN_XSPLICE_NAME_SIZE); /* Optional. */
+        memset(len, 'l', sizeof(*len) * MAX_LEN); /* Optional. */
+        rc = xc_xsplice_list(xch, MAX_LEN, idx, info, id, len, &done, &left);
+        if ( rc )
+        {
+            fprintf(stderr, "Failed to list %d/%d: %d(%s)!\n", idx, left, errno, strerror(errno));
+            break;
+        }
+        for ( i = 0; i < done; i++ )
+        {
+            unsigned int j;
+            uint32_t sz;
+            char *str;
+
+            sz = len[i];
+            str = id + (i * XEN_XSPLICE_NAME_SIZE);
+            for ( j = sz; j < XEN_XSPLICE_NAME_SIZE; j++ )
+                str[j] = '\0';
+
+            printf("%-40s| %s", str, state2str(info[i].state));
+            if ( info[i].rc )
+                printf(" (%d, %s)\n", -info[i].rc, strerror(-info[i].rc));
+            else
+                puts("");
+        }
+        idx += done;
+    } while ( left );
+
+out:
+    free(id);
+    free(info);
+    free(len);
+    return rc;
+}
+#undef MAX_LEN
+
+static int get_id(int argc, char *argv[], char *id)
+{
+    ssize_t len = strlen(argv[0]);
+    if ( len > XEN_XSPLICE_NAME_SIZE )
+    {
+        fprintf(stderr, "ID MUST be %d characters!\n", XEN_XSPLICE_NAME_SIZE);
+        errno = EINVAL;
+        return errno;
+    }
+    /* Don't want any funny strings from the stack. */
+    memset(id, 0, XEN_XSPLICE_NAME_SIZE);
+    strncpy(id, argv[0], len);
+    return 0;
+}
+
+static int upload_func(int argc, char *argv[])
+{
+    char *filename;
+    char id[XEN_XSPLICE_NAME_SIZE];
+    int fd = 0, rc;
+    struct stat buf;
+    unsigned char *fbuf;
+    ssize_t len;
+    DECLARE_HYPERCALL_BUFFER(char, payload);
+
+    if ( argc != 2 )
+    {
+        show_help();
+        return -1;
+    }
+
+    if ( get_id(argc, argv, id) )
+        return EINVAL;
+
+    filename = argv[1];
+    fd = open(filename, O_RDONLY);
+    if ( fd < 0 )
+    {
+        fprintf(stderr, "Could not open %s, error: %d(%s)\n",
+                filename, errno, strerror(errno));
+        return errno;
+    }
+    if ( stat(filename, &buf) != 0 )
+    {
+        fprintf(stderr, "Could not get right size %s, error: %d(%s)\n",
+                filename, errno, strerror(errno));
+        close(fd);
+        return errno;
+    }
+
+    len = buf.st_size;
+    fbuf = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
+    if ( fbuf == MAP_FAILED )
+    {
+        fprintf(stderr,"Could not map: %s, error: %d(%s)\n",
+                filename, errno, strerror(errno));
+        close (fd);
+        return errno;
+    }
+    printf("Uploading %s (%zu bytes)\n", filename, len);
+    payload = xc_hypercall_buffer_alloc(xch, payload, len);
+    memcpy(payload, fbuf, len);
+
+    rc = xc_xsplice_upload(xch, id, payload, len);
+    if ( rc )
+    {
+        fprintf(stderr, "Upload failed: %s, error: %d(%s)!\n",
+                filename, errno, strerror(errno));
+        goto out;
+    }
+    xc_hypercall_buffer_free(xch, payload);
+
+out:
+    if ( munmap( fbuf, len) )
+    {
+        fprintf(stderr, "Could not unmap!? error: %d(%s)!\n",
+                errno, strerror(errno));
+        rc = errno;
+    }
+    close(fd);
+
+    return rc;
+}
+
+/* These MUST match to the 'action_options[]' array slots. */
+enum {
+    ACTION_APPLY = 0,
+    ACTION_REVERT = 1,
+    ACTION_UNLOAD = 2,
+    ACTION_CHECK = 3,
+    ACTION_REPLACE = 4,
+};
+
+struct {
+    int allow; /* State it must be in to call function. */
+    int expected; /* The state to be in after the function. */
+    const char *name;
+    int (*function)(xc_interface *xch, char *id, uint32_t timeout);
+    unsigned int executed; /* Has the function been called?. */
+} action_options[] = {
+    {   .allow = XSPLICE_STATE_CHECKED,
+        .expected = XSPLICE_STATE_APPLIED,
+        .name = "apply",
+        .function = xc_xsplice_apply,
+    },
+    {   .allow = XSPLICE_STATE_APPLIED,
+        .expected = XSPLICE_STATE_CHECKED,
+        .name = "revert",
+        .function = xc_xsplice_revert,
+    },
+    {   .allow = XSPLICE_STATE_CHECKED | XSPLICE_STATE_LOADED,
+        .expected = -ENOENT,
+        .name = "unload",
+        .function = xc_xsplice_unload,
+    },
+    {   .allow = XSPLICE_STATE_CHECKED | XSPLICE_STATE_LOADED,
+        .expected = XSPLICE_STATE_CHECKED,
+        .name = "check",
+        .function = xc_xsplice_check
+    },
+    {   .allow = XSPLICE_STATE_CHECKED,
+        .expected = XSPLICE_STATE_APPLIED,
+        .name = "replace",
+        .function = xc_xsplice_replace,
+    },
+};
+
+/* Go around 300 * 0.1 seconds = 30 seconds. */
+#define RETRIES 300
+/* aka 0.1 second */
+#define DELAY 100000
+
+int action_func(int argc, char *argv[], unsigned int idx)
+{
+    char id[XEN_XSPLICE_NAME_SIZE];
+    int rc, original_state;
+    xen_xsplice_status_t status;
+    unsigned int retry = 0;
+
+    if ( argc != 1 )
+    {
+        show_help();
+        return -1;
+    }
+
+    if ( idx >= ARRAY_SIZE(action_options) )
+        return -1;
+
+    if ( get_id(argc, argv, id) )
+        return EINVAL;
+
+    /* Check initial status. */
+    rc = xc_xsplice_get(xch, id, &status);
+    if ( rc )
+        goto err;
+
+    if ( status.rc == -EAGAIN )
+    {
+        printf("%s failed. Operation already in progress\n", id);
+        return -1;
+    }
+
+    if ( status.state == action_options[idx].expected )
+    {
+        printf("No action needed\n");
+        return 0;
+    }
+
+    /* Perform action. */
+    if ( action_options[idx].allow & status.state )
+    {
+        printf("Performing %s:", action_options[idx].name);
+        rc = action_options[idx].function(xch, id, 0);
+        if ( rc )
+            goto err;
+    }
+    else
+    {
+        printf("%s: in wrong state (%s), expected (%s)\n",
+               id, state2str(status.state),
+               state2str(action_options[idx].expected));
+        return -1;
+    }
+
+    original_state = status.state;
+    do {
+        rc = xc_xsplice_get(xch, id, &status);
+        if ( rc )
+        {
+            rc = -errno;
+            break;
+        }
+
+        if ( status.state != original_state )
+            break;
+        if ( status.rc && status.rc != -EAGAIN )
+        {
+            rc = status.rc;
+            break;
+        }
+
+        printf(".");
+        fflush(stdout);
+        usleep(DELAY);
+    } while ( ++retry < RETRIES );
+
+    if ( retry >= RETRIES )
+    {
+        printf("%s: Operation didn't complete after 30 seconds.\n", id);
+        return -1;
+    }
+    else
+    {
+        if ( rc == 0 )
+            rc = status.state;
+
+        if ( action_options[idx].expected == rc )
+            printf(" completed\n");
+        else if ( rc < 0 )
+        {
+            printf("%s failed with %d(%s)\n", id, -rc, strerror(-rc));
+            return -1;
+        }
+        else
+        {
+            printf("%s: in wrong state (%s), expected (%s)\n",
+               id, state2str(rc),
+               state2str(action_options[idx].expected));
+            return -1;
+        }
+    }
+
+    return 0;
+
+ err:
+    printf("%s failed with %d(%s)\n", id, -rc, strerror(-rc));
+    return rc;
+}
+
+static int load_func(int argc, char *argv[])
+{
+    int rc;
+    char *new_argv[2];
+    char *id, *name, *lastdot;
+
+    if ( argc != 1 )
+    {
+        show_help();
+        return -1;
+    }
+    /* <file> */
+    new_argv[1] = argv[0];
+
+    /* Synthesize the <id> */
+    name = strdup(argv[0]);
+
+    id = basename(name);
+    lastdot = strrchr(id, '.');
+    if (lastdot != NULL)
+        *lastdot = '\0';
+    new_argv[0] = id;
+    printf("%s %s %s\n", argv[0], new_argv[0], new_argv[1]);
+
+    rc = upload_func(2 /* <id> <file> */, new_argv);
+    if ( rc )
+        return rc;
+
+    rc = action_func(1 /* only <id> */, new_argv, ACTION_CHECK);
+    if ( rc )
+        goto unload;
+
+    rc = action_func(1 /* only <id> */, new_argv, ACTION_APPLY);
+    if ( rc )
+        goto unload;
+
+    free(name);
+    return 0;
+unload:
+    action_func(1, new_argv, ACTION_UNLOAD);
+    free(name);
+    return rc;
+}
+
+/*
+ * These are also functions in action_options that are called in case
+ * none of these match.
+ */
+struct {
+    const char *name;
+    int (*function)(int argc, char *argv[]);
+} main_options[] = {
+    { "help", help_func },
+    { "list", list_func },
+    { "upload", upload_func },
+    { "load", load_func },
+};
+
+int main(int argc, char *argv[])
+{
+    int i, j, ret;
+
+    if ( argc  <= 1 )
+    {
+        show_help();
+        return 0;
+    }
+    for ( i = 0; i < ARRAY_SIZE(main_options); i++ )
+        if (!strncmp(main_options[i].name, argv[1], strlen(argv[1])))
+            break;
+
+    if ( i == ARRAY_SIZE(main_options) )
+    {
+        for ( j = 0; j < ARRAY_SIZE(action_options); j++ )
+            if (!strncmp(action_options[j].name, argv[1], strlen(argv[1])))
+                break;
+
+        if ( j == ARRAY_SIZE(action_options) )
+        {
+            fprintf(stderr, "Unrecognised command '%s' -- try "
+                   "'xen-xsplice help'\n", argv[1]);
+            return 1;
+        }
+    } else
+        j = ARRAY_SIZE(action_options);
+
+    xch = xc_interface_open(0,0,0);
+    if ( !xch )
+    {
+        fprintf(stderr, "failed to get the handler\n");
+        return 0;
+    }
+
+    if ( i == ARRAY_SIZE(main_options) )
+        ret = action_func(argc -2, argv + 2, j);
+    else
+        ret = main_options[i].function(argc -2, argv + 2);
+
+    xc_interface_close(xch);
+
+    return !!ret;
+}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 06/13] elf: Add relocation types to elfstructs.h
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (4 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 05/13] xen-xsplice: Tool to manipulate xsplice payloads (v3) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-14 21:47 ` [PATCH v2 07/13] xsplice: Add helper elf routines (v2) Konrad Rzeszutek Wilk
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2: Slim the list as we do not use all of them.
---
 xen/include/xen/elfstructs.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/xen/include/xen/elfstructs.h b/xen/include/xen/elfstructs.h
index 12ffb82..4ff3258 100644
--- a/xen/include/xen/elfstructs.h
+++ b/xen/include/xen/elfstructs.h
@@ -348,6 +348,14 @@ typedef struct {
 #define	ELF64_R_TYPE(info)	((info) & 0xFFFFFFFF)
 #define ELF64_R_INFO(s,t) 	(((s) << 32) + (u_int32_t)(t))
 
+/* x86-64 relocation types. We list only the ones we implement. */
+#define R_X86_64_NONE		0	/* No reloc */
+#define R_X86_64_64		1	/* Direct 64 bit  */
+#define R_X86_64_PC32		2	/* PC relative 32 bit signed */
+#define R_X86_64_PLT32		4	/* 32 bit PLT address */
+#define R_X86_64_32		10	/* Direct 32 bit zero extended */
+#define R_X86_64_32S		11	/* Direct 32 bit sign extended */
+
 /* Program Header */
 typedef struct {
 	Elf32_Word	p_type;		/* segment type */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 07/13] xsplice: Add helper elf routines (v2)
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (5 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 06/13] elf: Add relocation types to elfstructs.h Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-19 14:33   ` Ross Lagerwall
  2016-01-14 21:47 ` [PATCH v2 08/13] xsplice: Implement payload loading (v2) Konrad Rzeszutek Wilk
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Add Elf routines and data structures in preparation for loading an
xSplice payload.

We also add an macro that will print where we failed during
the ELF parsing.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2: - With the #define ELFSIZE in the ARM file we can use the common
     #defines instead of using #ifdef CONFIG_ARM_32.
    - Add checks for ELF file.
    - Add name to be printed.
    - Add len for easier ELF checks.
    - Expand on the checks. Add macro.
---
 xen/common/Makefile           |   1 +
 xen/common/xsplice_elf.c      | 201 ++++++++++++++++++++++++++++++++++++++++++
 xen/include/xen/xsplice_elf.h |  37 ++++++++
 3 files changed, 239 insertions(+)
 create mode 100644 xen/common/xsplice_elf.c
 create mode 100644 xen/include/xen/xsplice_elf.h

diff --git a/xen/common/Makefile b/xen/common/Makefile
index 6fdeccf..0c9d527 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -73,3 +73,4 @@ subdir-y += libelf
 subdir-$(CONFIG_HAS_DEVICE_TREE) += libfdt
 
 obj-$(CONFIG_XSPLICE) += xsplice.o
+obj-$(CONFIG_XSPLICE) += xsplice_elf.o
diff --git a/xen/common/xsplice_elf.c b/xen/common/xsplice_elf.c
new file mode 100644
index 0000000..a5e9d63
--- /dev/null
+++ b/xen/common/xsplice_elf.c
@@ -0,0 +1,201 @@
+#include <xen/lib.h>
+#include <xen/errno.h>
+#include <xen/xsplice.h>
+#include <xen/xsplice_elf.h>
+
+#define return_(x) { printk(XENLOG_DEBUG "%s:%d rc: %d\n",  \
+                            __func__,__LINE__, x); return x; }
+
+struct xsplice_elf_sec *xsplice_elf_sec_by_name(const struct xsplice_elf *elf,
+                                                const char *name)
+{
+    unsigned int i;
+
+    for ( i = 0; i < elf->hdr->e_shnum; i++ )
+    {
+        if ( !strcmp(name, elf->sec[i].name) )
+            return &elf->sec[i];
+    }
+
+    return NULL;
+}
+
+static int elf_resolve_sections(struct xsplice_elf *elf, uint8_t *data)
+{
+    struct xsplice_elf_sec *sec;
+    unsigned int i;
+
+    sec = xmalloc_array(struct xsplice_elf_sec, elf->hdr->e_shnum);
+    if ( !sec )
+    {
+        printk(XENLOG_ERR "Could not allocate memory for section table!\n");
+        return_(-ENOMEM);
+    }
+
+    /* N.B. We also will ingest SHN_UNDEF sections. */
+    for ( i = 0; i < elf->hdr->e_shnum; i++ )
+    {
+        ssize_t delta = elf->hdr->e_shoff + i * elf->hdr->e_shentsize;
+
+        if ( delta + sizeof(Elf_Shdr) > elf->len )
+            return_(-EINVAL);
+
+        sec[i].sec = (Elf_Shdr *)(data + delta);
+        delta = sec[i].sec->sh_offset;
+
+        if ( delta > elf->len )
+            return_(-EINVAL);
+
+        sec[i].data = data + delta;
+        /* Name is populated in xsplice_elf_sections_name. */
+        sec[i].name = NULL;
+
+        if ( sec[i].sec->sh_type == SHT_SYMTAB )
+        {
+                if ( elf->symtab )
+                    return_(-EINVAL);
+                elf->symtab = &sec[i];
+                /* elf->symtab->sec->sh_link would point to the right section
+                 * but we hadn't finished parsing all the sections. */
+                if ( elf->symtab->sec->sh_link > elf->hdr->e_shnum )
+                    return_(-EINVAL);
+        }
+    }
+    elf->sec = sec;
+    if ( !elf->symtab )
+        return_(-EINVAL);
+
+    /* There can be multiple SHT_STRTAB so pick the right one. */
+    elf->strtab = &sec[elf->symtab->sec->sh_link];
+
+    if ( elf->symtab->sec->sh_size == 0 || elf->symtab->sec->sh_entsize == 0 )
+        return_(-EINVAL);
+
+    if ( elf->symtab->sec->sh_entsize != sizeof(Elf_Sym) )
+        return_(-EINVAL);
+
+    return 0;
+}
+
+static int elf_resolve_section_names(struct xsplice_elf *elf, uint8_t *data)
+{
+    const char *shstrtab;
+    unsigned int i;
+    unsigned int offset, delta;
+
+    /* The elf->sec[0 -> e_shnum] structures have been verified by elf_resolve_sections */
+    /* Find file offset for section string table. */
+    offset =  elf->sec[elf->hdr->e_shstrndx].sec->sh_offset;
+
+    if ( offset > elf->len )
+        return_(-EINVAL);
+
+    shstrtab = (const char *)(data + offset);
+
+    /* We could ignore the first as it is reserved.. */
+    for ( i = 0; i < elf->hdr->e_shnum; i++ )
+    {
+        delta = elf->sec[i].sec->sh_name;
+
+        if ( offset + delta > elf->len )
+            return_(-EINVAL);
+
+        elf->sec[i].name = shstrtab + delta;
+    }
+    return 0;
+}
+
+static int elf_get_sym(struct xsplice_elf *elf, uint8_t *data)
+{
+    struct xsplice_elf_sec *symtab_sec, *strtab_sec;
+    struct xsplice_elf_sym *sym;
+    unsigned int i, delta, offset;
+
+    symtab_sec = elf->symtab;
+
+    strtab_sec = elf->strtab;
+
+    /* Pointers arithmetic to get file offset. */
+    offset = strtab_sec->data - data;
+
+    ASSERT( offset == strtab_sec->sec->sh_offset );
+    /* symtab_sec->data was computed in elf_resolve_sections. */
+    ASSERT((symtab_sec->sec->sh_offset + data) == symtab_sec->data );
+
+    /* No need to check values as elf_resolve_sections did it. */
+    elf->nsym = symtab_sec->sec->sh_size / symtab_sec->sec->sh_entsize;
+
+    sym = xmalloc_array(struct xsplice_elf_sym, elf->nsym);
+    if ( !sym )
+    {
+        printk(XENLOG_ERR "%s: Could not allocate memory for symbols\n", elf->name);
+        return_(-ENOMEM);
+    }
+
+    for ( i = 0; i < elf->nsym; i++ )
+    {
+        Elf_Sym *s;
+
+        if ( i * sizeof(Elf_Sym) > elf->len )
+            return_(-EINVAL);
+
+        s = &((Elf_Sym *)symtab_sec->data)[i];
+
+        /* If st->name is STN_UNDEF it is zero, so the check will always be true. */
+        delta = s->st_name;
+        /* Offset has been computed earlier. */
+        if ( offset + delta > elf->len )
+            return_(-EINVAL);
+
+        sym[i].sym = s;
+        if ( s->st_name == STN_UNDEF )
+            sym[i].name = NULL;
+        else
+            sym[i].name = (const char *)data + ( delta + offset );
+    }
+    elf->sym = sym;
+
+    return 0;
+}
+
+int xsplice_elf_load(struct xsplice_elf *elf, uint8_t *data)
+{
+    int rc;
+
+    elf->hdr = (Elf_Ehdr *)data;
+
+    if ( sizeof(*elf->hdr) >= elf->len )
+        return_(-EINVAL);
+
+    if ( elf->hdr->e_shstrndx == SHN_UNDEF )
+        return_(-EINVAL);
+
+    /* Check that section name index is within the sections. */
+    if ( elf->hdr->e_shstrndx > elf->hdr->e_shnum )
+        return_(-EINVAL);
+
+    rc = elf_resolve_sections(elf, data);
+    if ( rc )
+        return rc;
+
+    rc = elf_resolve_section_names(elf, data);
+    if ( rc )
+        return rc;
+
+    rc = elf_get_sym(elf, data);
+    if ( rc )
+        return rc;
+
+    return 0;
+}
+
+void xsplice_elf_free(struct xsplice_elf *elf)
+{
+    xfree(elf->sec);
+    elf->sec = NULL;
+    xfree(elf->sym);
+    elf->sym = NULL;
+    elf->nsym = 0;
+    elf->name = NULL;
+    elf->len = 0;
+}
diff --git a/xen/include/xen/xsplice_elf.h b/xen/include/xen/xsplice_elf.h
new file mode 100644
index 0000000..60c932b
--- /dev/null
+++ b/xen/include/xen/xsplice_elf.h
@@ -0,0 +1,37 @@
+#ifndef __XEN_XSPLICE_ELF_H__
+#define __XEN_XSPLICE_ELF_H__
+
+#include <xen/types.h>
+#include <xen/elfstructs.h>
+
+/* The following describes an Elf file as consumed by xSplice. */
+struct xsplice_elf_sec {
+    Elf_Shdr *sec;                 /* Hooked up in elf_resolve_sections. */
+    const char *name;              /* Human readable name hooked in
+                                      elf_resolve_section_names. */
+    const char uint8_t *data;      /* Pointer to the section (done by
+                                      elf_resolve_sections). */
+};
+
+struct xsplice_elf_sym {
+    Elf_Sym *sym;
+    const char *name;
+};
+
+struct xsplice_elf {
+    const char *name;              /* Pointer to payload->name. */
+    ssize_t len;                   /* Length of the ELF file. */
+    Elf_Ehdr *hdr;                 /* ELF file. */
+    struct xsplice_elf_sec *sec;   /* Array of sections, allocated by us. */
+    struct xsplice_elf_sym *sym;   /* Array of symbols , allocated by us. */
+    unsigned int nsym;
+    struct xsplice_elf_sec *symtab;/* Pointer to .symtab section - aka to sec[x]. */
+    struct xsplice_elf_sec *strtab;/* Pointer to .strtab section - aka to sec[y]. */
+};
+
+struct xsplice_elf_sec *xsplice_elf_sec_by_name(const struct xsplice_elf *elf,
+                                                const char *name);
+int xsplice_elf_load(struct xsplice_elf *elf, uint8_t *data);
+void xsplice_elf_free(struct xsplice_elf *elf);
+
+#endif /* __XEN_XSPLICE_ELF_H__ */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 08/13] xsplice: Implement payload loading (v2)
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (6 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 07/13] xsplice: Add helper elf routines (v2) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-19 14:34   ` Ross Lagerwall
  2016-01-19 16:45   ` Ross Lagerwall
  2016-01-14 21:47 ` [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2) Konrad Rzeszutek Wilk
                   ` (5 subsequent siblings)
  13 siblings, 2 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Add support for loading xsplice payloads. This is somewhat similar to
the Linux kernel module loader, implementing the following steps:
- Verify the elf file.
- Parse the elf file.
- Allocate a region of memory mapped within a free area of
  [xen_virt_end, XEN_VIRT_END].
- Copy allocated sections into the new region.
- Resolve section symbols. All other symbols must be absolute addresses.
- Perform relocations.

Note that the structure 'xsplice_patch_func' differs a bit from the design
by usurping 8 bytes from the padding. We use that for our own uses.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2: - Change the 'xsplice_patch_func' structure layout/size.
    - Add more error checking. Fix memory leak.
    - Move elf_resolve and elf_perform relocs in elf file.
    - Print the payload address and pages in keyhandler.
v3:
    - Make it build under ARM
---
 xen/arch/arm/Makefile             |   1 +
 xen/arch/arm/xsplice.c            |  23 ++++
 xen/arch/x86/Makefile             |   1 +
 xen/arch/x86/setup.c              |   7 ++
 xen/arch/x86/xsplice.c            | 106 +++++++++++++++++
 xen/common/xsplice.c              | 239 +++++++++++++++++++++++++++++++++++++-
 xen/common/xsplice_elf.c          |  84 ++++++++++++++
 xen/include/asm-arm/config.h      |   2 +
 xen/include/asm-x86/x86_64/page.h |   2 +
 xen/include/xen/xsplice.h         |  12 ++
 xen/include/xen/xsplice_elf.h     |   7 +-
 11 files changed, 481 insertions(+), 3 deletions(-)
 create mode 100644 xen/arch/arm/xsplice.c
 create mode 100644 xen/arch/x86/xsplice.c

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 2f050f5..c0f16b0 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -40,6 +40,7 @@ obj-y += device.o
 obj-y += decode.o
 obj-y += processor.o
 obj-y += smc.o
+obj-$(CONFIG_XSPLICE) += xsplice.o
 
 #obj-bin-y += ....o
 
diff --git a/xen/arch/arm/xsplice.c b/xen/arch/arm/xsplice.c
new file mode 100644
index 0000000..8d85fa9
--- /dev/null
+++ b/xen/arch/arm/xsplice.c
@@ -0,0 +1,23 @@
+#include <xen/lib.h>
+#include <xen/errno.h>
+#include <xen/xsplice_elf.h>
+#include <xen/xsplice.h>
+
+int xsplice_verify_elf(uint8_t *data, ssize_t len)
+{
+    return -ENOSYS;
+}
+
+int xsplice_perform_rel(struct xsplice_elf *elf,
+                        struct xsplice_elf_sec *base,
+                        struct xsplice_elf_sec *rela)
+{
+    return -ENOSYS;
+}
+
+int xsplice_perform_rela(struct xsplice_elf *elf,
+                         struct xsplice_elf_sec *base,
+                         struct xsplice_elf_sec *rela)
+{
+    return -ENOSYS;
+}
diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index 8e6e901..f7d3e39 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -63,6 +63,7 @@ obj-y += vm_event.o
 obj-y += xstate.o
 
 obj-$(crash_debug) += gdbstub.o
+obj-$(CONFIG_XSPLICE) += xsplice.o
 
 x86_emulate.o: x86_emulate/x86_emulate.c x86_emulate/x86_emulate.h
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 76c7b0f..fb35005 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -99,6 +99,9 @@ unsigned long __read_mostly xen_phys_start;
 
 unsigned long __read_mostly xen_virt_end;
 
+unsigned long __read_mostly module_virt_start;
+unsigned long __read_mostly module_virt_end;
+
 DEFINE_PER_CPU(struct tss_struct, init_tss);
 
 char __section(".bss.stack_aligned") cpu0_stack[STACK_SIZE];
@@ -1146,6 +1149,10 @@ void __init noreturn __start_xen(unsigned long mbi_p)
                    ~((1UL << L2_PAGETABLE_SHIFT) - 1);
     destroy_xen_mappings(xen_virt_end, XEN_VIRT_START + BOOTSTRAP_MAP_BASE);
 
+    module_virt_start = xen_virt_end;
+    module_virt_end = XEN_VIRT_END - NR_CPUS * PAGE_SIZE;
+    BUG_ON(module_virt_end <= module_virt_start);
+
     memguard_init();
 
     nr_pages = 0;
diff --git a/xen/arch/x86/xsplice.c b/xen/arch/x86/xsplice.c
new file mode 100644
index 0000000..7b13511
--- /dev/null
+++ b/xen/arch/x86/xsplice.c
@@ -0,0 +1,106 @@
+#include <xen/errno.h>
+#include <xen/lib.h>
+#include <xen/xsplice_elf.h>
+#include <xen/xsplice.h>
+
+int xsplice_verify_elf(struct xsplice_elf *elf, uint8_t *data)
+{
+
+    Elf_Ehdr *hdr = (Elf_Ehdr *)data;
+
+    if ( elf->len < (sizeof *hdr) ||
+         !IS_ELF(*hdr) ||
+         hdr->e_ident[EI_CLASS] != ELFCLASS64 ||
+         hdr->e_ident[EI_DATA] != ELFDATA2LSB ||
+         hdr->e_ident[EI_OSABI] != ELFOSABI_SYSV ||
+         hdr->e_machine != EM_X86_64 ||
+         hdr->e_type != ET_REL ||
+         hdr->e_phnum != 0 )
+    {
+        printk(XENLOG_ERR "%s: Invalid ELF file.\n", elf->name);
+        return -EOPNOTSUPP;
+    }
+
+    return 0;
+}
+
+int xsplice_perform_rel(struct xsplice_elf *elf,
+                        struct xsplice_elf_sec *base,
+                        struct xsplice_elf_sec *rela)
+{
+    printk(XENLOG_ERR "%s: SHR_REL relocation unsupported\n", elf->name);
+    return -ENOSYS;
+}
+
+int xsplice_perform_rela(struct xsplice_elf *elf,
+                         struct xsplice_elf_sec *base,
+                         struct xsplice_elf_sec *rela)
+{
+    Elf_RelA *r;
+    unsigned int symndx, i;
+    uint64_t val;
+    uint8_t *dest;
+
+    if ( !rela->sec->sh_entsize || !rela->sec->sh_size )
+        return -EINVAL;
+
+    if ( rela->sec->sh_entsize != sizeof(Elf_RelA) )
+        return -EINVAL;
+
+    for ( i = 0; i < (rela->sec->sh_size / rela->sec->sh_entsize); i++ )
+    {
+        r = (Elf_RelA *)(rela->data + i * rela->sec->sh_entsize);
+        if ( (unsigned long)r > (unsigned long)(elf->hdr + elf->len) )
+            return -EINVAL;
+
+        symndx = ELF64_R_SYM(r->r_info);
+        if ( symndx > elf->nsym )
+            return -EINVAL;
+
+        dest = base->load_addr + r->r_offset;
+        val = r->r_addend + elf->sym[symndx].sym->st_value;
+
+        switch ( ELF64_R_TYPE(r->r_info) )
+        {
+            case R_X86_64_NONE:
+                break;
+            case R_X86_64_64:
+                *(uint64_t *)dest = val;
+                break;
+            case R_X86_64_32:
+                *(uint32_t *)dest = val;
+                if (val != *(uint32_t *)dest)
+                    goto overflow;
+                break;
+            case R_X86_64_32S:
+                *(int32_t *)dest = val;
+                if ((int64_t)val != *(int32_t *)dest)
+                    goto overflow;
+                break;
+            case R_X86_64_PLT32:
+                /*
+                 * Xen uses -fpic which normally uses PLT relocations
+                 * except that it sets visibility to hidden which means
+                 * that they are not used.  However, when gcc cannot
+                 * inline memcpy it emits memcpy with default visibility
+                 * which then creates a PLT relocation.  It can just be
+                 * treated the same as R_X86_64_PC32.
+                 */
+                /* Fall through */
+            case R_X86_64_PC32:
+                *(uint32_t *)dest = val - (uint64_t)dest;
+                break;
+            default:
+                printk(XENLOG_ERR "%s: Unhandled relocation %lu\n",
+                       elf->name, ELF64_R_TYPE(r->r_info));
+                return -EINVAL;
+        }
+    }
+
+    return 0;
+
+ overflow:
+    printk(XENLOG_ERR "%s: Overflow in relocation %d in %s for %s\n",
+           elf->name, i, rela->name, base->name);
+    return -EOVERFLOW;
+}
diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 3c6acc3..67f6fc7 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -11,6 +11,7 @@
 #include <xen/sched.h>
 #include <xen/smp.h>
 #include <xen/spinlock.h>
+#include <xen/xsplice_elf.h>
 #include <xen/xsplice.h>
 
 #include <asm/event.h>
@@ -26,9 +27,15 @@ struct payload {
     int32_t state;                       /* One of the XSPLICE_STATE_*. */
     int32_t rc;                          /* 0 or -XEN_EXX. */
     struct list_head list;               /* Linked to 'payload_list'. */
+    void *payload_address;               /* Virtual address mapped. */
+    size_t payload_pages;                /* Nr of the pages. */
+
     char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
 };
 
+static int load_payload_data(struct payload *payload, uint8_t *raw, ssize_t len);
+static void free_payload_data(struct payload *payload);
+
 static const char *state2str(int32_t state)
 {
 #define STATE(x) [XSPLICE_STATE_##x] = #x
@@ -58,8 +65,9 @@ static void xsplice_printall(unsigned char key)
     spin_lock(&payload_list_lock);
 
     list_for_each_entry ( data, &payload_list, list )
-        printk(" name=%s state=%s(%d)\n", data->name,
-               state2str(data->state), data->state);
+        printk(" name=%s state=%s(%d) %p using %zu pages.\n", data->name,
+               state2str(data->state), data->state, data->payload_address,
+               data->payload_pages);
 
     spin_unlock(&payload_list_lock);
 }
@@ -136,6 +144,7 @@ static void free_payload(struct payload *data)
     list_del(&data->list);
     payload_cnt--;
     payload_version++;
+    free_payload_data(data);
     xfree(data);
 }
 
@@ -174,6 +183,10 @@ static int xsplice_upload(xen_sysctl_xsplice_upload_t *upload)
     if ( copy_from_guest(raw_data, upload->payload, upload->size) )
         goto err_raw;
 
+    rc = load_payload_data(data, raw_data, upload->size);
+    if ( rc )
+        goto err_raw;
+
     data->state = XSPLICE_STATE_LOADED;
     data->rc = 0;
     INIT_LIST_HEAD(&data->list);
@@ -378,6 +391,228 @@ int xsplice_control(xen_sysctl_xsplice_op_t *xsplice)
     return rc;
 }
 
+static void find_hole(ssize_t pages, unsigned long *hole_start,
+                      unsigned long *hole_end)
+{
+    struct payload *data, *data2;
+
+    spin_lock(&payload_list_lock);
+    list_for_each_entry ( data, &payload_list, list )
+    {
+        list_for_each_entry ( data2, &payload_list, list )
+        {
+            unsigned long start, end;
+
+            start = (unsigned long)data2->payload_address;
+            end = start + data2->payload_pages * PAGE_SIZE;
+            if ( *hole_end > start && *hole_start < end )
+            {
+                *hole_start = end;
+                *hole_end = *hole_start + pages * PAGE_SIZE;
+                break;
+            }
+        }
+        if ( &data2->list == &payload_list )
+            break;
+    }
+    spin_unlock(&payload_list_lock);
+}
+
+/*
+ * The following functions prepare an xSplice payload to be executed by
+ * allocating space, loading the allocated sections, resolving symbols,
+ * performing relocations, etc.
+ */
+#ifdef CONFIG_X86
+static void *alloc_payload(size_t size)
+{
+    mfn_t *mfn, *mfn_ptr;
+    size_t pages, i;
+    struct page_info *pg;
+    unsigned long hole_start, hole_end, cur;
+
+    ASSERT(size);
+
+    /*
+     * Copied from vmalloc which allocates pages and then maps them to an
+     * arbitrary virtual address with PAGE_HYPERVISOR. We need specific
+     * virtual address with PAGE_HYPERVISOR_RWX.
+     */
+    pages = PFN_UP(size);
+    mfn = xmalloc_array(mfn_t, pages);
+    if ( mfn == NULL )
+        return NULL;
+
+    for ( i = 0; i < pages; i++ )
+    {
+        pg = alloc_domheap_page(NULL, 0);
+        if ( pg == NULL )
+            goto error;
+        mfn[i] = _mfn(page_to_mfn(pg));
+    }
+
+    hole_start = (unsigned long)module_virt_start;
+    hole_end = hole_start + pages * PAGE_SIZE;
+    find_hole(pages, &hole_start, &hole_end);
+
+    if ( hole_end >= module_virt_end )
+        goto error;
+
+    for ( cur = hole_start, mfn_ptr = mfn; pages--; ++mfn_ptr, cur += PAGE_SIZE )
+    {
+        if ( map_pages_to_xen(cur, mfn_x(*mfn_ptr), 1, PAGE_HYPERVISOR_RWX) )
+        {
+            if ( cur != hole_start )
+                destroy_xen_mappings(hole_start, cur);
+            goto error;
+        }
+    }
+    xfree(mfn);
+    return (void *)hole_start;
+
+ error:
+    while ( i-- )
+        free_domheap_page(mfn_to_page(mfn_x(mfn[i])));
+    xfree(mfn);
+    return NULL;
+}
+#else
+static void *alloc_payload(size_t size)
+{
+    return NULL;
+}
+#endif
+
+static void free_payload_data(struct payload *payload)
+{
+    unsigned int i;
+    struct page_info *pg;
+    PAGE_LIST_HEAD(pg_list);
+    void *va = payload->payload_address;
+    unsigned long addr = (unsigned long)va;
+
+    if ( !va )
+        return;
+
+    payload->payload_address = NULL;
+
+    for ( i = 0; i < payload->payload_pages; i++ )
+        page_list_add(vmap_to_page(va + i * PAGE_SIZE), &pg_list);
+
+    destroy_xen_mappings(addr, addr + payload->payload_pages * PAGE_SIZE);
+
+    while ( (pg = page_list_remove_head(&pg_list)) != NULL )
+        free_domheap_page(pg);
+
+    payload->payload_pages = 0;
+}
+
+static void calc_section(struct xsplice_elf_sec *sec, size_t *core_size)
+{
+    size_t align_size = ROUNDUP(*core_size, sec->sec->sh_addralign);
+    sec->sec->sh_entsize = align_size;
+    *core_size = sec->sec->sh_size + align_size;
+}
+
+static int move_payload(struct payload *payload, struct xsplice_elf *elf)
+{
+    uint8_t *buf;
+    unsigned int i;
+    size_t core_size = 0;
+
+    /* Compute text regions */
+    for ( i = 0; i < elf->hdr->e_shnum; i++ )
+    {
+        if ( (elf->sec[i].sec->sh_flags & (SHF_ALLOC|SHF_EXECINSTR)) ==
+             (SHF_ALLOC|SHF_EXECINSTR) )
+            calc_section(&elf->sec[i], &core_size);
+    }
+
+    /* Compute rw data */
+    for ( i = 0; i < elf->hdr->e_shnum; i++ )
+    {
+        if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
+             !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
+             (elf->sec[i].sec->sh_flags & SHF_WRITE) )
+            calc_section(&elf->sec[i], &core_size);
+    }
+
+    /* Compute ro data */
+    for ( i = 0; i < elf->hdr->e_shnum; i++ )
+    {
+        if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
+             !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
+             !(elf->sec[i].sec->sh_flags & SHF_WRITE) )
+            calc_section(&elf->sec[i], &core_size);
+    }
+
+    buf = alloc_payload(core_size);
+    if ( !buf ) {
+        printk(XENLOG_ERR "%s: Could not allocate memory for module\n",
+               elf->name);
+        return -ENOMEM;
+    }
+    memset(buf, 0, core_size);
+
+    for ( i = 0; i < elf->hdr->e_shnum; i++ )
+    {
+        if ( elf->sec[i].sec->sh_flags & SHF_ALLOC )
+        {
+            elf->sec[i].load_addr = buf + elf->sec[i].sec->sh_entsize;
+            memcpy(elf->sec[i].load_addr, elf->sec[i].data,
+                   elf->sec[i].sec->sh_size);
+            printk(XENLOG_DEBUG "%s: Loaded %s at 0x%p\n",
+                   elf->name, elf->sec[i].name, elf->sec[i].load_addr);
+        }
+    }
+
+    payload->payload_address = buf;
+    payload->payload_pages = PFN_UP(core_size);
+
+    return 0;
+}
+
+static int load_payload_data(struct payload *payload, uint8_t *raw, ssize_t len)
+{
+    struct xsplice_elf elf;
+    int rc = 0;
+
+    memset(&elf, 0, sizeof(elf));
+    elf.name = payload->name;
+    elf.len = len;
+
+    rc = xsplice_verify_elf(&elf, raw);
+    if ( rc )
+        return rc;
+
+    rc = xsplice_elf_load(&elf, raw);
+    if ( rc )
+        goto err_elf;
+
+    rc = move_payload(payload, &elf);
+    if ( rc )
+        goto err_elf;
+
+    rc = xsplice_elf_resolve_symbols(&elf);
+    if ( rc )
+        goto err_payload;
+
+    rc = xsplice_elf_perform_relocs(&elf);
+    if ( rc )
+        goto err_payload;
+
+    /* Free our temporary data structure. */
+    xsplice_elf_free(&elf);
+    return 0;
+
+ err_payload:
+    free_payload_data(payload);
+ err_elf:
+    xsplice_elf_free(&elf);
+
+    return rc;
+}
+
 static int __init xsplice_init(void)
 {
     register_keyhandler('x', xsplice_printall, "print xsplicing info", 1);
diff --git a/xen/common/xsplice_elf.c b/xen/common/xsplice_elf.c
index a5e9d63..ea7eb73 100644
--- a/xen/common/xsplice_elf.c
+++ b/xen/common/xsplice_elf.c
@@ -199,3 +199,87 @@ void xsplice_elf_free(struct xsplice_elf *elf)
     elf->name = NULL;
     elf->len = 0;
 }
+
+int xsplice_elf_resolve_symbols(struct xsplice_elf *elf)
+{
+    unsigned int i;
+
+    /*
+     * The first entry of an ELF symbol table is the "undefined symbol index".
+     * aka reserved so we skip it.
+     */
+    ASSERT( elf->sym );
+    for ( i = 1; i < elf->nsym; i++ )
+    {
+        switch ( elf->sym[i].sym->st_shndx )
+        {
+            case SHN_COMMON:
+                printk(XENLOG_ERR "%s: Unexpected common symbol: %s\n",
+                       elf->name, elf->sym[i].name);
+                return_(-EINVAL);
+                break;
+            case SHN_UNDEF:
+                printk(XENLOG_ERR "%s: Unknown symbol: %s\n", elf->name,
+                       elf->sym[i].name);
+                return_(-ENOENT);
+                break;
+            case SHN_ABS:
+                printk(XENLOG_DEBUG "%s: Absolute symbol: %s => 0x%p\n",
+                      elf->name, elf->sym[i].name,
+                      (void *)elf->sym[i].sym->st_value);
+                break;
+            default:
+                if ( elf->sec[elf->sym[i].sym->st_shndx].sec->sh_flags & SHF_ALLOC )
+                {
+                    elf->sym[i].sym->st_value +=
+                        (unsigned long)elf->sec[elf->sym[i].sym->st_shndx].load_addr;
+                    printk(XENLOG_DEBUG "%s: Symbol resolved: %s => 0x%p\n",
+                           elf->name, elf->sym[i].name,
+                           (void *)elf->sym[i].sym->st_value);
+                }
+        }
+    }
+
+    return 0;
+}
+
+int xsplice_elf_perform_relocs(struct xsplice_elf *elf)
+{
+    struct xsplice_elf_sec *rela, *base;
+    unsigned int i;
+    int rc;
+
+    /*
+     * The first entry of an ELF symbol table is the "undefined symbol index".
+     * aka reserved so we skip it.
+     */
+    ASSERT( elf->sym );
+    for ( i = 1; i < elf->hdr->e_shnum; i++ )
+    {
+        rela = &elf->sec[i];
+
+        if ( (rela->sec->sh_type != SHT_RELA ) &&
+             (rela->sec->sh_type != SHT_REL ) )
+            continue;
+
+         /* Is it a valid relocation section? */
+         if ( rela->sec->sh_info >= elf->hdr->e_shnum )
+            continue;
+
+         base = &elf->sec[rela->sec->sh_info];
+
+         /* Don't relocate non-allocated sections. */
+         if ( !(base->sec->sh_flags & SHF_ALLOC) )
+            continue;
+
+        if ( elf->sec[i].sec->sh_type == SHT_RELA )
+            rc = xsplice_perform_rela(elf, base, rela);
+        else /* SHT_REL */
+            rc = xsplice_perform_rel(elf, base, rela);
+
+        if ( rc )
+            return rc;
+    }
+
+    return 0;
+}
diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
index bd832df..4ea66bf 100644
--- a/xen/include/asm-arm/config.h
+++ b/xen/include/asm-arm/config.h
@@ -15,8 +15,10 @@
 
 #if defined(CONFIG_ARM_64)
 # define LONG_BYTEORDER 3
+# define ELFSIZE 64
 #else
 # define LONG_BYTEORDER 2
+# define ELFSIZE 32
 #endif
 
 #define BYTES_PER_LONG (1 << LONG_BYTEORDER)
diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
index 19ab4d0..e6f08e9 100644
--- a/xen/include/asm-x86/x86_64/page.h
+++ b/xen/include/asm-x86/x86_64/page.h
@@ -38,6 +38,8 @@
 #include <xen/pdx.h>
 
 extern unsigned long xen_virt_end;
+extern unsigned long module_virt_start;
+extern unsigned long module_virt_end;
 
 #define spage_to_pdx(spg) (((spg) - spage_table)<<(SUPERPAGE_SHIFT-PAGE_SHIFT))
 #define pdx_to_spage(pdx) (spage_table + ((pdx)>>(SUPERPAGE_SHIFT-PAGE_SHIFT)))
diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
index 2cb2035..b90742f 100644
--- a/xen/include/xen/xsplice.h
+++ b/xen/include/xen/xsplice.h
@@ -1,7 +1,19 @@
 #ifndef __XEN_XSPLICE_H__
 #define __XEN_XSPLICE_H__
 
+struct xsplice_elf;
+struct xsplice_elf_sec;
+struct xsplice_elf_sym;
 struct xen_sysctl_xsplice_op;
+
 int xsplice_control(struct xen_sysctl_xsplice_op *);
 
+/* Arch hooks */
+int xsplice_verify_elf(struct xsplice_elf *elf, uint8_t *data);
+int xsplice_perform_rel(struct xsplice_elf *elf,
+                        struct xsplice_elf_sec *base,
+                        struct xsplice_elf_sec *rela);
+int xsplice_perform_rela(struct xsplice_elf *elf,
+                         struct xsplice_elf_sec *base,
+                         struct xsplice_elf_sec *rela);
 #endif /* __XEN_XSPLICE_H__ */
diff --git a/xen/include/xen/xsplice_elf.h b/xen/include/xen/xsplice_elf.h
index 60c932b..229c11f 100644
--- a/xen/include/xen/xsplice_elf.h
+++ b/xen/include/xen/xsplice_elf.h
@@ -9,8 +9,10 @@ struct xsplice_elf_sec {
     Elf_Shdr *sec;                 /* Hooked up in elf_resolve_sections. */
     const char *name;              /* Human readable name hooked in
                                       elf_resolve_section_names. */
-    const char uint8_t *data;      /* Pointer to the section (done by
+    const uint8_t *data;           /* Pointer to the section (done by
                                       elf_resolve_sections). */
+    uint8_t *load_addr;            /* A pointer to the allocated destination.
+                                      Done by load_payload_data. */
 };
 
 struct xsplice_elf_sym {
@@ -34,4 +36,7 @@ struct xsplice_elf_sec *xsplice_elf_sec_by_name(const struct xsplice_elf *elf,
 int xsplice_elf_load(struct xsplice_elf *elf, uint8_t *data);
 void xsplice_elf_free(struct xsplice_elf *elf);
 
+int xsplice_elf_resolve_symbols(struct xsplice_elf *elf);
+int xsplice_elf_perform_relocs(struct xsplice_elf *elf);
+
 #endif /* __XEN_XSPLICE_ELF_H__ */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2)
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (7 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 08/13] xsplice: Implement payload loading (v2) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-19 14:39   ` Ross Lagerwall
  2016-01-14 21:47 ` [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version' Konrad Rzeszutek Wilk
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Implement support for the apply, revert and replace actions.

To perform and action on a payload, the hypercall sets up a data
structure to schedule the work.  A hook is added in all the
return-to-guest paths to check for work to do and execute it if needed.
In this way, patches can be applied with all CPUs idle and without
stacks.  The first CPU to do_xsplice() becomes the master and triggers a
reschedule softirq to trigger all the other CPUs to enter do_xsplice()
with no stack.  Once all CPUs have rendezvoused, all CPUs disable IRQs
and NMIs are ignored. The system is then quiscient and the master
performs the action.  After this, all CPUs enable IRQs and NMIs are
re-enabled.

The action to perform is one of:
- APPLY: For each function in the module, store the first 5 bytes of the
  old function and replace it with a jump to the new function.
- REVERT: Copy the previously stored bytes into the first 5 bytes of the
  old function.
- REPLACE: Revert each applied module and then apply the new module.

To prevent a deadlock with any other barrier in the system, the master
will wait for up to 30ms before timing out.  I've taken some
measurements and found the patch application to take about 100 μs on a
72 CPU system, whether idle or fully loaded.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
--
 v2: - Pluck the 'struct xsplice_patch_func' in this patch.
     - Modify code per review comments.
     - Add more data in the keyboard handler.
     - Redo the patching code, split it in functions.
---
 xen/arch/arm/xsplice.c      |  10 +-
 xen/arch/x86/domain.c       |   4 +
 xen/arch/x86/hvm/svm/svm.c  |   2 +
 xen/arch/x86/hvm/vmx/vmcs.c |   2 +
 xen/arch/x86/xsplice.c      |  19 +++
 xen/common/xsplice.c        | 389 ++++++++++++++++++++++++++++++++++++++++----
 xen/include/asm-arm/nmi.h   |  13 ++
 xen/include/xen/xsplice.h   |  24 +++
 8 files changed, 432 insertions(+), 31 deletions(-)

diff --git a/xen/arch/arm/xsplice.c b/xen/arch/arm/xsplice.c
index 8d85fa9..06f6875 100644
--- a/xen/arch/arm/xsplice.c
+++ b/xen/arch/arm/xsplice.c
@@ -3,7 +3,15 @@
 #include <xen/xsplice_elf.h>
 #include <xen/xsplice.h>
 
-int xsplice_verify_elf(uint8_t *data, ssize_t len)
+void xsplice_apply_jmp(struct xsplice_patch_func *func)
+{
+}
+
+void xsplice_revert_jmp(struct xsplice_patch_func *func)
+{
+}
+
+int xsplice_verify_elf(struct xsplice_elf *elf, uint8_t *data)
 {
     return -ENOSYS;
 }
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index e70c125..03ac0d7 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -36,6 +36,7 @@
 #include <xen/cpu.h>
 #include <xen/wait.h>
 #include <xen/guest_access.h>
+#include <xen/xsplice.h>
 #include <public/sysctl.h>
 #include <public/hvm/hvm_vcpu.h>
 #include <asm/regs.h>
@@ -121,6 +122,7 @@ static void idle_loop(void)
         (*pm_idle)();
         do_tasklet();
         do_softirq();
+        do_xsplice(); /* Must be last. */
     }
 }
 
@@ -137,6 +139,7 @@ void startup_cpu_idle_loop(void)
 
 static void noreturn continue_idle_domain(struct vcpu *v)
 {
+    do_xsplice();
     reset_stack_and_jump(idle_loop);
 }
 
@@ -144,6 +147,7 @@ static void noreturn continue_nonidle_domain(struct vcpu *v)
 {
     check_wakeup_from_wait();
     mark_regs_dirty(guest_cpu_user_regs());
+    do_xsplice();
     reset_stack_and_jump(ret_from_intr);
 }
 
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index a66d854..3ea5b97 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -26,6 +26,7 @@
 #include <xen/hypercall.h>
 #include <xen/domain_page.h>
 #include <xen/xenoprof.h>
+#include <xen/xsplice.h>
 #include <asm/current.h>
 #include <asm/io.h>
 #include <asm/paging.h>
@@ -1096,6 +1097,7 @@ static void noreturn svm_do_resume(struct vcpu *v)
 
     hvm_do_resume(v);
 
+    do_xsplice();
     reset_stack_and_jump(svm_asm_do_resume);
 }
 
diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 5bc3c74..1008163 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -25,6 +25,7 @@
 #include <xen/kernel.h>
 #include <xen/keyhandler.h>
 #include <xen/vm_event.h>
+#include <xen/xsplice.h>
 #include <asm/current.h>
 #include <asm/cpufeature.h>
 #include <asm/processor.h>
@@ -1716,6 +1717,7 @@ void vmx_do_resume(struct vcpu *v)
     }
 
     hvm_do_resume(v);
+    do_xsplice();
     reset_stack_and_jump(vmx_asm_do_vmentry);
 }
 
diff --git a/xen/arch/x86/xsplice.c b/xen/arch/x86/xsplice.c
index 7b13511..012ddfe 100644
--- a/xen/arch/x86/xsplice.c
+++ b/xen/arch/x86/xsplice.c
@@ -3,6 +3,25 @@
 #include <xen/xsplice_elf.h>
 #include <xen/xsplice.h>
 
+#define PATCH_INSN_SIZE 5
+
+void xsplice_apply_jmp(struct xsplice_patch_func *func)
+{
+    uint32_t val;
+    uint8_t *old_ptr;
+
+    old_ptr = (uint8_t *)func->old_addr;
+    memcpy(func->undo, old_ptr, PATCH_INSN_SIZE);
+    *old_ptr++ = 0xe9; /* Relative jump */
+    val = func->new_addr - func->old_addr - PATCH_INSN_SIZE;
+    memcpy(old_ptr, &val, sizeof val);
+}
+
+void xsplice_revert_jmp(struct xsplice_patch_func *func)
+{
+    memcpy((void *)func->old_addr, func->undo, PATCH_INSN_SIZE);
+}
+
 int xsplice_verify_elf(struct xsplice_elf *elf, uint8_t *data)
 {
 
diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 67f6fc7..5abeb28 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -3,6 +3,7 @@
  *
  */
 
+#include <xen/cpu.h>
 #include <xen/guest_access.h>
 #include <xen/keyhandler.h>
 #include <xen/lib.h>
@@ -10,25 +11,38 @@
 #include <xen/mm.h>
 #include <xen/sched.h>
 #include <xen/smp.h>
+#include <xen/softirq.h>
 #include <xen/spinlock.h>
+#include <xen/wait.h>
 #include <xen/xsplice_elf.h>
 #include <xen/xsplice.h>
 
 #include <asm/event.h>
+#include <asm/nmi.h>
 #include <public/sysctl.h>
 
-static DEFINE_SPINLOCK(payload_list_lock);
+/*
+ * Protects against payload_list operations and also allows only one
+ * caller in schedule_work.
+ */
+static DEFINE_SPINLOCK(payload_lock);
 static LIST_HEAD(payload_list);
 
+static LIST_HEAD(applied_list);
+
 static unsigned int payload_cnt;
 static unsigned int payload_version = 1;
 
 struct payload {
     int32_t state;                       /* One of the XSPLICE_STATE_*. */
     int32_t rc;                          /* 0 or -XEN_EXX. */
+    uint32_t timeout;                    /* Timeout to do the operation. */
     struct list_head list;               /* Linked to 'payload_list'. */
     void *payload_address;               /* Virtual address mapped. */
     size_t payload_pages;                /* Nr of the pages. */
+    struct list_head applied_list;       /* Linked to 'applied_list'. */
+    struct xsplice_patch_func *funcs;    /* The array of functions to patch. */
+    unsigned int nfuncs;                 /* Nr of functions to patch. */
 
     char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
 };
@@ -36,6 +50,23 @@ struct payload {
 static int load_payload_data(struct payload *payload, uint8_t *raw, ssize_t len);
 static void free_payload_data(struct payload *payload);
 
+/* Defines an outstanding patching action. */
+struct xsplice_work
+{
+    atomic_t semaphore;          /* Used for rendezvous. First to grab it will
+                                    do the patching. */
+    atomic_t irq_semaphore;      /* Used to signal all IRQs disabled. */
+    struct payload *data;        /* The payload on which to act. */
+    volatile bool_t do_work;     /* Signals work to do. */
+    volatile bool_t ready;       /* Signals all CPUs synchronized. */
+    uint32_t cmd;                /* Action request: XSPLICE_ACTION_* */
+};
+
+/* There can be only one outstanding patching action. */
+static struct xsplice_work xsplice_work;
+
+static int schedule_work(struct payload *data, uint32_t cmd);
+
 static const char *state2str(int32_t state)
 {
 #define STATE(x) [XSPLICE_STATE_##x] = #x
@@ -61,15 +92,24 @@ static const char *state2str(int32_t state)
 static void xsplice_printall(unsigned char key)
 {
     struct payload *data;
+    unsigned int i;
 
-    spin_lock(&payload_list_lock);
+    spin_lock(&payload_lock);
 
     list_for_each_entry ( data, &payload_list, list )
-        printk(" name=%s state=%s(%d) %p using %zu pages.\n", data->name,
+    {
+        printk(" name=%s state=%s(%d) %p using %zu pages:\n", data->name,
                state2str(data->state), data->state, data->payload_address,
                data->payload_pages);
 
-    spin_unlock(&payload_list_lock);
+        for ( i = 0; i < data->nfuncs; i ++ )
+        {
+            struct xsplice_patch_func *f = &(data->funcs[i]);
+            printk("    %s patch 0x%lx(%u) with 0x%lx(%u)\n",
+                   f->name, f->old_addr, f->old_size, f->new_addr, f->new_size);
+        }
+    }
+    spin_unlock(&payload_lock);
 }
 
 static int verify_name(xen_xsplice_name_t *name)
@@ -103,7 +143,7 @@ static int find_payload(xen_xsplice_name_t *name, bool_t need_lock,
         return -EFAULT;
 
     if ( need_lock )
-        spin_lock(&payload_list_lock);
+        spin_lock(&payload_lock);
 
     rc = -ENOENT;
     list_for_each_entry ( data, &payload_list, list )
@@ -117,7 +157,7 @@ static int find_payload(xen_xsplice_name_t *name, bool_t need_lock,
     }
 
     if ( need_lock )
-        spin_unlock(&payload_list_lock);
+        spin_unlock(&payload_lock);
 
     return rc;
 }
@@ -137,7 +177,7 @@ static int verify_payload(xen_sysctl_xsplice_upload_t *upload)
 }
 
 /*
- * We MUST be holding the payload_list_lock spinlock.
+ * We MUST be holding the payload_lock spinlock.
  */
 static void free_payload(struct payload *data)
 {
@@ -191,11 +231,11 @@ static int xsplice_upload(xen_sysctl_xsplice_upload_t *upload)
     data->rc = 0;
     INIT_LIST_HEAD(&data->list);
 
-    spin_lock(&payload_list_lock);
+    spin_lock(&payload_lock);
     list_add_tail(&data->list, &payload_list);
     payload_cnt++;
     payload_version++;
-    spin_unlock(&payload_list_lock);
+    spin_unlock(&payload_lock);
 
     free_xenheap_pages(raw_data, get_order_from_bytes(upload->size));
     return 0;
@@ -250,10 +290,10 @@ static int xsplice_list(xen_sysctl_xsplice_list_t *list)
          !guest_handle_okay(list->len, sizeof(uint32_t) * list->nr) )
         return -EINVAL;
 
-    spin_lock(&payload_list_lock);
+    spin_lock(&payload_lock);
     if ( list->idx > payload_cnt || !list->nr )
     {
-        spin_unlock(&payload_list_lock);
+        spin_unlock(&payload_lock);
         return -EINVAL;
     }
 
@@ -283,7 +323,7 @@ static int xsplice_list(xen_sysctl_xsplice_list_t *list)
     }
     list->nr = payload_cnt - i; /* Remaining amount. */
     list->version = payload_version;
-    spin_unlock(&payload_list_lock);
+    spin_unlock(&payload_lock);
 
     /* And how many we have processed. */
     return rc ? : idx;
@@ -298,7 +338,7 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
     if ( rc )
         return rc;
 
-    spin_lock(&payload_list_lock);
+    spin_lock(&payload_lock);
     rc = find_payload(&action->name, 0 /* We are holding the lock. */, &data);
     if ( rc )
         goto out;
@@ -327,28 +367,25 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
     case XSPLICE_ACTION_REVERT:
         if ( data->state == XSPLICE_STATE_APPLIED )
         {
-            /* No implementation yet. */
-            data->state = XSPLICE_STATE_CHECKED;
-            data->rc = 0;
-            rc = 0;
+            data->rc = -EAGAIN;
+            data->timeout = action->timeout;
+            rc = schedule_work(data, action->cmd);
         }
         break;
     case XSPLICE_ACTION_APPLY:
         if ( (data->state == XSPLICE_STATE_CHECKED) )
         {
-            /* No implementation yet. */
-            data->state = XSPLICE_STATE_APPLIED;
-            data->rc = 0;
-            rc = 0;
+            data->rc = -EAGAIN;
+            data->timeout = action->timeout;
+            rc = schedule_work(data, action->cmd);
         }
         break;
     case XSPLICE_ACTION_REPLACE:
         if ( data->state == XSPLICE_STATE_CHECKED )
         {
-            /* No implementation yet. */
-            data->state = XSPLICE_STATE_CHECKED;
-            data->rc = 0;
-            rc = 0;
+            data->rc = -EAGAIN;
+            data->timeout = action->timeout;
+            rc = schedule_work(data, action->cmd);
         }
         break;
     default:
@@ -357,7 +394,7 @@ static int xsplice_action(xen_sysctl_xsplice_action_t *action)
     }
 
  out:
-    spin_unlock(&payload_list_lock);
+    spin_unlock(&payload_lock);
 
     return rc;
 }
@@ -391,12 +428,13 @@ int xsplice_control(xen_sysctl_xsplice_op_t *xsplice)
     return rc;
 }
 
+#ifdef CONFIG_X86
 static void find_hole(ssize_t pages, unsigned long *hole_start,
                       unsigned long *hole_end)
 {
     struct payload *data, *data2;
 
-    spin_lock(&payload_list_lock);
+    spin_lock(&payload_lock);
     list_for_each_entry ( data, &payload_list, list )
     {
         list_for_each_entry ( data2, &payload_list, list )
@@ -415,7 +453,7 @@ static void find_hole(ssize_t pages, unsigned long *hole_start,
         if ( &data2->list == &payload_list )
             break;
     }
-    spin_unlock(&payload_list_lock);
+    spin_unlock(&payload_lock);
 }
 
 /*
@@ -423,7 +461,6 @@ static void find_hole(ssize_t pages, unsigned long *hole_start,
  * allocating space, loading the allocated sections, resolving symbols,
  * performing relocations, etc.
  */
-#ifdef CONFIG_X86
 static void *alloc_payload(size_t size)
 {
     mfn_t *mfn, *mfn_ptr;
@@ -572,6 +609,47 @@ static int move_payload(struct payload *payload, struct xsplice_elf *elf)
     return 0;
 }
 
+static int find_special_sections(struct payload *payload,
+                                 struct xsplice_elf *elf)
+{
+    struct xsplice_elf_sec *sec;
+    unsigned int i;
+    struct xsplice_patch_func *f;
+
+    sec = xsplice_elf_sec_by_name(elf, ".xsplice.funcs");
+    if ( !sec )
+    {
+        printk(XENLOG_ERR "%s: .xsplice.funcs is missing!\n", elf->name);
+        return -EINVAL;
+    }
+
+    if ( ( !sec->sec->sh_size ) ||
+         ( sec->sec->sh_size % sizeof *payload->funcs ) )
+        return -EINVAL;
+
+    payload->funcs = (struct xsplice_patch_func *)sec->load_addr;
+    payload->nfuncs = sec->sec->sh_size / (sizeof *payload->funcs);
+
+    for ( i = 0; i < payload->nfuncs; i++ )
+    {
+        unsigned int j;
+
+        f = &(payload->funcs[i]);
+
+        if ( !f->new_addr || !f->old_addr || !f->old_size || !f->new_size )
+            return -EINVAL;
+
+        for ( j = 0; j < 8; j ++ )
+            if ( f->undo[j] )
+                return -EINVAL;
+
+        for ( j = 0; j < 24; j ++ )
+            if ( f->pad[j] )
+                return -EINVAL;
+    }
+    return 0;
+}
+
 static int load_payload_data(struct payload *payload, uint8_t *raw, ssize_t len)
 {
     struct xsplice_elf elf;
@@ -601,7 +679,10 @@ static int load_payload_data(struct payload *payload, uint8_t *raw, ssize_t len)
     if ( rc )
         goto err_payload;
 
-    /* Free our temporary data structure. */
+    rc = find_special_sections(payload, &elf);
+    if ( rc )
+        goto err_payload;
+
     xsplice_elf_free(&elf);
     return 0;
 
@@ -613,6 +694,254 @@ static int load_payload_data(struct payload *payload, uint8_t *raw, ssize_t len)
     return rc;
 }
 
+
+/*
+ * The following functions get the CPUs into an appropriate state and
+ * apply (or revert) each of the module's functions.
+ */
+
+/*
+ * This function is executed having all other CPUs with no stack (we may
+ * have cpu_idle on it) and IRQs disabled. We guard against NMI by temporarily
+ * installing our NOP NMI handler.
+ */
+static int apply_payload(struct payload *data)
+{
+    unsigned int i;
+
+    printk(XENLOG_DEBUG "%s: Applying %u functions.\n", data->name,
+           data->nfuncs);
+
+    for ( i = 0; i < data->nfuncs; i++ )
+        xsplice_apply_jmp(data->funcs + i);
+
+    list_add_tail(&data->applied_list, &applied_list);
+
+    return 0;
+}
+
+/*
+ * This function is executed having all other CPUs with no stack (we may
+ * have cpu_idle on it) and IRQs disabled.
+ */
+static int revert_payload(struct payload *data)
+{
+    unsigned int i;
+
+    printk(XENLOG_DEBUG "%s: Reverting.\n", data->name);
+
+    for ( i = 0; i < data->nfuncs; i++ )
+        xsplice_revert_jmp(data->funcs + i);
+
+    list_del(&data->applied_list);
+
+    return 0;
+}
+
+/* Must be holding the payload_list lock. */
+static int schedule_work(struct payload *data, uint32_t cmd)
+{
+    /* Fail if an operation is already scheduled. */
+    if ( xsplice_work.do_work )
+        return -EAGAIN;
+
+    xsplice_work.cmd = cmd;
+    xsplice_work.data = data;
+    atomic_set(&xsplice_work.semaphore, -1);
+    atomic_set(&xsplice_work.irq_semaphore, -1);
+
+    xsplice_work.ready = 0;
+    smp_wmb();
+    xsplice_work.do_work = 1;
+    smp_wmb();
+
+    return 0;
+}
+
+/*
+ * Note that because of this NOP code the do_nmi is not safely patchable.
+ * Also if we do receive 'real' NMIs we have lost them.
+ */
+static int mask_nmi_callback(const struct cpu_user_regs *regs, int cpu)
+{
+    return 1;
+}
+
+static void reschedule_fn(void *unused)
+{
+    smp_mb(); /* Synchronize with setting do_work */
+    raise_softirq(SCHEDULE_SOFTIRQ);
+}
+
+static int xsplice_do_wait(atomic_t *counter, s_time_t timeout,
+                           unsigned int total_cpus, const char *s)
+{
+    int rc = 0;
+
+    while ( atomic_read(counter) != total_cpus && NOW() < timeout )
+        cpu_relax();
+
+    /* Log & abort. */
+    if ( atomic_read(counter) != total_cpus )
+    {
+        printk(XENLOG_DEBUG "%s: %s %u/%u\n", xsplice_work.data->name,
+               s, atomic_read(counter), total_cpus);
+        rc = -EBUSY;
+        xsplice_work.data->rc = rc;
+        xsplice_work.do_work = 0;
+        smp_wmb();
+        return rc;
+    }
+    return rc;
+}
+
+static void xsplice_do_single(unsigned int total_cpus)
+{
+    nmi_callback_t saved_nmi_callback;
+    s_time_t timeout;
+    struct payload *data, *tmp;
+    int rc;
+
+    data = xsplice_work.data;
+    timeout = data->timeout ? data->timeout : MILLISECS(30);
+    printk(XENLOG_DEBUG "%s: timeout is %"PRI_stime"ms\n", data->name,
+           timeout / MILLISECS(1));
+
+    timeout += NOW();
+
+    if ( xsplice_do_wait(&xsplice_work.semaphore, timeout, total_cpus,
+                         "Timed out on CPU semaphore") )
+        return;
+
+    /* "Mask" NMIs. */
+    saved_nmi_callback = set_nmi_callback(mask_nmi_callback);
+
+    /* All CPUs are waiting, now signal to disable IRQs. */
+    xsplice_work.ready = 1;
+    smp_wmb();
+
+    atomic_inc(&xsplice_work.irq_semaphore);
+    if ( xsplice_do_wait(&xsplice_work.irq_semaphore, timeout, total_cpus,
+                         "Timed out on IRQ semaphore.") )
+        return;
+
+    local_irq_disable();
+    /* Now this function should be the only one on any stack.
+     * No need to lock the payload list or applied list. */
+    switch ( xsplice_work.cmd )
+    {
+    case XSPLICE_ACTION_APPLY:
+        rc = apply_payload(data);
+        if ( rc == 0 )
+            data->state = XSPLICE_STATE_APPLIED;
+        break;
+    case XSPLICE_ACTION_REVERT:
+        rc = revert_payload(data);
+        if ( rc == 0 )
+            data->state = XSPLICE_STATE_CHECKED;
+        break;
+    case XSPLICE_ACTION_REPLACE:
+        list_for_each_entry_safe_reverse ( data, tmp, &applied_list, list )
+        {
+            data->rc = revert_payload(data);
+            if ( data->rc == 0 )
+                data->state = XSPLICE_STATE_CHECKED;
+            else
+            {
+                rc = -EINVAL;
+                break;
+            }
+        }
+        if ( rc != -EINVAL )
+        {
+            rc = apply_payload(xsplice_work.data);
+            if ( rc == 0 )
+                xsplice_work.data->state = XSPLICE_STATE_APPLIED;
+        }
+        break;
+    default:
+        rc = -EINVAL;
+        break;
+    }
+
+    xsplice_work.data->rc = rc;
+
+    local_irq_enable();
+    set_nmi_callback(saved_nmi_callback);
+
+    xsplice_work.do_work = 0;
+    smp_wmb(); /* Synchronize with waiting CPUs. */
+}
+
+/*
+ * The main function which manages the work of quiescing the system and
+ * patching code.
+ */
+void do_xsplice(void)
+{
+    struct payload *p = xsplice_work.data;
+    unsigned int cpu = smp_processor_id();
+
+    /* Fast path: no work to do. */
+    if ( likely(!xsplice_work.do_work) )
+        return;
+
+    ASSERT(local_irq_is_enabled());
+
+    /* Set at -1, so will go up to num_online_cpus - 1 */
+    if ( atomic_inc_and_test(&xsplice_work.semaphore) )
+    {
+        unsigned int total_cpus;
+
+        if ( !get_cpu_maps() )
+        {
+            printk(XENLOG_DEBUG "%s: CPU%u - unable to get cpu_maps lock.\n",
+                   p->name, cpu);
+            xsplice_work.data->rc = -EBUSY;
+            xsplice_work.do_work = 0;
+            return;
+        }
+
+        barrier(); /* MUST do it after get_cpu_maps. */
+        total_cpus = num_online_cpus() - 1;
+
+        if ( total_cpus )
+        {
+            printk(XENLOG_DEBUG "%s: CPU%u - IPIing the %u CPUs.\n", p->name,
+                   cpu, total_cpus);
+            smp_call_function(reschedule_fn, NULL, 0);
+        }
+        (void)xsplice_do_single(total_cpus);
+
+        ASSERT(local_irq_is_enabled());
+
+        put_cpu_maps();
+
+        printk(XENLOG_DEBUG "%s finished with rc=%d\n", p->name, p->rc);
+    }
+    else
+    {
+        /* Wait for all CPUs to rendezvous. */
+        while ( xsplice_work.do_work && !xsplice_work.ready )
+        {
+            cpu_relax();
+            smp_rmb();
+        }
+
+        /* Disable IRQs and signal. */
+        local_irq_disable();
+        atomic_inc(&xsplice_work.irq_semaphore);
+
+        /* Wait for patching to complete. */
+        while ( xsplice_work.do_work )
+        {
+            cpu_relax();
+            smp_rmb();
+        }
+        local_irq_enable();
+    }
+}
+
 static int __init xsplice_init(void)
 {
     register_keyhandler('x', xsplice_printall, "print xsplicing info", 1);
diff --git a/xen/include/asm-arm/nmi.h b/xen/include/asm-arm/nmi.h
index a60587e..82aff35 100644
--- a/xen/include/asm-arm/nmi.h
+++ b/xen/include/asm-arm/nmi.h
@@ -4,6 +4,19 @@
 #define register_guest_nmi_callback(a)  (-ENOSYS)
 #define unregister_guest_nmi_callback() (-ENOSYS)
 
+typedef int (*nmi_callback_t)(const struct cpu_user_regs *regs, int cpu);
+
+/**
+ * set_nmi_callback
+ *
+ * Set a handler for an NMI. Only one handler may be
+ * set. Return the old nmi callback handler.
+ */
+static inline nmi_callback_t set_nmi_callback(nmi_callback_t callback)
+{
+    return NULL;
+}
+
 #endif /* ASM_NMI_H */
 /*
  * Local variables:
diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
index b90742f..61a9ec6 100644
--- a/xen/include/xen/xsplice.h
+++ b/xen/include/xen/xsplice.h
@@ -6,8 +6,30 @@ struct xsplice_elf_sec;
 struct xsplice_elf_sym;
 struct xen_sysctl_xsplice_op;
 
+/*
+ * The structure which defines the patching. This is what the hypervisor
+ * expects in the '.xsplice.func' section of the ELF file.
+ *
+ * This MUST be in sync with what the tools generate.
+ */
+struct xsplice_patch_func {
+    const char *name;
+    unsigned long new_addr;
+    const unsigned long old_addr;
+    uint32_t new_size;
+    const uint32_t old_size;
+    uint8_t undo[8];
+    uint8_t pad[24];
+};
+
 int xsplice_control(struct xen_sysctl_xsplice_op *);
 
+#ifdef CONFIG_XSPLICE
+void do_xsplice(void);
+#else
+static inline void do_xsplice(void) { };
+#endif
+
 /* Arch hooks */
 int xsplice_verify_elf(struct xsplice_elf *elf, uint8_t *data);
 int xsplice_perform_rel(struct xsplice_elf *elf,
@@ -16,4 +38,6 @@ int xsplice_perform_rel(struct xsplice_elf *elf,
 int xsplice_perform_rela(struct xsplice_elf *elf,
                          struct xsplice_elf_sec *base,
                          struct xsplice_elf_sec *rela);
+void xsplice_apply_jmp(struct xsplice_patch_func *func);
+void xsplice_revert_jmp(struct xsplice_patch_func *func);
 #endif /* __XEN_XSPLICE_H__ */
-- 
2.1.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version'.
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (8 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
                     ` (2 more replies)
  2016-01-14 21:47 ` [PATCH v2 11/13] xsplice: Add support for bug frames. (v2) Konrad Rzeszutek Wilk
                   ` (3 subsequent siblings)
  13 siblings, 3 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

This change demonstrates how to generate an xSplice ELF payload.

The idea here is that we want to patch in the hypervisor
the 'xen_version_extra' function with an function that will
return 'Hello World'. The 'xl info | grep extraversion'
will reflect the new value after the patching.

To generate this ELF payload file we need:
 - C code of the new code.
 - C code generating the .xsplice.func structure.
 - The address of the old code (xen_extra_version). We
   do it by using 'nm' but that is a bit of hack.

The linker script file:
 - Discards .debug* and .comments* sections.
 - Changes the name of .data.local.xsplice_hello_world to
   .xsplice.func
 - Figures out the size of the new code.

Also if you are curious on the input/output sections
magic the linker does, add these to the GCC line:
  -Wl,-M  -Wl,-t -Wl,-verbose
which are: print linking map, provide trace and be verbose.

The use-case is simple:

$xen-xsplice load /usr/lib/xen/bin/xen_hello_world.xsplice
$xen-xsplice list
 ID                                     | status
----------------------------------------+------------
xen_hello_world                           APPLIED
$xl info | grep extra
xen_extra              : Hello World
$xen-xsplice revert xen_hello_world
Performing revert: completed
$xen-xsplice unload xen_hello_world
Performing unload: completed
$xl info | grep extra
xen_extra              : -unstable

Note that it does not build under a 32-bit toolstack as
there is no access to the hypervisor (xen-syms).

We also force it to be built every time - as the hypervisor
may have been rebuilt.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 docs/misc/xsplice.markdown   | 50 ++++++++++++++++++++++++++++++++++++++++++++
 tools/misc/Makefile          | 25 +++++++++++++++++++++-
 tools/misc/xen_hello_world.c | 15 +++++++++++++
 tools/misc/xsplice.h         | 12 +++++++++++
 tools/misc/xsplice.lds       | 11 ++++++++++
 5 files changed, 112 insertions(+), 1 deletion(-)
 create mode 100644 tools/misc/xen_hello_world.c
 create mode 100644 tools/misc/xsplice.h
 create mode 100644 tools/misc/xsplice.lds

diff --git a/docs/misc/xsplice.markdown b/docs/misc/xsplice.markdown
index beb452e..e2cdcff 100644
--- a/docs/misc/xsplice.markdown
+++ b/docs/misc/xsplice.markdown
@@ -312,11 +312,61 @@ size.
 
 When applying the patch the hypervisor iterates over each `xsplice_patch_func`
 structure and the core code inserts a trampoline at `old_addr` to `new_addr`.
+The `new_addr` is altered when the ELF payload is loaded.
 
 When reverting a patch, the hypervisor iterates over each `xsplice_patch_func`
 and the core code copies the data from the undo buffer (private internal copy)
 to `old_addr`.
 
+### Example
+
+A simple example of what a payload file can be:
+
+<pre>
+/* MUST be in sync with hypervisor. */  
+struct xsplice_patch_func {  
+    const char *name;  
+    unsigned long new_addr;  
+    const unsigned long old_addr;  
+    uint32_t new_size;  
+    const uint32_t old_size;  
+    uint8_t pad[32];  
+};  
+
+/* Our replacement function for xen_extra_version. */  
+const char *xen_hello_world(void)  
+{  
+    return "Hello World";  
+}  
+
+struct xsplice_patch_func xsplice_hello_world = {  
+    .name = "xen_extra_version",  
+    .new_addr = &xen_hello_world,  
+    .old_addr = 0xffff82d08013963c, /* Extracted from xen-syms. */  
+    .new_size = 13, /* To be be computed by scripts. */  
+    .old_size = 13, /* -----------""---------------  */  
+};  
+</pre>
+
+With the linker script as follow to change the `xsplice_hello_world`
+do be `.xsplice.funcs` :
+
+<pre>
+OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", "elf64-x86-64")  
+OUTPUT_ARCH(i386:x86-64)  
+ENTRY(xsplice_hello_world)  
+SECTIONS  
+{  
+    /* The hypervisor expects ".xsplice.func", so change  
+     * the ".data.xsplice_hello_world" to it. */  
+
+    .xsplice.funcs : { *(*.xsplice_hello_world) }  
+    }  
+}  
+</pre>
+
+Code must be compiled with -fPIC.
+
 ## Hypercalls
 
 We will employ the sub operations of the system management hypercall (sysctl).
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index c46873e..8385830 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -36,6 +36,10 @@ INSTALL_SBIN += $(INSTALL_SBIN-y)
 # Everything to be installed in a private bin/
 INSTALL_PRIVBIN                += xenpvnetboot
 
+# We need the hypervisor - and only 64-bit builds have it.
+ifeq ($(XEN_COMPILE_ARCH),x86_64)
+INSTALL_PRIVBIN                += xen_hello_world.xsplice
+endif
 # Everything to be installed
 TARGETS_ALL := $(INSTALL_BIN) $(INSTALL_SBIN) $(INSTALL_PRIVBIN)
 
@@ -49,7 +53,7 @@ TARGETS_COPY += xenpvnetboot
 # Everything which needs to be built
 TARGETS_BUILD := $(filter-out $(TARGETS_COPY),$(TARGETS_ALL))
 
-.PHONY: all build
+.PHONY: all build xsplice
 all build: $(TARGETS_BUILD)
 
 .PHONY: install
@@ -111,4 +115,23 @@ gtraceview: gtraceview.o
 xencov: xencov.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+.PHONY: xsplice
+xsplice:
+ifeq ($(XEN_COMPILE_ARCH),x86_64)
+	# We MUST regenerate the file everytime we build - in case the hypervisor
+	# is rebuilt too.
+	$(RM) *.xplice
+	$(MAKE) xen_hello_world.xsplice
+endif
+
+XEN_EXTRA_VERSION_ADDR=$(shell nm --defined $(XEN_ROOT)/xen/xen-syms | grep xen_extra_version | awk '{print "0x"$$1}')
+
+xen_hello_world.xsplice: xen_hello_world.c
+	$(CC) -DOLD_CODE=$(XEN_EXTRA_VERSION_ADDR) -I$(XEN_ROOT)/tools/include \
+		-fPIC -Wl,--emit-relocs \
+		-Wl,-r -Wl,--entry=xsplice_hello_world \
+		-fdata-sections -ffunction-sections \
+		-nostdlib -Txsplice.lds \
+		-o $@ $<
+	@objdump -x --section=.xsplice.funcs $@
 -include $(DEPS)
diff --git a/tools/misc/xen_hello_world.c b/tools/misc/xen_hello_world.c
new file mode 100644
index 0000000..8c24d8f
--- /dev/null
+++ b/tools/misc/xen_hello_world.c
@@ -0,0 +1,15 @@
+#include "xsplice.h"
+
+/* Our replacement function for xen_extra_version. */
+const char *xen_hello_world(void)
+{
+    return "Hello World";
+}
+
+struct xsplice_patch_func xsplice_hello_world = {
+    .name = "xen_extra_version",
+    .new_addr = &xen_hello_world,
+    .old_addr = OLD_CODE,
+    .new_size = 13, /* TODO: Compute. */
+    .old_size = 13, /* TODO: Compute. */
+};
diff --git a/tools/misc/xsplice.h b/tools/misc/xsplice.h
new file mode 100644
index 0000000..6ce8bae
--- /dev/null
+++ b/tools/misc/xsplice.h
@@ -0,0 +1,12 @@
+#include <stdint.h>
+#include <sys/types.h>
+
+/* MUST be in sync with hypervisor. */
+struct xsplice_patch_func {
+    const char *name;
+    unsigned long new_addr;
+    const unsigned long old_addr;
+    uint32_t new_size;
+    const uint32_t old_size;
+    uint8_t pad[32];
+};
diff --git a/tools/misc/xsplice.lds b/tools/misc/xsplice.lds
new file mode 100644
index 0000000..f52eb8c
--- /dev/null
+++ b/tools/misc/xsplice.lds
@@ -0,0 +1,11 @@
+OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", "elf64-x86-64")
+OUTPUT_ARCH(i386:x86-64)
+ENTRY(xsplice_hello_world)
+SECTIONS
+{
+    /* The hypervisor expects ".xsplice.func", so change
+     * the ".data.xsplice_hello_world" to it. */
+
+    .xsplice.funcs : { *(*.xsplice_hello_world) }
+
+}
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 11/13] xsplice: Add support for bug frames. (v2)
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (9 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version' Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-19 14:42   ` Ross Lagerwall
  2016-01-14 21:47 ` [PATCH v2 12/13] xsplice: Add support for exception tables. (v2) Konrad Rzeszutek Wilk
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Add support for handling bug frames contained with xsplice modules. If a
trap occurs search either the kernel bug table or an applied payload's
bug table depending on the instruction pointer.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2:- s/module/payload/
   - add build time check in case amount of bug frames expands.
   - add define for the number of bug-frames.
---
 xen/arch/x86/traps.c      |  30 ++++++++-----
 xen/common/symbols.c      |   7 +++
 xen/common/xsplice.c      | 109 +++++++++++++++++++++++++++++++++++++++++-----
 xen/include/asm-arm/bug.h |   2 +
 xen/include/asm-x86/bug.h |   2 +
 xen/include/xen/kernel.h  |   1 +
 xen/include/xen/xsplice.h |  15 +++++++
 7 files changed, 145 insertions(+), 21 deletions(-)

diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index e105b95..6e80607 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -48,6 +48,7 @@
 #include <xen/kexec.h>
 #include <xen/trace.h>
 #include <xen/paging.h>
+#include <xen/xsplice.h>
 #include <xen/watchdog.h>
 #include <asm/system.h>
 #include <asm/io.h>
@@ -1080,20 +1081,29 @@ void do_invalid_op(struct cpu_user_regs *regs)
         return;
     }
 
-    if ( !is_active_kernel_text(regs->eip) ||
+    if ( !is_active_text(regs->eip) ||
          __copy_from_user(bug_insn, eip, sizeof(bug_insn)) ||
          memcmp(bug_insn, "\xf\xb", sizeof(bug_insn)) )
         goto die;
 
-    for ( bug = __start_bug_frames, id = 0; stop_frames[id]; ++bug )
+    if ( likely(is_active_kernel_text(regs->eip)) )
     {
-        while ( unlikely(bug == stop_frames[id]) )
-            ++id;
-        if ( bug_loc(bug) == eip )
-            break;
+        for ( bug = __start_bug_frames, id = 0; stop_frames[id]; ++bug )
+        {
+            while ( unlikely(bug == stop_frames[id]) )
+                ++id;
+            if ( bug_loc(bug) == eip )
+                break;
+        }
+        if ( !stop_frames[id] )
+            goto die;
+    }
+    else
+    {
+        bug = xsplice_find_bug(eip, &id);
+        if ( !bug )
+            goto die;
     }
-    if ( !stop_frames[id] )
-        goto die;
 
     eip += sizeof(bug_insn);
     if ( id == BUGFRAME_run_fn )
@@ -1107,7 +1117,7 @@ void do_invalid_op(struct cpu_user_regs *regs)
 
     /* WARN, BUG or ASSERT: decode the filename pointer and line number. */
     filename = bug_ptr(bug);
-    if ( !is_kernel(filename) )
+    if ( !is_kernel(filename) && !is_module(filename) )
         goto die;
     fixup = strlen(filename);
     if ( fixup > 50 )
@@ -1134,7 +1144,7 @@ void do_invalid_op(struct cpu_user_regs *regs)
     case BUGFRAME_assert:
         /* ASSERT: decode the predicate string pointer. */
         predicate = bug_msg(bug);
-        if ( !is_kernel(predicate) )
+        if ( !is_kernel(predicate) && !is_module(predicate) )
             predicate = "<unknown>";
 
         printk("Assertion '%s' failed at %s%s:%d\n",
diff --git a/xen/common/symbols.c b/xen/common/symbols.c
index a59c59d..bf5623f 100644
--- a/xen/common/symbols.c
+++ b/xen/common/symbols.c
@@ -17,6 +17,7 @@
 #include <xen/lib.h>
 #include <xen/string.h>
 #include <xen/spinlock.h>
+#include <xen/xsplice.h>
 #include <public/platform.h>
 #include <xen/guest_access.h>
 
@@ -101,6 +102,12 @@ bool_t is_active_kernel_text(unsigned long addr)
             (system_state < SYS_STATE_active && is_kernel_inittext(addr)));
 }
 
+bool_t is_active_text(unsigned long addr)
+{
+    return is_active_kernel_text(addr) ||
+           is_active_module_text(addr);
+}
+
 const char *symbols_lookup(unsigned long addr,
                            unsigned long *symbolsize,
                            unsigned long *offset,
diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 5abeb28..02cb4a8 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -43,7 +43,10 @@ struct payload {
     struct list_head applied_list;       /* Linked to 'applied_list'. */
     struct xsplice_patch_func *funcs;    /* The array of functions to patch. */
     unsigned int nfuncs;                 /* Nr of functions to patch. */
-
+    size_t core_size;                    /* Only .text size. */
+    size_t core_text_size;               /* Everything else - .data,.rodata, etc. */
+    struct bug_frame *start_bug_frames[BUGFRAME_NR]; /* .bug.frame patching. */
+    struct bug_frame *stop_bug_frames[BUGFRAME_NR];
     char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
 };
 
@@ -544,26 +547,27 @@ static void free_payload_data(struct payload *payload)
     payload->payload_pages = 0;
 }
 
-static void calc_section(struct xsplice_elf_sec *sec, size_t *core_size)
+static void calc_section(struct xsplice_elf_sec *sec, size_t *size)
 {
-    size_t align_size = ROUNDUP(*core_size, sec->sec->sh_addralign);
+    size_t align_size = ROUNDUP(*size, sec->sec->sh_addralign);
     sec->sec->sh_entsize = align_size;
-    *core_size = sec->sec->sh_size + align_size;
+    *size = sec->sec->sh_size + align_size;
 }
 
 static int move_payload(struct payload *payload, struct xsplice_elf *elf)
 {
     uint8_t *buf;
     unsigned int i;
-    size_t core_size = 0;
+    size_t size = 0;
 
     /* Compute text regions */
     for ( i = 0; i < elf->hdr->e_shnum; i++ )
     {
         if ( (elf->sec[i].sec->sh_flags & (SHF_ALLOC|SHF_EXECINSTR)) ==
              (SHF_ALLOC|SHF_EXECINSTR) )
-            calc_section(&elf->sec[i], &core_size);
+            calc_section(&elf->sec[i], &size);
     }
+    payload->core_text_size = size;
 
     /* Compute rw data */
     for ( i = 0; i < elf->hdr->e_shnum; i++ )
@@ -571,7 +575,7 @@ static int move_payload(struct payload *payload, struct xsplice_elf *elf)
         if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
              !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
              (elf->sec[i].sec->sh_flags & SHF_WRITE) )
-            calc_section(&elf->sec[i], &core_size);
+            calc_section(&elf->sec[i], &size);
     }
 
     /* Compute ro data */
@@ -580,16 +584,17 @@ static int move_payload(struct payload *payload, struct xsplice_elf *elf)
         if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
              !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
              !(elf->sec[i].sec->sh_flags & SHF_WRITE) )
-            calc_section(&elf->sec[i], &core_size);
+            calc_section(&elf->sec[i], &size);
     }
+    payload->core_size = size;
 
-    buf = alloc_payload(core_size);
+    buf = alloc_payload(size);
     if ( !buf ) {
         printk(XENLOG_ERR "%s: Could not allocate memory for module\n",
                elf->name);
         return -ENOMEM;
     }
-    memset(buf, 0, core_size);
+    memset(buf, 0, size);
 
     for ( i = 0; i < elf->hdr->e_shnum; i++ )
     {
@@ -604,7 +609,7 @@ static int move_payload(struct payload *payload, struct xsplice_elf *elf)
     }
 
     payload->payload_address = buf;
-    payload->payload_pages = PFN_UP(core_size);
+    payload->payload_pages = PFN_UP(size);
 
     return 0;
 }
@@ -647,6 +652,22 @@ static int find_special_sections(struct payload *payload,
             if ( f->pad[j] )
                 return -EINVAL;
     }
+    for ( i = 0; i < BUGFRAME_NR; i++ )
+    {
+        char str[14];
+
+        snprintf(str, sizeof str, ".bug_frames.%d", i);
+        sec = xsplice_elf_sec_by_name(elf, str);
+        if ( !sec )
+            continue;
+
+        if ( ( !sec->sec->sh_size ) ||
+             ( sec->sec->sh_size % sizeof (struct bug_frame) ) )
+            return -EINVAL;
+
+        payload->start_bug_frames[i] = (struct bug_frame *)sec->load_addr;
+        payload->stop_bug_frames[i] = (struct bug_frame *)(sec->load_addr + sec->sec->sh_size);
+    }
     return 0;
 }
 
@@ -942,6 +963,72 @@ void do_xsplice(void)
     }
 }
 
+
+/*
+ * Functions for handling special sections.
+ */
+struct bug_frame *xsplice_find_bug(const char *eip, int *id)
+{
+    struct payload *data;
+    struct bug_frame *bug;
+    int i;
+
+    /* No locking since this list is only ever changed during apply or revert
+     * context. */
+    list_for_each_entry ( data, &applied_list, applied_list )
+    {
+        for (i = 0; i < 4; i++) {
+            if (!data->start_bug_frames[i])
+                continue;
+            if ( !((void *)eip >= data->payload_address &&
+                   (void *)eip < (data->payload_address + data->core_text_size)))
+                continue;
+
+            for ( bug = data->start_bug_frames[i]; bug != data->stop_bug_frames[i]; ++bug ) {
+                if ( bug_loc(bug) == eip )
+                {
+                    *id = i;
+                    return bug;
+                }
+            }
+        }
+    }
+
+    return NULL;
+}
+
+bool_t is_module(const void *ptr)
+{
+    struct payload *data;
+
+    /* No locking since this list is only ever changed during apply or revert
+     * context. */
+    list_for_each_entry ( data, &applied_list, applied_list )
+    {
+        if ( ptr >= data->payload_address &&
+             ptr < (data->payload_address + data->core_size))
+            return 1;
+    }
+
+    return 0;
+}
+
+bool_t is_active_module_text(unsigned long addr)
+{
+    struct payload *data;
+
+    /* No locking since this list is only ever changed during apply or revert
+     * context. */
+    list_for_each_entry ( data, &applied_list, applied_list )
+    {
+        if ( (void *)addr >= data->payload_address &&
+             (void *)addr < (data->payload_address + data->core_text_size))
+            return 1;
+    }
+
+    return 0;
+}
+
 static int __init xsplice_init(void)
 {
     register_keyhandler('x', xsplice_printall, "print xsplicing info", 1);
diff --git a/xen/include/asm-arm/bug.h b/xen/include/asm-arm/bug.h
index ab9e811..4df6b2a 100644
--- a/xen/include/asm-arm/bug.h
+++ b/xen/include/asm-arm/bug.h
@@ -31,6 +31,7 @@ struct bug_frame {
 #define BUGFRAME_warn   0
 #define BUGFRAME_bug    1
 #define BUGFRAME_assert 2
+#define BUGFRAME_NR     3
 
 /* Many versions of GCC doesn't support the asm %c parameter which would
  * be preferable to this unpleasantness. We use mergeable string
@@ -39,6 +40,7 @@ struct bug_frame {
  */
 #define BUG_FRAME(type, line, file, has_msg, msg) do {                      \
     BUILD_BUG_ON((line) >> 16);                                             \
+    BUILD_BUG_ON(type >= BUGFRAME_NR);                                      \
     asm ("1:"BUG_INSTR"\n"                                                  \
          ".pushsection .rodata.str, \"aMS\", %progbits, 1\n"                \
          "2:\t.asciz " __stringify(file) "\n"                               \
diff --git a/xen/include/asm-x86/bug.h b/xen/include/asm-x86/bug.h
index e868e85..5443191 100644
--- a/xen/include/asm-x86/bug.h
+++ b/xen/include/asm-x86/bug.h
@@ -9,6 +9,7 @@
 #define BUGFRAME_warn   1
 #define BUGFRAME_bug    2
 #define BUGFRAME_assert 3
+#define BUGFRAME_NR     4
 
 #ifndef __ASSEMBLY__
 
@@ -51,6 +52,7 @@ struct bug_frame {
 
 #define BUG_FRAME(type, line, ptr, second_frame, msg) do {                   \
     BUILD_BUG_ON((line) >> (BUG_LINE_LO_WIDTH + BUG_LINE_HI_WIDTH));         \
+    BUILD_BUG_ON((type) >= (BUGFRAME_NR));                                   \
     asm volatile ( _ASM_BUGFRAME_TEXT(second_frame)                          \
                    :: _ASM_BUGFRAME_INFO(type, line, ptr, msg) );            \
 } while (0)
diff --git a/xen/include/xen/kernel.h b/xen/include/xen/kernel.h
index 548b64d..df57754 100644
--- a/xen/include/xen/kernel.h
+++ b/xen/include/xen/kernel.h
@@ -99,6 +99,7 @@ extern enum system_state {
 } system_state;
 
 bool_t is_active_kernel_text(unsigned long addr);
+bool_t is_active_text(unsigned long addr);
 
 #endif /* _LINUX_KERNEL_H */
 
diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
index 61a9ec6..41b738a 100644
--- a/xen/include/xen/xsplice.h
+++ b/xen/include/xen/xsplice.h
@@ -26,8 +26,23 @@ int xsplice_control(struct xen_sysctl_xsplice_op *);
 
 #ifdef CONFIG_XSPLICE
 void do_xsplice(void);
+struct bug_frame *xsplice_find_bug(const char *eip, int *id);
+bool_t is_module(const void *addr);
+bool_t is_active_module_text(unsigned long addr);
 #else
 static inline void do_xsplice(void) { };
+static inline struct bug_frame *xsplice_find_bug(const char *eip, int *id)
+{
+	return NULL;
+}
+static inline bool_t is_module(const void *addr)
+{
+	return 0;
+}
+static inline bool_t is_active_module_text(unsigned long addr)
+{
+	return 0;
+}
 #endif
 
 /* Arch hooks */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 12/13] xsplice: Add support for exception tables. (v2)
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (10 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 11/13] xsplice: Add support for bug frames. (v2) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-14 21:47 ` [PATCH v2 13/13] xsplice: Add support for alternatives Konrad Rzeszutek Wilk
  2016-01-15 16:58 ` [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
  13 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Add support for exception tables contained within xSplice payloads. If an
exception occurs search either the main exception table or a particular
active payload's exception table depending on the instruction pointer.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2:
 - s/module/payload/
 - sanity checks.
---
 xen/arch/x86/extable.c        | 36 +++++++++++++++++++++--------------
 xen/common/xsplice.c          | 44 +++++++++++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/uaccess.h |  5 +++++
 xen/include/xen/xsplice.h     |  5 +++++
 4 files changed, 76 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/extable.c b/xen/arch/x86/extable.c
index 89b5bcb..2787a92 100644
--- a/xen/arch/x86/extable.c
+++ b/xen/arch/x86/extable.c
@@ -4,6 +4,7 @@
 #include <xen/perfc.h>
 #include <xen/sort.h>
 #include <xen/spinlock.h>
+#include <xen/xsplice.h>
 #include <asm/uaccess.h>
 
 #define EX_FIELD(ptr, field) ((unsigned long)&(ptr)->field + (ptr)->field)
@@ -18,7 +19,7 @@ static inline unsigned long ex_cont(const struct exception_table_entry *x)
 	return EX_FIELD(x, cont);
 }
 
-static int __init cmp_ex(const void *a, const void *b)
+static int cmp_ex(const void *a, const void *b)
 {
 	const struct exception_table_entry *l = a, *r = b;
 	unsigned long lip = ex_addr(l);
@@ -33,7 +34,7 @@ static int __init cmp_ex(const void *a, const void *b)
 }
 
 #ifndef swap_ex
-static void __init swap_ex(void *a, void *b, int size)
+static void swap_ex(void *a, void *b, int size)
 {
 	struct exception_table_entry *l = a, *r = b, tmp;
 	long delta = b - a;
@@ -46,19 +47,23 @@ static void __init swap_ex(void *a, void *b, int size)
 }
 #endif
 
-void __init sort_exception_tables(void)
+void sort_exception_table(struct exception_table_entry *start,
+                          struct exception_table_entry *stop)
 {
-    sort(__start___ex_table, __stop___ex_table - __start___ex_table,
-         sizeof(struct exception_table_entry), cmp_ex, swap_ex);
-    sort(__start___pre_ex_table,
-         __stop___pre_ex_table - __start___pre_ex_table,
+    sort(start, stop - start,
          sizeof(struct exception_table_entry), cmp_ex, swap_ex);
 }
 
-static inline unsigned long
-search_one_table(const struct exception_table_entry *first,
-                 const struct exception_table_entry *last,
-                 unsigned long value)
+void __init sort_exception_tables(void)
+{
+    sort_exception_table(__start___ex_table, __stop___ex_table);
+    sort_exception_table(__start___pre_ex_table, __stop___pre_ex_table);
+}
+
+unsigned long
+search_one_extable(const struct exception_table_entry *first,
+                   const struct exception_table_entry *last,
+                   unsigned long value)
 {
     const struct exception_table_entry *mid;
     long diff;
@@ -80,15 +85,18 @@ search_one_table(const struct exception_table_entry *first,
 unsigned long
 search_exception_table(unsigned long addr)
 {
-    return search_one_table(
-        __start___ex_table, __stop___ex_table-1, addr);
+    if ( likely(is_kernel(addr)) )
+        return search_one_extable(
+            __start___ex_table, __stop___ex_table-1, addr);
+    else
+        return search_module_extables(addr);
 }
 
 unsigned long
 search_pre_exception_table(struct cpu_user_regs *regs)
 {
     unsigned long addr = (unsigned long)regs->eip;
-    unsigned long fixup = search_one_table(
+    unsigned long fixup = search_one_extable(
         __start___pre_ex_table, __stop___pre_ex_table-1, addr);
     if ( fixup )
     {
diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 02cb4a8..53a67a9 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -47,6 +47,10 @@ struct payload {
     size_t core_text_size;               /* Everything else - .data,.rodata, etc. */
     struct bug_frame *start_bug_frames[BUGFRAME_NR]; /* .bug.frame patching. */
     struct bug_frame *stop_bug_frames[BUGFRAME_NR];
+#ifdef CONFIG_X86
+    struct exception_table_entry *start_ex_table;
+    struct exception_table_entry *stop_ex_table;
+#endif
     char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
 };
 
@@ -668,6 +672,20 @@ static int find_special_sections(struct payload *payload,
         payload->start_bug_frames[i] = (struct bug_frame *)sec->load_addr;
         payload->stop_bug_frames[i] = (struct bug_frame *)(sec->load_addr + sec->sec->sh_size);
     }
+#ifdef CONFIG_X86
+    sec = xsplice_elf_sec_by_name(elf, ".ex_table");
+    if ( sec )
+    {
+        if ( ( !sec->sec->sh_size ) ||
+             ( sec->sec->sh_size % sizeof *sec->load_addr ) )
+            return -EINVAL;
+
+        payload->start_ex_table = (struct exception_table_entry *)sec->load_addr;
+        payload->stop_ex_table = (struct exception_table_entry *)(sec->load_addr + sec->sec->sh_size);
+
+        sort_exception_table(payload->start_ex_table, payload->stop_ex_table);
+    }
+#endif
     return 0;
 }
 
@@ -1029,6 +1047,32 @@ bool_t is_active_module_text(unsigned long addr)
     return 0;
 }
 
+#ifdef CONFIG_X86
+unsigned long search_module_extables(unsigned long addr)
+{
+    struct payload *data;
+    unsigned long ret;
+
+    /* No locking since this list is only ever changed during apply or revert
+     * context. */
+    list_for_each_entry ( data, &applied_list, applied_list )
+    {
+        if ( !data->start_ex_table )
+            continue;
+        if ( !((void *)addr >= data->payload_address &&
+               (void *)addr < (data->payload_address + data->core_text_size)))
+            continue;
+
+        ret = search_one_extable(data->start_ex_table, data->stop_ex_table - 1,
+                                 addr);
+        if ( ret )
+            return ret;
+    }
+
+    return 0;
+}
+#endif
+
 static int __init xsplice_init(void)
 {
     register_keyhandler('x', xsplice_printall, "print xsplicing info", 1);
diff --git a/xen/include/asm-x86/uaccess.h b/xen/include/asm-x86/uaccess.h
index 947470d..9e67bf0 100644
--- a/xen/include/asm-x86/uaccess.h
+++ b/xen/include/asm-x86/uaccess.h
@@ -276,6 +276,11 @@ extern struct exception_table_entry __start___pre_ex_table[];
 extern struct exception_table_entry __stop___pre_ex_table[];
 
 extern unsigned long search_exception_table(unsigned long);
+extern unsigned long search_one_extable(const struct exception_table_entry *first,
+                                        const struct exception_table_entry *last,
+                                        unsigned long value);
 extern void sort_exception_tables(void);
+extern void sort_exception_table(struct exception_table_entry *start,
+                                 struct exception_table_entry *stop);
 
 #endif /* __X86_UACCESS_H__ */
diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
index 41b738a..ec929fd 100644
--- a/xen/include/xen/xsplice.h
+++ b/xen/include/xen/xsplice.h
@@ -29,6 +29,7 @@ void do_xsplice(void);
 struct bug_frame *xsplice_find_bug(const char *eip, int *id);
 bool_t is_module(const void *addr);
 bool_t is_active_module_text(unsigned long addr);
+unsigned long search_module_extables(unsigned long addr);
 #else
 static inline void do_xsplice(void) { };
 static inline struct bug_frame *xsplice_find_bug(const char *eip, int *id)
@@ -43,6 +44,10 @@ static inline bool_t is_active_module_text(unsigned long addr)
 {
 	return 0;
 }
+static inline unsigned long search_module_extables(unsigned long addr)
+{
+	return 0;
+}
 #endif
 
 /* Arch hooks */
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v2 13/13] xsplice: Add support for alternatives
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (11 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 12/13] xsplice: Add support for exception tables. (v2) Konrad Rzeszutek Wilk
@ 2016-01-14 21:47 ` Konrad Rzeszutek Wilk
  2016-01-15 16:58 ` [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
  13 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-14 21:47 UTC (permalink / raw)
  To: xen-devel, ross.lagerwall, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin
  Cc: Konrad Rzeszutek Wilk

From: Ross Lagerwall <ross.lagerwall@citrix.com>

Add support for applying alternative sections within xsplice modules. At
module load time, apply an alternative sections that are found.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 xen/arch/x86/Makefile             |  2 +-
 xen/arch/x86/alternative.c        | 12 ++++++------
 xen/common/xsplice.c              | 10 +++++++++-
 xen/include/asm-x86/alternative.h |  1 +
 4 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile
index f7d3e39..60249ab 100644
--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -6,7 +6,7 @@ subdir-y += mm
 subdir-y += oprofile
 subdir-y += x86_64
 
-obj-bin-y += alternative.init.o
+obj-bin-y += alternative.o
 obj-y += apic.o
 obj-y += bitops.o
 obj-bin-y += bzimage.init.o
diff --git a/xen/arch/x86/alternative.c b/xen/arch/x86/alternative.c
index 46ac0fd..8d895ad 100644
--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -28,7 +28,7 @@
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
 
 #ifdef K8_NOP1
-static const unsigned char k8nops[] __initconst = {
+static const unsigned char k8nops[] = {
     K8_NOP1,
     K8_NOP2,
     K8_NOP3,
@@ -52,7 +52,7 @@ static const unsigned char * const k8_nops[ASM_NOP_MAX+1] = {
 #endif
 
 #ifdef P6_NOP1
-static const unsigned char p6nops[] __initconst = {
+static const unsigned char p6nops[] = {
     P6_NOP1,
     P6_NOP2,
     P6_NOP3,
@@ -75,7 +75,7 @@ static const unsigned char * const p6_nops[ASM_NOP_MAX+1] = {
 };
 #endif
 
-static const unsigned char * const *ideal_nops __initdata = k8_nops;
+static const unsigned char * const *ideal_nops = k8_nops;
 
 static int __init mask_nmi_callback(const struct cpu_user_regs *regs, int cpu)
 {
@@ -100,7 +100,7 @@ static void __init arch_init_ideal_nops(void)
 }
 
 /* Use this to add nops to a buffer, then text_poke the whole buffer. */
-static void __init add_nops(void *insns, unsigned int len)
+static void add_nops(void *insns, unsigned int len)
 {
     while ( len > 0 )
     {
@@ -127,7 +127,7 @@ static void __init add_nops(void *insns, unsigned int len)
  *
  * This routine is called with local interrupt disabled.
  */
-static void *__init text_poke_early(void *addr, const void *opcode, size_t len)
+static void *text_poke_early(void *addr, const void *opcode, size_t len)
 {
     memcpy(addr, opcode, len);
     sync_core();
@@ -142,7 +142,7 @@ static void *__init text_poke_early(void *addr, const void *opcode, size_t len)
  * APs have less capabilities than the boot processor are not handled.
  * Tough. Make sure you disable such features by hand.
  */
-static void __init apply_alternatives(struct alt_instr *start, struct alt_instr *end)
+void apply_alternatives(struct alt_instr *start, struct alt_instr *end)
 {
     struct alt_instr *a;
     u8 *instr, *replacement;
diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 53a67a9..7866162 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -677,7 +677,7 @@ static int find_special_sections(struct payload *payload,
     if ( sec )
     {
         if ( ( !sec->sec->sh_size ) ||
-             ( sec->sec->sh_size % sizeof *sec->load_addr ) )
+             ( sec->sec->sh_size % sizeof (struct exception_table_entry) ) )
             return -EINVAL;
 
         payload->start_ex_table = (struct exception_table_entry *)sec->load_addr;
@@ -685,6 +685,14 @@ static int find_special_sections(struct payload *payload,
 
         sort_exception_table(payload->start_ex_table, payload->stop_ex_table);
     }
+    sec = xsplice_elf_sec_by_name(elf, ".altinstructions");
+    if ( sec )
+    {
+        local_irq_disable();
+        apply_alternatives((struct alt_instr *)sec->load_addr,
+                           (struct alt_instr *)(sec->load_addr + sec->sec->sh_size));
+        local_irq_enable();
+    }
 #endif
     return 0;
 }
diff --git a/xen/include/asm-x86/alternative.h b/xen/include/asm-x86/alternative.h
index 7d11354..95c2f7e 100644
--- a/xen/include/asm-x86/alternative.h
+++ b/xen/include/asm-x86/alternative.h
@@ -23,6 +23,7 @@ struct alt_instr {
     u8  replacementlen;     /* length of new instruction, <= instrlen */
 };
 
+extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 extern void alternative_instructions(void);
 
 #define OLDINSTR(oldinstr)      "661:\n\t" oldinstr "\n662:\n"
-- 
2.1.0

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2] xSplice v1 implementation.
  2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
                   ` (12 preceding siblings ...)
  2016-01-14 21:47 ` [PATCH v2 13/13] xsplice: Add support for alternatives Konrad Rzeszutek Wilk
@ 2016-01-15 16:58 ` Konrad Rzeszutek Wilk
  2016-01-25 11:57   ` Ross Lagerwall
  13 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-15 16:58 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, Ian Campbell, andrew.cooper3, xen.org, Martin Pohlack,
	ross.lagerwall, stefano.stabellini, Jan Beulich, sasha.levin,
	xen-devel

[-- Attachment #1: Type: text/plain, Size: 313 bytes --]

> Or you can use git://github.com/rosslagerwall/xsplice-build.git tool
> (it will need an extra patch, will send that shortly) - which
> generates the ELF payloads.
>
> This link has a nice description of how to use the tool:
> http://lists.xenproject.org/archives/html/xen-devel/2015-10/msg02595.html

Attached.

[-- Attachment #2: 0001-Add-64-bytes-of-padding-to-xsplice_patch_funcs-struc.patch --]
[-- Type: application/octet-stream, Size: 2736 bytes --]

From 8de23b4ac778e37a693c247c861f85fb3ee5f56e Mon Sep 17 00:00:00 2001
From: Ross Lagerwall <ross.lagerwall@citrix.com>
Date: Tue, 3 Nov 2015 14:44:49 +0000
Subject: [PATCH] Add 64 bytes of padding to xsplice_patch_funcs structure and
 shrink the size

This gives the hypervisor scratch space for undo buffers and possibly
other uses.

Also we change the size to be uint32_t instead of unsigned long.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 common.h             | 8 ++++----
 create-diff-object.c | 5 +++--
 lookup.h             | 2 +-
 prelink.c            | 2 +-
 4 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/common.h b/common.h
index d78275c..c16eb38 100644
--- a/common.h
+++ b/common.h
@@ -117,12 +117,12 @@ struct xsplice_elf {
 #define PATCH_INSN_SIZE 5
 
 struct xsplice_patch_func {
+	char *name;
 	unsigned long new_addr;
-	unsigned long new_size;
 	unsigned long old_addr;
-	unsigned long old_size;
-	char *name;
-	unsigned char undo[8];
+	uint32_t new_size;
+	uint32_t old_size;
+	unsigned char pad[32];
 };
 
 struct special_section {
diff --git a/create-diff-object.c b/create-diff-object.c
index 15c4115..be7e90a 100644
--- a/create-diff-object.c
+++ b/create-diff-object.c
@@ -1497,7 +1497,7 @@ static void xsplice_create_patches_sections(struct xsplice_elf *kelf,
 					ERROR("lookup_global_symbol %s",
 					      sym->name);
 			}
-			log_debug("lookup for %s @ 0x%016lx len %lu\n",
+			log_debug("lookup for %s @ 0x%016lx len %u\n",
 			          sym->name, result.value, result.size);
 
 			if (result.size < PATCH_INSN_SIZE)
@@ -1508,7 +1508,8 @@ static void xsplice_create_patches_sections(struct xsplice_elf *kelf,
 			funcs[index].old_size = result.size;
 			funcs[index].new_addr = 0;
 			funcs[index].new_size = sym->sym.st_size;
-			memset(funcs[index].undo, 0, sizeof funcs[index].undo);
+			funcs[index].name = NULL;
+			memset(funcs[index].pad, 0, sizeof funcs[index].pad);
 
 			/*
 			 * Add a relocation that will populate
diff --git a/lookup.h b/lookup.h
index cbb3dae..73fff52 100644
--- a/lookup.h
+++ b/lookup.h
@@ -5,7 +5,7 @@ struct lookup_table;
 
 struct lookup_result {
 	unsigned long value;
-	unsigned long size;
+	uint16_t size;
 };
 
 struct lookup_table *lookup_open(char *path);
diff --git a/prelink.c b/prelink.c
index f922871..9e6234c 100644
--- a/prelink.c
+++ b/prelink.c
@@ -73,7 +73,7 @@ void xsplice_resolve_symbols(struct xsplice_elf *kelf,
 				ERROR("lookup_global_symbol %s",
 				      sym->name);
 		}
-		log_debug("lookup for %s @ 0x%016lx len %lu\n",
+		log_debug("lookup for %s @ 0x%016lx len %u\n",
 			  sym->name, result.value, result.size);
 		sym->sym.st_value = result.value;
 		sym->sym.st_shndx = SHN_ABS;
-- 
2.4.3


[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/13] xsplice: Design document (v5).
  2016-01-14 21:46 ` [PATCH v2 01/13] xsplice: Design document (v5) Konrad Rzeszutek Wilk
@ 2016-01-19 11:14   ` Wei Liu
  2016-01-19 14:31   ` Ross Lagerwall
  2016-02-05 15:25   ` Jan Beulich
  2 siblings, 0 replies; 45+ messages in thread
From: Wei Liu @ 2016-01-19 11:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, jbeulich, sasha.levin,
	xen-devel

I skimmed this document and managed to do some non-technical nitpicks.
:-)

On Thu, Jan 14, 2016 at 04:46:59PM -0500, Konrad Rzeszutek Wilk wrote:
[...]
> +## Patching code
> +
> +The first mechanism to patch that comes in mind is in-place replacement.
> +That is replace the affected code with new code. Unfortunately the x86

"replacing" or "to replace"

> +ISA is variable size which places limits on how much space we have available
> +to replace the instructions. That is not a problem if the change is smaller
> +than the original opcode and we can fill it with nops. Problems will
> +appear if the replacement code is longer.
> +
> +The second mechanism is by replacing the call or jump to the
> +old function with the address of the new function.
> +
> +A third mechanism is to add a jump to the new function at the
> +start of the old function. N.B. The Xen hypervisor implements the third
> +mechanism.
> +
> +### Example of trampoline and in-place splicing
> +
> +As example we will assume the hypervisor does not have XSA-132 (see
> +*domctl/sysctl: don't leak hypervisor stack to toolstacks*
> +4ff3449f0e9d175ceb9551d3f2aecb59273f639d) and we would like to binary patch
> +the hypervisor with it. The original code looks as so:
> +
> +<pre>
> +   48 89 e0                  mov    %rsp,%rax  
> +   48 25 00 80 ff ff         and    $0xffffffffffff8000,%rax  
> +</pre>
> +
> +while the new patched hypervisor would be:
> +
> +<pre>
> +   48 c7 45 b8 00 00 00 00   movq   $0x0,-0x48(%rbp)  
> +   48 c7 45 c0 00 00 00 00   movq   $0x0,-0x40(%rbp)  
> +   48 c7 45 c8 00 00 00 00   movq   $0x0,-0x38(%rbp)  
> +   48 89 e0                  mov    %rsp,%rax  
> +   48 25 00 80 ff ff         and    $0xffffffffffff8000,%rax  
> +</pre>
> +
> +This is inside the arch_do_domctl. This new change adds 21 extra
> +bytes of code which alters all the offsets inside the function. To alter
> +these offsets and add the extra 21 bytes of code we might not have enough
> +space in .text to squeeze this in.
> +
> +As such we could simplify this problem by only patching the site
> +which calls arch_do_domctl:
> +
> +<pre>
> +<do_domctl>:  
> + e8 4b b1 05 00          callq  ffff82d08015fbb9 <arch_do_domctl>  
> +</pre>
> +
> +with a new address for where the new `arch_do_domctl` would be (this
> +area would be allocated dynamically).
> +
> +Astute readers will wonder what we need to do if we were to patch `do_domctl`
> +- which is not called directly by hypervisor but on behalf of the guests via
> +the `compat_hypercall_table` and `hypercall_table`.
> +Patching the offset in `hypercall_table` for `do_domctl:
> +(ffff82d080103079 <do_domctl>:)

Blank line here please.

> +<pre>
> +
> + ffff82d08024d490:   79 30  
> + ffff82d08024d492:   10 80 d0 82 ff ff   
> +
> +</pre>

Blank line.

> +with the new address where the new `do_domctl` is possible. The other
> +place where it is used is in `hvm_hypercall64_table` which would need
> +to be patched in a similar way. This would require an in-place splicing
> +of the new virtual address of `arch_do_domctl`.
> +
> +In summary this example patched the callee of the affected function by
> + * allocating memory for the new code to live in,
> + * changing the virtual address in all the functions which called the old
> +   code (computing the new offset, patching the callq with a new callq).
> + * changing the function pointer tables with the new virtual address of
> +   the function (splicing in the new virtual address). Since this table
> +   resides in the .rodata section we would need to temporarily change the
> +   page table permissions during this part.
> +
> +
> +However it has severe drawbacks - the safety checks which have to make sure
> +the function is not on the stack - must also check every caller. For some
> +patches this could mean - if there were an sufficient large amount of
> +callers - that we would never be able to apply the update.
> +
> +### Example of different trampoline patching.
> +
> +An alternative mechanism exists where we can insert a trampoline in the
> +existing function to be patched to jump directly to the new code. This
> +lessens the locations to be patched to one but it puts pressure on the
> +CPU branching logic (I-cache, but it is just one unconditional jump).
> +
> +For this example we will assume that the hypervisor has not been compiled
> +with fe2e079f642effb3d24a6e1a7096ef26e691d93e (XSA-125: *pre-fill structures
> +for certain HYPERVISOR_xen_version sub-ops*) which mem-sets an structure
> +in `xen_version` hypercall. This function is not called **anywhere** in
> +the hypervisor (it is called by the guest) but referenced in the
> +`compat_hypercall_table` and `hypercall_table` (and indirectly called
> +from that). Patching the offset in `hypercall_table` for the old
> +`do_xen_version` (ffff82d080112f9e <do_xen_version>)
> +
> +</pre>
> + ffff82d08024b270 <hypercall_table>  
> + ...  
> + ffff82d08024b2f8:   9e 2f 11 80 d0 82 ff ff  
> +
> +</pre>

Blank line.

> +with the new address where the new `do_xen_version` is possible. The other
> +place where it is used is in `hvm_hypercall64_table` which would need
> +to be patched in a similar way. This would require an in-place splicing
> +of the new virtual address of `do_xen_version`.
> +
[...]
> +#### Before entering the guest code.
> +
> +Before we call VMXResume we check whether any soft IRQs need to be executed.
> +This is a good spot because all Xen stacks are effectively empty at
> +that point.
> +
> +To randezvous all the CPUs an barrier with an maximum timeout (which

"rendezvous"

> +could be adjusted), combined with forcing all other CPUs through the
> +hypervisor with IPIs, can be utilized to have all the CPUs be lockstep.
> +

I couldn't parse the last part of this sentence. But I'm not a native
speaker.

Wei.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 04/13] libxc: Implementation of XEN_XSPLICE_op in libxc (v4).
  2016-01-14 21:47 ` [PATCH v2 04/13] libxc: Implementation of XEN_XSPLICE_op in libxc (v4) Konrad Rzeszutek Wilk
@ 2016-01-19 11:14   ` Wei Liu
  0 siblings, 0 replies; 45+ messages in thread
From: Wei Liu @ 2016-01-19 11:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, jbeulich, sasha.levin,
	xen-devel

On Thu, Jan 14, 2016 at 04:47:02PM -0500, Konrad Rzeszutek Wilk wrote:
[...]
> +int xc_xsplice_upload(xc_interface *xch,
> +                      char *name,
> +                      char *payload,
> +                      uint32_t size)
> +{
> +    int rc;
> +    DECLARE_SYSCTL;
> +    DECLARE_HYPERCALL_BOUNCE(payload, size, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +    DECLARE_HYPERCALL_BOUNCE(name, 0 /* adjust later */, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +    xen_xsplice_name_t def_name = { .pad = { 0, 0, 0 } };
> +
> +    if ( !name || !payload )
> +        return -1;
> +
> +    def_name.size = strlen(name);
> +    if ( def_name.size > XEN_XSPLICE_NAME_SIZE )
> +        return -1;
> +
> +    HYPERCALL_BOUNCE_SET_SIZE(name, def_name.size );
> +
> +    if ( xc_hypercall_bounce_pre(xch, name) )
> +        return -1;
> +
> +    if ( xc_hypercall_bounce_pre(xch, payload) )
> +        return -1;
> +

xc_hypercall_bounce_pre can allocate memory so please clean up after
failure instead of returning -1 directly.

> +    sysctl.cmd = XEN_SYSCTL_xsplice_op;
> +    sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_UPLOAD;
> +    sysctl.u.xsplice.pad = 0;
> +    sysctl.u.xsplice.u.upload.size = size;
> +    set_xen_guest_handle(sysctl.u.xsplice.u.upload.payload, payload);
> +
> +    sysctl.u.xsplice.u.upload.name = def_name;
> +    set_xen_guest_handle(sysctl.u.xsplice.u.upload.name.name, name);
> +
> +    rc = do_sysctl(xch, &sysctl);
> +
> +    xc_hypercall_bounce_post(xch, payload);
> +    xc_hypercall_bounce_post(xch, name);
> +
> +    return rc;
> +}
> +
> +int xc_xsplice_get(xc_interface *xch,
> +                   char *name,
> +                   xen_xsplice_status_t *status)
> +{
> +    int rc;
> +    DECLARE_SYSCTL;
> +    DECLARE_HYPERCALL_BOUNCE(name, 0 /*adjust later */, XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +    xen_xsplice_name_t def_name = { .pad = { 0, 0, 0 } };
> +
> +    if ( !name )
> +        return -1;
> +
> +    def_name.size = strlen(name);
> +    if ( def_name.size > XEN_XSPLICE_NAME_SIZE )
> +        return -1;
> +
> +    HYPERCALL_BOUNCE_SET_SIZE(name, def_name.size );
> +
> +    if ( xc_hypercall_bounce_pre(xch, name) )
> +        return -1;
> +
> +    sysctl.cmd = XEN_SYSCTL_xsplice_op;
> +    sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_GET;
> +    sysctl.u.xsplice.pad = 0;
> +
> +    sysctl.u.xsplice.u.get.status.state = 0;
> +    sysctl.u.xsplice.u.get.status.rc = 0;
> +
> +    sysctl.u.xsplice.u.get.name = def_name;
> +    set_xen_guest_handle(sysctl.u.xsplice.u.get.name.name, name);
> +
> +    rc = do_sysctl(xch, &sysctl);
> +
> +    xc_hypercall_bounce_post(xch, name);
> +
> +    memcpy(status, &sysctl.u.xsplice.u.get.status, sizeof(*status));
> +
> +    return rc;
> +}
> +
> +int xc_xsplice_list(xc_interface *xch, unsigned int max, unsigned int start,
> +                    xen_xsplice_status_t *info,
> +                    char *name, uint32_t *len,
> +                    unsigned int *done,
> +                    unsigned int *left)


Can you please add some comment before this function to document what
each of the parameters means? I have to admit I fail to grok the
algorithm of this function.

> +{
> +    int rc;
> +    DECLARE_SYSCTL;
> +    DECLARE_HYPERCALL_BOUNCE(info, 0 /* adjust later. */, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
> +    DECLARE_HYPERCALL_BOUNCE(name, 0 /* adjust later. */, XC_HYPERCALL_BUFFER_BOUNCE_OUT);
> +    DECLARE_HYPERCALL_BOUNCE(len, 0 /* adjust later. */, XC_HYPERCALL_BUFFER_BOUNCE_OUT);

Lines too long.

> +    uint32_t max_batch_sz, nr;
> +    uint32_t version = 0, retries = 0;
> +    uint32_t adjust = 0;
> +
> +    if ( !max || !info || !name || !len )
> +        return -1;
> +
> +    sysctl.cmd = XEN_SYSCTL_xsplice_op;
> +    sysctl.u.xsplice.cmd = XEN_SYSCTL_XSPLICE_LIST;
> +    sysctl.u.xsplice.pad = 0;
> +    sysctl.u.xsplice.u.list.version = 0;
> +    sysctl.u.xsplice.u.list.idx = start;
> +    sysctl.u.xsplice.u.list.pad = 0;
> +
> +    max_batch_sz = max;
> +
> +    *done = 0;
> +    *left = 0;
> +    do {
> +        if ( adjust )
> +            adjust = 0; /* Used when adjusting the 'max_batch_sz' or 'retries'. */
> +
> +        nr = min(max - *done, max_batch_sz);
> +
> +        sysctl.u.xsplice.u.list.nr = nr;
> +        /* Fix the size (may vary between hypercalls). */
> +        HYPERCALL_BOUNCE_SET_SIZE(info, nr * sizeof(*info));
> +        HYPERCALL_BOUNCE_SET_SIZE(name, nr * sizeof(*name) * XEN_XSPLICE_NAME_SIZE);

Line too long.

> +        HYPERCALL_BOUNCE_SET_SIZE(len, nr * sizeof(*len));
> +        /* Move the pointer to proper offset into 'info'. */
> +        (HYPERCALL_BUFFER(info))->ubuf = info + *done;
> +        (HYPERCALL_BUFFER(name))->ubuf = name + (sizeof(*name) * XEN_XSPLICE_NAME_SIZE * *done);
> +        (HYPERCALL_BUFFER(len))->ubuf = len + *done;
> +        /* Allocate memory. */
> +        rc = xc_hypercall_bounce_pre(xch, info);
> +        if ( rc )
> +            return rc;
> +
> +        rc = xc_hypercall_bounce_pre(xch, name);
> +        if ( rc )
> +        {
> +            xc_hypercall_bounce_post(xch, info);
> +            return rc;
> +        }
> +        rc = xc_hypercall_bounce_pre(xch, len);
> +        if ( rc )
> +        {
> +            xc_hypercall_bounce_post(xch, info);
> +            xc_hypercall_bounce_post(xch, name);
> +            return rc;
> +        }

Can you just use "break" instead of three "return"s? That should
simplify code a lot.

> +        set_xen_guest_handle(sysctl.u.xsplice.u.list.status, info);
> +        set_xen_guest_handle(sysctl.u.xsplice.u.list.name, name);
> +        set_xen_guest_handle(sysctl.u.xsplice.u.list.len, len);
> +
> +        rc = do_sysctl(xch, &sysctl);
> +        /*
> +         * From here on we MUST call xc_hypercall_bounce. If rc < 0 we
> +         * end up doing it (outside the loop), so using a break is OK.
> +         */
> +        if ( rc < 0 && errno == E2BIG )
> +        {
> +            if ( max_batch_sz <= 1 )
> +                break;
> +            max_batch_sz >>= 1;
> +            adjust = 1; /* For the loop conditional to let us loop again. */
> +            /* No memory leaks! */
> +            xc_hypercall_bounce_post(xch, info);
> +            xc_hypercall_bounce_post(xch, name);
> +            xc_hypercall_bounce_post(xch, len);
> +            continue;
> +        }
> +        else if ( rc < 0 ) /* For all other errors we bail out. */
> +            break;
> +
> +        if ( !version )
> +            version = sysctl.u.xsplice.u.list.version;
> +
> +        if ( sysctl.u.xsplice.u.list.version != version )
> +        {
> +            /* TODO: retries should be configurable? */
> +            if ( retries++ > 3 )
> +            {
> +                rc = -1;
> +                errno = EBUSY;
> +                break;
> +            }
> +            *done = 0; /* Retry from scratch. */
> +            version = sysctl.u.xsplice.u.list.version;
> +            adjust = 1; /* And make sure we continue in the loop. */
> +            /* No memory leaks. */
> +            xc_hypercall_bounce_post(xch, info);
> +            xc_hypercall_bounce_post(xch, name);
> +            xc_hypercall_bounce_post(xch, len);
> +            continue;
> +        }
> +
> +        /* We should never hit this, but just in case. */
> +        if ( rc > nr )
> +        {
> +            errno = EINVAL; /* Overflow! */
> +            rc = -1;
> +            break;
> +        }
> +        *left = sysctl.u.xsplice.u.list.nr; /* Total remaining count. */
> +        /* Copy only up 'rc' of data' - we could add 'min(rc,nr) if desired. */
> +        HYPERCALL_BOUNCE_SET_SIZE(info, (rc * sizeof(*info)));
> +        HYPERCALL_BOUNCE_SET_SIZE(name, (rc * sizeof(*name) * XEN_XSPLICE_NAME_SIZE));

Line too long.

> +        HYPERCALL_BOUNCE_SET_SIZE(len, (rc * sizeof(*len)));
> +        /* Bounce the data and free the bounce buffer. */
> +        xc_hypercall_bounce_post(xch, info);
> +        xc_hypercall_bounce_post(xch, name);
> +        xc_hypercall_bounce_post(xch, len);
> +        /* And update how many elements of info we have copied into. */
> +        *done += rc;
> +        /* Update idx. */
> +        sysctl.u.xsplice.u.list.idx = *done;
> +    } while ( adjust || (*done < max && *left != 0) );
> +
> +    if ( rc < 0 )
> +    {
> +        xc_hypercall_bounce_post(xch, len);
> +        xc_hypercall_bounce_post(xch, name);
> +        xc_hypercall_bounce_post(xch, info);
> +    }
> +
> +    return rc > 0 ? 0 : rc;
> +}
> +
> +static int _xc_xsplice_action(xc_interface *xch,
> +                              char *name,
> +                              unsigned int action,
> +                              uint32_t timeout)
> +{
> +    int rc;
> +    DECLARE_SYSCTL;
> +    DECLARE_HYPERCALL_BOUNCE(name, 0 /* adjust later */, XC_HYPERCALL_BUFFER_BOUNCE_IN);

Line too long.

Wei.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 05/13] xen-xsplice: Tool to manipulate xsplice payloads (v3)
  2016-01-14 21:47 ` [PATCH v2 05/13] xen-xsplice: Tool to manipulate xsplice payloads (v3) Konrad Rzeszutek Wilk
@ 2016-01-19 11:14   ` Wei Liu
  2016-01-19 14:30   ` Ross Lagerwall
  1 sibling, 0 replies; 45+ messages in thread
From: Wei Liu @ 2016-01-19 11:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, jbeulich, sasha.levin,
	xen-devel

On Thu, Jan 14, 2016 at 04:47:03PM -0500, Konrad Rzeszutek Wilk wrote:
[...]
> +/* This value was choosen adhoc. It could be 42 too. */
> +#define MAX_LEN 11
> +static int list_func(int argc, char *argv[])
> +{
> +    unsigned int idx, done, left, i;
> +    xen_xsplice_status_t *info = NULL;
> +    char *id = NULL;
> +    uint32_t *len = NULL;
> +    int rc = ENOMEM;
> +
> +    if ( argc )
> +    {
> +        show_help();
> +        return -1;
> +    }
> +    idx = left = 0;
> +    info = malloc(sizeof(*info) * MAX_LEN);
> +    if ( !info )
> +        goto out;
> +    id = malloc(sizeof(*id) * XEN_XSPLICE_NAME_SIZE * MAX_LEN);
> +    if ( !id )
> +        goto out;
> +    len = malloc(sizeof(*len) * MAX_LEN);
> +    if ( !len )
> +        goto out;
> +
> +    fprintf(stdout," ID                                     | status\n"
> +                   "----------------------------------------+------------\n");
> +    do {
> +        done = 0;
> +        memset(info, 'A', sizeof(*info) * MAX_LEN); /* Optional. */
> +        memset(id, 'i', sizeof(*id) * MAX_LEN * XEN_XSPLICE_NAME_SIZE); /* Optional. */

Line too long.

[...]
> +static int upload_func(int argc, char *argv[])
> +{
> +    char *filename;
> +    char id[XEN_XSPLICE_NAME_SIZE];
> +    int fd = 0, rc;
> +    struct stat buf;
> +    unsigned char *fbuf;
> +    ssize_t len;
> +    DECLARE_HYPERCALL_BUFFER(char, payload);
> +

I don't think you need to declare hypercall buffer here in the utility.
It should be libxc's responsibility to bounce the buffer accordingly.


Wei.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version'.
  2016-01-14 21:47 ` [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version' Konrad Rzeszutek Wilk
@ 2016-01-19 11:14   ` Wei Liu
  2016-01-19 14:57   ` Ross Lagerwall
  2016-01-19 16:47   ` Ross Lagerwall
  2 siblings, 0 replies; 45+ messages in thread
From: Wei Liu @ 2016-01-19 11:14 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, jbeulich, sasha.levin,
	xen-devel

On Thu, Jan 14, 2016 at 04:47:08PM -0500, Konrad Rzeszutek Wilk wrote:
> This change demonstrates how to generate an xSplice ELF payload.
> 
> The idea here is that we want to patch in the hypervisor
> the 'xen_version_extra' function with an function that will
> return 'Hello World'. The 'xl info | grep extraversion'
> will reflect the new value after the patching.
> 
> To generate this ELF payload file we need:
>  - C code of the new code.
>  - C code generating the .xsplice.func structure.
>  - The address of the old code (xen_extra_version). We
>    do it by using 'nm' but that is a bit of hack.
> 
> The linker script file:
>  - Discards .debug* and .comments* sections.
>  - Changes the name of .data.local.xsplice_hello_world to
>    .xsplice.func
>  - Figures out the size of the new code.
> 
> Also if you are curious on the input/output sections
> magic the linker does, add these to the GCC line:
>   -Wl,-M  -Wl,-t -Wl,-verbose
> which are: print linking map, provide trace and be verbose.
> 
> The use-case is simple:
> 
> $xen-xsplice load /usr/lib/xen/bin/xen_hello_world.xsplice
> $xen-xsplice list
>  ID                                     | status
> ----------------------------------------+------------
> xen_hello_world                           APPLIED
> $xl info | grep extra
> xen_extra              : Hello World
> $xen-xsplice revert xen_hello_world
> Performing revert: completed
> $xen-xsplice unload xen_hello_world
> Performing unload: completed
> $xl info | grep extra
> xen_extra              : -unstable
> 
> Note that it does not build under a 32-bit toolstack as
> there is no access to the hypervisor (xen-syms).
> 
> We also force it to be built every time - as the hypervisor
> may have been rebuilt.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  docs/misc/xsplice.markdown   | 50 ++++++++++++++++++++++++++++++++++++++++++++
>  tools/misc/Makefile          | 25 +++++++++++++++++++++-
>  tools/misc/xen_hello_world.c | 15 +++++++++++++
>  tools/misc/xsplice.h         | 12 +++++++++++
>  tools/misc/xsplice.lds       | 11 ++++++++++

Please put the files of this test case into a dedicated directory.

Wei.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 05/13] xen-xsplice: Tool to manipulate xsplice payloads (v3)
  2016-01-14 21:47 ` [PATCH v2 05/13] xen-xsplice: Tool to manipulate xsplice payloads (v3) Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
@ 2016-01-19 14:30   ` Ross Lagerwall
  1 sibling, 0 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 14:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
> A simple tool that allows an system admin to perform
> basic xsplice operations:
>
>   - Upload a xsplice file (with an unique id)

s/id/name throughout the rest of this patch

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7)
  2016-01-14 21:47 ` [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7) Konrad Rzeszutek Wilk
@ 2016-01-19 14:30   ` Ross Lagerwall
  2016-02-06 22:35   ` Doug Goldstein
  1 sibling, 0 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 14:30 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
> +struct xen_sysctl_xsplice_summary {
> +    xen_xsplice_name_t name;                /* IN, name of the payload. */
> +    xen_xsplice_status_t status;            /* IN/OUT, state of it. */
> +};
> +typedef struct xen_sysctl_xsplice_summary xen_sysctl_xsplice_summary_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_summary_t);
> +
> +/*
> + * Retrieve an array of abbreviated status and names of payloads that are
> + * loaded in the hypervisor.
> + *
> + * If the hypercall returns an positive number, it is the number (up to `nr`)
> + * of the payloads returned, along with `nr` updated with the number of remaining
> + * payloads, `version` updated (it may be the same across hypercalls. If it
> + * varies the data is stale and further calls could fail). The `status`,
> + * `name`, and `len`' are updated at their designed index value (`idx`) with
> + * the returned value of data.
> + *
> + * If the hypercall returns E2BIG the `nr` is too big and should be
> + * lowered.
> + *
> + * This operation can be preempted by the hypercall returning EAGAIN.
> + * Retry.
> + *
> + * Note that due to the asynchronous nature of hypercalls the domain might have
> + * added or removed the number of payloads making this information stale. It is
> + * the responsibility of the toolstack to use the `version` field to check
> + * between each invocation. if the version differs it should discard the stale
> + * data and start from scratch. It is OK for the toolstack to use the new
> + * `version` field.
> + */
> +#define XEN_SYSCTL_XSPLICE_LIST 2
> +struct xen_sysctl_xsplice_list {
> +    uint32_t version;                       /* IN/OUT: Initially *MUST* be zero.
> +                                               On subsequent calls reuse value.
> +                                               If varies between calls, we are
> +                                             * getting stale data. */
> +    uint32_t idx;                           /* IN/OUT: Index into array. */
> +    uint32_t nr;                            /* IN: How many status, id, and len

s/id/name/

> +                                               should fill out.
> +                                               OUT: How many payloads left. */
> +    uint32_t pad;                           /* IN: Must be zero. */
> +    XEN_GUEST_HANDLE_64(xen_xsplice_status_t) status;  /* OUT. Must have enough
> +                                               space allocate for nr of them. */
> +    XEN_GUEST_HANDLE_64(char) name;         /* OUT: Array of ids. Each member

s/id/name/

> +                                               MUST XEN_XSPLICE_NAME_SIZE in size.
> +                                               Must have nr of them. */
> +    XEN_GUEST_HANDLE_64(uint32) len;        /* OUT: Array of lengths of ids.

s/id/name/

> +                                               Must have nr of them. */
> +};
> +typedef struct xen_sysctl_xsplice_list xen_sysctl_xsplice_list_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_list_t);
> +
> +/*
> + * Perform an operation on the payload structure referenced by the `name` field.
> + * The operation request is asynchronous and the status should be retrieved
> + * by using either XEN_SYSCTL_XSPLICE_GET or XEN_SYSCTL_XSPLICE_LIST hypercall.
> + */
> +#define XEN_SYSCTL_XSPLICE_ACTION 3
> +struct xen_sysctl_xsplice_action {


-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/13] xsplice: Design document (v5).
  2016-01-14 21:46 ` [PATCH v2 01/13] xsplice: Design document (v5) Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
@ 2016-01-19 14:31   ` Ross Lagerwall
  2016-02-05 18:27     ` Konrad Rzeszutek Wilk
  2016-02-05 18:34     ` Konrad Rzeszutek Wilk
  2016-02-05 15:25   ` Jan Beulich
  2 siblings, 2 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 14:31 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:46 PM, Konrad Rzeszutek Wilk wrote:
> +## Workflow
> +
> +The expected workflows of higher-level tools that manage multiple patches
> +on production machines would be:
> +
> + * The first obvious task is loading all available / suggested
> +   hotpatches around system start.

I'd expect that the most obvious task apply patches as they are 
installed at runtime. I'd hope that the system would always be booted 
from a fully patched hypervisor. Of course, there's nothing stopping one 
from patching at system start.

> + * Whenever new hotpatches are installed, they should be loaded too.
> + * One wants to query which modules have been loaded at runtime.
> + * If unloading is deemed safe (see unloading below), one may want to
> +   support a workflow where a specific hotpatch is marked as bad and
> +   unloaded.
> + * If we do no restrict module activation order and want to report tboot
> +   state on sequences, we might have a complexity explosion problem, in
> +   what system hashes should be considered acceptable.

This last bullet shouldn't be in the Workflow section.

> +
> +## Patching code
> +
> +The first mechanism to patch that comes in mind is in-place replacement.
> +That is replace the affected code with new code. Unfortunately the x86
> +ISA is variable size which places limits on how much space we have available
> +to replace the instructions. That is not a problem if the change is smaller
> +than the original opcode and we can fill it with nops. Problems will
> +appear if the replacement code is longer.
> +
snip
> +## Hypercalls
> +
> +We will employ the sub operations of the system management hypercall (sysctl).
> +There are to be four sub-operations:
> +
> + * upload the payloads.
> + * listing of payloads summary uploaded and their state.
> + * getting an particular payload summary and its state.
> + * command to apply, delete, or revert the payload.
> +
> +Most of the actions are asynchronous therefore the caller is responsible
> +to verify that it has been applied properly by retrieving the summary of it
> +and verifying that there are no error codes associated with the payload.
> +
> +We **MUST** make some of them asynchronous due to the nature of patching
> +it requires every physical CPU to be lock-step with each other.
> +The patching mechanism while an implementation detail, is not an short
> +operation and as such the design **MUST** assume it will be an long-running
> +operation.
> +
> +The sub-operations will spell out how preemption is to be handled (if at all).
> +
> +Furthermore it is possible to have multiple different payloads for the same
> +function. As such an unique id per payload has to be visible to allow proper manipulation.

s/id/name throughout this patch.

> +
> +The hypercall is part of the `xen_sysctl`. The top level structure contains
> +one uint32_t to determine the sub-operations and one padding field which
> +*MUST* always be zero.
> +
> +<pre>
> +struct xen_sysctl_xsplice_op {
> +    uint32_t cmd;                   /* IN: XEN_SYSCTL_XSPLICE_*. */
> +    uint32_t pad;                   /* IN: Always zero. */
> +	union {
> +          ... see below ...
> +        } u;
> +};
> +
> +</pre>
> +while the rest of hypercall specific structures are part of the this structure.
> +
> +### Basic type: struct xen_xsplice_id
> +
> +Most of the hypercalls employ an shared structure called `struct xen_xsplice_id`
> +which contains:
> +
> + * `name` - pointer where the string for the id is located.
> + * `size` - the size of the string
> + * `pad` - padding - to be zero.
> +
> +The structure is as follow:
> +
> +<pre>
> +#define XEN_XSPLICE_NAME_SIZE 128
> +struct xen_xsplice_id {
> +    XEN_GUEST_HANDLE_64(char) name;         /* IN, pointer to name. */
> +    uint16_t size;                          /* IN, size of name. May be upto
> +                                               XEN_XSPLICE_NAME_SIZE. */
> +    uint16_t pad[3];                        /* IN: MUST be zero. */
> +};
> +</pre>
> +
> +### XEN_SYSCTL_XSPLICE_UPLOAD (0)
> +
> +Upload a payload to the hypervisor. The payload is verified
> +against basic checks and if there are any issues the proper return code
> +will be returned. The payload is not applied at this time - that is
> +controlled by *XEN_SYSCTL_XSPLICE_ACTION*.
> +
> +The caller provides:
> +
> + * A `struct xen_xsplice_id` called `id` which has the unique id.
> + * `size` the size of the ELF payload (in bytes).
> + * `payload` the virtual address of where the ELF payload is.
> +
> +The `id` could be an UUID that stays fixed forever for a given
> +payload. It can be embedded into the ELF payload at creation time
> +and extracted by tools.
> +
> +The return value is zero if the payload was succesfully uploaded.
> +Otherwise an XEN_EXX return value is provided. Duplicate `id` are not supported.
> +
> +The `payload` is the ELF payload as mentioned in the `Payload format` section.
> +
> +The structure is as follow:
> +
> +<pre>
> +struct xen_sysctl_xsplice_upload {
> +    xen_xsplice_id_t id;                /* IN, name of the patch. */
> +    uint64_t size;                      /* IN, size of the ELF file. */
> +    XEN_GUEST_HANDLE_64(uint8) payload; /* IN: ELF file. */
> +};
> +</pre>
> +
> +### XEN_SYSCTL_XSPLICE_GET (1)
> +
> +Retrieve an status of an specific payload. This caller provides:
> +
> + * A `struct xen_xsplice_id` called `id` which has the unique id.
> + * A `struct xen_xsplice_status` structure which has all members
> +   set to zero: That is:
> +   * `status` *MUST* be set to zero.
> +   * `rc` *MUST* be set to zero.
> +
> +Upon completion the `struct xen_xsplice_status` is updated.
> +
> + * `status` - whether it has been:
> +   * *XSPLICE_STATUS_LOADED* (1) has been loaded.
> +   * *XSPLICE_STATUS_CHECKED*  (2) the ELF payload safety checks passed.
> +   * *XSPLICE_STATUS_APPLIED* (3) loaded, checked, and applied.
> +   *  No other value is possible.
> + * `rc` - XEN_EXX type errors encountered while performing the `status`
> +   operation.

This is not quite correct. `rc` indicates errors while performing the 
last XSPLICE_ACTION_* operation. `status` indicates the current status 
of the payload. So usually if there's an error in `rc`, `status` will 
_not_ have changed.

For example, suppose there exists a payload:
status: XSPLICE_STATUS_LOADED
rc: 0

We apply an action, XSPLICE_ACTION_REVERT, to revert it. Afterwards:
status: XSPLICE_STATUS_LOADED
rc: XEN_EINVAL

It has failed (with EINVAL) but it remains loaded.


> +   respectively mean: success or operation in progress. Other values
> +   imply an error occurred.
> +
> +The return value of the hypercall is zero on success and XEN_EXX on failure.
> +(Note that the `rc`` value can be different from the return value, as in
> +rc=XEN_EAGAIN and return value can be 0).
> +
> +This operation is synchronous and does not require preemption.
> +
> +The structure is as follow:
> +
> +<pre>
> +struct xen_xsplice_status {
> +#define XSPLICE_STATUS_LOADED       1
> +#define XSPLICE_STATUS_CHECKED      2
> +#define XSPLICE_STATUS_APPLIED      3
> +    int32_t state;                  /* OUT: XSPLICE_STATE_*. IN: MUST be zero. */
> +    int32_t rc;                     /* OUT: 0 if no error, otherwise -XEN_EXX. */
> +                                    /* IN: MUST be zero. */
> +};
> +
> +struct xen_sysctl_xsplice_summary {
> +    xen_xsplice_id_t id;            /* IN, the name of the payload. */
> +    xen_xsplice_status_t status;    /* IN/OUT: status of the payload. */
> +};
> +</pre>
> +
> +### XEN_SYSCTL_XSPLICE_LIST (2)
> +
> +Retrieve an array of abbreviated status and names of payloads that are loaded in the
> +hypervisor.
> +
> +The caller provides:
> +
> + * `version`. Initially (on first hypercall) *MUST* be zero.
> + * `idx` index iterator. On first call *MUST* be zero, subsequent calls varies.
> + * `nr` the max number of entries to populate.
> + * `pad` - *MUST* be zero.
> + * `status` virtual address of where to write `struct xen_xsplice_status`
> +   structures. Caller *MUST* allocate up to `nr` of them.
> + * `id` - virtual address of where to write the unique id of the payload.
> +   Caller *MUST* allocate up to `nr` of them. Each *MUST* be of
> +   **XEN_XSPLICE_NAME_SIZE** size.
> + * `len` - virtual address of where to write the length of each unique id
> +   of the payload. Caller *MUST* allocate up to `nr` of them. Each *MUST* be
> +   of sizeof(uint32_t) (4 bytes).
> +
> +If the hypercall returns an positive number, it is the number (up to `nr`)
> +of the payloads returned, along with `nr` updated with the number of remaining
> +payloads, `version` updated (it may be the same across hypercalls. If it
> +varies the data is stale and further calls could fail). The `status`,
> +`id`, and `len`' are updated at their designed index value (`idx`) with
> +the returned value of data.
> +
> +If the hypercall returns E2BIG the `count` is too big and should be
> +lowered.
> +
> +This operation can be preempted by the hypercall returning XEN_EAGAIN.
> +Retry.
> +
> +Note that due to the asynchronous nature of hypercalls the control domain might
> +have added or removed a number of payloads making this information stale. It is
> +the responsibility of the toolstack to use the `version` field to check
> +between each invocation. if the version differs it should discard the stale
> +data and start from scratch. It is OK for the toolstack to use the new
> +`version` field.
> +
> +The `struct xen_xsplice_status` structure contains an status of payload which includes:
> +
> + * `status` - whether it has been:
> +   * *XSPLICE_STATUS_LOADED* (1) has been loaded.
> +   * *XSPLICE_STATUS_CHECKED*  (2) the ELF payload safety checks passed.

The ELF safety checks are done during load. At this stage we don't have 
any checks yet so I'm not sure what will go here.


> +   * *XSPLICE_STATUS_APPLIED* (3) loaded, checked, and applied.
> + * `rc` - XEN_EXX type errors encountered while performing the `status`
> +   operation.

Same comment as above.

> +   respectively mean: success or operation in progress.
> +
> +The structure is as follow:
> +
> +<pre>
> +struct xen_sysctl_xsplice_list {
> +    uint32_t version;                       /* IN/OUT: Initially *MUST* be zero.
> +                                               On subsequent calls reuse value.
> +                                               If varies between calls, we are
> +                                             * getting stale data. */
> +    uint32_t idx;                           /* IN/OUT: Index into array. */
> +    uint32_t nr;                            /* IN: How many status, id, and len
> +                                               should fill out.
> +                                               OUT: How many payloads left. */
> +    uint32_t pad;                           /* IN: Must be zero. */
> +    XEN_GUEST_HANDLE_64(xen_xsplice_status_t) status;  /* OUT. Must have enough
> +                                               space allocate for n of them. */
> +    XEN_GUEST_HANDLE_64(char) id;           /* OUT: Array of ids. Each member
> +                                               MUST XEN_XSPLICE_NAME_SIZE in size.
> +                                               Must have n of them. */
> +    XEN_GUEST_HANDLE_64(uint32) len;        /* OUT: Array of lengths of ids.
> +                                               Must have n of them. */
> +};
> +</pre>
> +
> +### XEN_SYSCTL_XSPLICE_ACTION (3)
> +
> +Perform an operation on the payload structure referenced by the `id` field.
> +The operation request is asynchronous and the status should be retrieved
> +by using either **XEN_SYSCTL_XSPLICE_GET** or **XEN_SYSCTL_XSPLICE_LIST** hypercall.
> +
> +The caller provides:
> +
> + * A 'struct xen_xsplice_id` `id` containing the unique id.
> + * `cmd` the command requested:
> +  * *XSPLICE_ACTION_CHECK* (1) check that the payload will apply properly.
> +    This also verfies the payload - which may require SecureBoot firmware
> +    calls.

Again, I'm not sure what checks will go here. We can't necessarily check 
that the patch will apply because some other payload may be applied first.

> +  * *XSPLICE_ACTION_UNLOAD* (2) unload the payload.
> +   Any further hypercalls against the `id` will result in failure unless
> +   **XEN_SYSCTL_XSPLICE_UPLOAD** hypercall is perfomed with same `id`.
> +  * *XSPLICE_ACTION_REVERT* (3) revert the payload. If the operation takes
> +  more time than the upper bound of time the `status` will XEN_EBUSY.
> +  * *XSPLICE_ACTION_APPLY* (4) apply the payload. If the operation takes
> +  more time than the upper bound of time the `status` will be XEN_EBUSY.
> +  * *XSPLICE_ACTION_REPLACE* (5) revert all applied payloads and apply this
> +  payload.
> +  * *XSPLICE_ACTION_LOADED* is an initial state and cannot be requested.
> + * `time` the upper bound of time (ms) the cmd should take. Zero means infinite.
> +   If within the time the operation does not succeed the operation would go in
> +   error state.
> + * `pad` - *MUST* be zero.
> +
> +The return value will be zero unless the provided fields are incorrect.
> +
> +The structure is as follow:
> +
> +<pre>
> +#define XSPLICE_ACTION_CHECK   1
> +#define XSPLICE_ACTION_UNLOAD  2
> +#define XSPLICE_ACTION_REVERT  3
> +#define XSPLICE_ACTION_APPLY   4
> +#define XSPLICE_ACTION_REPLACE 5
> +struct xen_sysctl_xsplice_action {
> +    xen_xsplice_id_t id;                    /* IN, name of the patch. */
> +    uint32_t cmd;                           /* IN: XSPLICE_ACTION_* */
> +    uint32_t time;                          /* IN: Zero if no timeout. */
> +                                            /* Or upper bound of time (ms) */
> +                                            /* for operation to take. */
> +};
> +
> +</pre>
> +
> +## State diagrams of XSPLICE_ACTION commands.
> +
> +There is a strict ordering state of what the commands can be.
> +The XSPLICE_ACTION prefix has been dropped to easy reading and
> +does not include the XSPLICE_STATES:
> +
> +<pre>
> +              /->\
> +              \  /
> + UNLOAD <--- CHECK ---> REPLACE|APPLY --> REVERT --\
> +                \                                  |
> +                 \-------------------<-------------/

This doesn't make much sense to me. The actions need to be represented 
by arrows that move from one state to another.

> +
> +</pre>
> +## State transition table of XSPLICE_ACTION commands and XSPLICE_STATUS.
> +
> +Note that:
> +
> + - The LOADED state is the starting one achieved with *XEN_SYSCTL_XSPLICE_UPLOAD* hypercall.
> + - The REVERT operation on success will automatically move to CHECK state.

... move to the CHECKED state.

Although the second point is not really notable since it's implicit from 
the state transition table below. I guess it's only notable because it's 
different from an earlier design.

> + - There are three STATES: LOADED, CHECKED and APPLIED.
> + - There are five actions (aka commands): CHECK, APPLY, REPLACE, REVERT, and UNLOAD.
> +
> +The state transition table of valid states and action states:
> +
> +<pre>
> +
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| ACTION  | Current | Result                         |       Next STATE:      |
> +| ACTION  | STATE   |                                | LOADED|CHECKED|APPLIED |
> ++---------+----------+-------------------------------+-------+-------+--------+
> +| CHECK   | LOADED  | Check payload (success).       |       |   x   |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| CHECK   | LOADED  | Check payload (error).         |  x    |       |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| CHECK   | CHECKED | Check payload (once more, no)  |       |   x   |        |
> +|         |         | errors)                        |       |       |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| CHECK   | CHECKED | Check payload (once more, with |   x   |       |        |
> +|         |         | errors)                        |       |       |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| UNLOAD  | CHECKED | Unload payload. Always works.  |       |       |        |
> +|         |         | No next states.                |       |       |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| UNLOAD  | LOADED  | Unload payload. Always works.  |       |       |        |
> +|         |         | No next states.                |       |       |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| APPLY   | CHECKED | Apply payload (success).       |       |       |   x    |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| APPLY   | CHECKED | Apply payload (error|timeout)  |       |   x   |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| REPLACE | CHECKED | Revert payloads and apply new  |       |       |   x    |
> +|         |         | payload with success.          |       |       |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| REPLACE | CHECKED | Revert payloads and apply new  |       |   x   |        |
> +|         |         | payload with error.            |       |       |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| REVERT  | APPLIED | Revert payload (success).      |       |   x   |        |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +| REVERT  | APPLIED | Revert payload (error|timeout) |       |       |   x    |
> ++---------+---------+--------------------------------+-------+-------+--------+
> +</pre>
> +
> +All the other state transitions are invalid.
> +
> +## Sequence of events.
> +
> +The normal sequence of events is to:
> +
> + 1. *XEN_SYSCTL_XSPLICE_UPLOAD* to upload the payload. If there are errors *STOP* here.
> + 2. *XEN_SYSCTL_XSPLICE_GET* to check the `->rc`. If *XEN_EAGAIN* spin. If zero go to next step.
> + 3. *XEN_SYSCTL_XSPLICE_ACTION* with *XSPLICE_ACTION_CHECK* command to verify that the payload can be succesfully applied.
> + 4. *XEN_SYSCTL_XSPLICE_GET* to check the `->rc`. If *XEN_EAGAIN* spin. If zero go to next step.
> + 5. *XEN_SYSCTL_XSPLICE_ACTION* with *XSPLICE_ACTION_APPLY* to apply the patch.
> + 6. *XEN_SYSCTL_XSPLICE_GET* to check the `->rc`. If in *XEN_EAGAIN* spin. If zero exit with success.
> +
> +
snip
> +
> +# v2: Not Yet Done

We need a new word to describe that this isn't V2 of this series, but 
some further development of xSplice.

> +
> +
> +## Goals
> +
> +The v2 design must also have a mechanism for:
> +
> + *  An dependency mechanism for the payloads. To use that information to load:
> +    - The appropiate payload. To verify that payload is built against the
> +      hypervisor. This can be done via the `build-id`
> +      or via providing an copy of the old code - so that the hypervisor can
> +       verify it against the code in memory.
> +    - To construct an appropiate order of payloads to load in case they
> +      depend on each other.
> + * Be able to lookup in the Xen hypervisor the symbol names of functions from the ELF payload.
> + * Be able to patch .rodata, .bss, and .data sections.

See my comments about this below.

> + * Further safety checks (blacklist of which functions cannot be patched, check
> +   the stack, etc).
> +
> +### xSplice interdependencies
> +
> +xSplice patches interdependencies are tricky.
> +
> +There are the ways this can be addressed:
> + * A single large patch that subsumes and replaces all previous ones.
> +   Over the life-time of patching the hypervisor this large patch
> +   grows to accumulate all the code changes.
> + * Hotpatch stack - where an mechanism exists that loads the hotpatches
> +   in the same order they were built in. We would need an build-id
> +   of the hypevisor to make sure the hot-patches are build against the
> +   correct build.
> + * Payload containing the old code to check against that. That allows
> +   the hotpatches to be loaded indepedently (if they don't overlap) - or
> +   if the old code also containst previously patched code - even if they
> +   overlap.
> +
> +The disadvantage of the first large patch is that it can grow over
> +time and not provide an bisection mechanism to identify faulty patches.
> +
> +The hot-patch stack puts stricts requirements on the order of the patches
> +being loaded and requires an hypervisor build-id to match against.
> +
> +The old code allows much more flexibility and an additional guard,
> +but is more complex to implement.
> +
> +### Handle inlined __LINE__
> +
> +This problem is related to hotpatch construction
> +and potentially has influence on the design of the hotpatching
> +infrastructure in Xen.
> +
> +For example:
> +
> +We have file1.c with functions f1 and f2 (in that order).  f2 contains a
> +BUG() (or WARN()) macro and at that point embeds the source line number
> +into the generated code for f2.
> +
> +Now we want to hotpatch f1 and the hotpatch source-code patch adds 2
> +lines to f1 and as a consequence shifts out f2 by two lines.  The newly
> +constructed file1.o will now contain differences in both binary
> +functions f1 (because we actually changed it with the applied patch) and
> +f2 (because the contained BUG macro embeds the new line number).
> +
> +Without additional information, an algorithm comparing file1.o before
> +and after hotpatch application will determine both functions to be
> +changed and will have to include both into the binary hotpatch.
> +
> +Options:
> +
> +1. Transform source code patches for hotpatches to be line-neutral for
> +   each chunk.  This can be done in almost all cases with either
> +   reformatting of the source code or by introducing artificial
> +   preprocessor "#line n" directives to adjust for the introduced
> +   differences.
> +
> +   This approach is low-tech and simple.  Potentially generated
> +   backtraces and existing debug information refers to the original
> +   build and does not reflect hotpatching state except for actually
> +   hotpatched functions but should be mostly correct.
> +
> +2. Ignoring the problem and living with artificially large hotpatches
> +   that unnecessarily patch many functions.
> +
> +   This approach might lead to some very large hotpatches depending on
> +   content of specific source file.  It may also trigger pulling in
> +   functions into the hotpatch that cannot reasonable be hotpatched due
> +   to limitations of a hotpatching framework (init-sections, parts of
> +   the hotpatching framework itself, ...) and may thereby prevent us
> +   from patching a specific problem.
> +
> +   The decision between 1. and 2. can be made on a patch--by-patch
> +   basis.
> +
> +3. Introducing an indirection table for storing line numbers and
> +   treating that specially for binary diffing. Linux may follow
> +   this approach.
> +
> +   We might either use this indirection table for runtime use and patch
> +   that with each hotpatch (similarly to exception tables) or we might
> +   purely use it when building hotpatches to ignore functions that only
> +   differ at exactly the location where a line-number is embedded.
> +
> +   For BUG(), WARN(), etc., the line number is embedded into the bug frame, not
> +   the function itself.

This shouldn't be indented so that it is clear that it is not part of 
Option 3, but is what currently exists in Xen .

> +
> +Similar considerations are true to a lesser extent for __FILE__, but it
> +could be argued that file renaming should be done outside of hotpatches.
> +
snip
> +
> +
> +### .rodata sections
> +
> +The patching might require strings to be updated as well. As such we must be
> +also able to patch the strings as needed. This sounds simple - but the compiler
> +has a habit of coalescing strings that are the same - which means if we in-place
> +alter the strings - other users will be inadvertently affected as well.
> +
> +This is also where pointers to functions live - and we may need to patch this
> +as well. And switch-style jump tables.
> +
> +To guard against that we must be prepared to do patching similar to
> +trampoline patching or in-line depending on the flavour. If we can
> +do in-line patching we would need to:
> +
> + * alter `.rodata` to be writeable.
> + * inline patch.
> + * alter `.rodata` to be read-only.
> +
> +If are doing trampoline patching we would need to:
> +
> + * allocate a new memory location for the string.
> + * all locations which use this string will have to be updated to use the
> +   offset to the string.
> + * mark the region RO when we are done.
> +
> +### .bss and .data sections.
> +
> +In place patching writable data is not suitable as it is unclear what should be done
> +depending on the current state of data. As such it should not be attempted.
> +
> +However, functions which are being patched can bring in changes to strings
> +(.data or .rodata section changes), or even to .bss sections.
> +
> +As such the ELF payload can introduce new .rodata, .bss, and .data sections.
> +Patching in the new function will end up also patching in the new .rodata
> +section and the new function will reference the new string in the new
> +.rodata section.

Payloads including .rodata, .bss, and .data already work with the 
existing build tool and this patch series.

In-place patching is, IMO, a bad idea and shouldn't be in the further 
development :-)

What we should have in the next iteration is hook functions so that the 
existing data can be changed during payload application.

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 07/13] xsplice: Add helper elf routines (v2)
  2016-01-14 21:47 ` [PATCH v2 07/13] xsplice: Add helper elf routines (v2) Konrad Rzeszutek Wilk
@ 2016-01-19 14:33   ` Ross Lagerwall
  2016-02-05 18:38     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 14:33 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
>
> Add Elf routines and data structures in preparation for loading an
> xSplice payload.
>
> We also add an macro that will print where we failed during
> the ELF parsing.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> v2: - With the #define ELFSIZE in the ARM file we can use the common
>       #defines instead of using #ifdef CONFIG_ARM_32.
>      - Add checks for ELF file.
>      - Add name to be printed.
>      - Add len for easier ELF checks.
>      - Expand on the checks. Add macro.
> ---
> diff --git a/xen/common/xsplice_elf.c b/xen/common/xsplice_elf.c
> new file mode 100644
> index 0000000..a5e9d63
> --- /dev/null
> +++ b/xen/common/xsplice_elf.c
> @@ -0,0 +1,201 @@
> +#include <xen/lib.h>
> +#include <xen/errno.h>
> +#include <xen/xsplice.h>
> +#include <xen/xsplice_elf.h>
> +
> +#define return_(x) { printk(XENLOG_DEBUG "%s:%d rc: %d\n",  \
> +                            __func__,__LINE__, x); return x; }
> +
> +struct xsplice_elf_sec *xsplice_elf_sec_by_name(const struct xsplice_elf *elf,
> +                                                const char *name)
> +{
> +    unsigned int i;
> +
> +    for ( i = 0; i < elf->hdr->e_shnum; i++ )
> +    {
> +        if ( !strcmp(name, elf->sec[i].name) )
> +            return &elf->sec[i];
> +    }
> +
> +    return NULL;
> +}
> +
> +static int elf_resolve_sections(struct xsplice_elf *elf, uint8_t *data)
> +{
> +    struct xsplice_elf_sec *sec;
> +    unsigned int i;
> +
> +    sec = xmalloc_array(struct xsplice_elf_sec, elf->hdr->e_shnum);
> +    if ( !sec )
> +    {
> +        printk(XENLOG_ERR "Could not allocate memory for section table!\n");

Shouldn't this printk be removed if you're using return_?

> +        return_(-ENOMEM);
> +    }
> +
> +    /* N.B. We also will ingest SHN_UNDEF sections. */
> +    for ( i = 0; i < elf->hdr->e_shnum; i++ )
> +    {
> +        ssize_t delta = elf->hdr->e_shoff + i * elf->hdr->e_shentsize;
> +
> +        if ( delta + sizeof(Elf_Shdr) > elf->len )
> +            return_(-EINVAL);
> +
> +        sec[i].sec = (Elf_Shdr *)(data + delta);
> +        delta = sec[i].sec->sh_offset;
> +
> +        if ( delta > elf->len )
> +            return_(-EINVAL);
> +
> +        sec[i].data = data + delta;
> +        /* Name is populated in xsplice_elf_sections_name. */
> +        sec[i].name = NULL;
> +
> +        if ( sec[i].sec->sh_type == SHT_SYMTAB )
> +        {
> +                if ( elf->symtab )
> +                    return_(-EINVAL);
> +                elf->symtab = &sec[i];
> +                /* elf->symtab->sec->sh_link would point to the right section
> +                 * but we hadn't finished parsing all the sections. */
> +                if ( elf->symtab->sec->sh_link > elf->hdr->e_shnum )
> +                    return_(-EINVAL);
> +        }
> +    }
> +    elf->sec = sec;
> +    if ( !elf->symtab )
> +        return_(-EINVAL);
> +
> +    /* There can be multiple SHT_STRTAB so pick the right one. */
> +    elf->strtab = &sec[elf->symtab->sec->sh_link];
> +
> +    if ( elf->symtab->sec->sh_size == 0 || elf->symtab->sec->sh_entsize == 0 )
> +        return_(-EINVAL);
> +
> +    if ( elf->symtab->sec->sh_entsize != sizeof(Elf_Sym) )
> +        return_(-EINVAL);
> +
> +    return 0;
> +}
> +
snip
> +
> +static int elf_get_sym(struct xsplice_elf *elf, uint8_t *data)
> +{
> +    struct xsplice_elf_sec *symtab_sec, *strtab_sec;
> +    struct xsplice_elf_sym *sym;
> +    unsigned int i, delta, offset;
> +
> +    symtab_sec = elf->symtab;
> +
> +    strtab_sec = elf->strtab;
> +
> +    /* Pointers arithmetic to get file offset. */
> +    offset = strtab_sec->data - data;
> +
> +    ASSERT( offset == strtab_sec->sec->sh_offset );
> +    /* symtab_sec->data was computed in elf_resolve_sections. */
> +    ASSERT((symtab_sec->sec->sh_offset + data) == symtab_sec->data );
> +
> +    /* No need to check values as elf_resolve_sections did it. */
> +    elf->nsym = symtab_sec->sec->sh_size / symtab_sec->sec->sh_entsize;
> +
> +    sym = xmalloc_array(struct xsplice_elf_sym, elf->nsym);
> +    if ( !sym )
> +    {
> +        printk(XENLOG_ERR "%s: Could not allocate memory for symbols\n", elf->name);

Shouldn't this printk be removed if you're using return_?

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/13] xsplice: Implement payload loading (v2)
  2016-01-14 21:47 ` [PATCH v2 08/13] xsplice: Implement payload loading (v2) Konrad Rzeszutek Wilk
@ 2016-01-19 14:34   ` Ross Lagerwall
  2016-01-19 16:59     ` Konrad Rzeszutek Wilk
  2016-01-19 16:45   ` Ross Lagerwall
  1 sibling, 1 reply; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 14:34 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
>
> Add support for loading xsplice payloads. This is somewhat similar to
> the Linux kernel module loader, implementing the following steps:
> - Verify the elf file.
> - Parse the elf file.
> - Allocate a region of memory mapped within a free area of
>    [xen_virt_end, XEN_VIRT_END].
> - Copy allocated sections into the new region.
> - Resolve section symbols. All other symbols must be absolute addresses.
> - Perform relocations.
>
> Note that the structure 'xsplice_patch_func' differs a bit from the design
> by usurping 8 bytes from the padding. We use that for our own uses.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
> v2: - Change the 'xsplice_patch_func' structure layout/size.
>      - Add more error checking. Fix memory leak.
>      - Move elf_resolve and elf_perform relocs in elf file.
>      - Print the payload address and pages in keyhandler.
> v3:
>      - Make it build under ARM
snip
>
> +static void find_hole(ssize_t pages, unsigned long *hole_start,
> +                      unsigned long *hole_end)
> +{
> +    struct payload *data, *data2;
> +
> +    spin_lock(&payload_list_lock);
> +    list_for_each_entry ( data, &payload_list, list )
> +    {
> +        list_for_each_entry ( data2, &payload_list, list )
> +        {
> +            unsigned long start, end;
> +
> +            start = (unsigned long)data2->payload_address;
> +            end = start + data2->payload_pages * PAGE_SIZE;
> +            if ( *hole_end > start && *hole_start < end )
> +            {
> +                *hole_start = end;
> +                *hole_end = *hole_start + pages * PAGE_SIZE;
> +                break;
> +            }
> +        }
> +        if ( &data2->list == &payload_list )
> +            break;
> +    }
> +    spin_unlock(&payload_list_lock);
> +}

This function above should go down into the CONFIG_X86 section below.

> +
> +/*
> + * The following functions prepare an xSplice payload to be executed by
> + * allocating space, loading the allocated sections, resolving symbols,
> + * performing relocations, etc.
> + */
> +#ifdef CONFIG_X86
> +static void *alloc_payload(size_t size)
> +{
> +    mfn_t *mfn, *mfn_ptr;
> +    size_t pages, i;
> +    struct page_info *pg;
> +    unsigned long hole_start, hole_end, cur;
> +
> +    ASSERT(size);
> +
> +    /*
> +     * Copied from vmalloc which allocates pages and then maps them to an
> +     * arbitrary virtual address with PAGE_HYPERVISOR. We need specific
> +     * virtual address with PAGE_HYPERVISOR_RWX.
> +     */
> +    pages = PFN_UP(size);
> +    mfn = xmalloc_array(mfn_t, pages);
> +    if ( mfn == NULL )
> +        return NULL;
> +
> +    for ( i = 0; i < pages; i++ )
> +    {
> +        pg = alloc_domheap_page(NULL, 0);
> +        if ( pg == NULL )
> +            goto error;
> +        mfn[i] = _mfn(page_to_mfn(pg));
snip
> diff --git a/xen/common/xsplice_elf.c b/xen/common/xsplice_elf.c
> index a5e9d63..ea7eb73 100644
> --- a/xen/common/xsplice_elf.c
> +++ b/xen/common/xsplice_elf.c
> @@ -199,3 +199,87 @@ void xsplice_elf_free(struct xsplice_elf *elf)
>       elf->name = NULL;
>       elf->len = 0;
>   }
> +
> +int xsplice_elf_resolve_symbols(struct xsplice_elf *elf)
> +{
> +    unsigned int i;
> +
> +    /*
> +     * The first entry of an ELF symbol table is the "undefined symbol index".
> +     * aka reserved so we skip it.
> +     */
> +    ASSERT( elf->sym );
> +    for ( i = 1; i < elf->nsym; i++ )
> +    {
> +        switch ( elf->sym[i].sym->st_shndx )
> +        {
> +            case SHN_COMMON:
> +                printk(XENLOG_ERR "%s: Unexpected common symbol: %s\n",
> +                       elf->name, elf->sym[i].name);
> +                return_(-EINVAL);
> +                break;
> +            case SHN_UNDEF:
> +                printk(XENLOG_ERR "%s: Unknown symbol: %s\n", elf->name,
> +                       elf->sym[i].name);
> +                return_(-ENOENT);
> +                break;
> +            case SHN_ABS:
> +                printk(XENLOG_DEBUG "%s: Absolute symbol: %s => 0x%p\n",
> +                      elf->name, elf->sym[i].name,
> +                      (void *)elf->sym[i].sym->st_value);
> +                break;
> +            default:
> +                if ( elf->sec[elf->sym[i].sym->st_shndx].sec->sh_flags & SHF_ALLOC )
> +                {
> +                    elf->sym[i].sym->st_value +=
> +                        (unsigned long)elf->sec[elf->sym[i].sym->st_shndx].load_addr;
> +                    printk(XENLOG_DEBUG "%s: Symbol resolved: %s => 0x%p\n",
> +                           elf->name, elf->sym[i].name,
> +                           (void *)elf->sym[i].sym->st_value);
> +                }
> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +int xsplice_elf_perform_relocs(struct xsplice_elf *elf)
> +{
> +    struct xsplice_elf_sec *rela, *base;
> +    unsigned int i;
> +    int rc;
> +
> +    /*
> +     * The first entry of an ELF symbol table is the "undefined symbol index".
> +     * aka reserved so we skip it.
> +     */
> +    ASSERT( elf->sym );
> +    for ( i = 1; i < elf->hdr->e_shnum; i++ )
> +    {
> +        rela = &elf->sec[i];
> +
> +        if ( (rela->sec->sh_type != SHT_RELA ) &&
> +             (rela->sec->sh_type != SHT_REL ) )
> +            continue;
> +
> +         /* Is it a valid relocation section? */
> +         if ( rela->sec->sh_info >= elf->hdr->e_shnum )
> +            continue;
> +
> +         base = &elf->sec[rela->sec->sh_info];
> +
> +         /* Don't relocate non-allocated sections. */
> +         if ( !(base->sec->sh_flags & SHF_ALLOC) )
> +            continue;
> +
> +        if ( elf->sec[i].sec->sh_type == SHT_RELA )
> +            rc = xsplice_perform_rela(elf, base, rela);
> +        else /* SHT_REL */
> +            rc = xsplice_perform_rel(elf, base, rela);
> +
> +        if ( rc )
> +            return rc;
> +    }
> +
> +    return 0;
> +}

Is there a reason the above two functions weren't put in the previous patch?

> diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> index bd832df..4ea66bf 100644
> --- a/xen/include/asm-arm/config.h
> +++ b/xen/include/asm-arm/config.h
> @@ -15,8 +15,10 @@
>
>   #if defined(CONFIG_ARM_64)
>   # define LONG_BYTEORDER 3
> +# define ELFSIZE 64
>   #else
>   # define LONG_BYTEORDER 2
> +# define ELFSIZE 32
>   #endif

What does this do?

(And perhaps it should also be in the previous patch since it's 
mentioned in the previous patch's changelog?)

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2)
  2016-01-14 21:47 ` [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2) Konrad Rzeszutek Wilk
@ 2016-01-19 14:39   ` Ross Lagerwall
  2016-01-19 16:55     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 14:39 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
>
> Implement support for the apply, revert and replace actions.
>
snip
> +#include <xen/cpu.h>
>   #include <xen/guest_access.h>
>   #include <xen/keyhandler.h>
>   #include <xen/lib.h>
> @@ -10,25 +11,38 @@
>   #include <xen/mm.h>
>   #include <xen/sched.h>
>   #include <xen/smp.h>
> +#include <xen/softirq.h>
>   #include <xen/spinlock.h>
> +#include <xen/wait.h>
>   #include <xen/xsplice_elf.h>
>   #include <xen/xsplice.h>
>
>   #include <asm/event.h>
> +#include <asm/nmi.h>
>   #include <public/sysctl.h>
>
> -static DEFINE_SPINLOCK(payload_list_lock);
> +/*
> + * Protects against payload_list operations and also allows only one
> + * caller in schedule_work.
> + */
> +static DEFINE_SPINLOCK(payload_lock);

I think it would be cleaner if all the payload_list_lock changes were 
folded into Patch 3.

>   static LIST_HEAD(payload_list);
>
> +static LIST_HEAD(applied_list);
> +
>   static unsigned int payload_cnt;
>   static unsigned int payload_version = 1;
>
>   struct payload {
>       int32_t state;                       /* One of the XSPLICE_STATE_*. */
>       int32_t rc;                          /* 0 or -XEN_EXX. */
> +    uint32_t timeout;                    /* Timeout to do the operation. */

This should go into struct xsplice_work.

>       struct list_head list;               /* Linked to 'payload_list'. */
>       void *payload_address;               /* Virtual address mapped. */
>       size_t payload_pages;                /* Nr of the pages. */
> +    struct list_head applied_list;       /* Linked to 'applied_list'. */
> +    struct xsplice_patch_func *funcs;    /* The array of functions to patch. */
> +    unsigned int nfuncs;                 /* Nr of functions to patch. */
>
>       char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
>   };
> @@ -36,6 +50,23 @@ struct payload {
>   static int load_payload_data(struct payload *payload, uint8_t *raw, ssize_t len);
>   static void free_payload_data(struct payload *payload);
>
> +/* Defines an outstanding patching action. */
> +struct xsplice_work
> +{
> +    atomic_t semaphore;          /* Used for rendezvous. First to grab it will
> +                                    do the patching. */
> +    atomic_t irq_semaphore;      /* Used to signal all IRQs disabled. */
> +    struct payload *data;        /* The payload on which to act. */
> +    volatile bool_t do_work;     /* Signals work to do. */
> +    volatile bool_t ready;       /* Signals all CPUs synchronized. */
> +    uint32_t cmd;                /* Action request: XSPLICE_ACTION_* */
> +};
> +
> +/* There can be only one outstanding patching action. */
> +static struct xsplice_work xsplice_work;
> +
> +static int schedule_work(struct payload *data, uint32_t cmd);
> +
snip
> +
> +/*
> + * This function is executed having all other CPUs with no stack (we may
> + * have cpu_idle on it) and IRQs disabled.
> + */
> +static int revert_payload(struct payload *data)
> +{
> +    unsigned int i;
> +
> +    printk(XENLOG_DEBUG "%s: Reverting.\n", data->name);
> +
> +    for ( i = 0; i < data->nfuncs; i++ )
> +        xsplice_revert_jmp(data->funcs + i);
> +
> +    list_del(&data->applied_list);
> +
> +    return 0;
> +}
> +
> +/* Must be holding the payload_list lock. */

payload lock?

> +static int schedule_work(struct payload *data, uint32_t cmd)
> +{
> +    /* Fail if an operation is already scheduled. */
> +    if ( xsplice_work.do_work )
> +        return -EAGAIN;

Hmm, I don't think EAGAIN is correct. It will cause xen-xsplice to poll 
for a status update, but the operation hasn't actually been submitted.

> +
> +    xsplice_work.cmd = cmd;
> +    xsplice_work.data = data;
> +    atomic_set(&xsplice_work.semaphore, -1);
> +    atomic_set(&xsplice_work.irq_semaphore, -1);
> +
> +    xsplice_work.ready = 0;
> +    smp_wmb();
> +    xsplice_work.do_work = 1;
> +    smp_wmb();
> +
> +    return 0;
> +}
> +
> +/*
> + * Note that because of this NOP code the do_nmi is not safely patchable.
> + * Also if we do receive 'real' NMIs we have lost them.
> + */
> +static int mask_nmi_callback(const struct cpu_user_regs *regs, int cpu)
> +{
> +    return 1;
> +}
> +
> +static void reschedule_fn(void *unused)
> +{
> +    smp_mb(); /* Synchronize with setting do_work */
> +    raise_softirq(SCHEDULE_SOFTIRQ);
> +}
> +
> +static int xsplice_do_wait(atomic_t *counter, s_time_t timeout,
> +                           unsigned int total_cpus, const char *s)
> +{
> +    int rc = 0;
> +
> +    while ( atomic_read(counter) != total_cpus && NOW() < timeout )
> +        cpu_relax();
> +
> +    /* Log & abort. */
> +    if ( atomic_read(counter) != total_cpus )
> +    {
> +        printk(XENLOG_DEBUG "%s: %s %u/%u\n", xsplice_work.data->name,
> +               s, atomic_read(counter), total_cpus);
> +        rc = -EBUSY;
> +        xsplice_work.data->rc = rc;
> +        xsplice_work.do_work = 0;
> +        smp_wmb();
> +        return rc;
> +    }
> +    return rc;
> +}
> +
> +static void xsplice_do_single(unsigned int total_cpus)
> +{
> +    nmi_callback_t saved_nmi_callback;
> +    s_time_t timeout;
> +    struct payload *data, *tmp;
> +    int rc;
> +
> +    data = xsplice_work.data;
> +    timeout = data->timeout ? data->timeout : MILLISECS(30);

The design doc says that a timeout of 0 means infinity.

> +    printk(XENLOG_DEBUG "%s: timeout is %"PRI_stime"ms\n", data->name,
> +           timeout / MILLISECS(1));
> +
> +    timeout += NOW();
> +
> +    if ( xsplice_do_wait(&xsplice_work.semaphore, timeout, total_cpus,
> +                         "Timed out on CPU semaphore") )
> +        return;
> +
> +    /* "Mask" NMIs. */
> +    saved_nmi_callback = set_nmi_callback(mask_nmi_callback);
> +
> +    /* All CPUs are waiting, now signal to disable IRQs. */
> +    xsplice_work.ready = 1;
> +    smp_wmb();
> +
> +    atomic_inc(&xsplice_work.irq_semaphore);
> +    if ( xsplice_do_wait(&xsplice_work.irq_semaphore, timeout, total_cpus,
> +                         "Timed out on IRQ semaphore.") )
> +        return;
> +
> +    local_irq_disable();

As far as I can tell, the mechanics of how this works haven't changed, 
the code has just been reorganized. Which means the points that Martin 
raised about this mechanism are still outstanding.

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 11/13] xsplice: Add support for bug frames. (v2)
  2016-01-14 21:47 ` [PATCH v2 11/13] xsplice: Add support for bug frames. (v2) Konrad Rzeszutek Wilk
@ 2016-01-19 14:42   ` Ross Lagerwall
  0 siblings, 0 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 14:42 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
> From: Ross Lagerwall <ross.lagerwall@citrix.com>
>
> Add support for handling bug frames contained with xsplice modules. If a
> trap occurs search either the kernel bug table or an applied payload's
> bug table depending on the instruction pointer.
>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
snip

> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
> index 5abeb28..02cb4a8 100644
> --- a/xen/common/xsplice.c
> +++ b/xen/common/xsplice.c
> @@ -43,7 +43,10 @@ struct payload {
>       struct list_head applied_list;       /* Linked to 'applied_list'. */
>       struct xsplice_patch_func *funcs;    /* The array of functions to patch. */
>       unsigned int nfuncs;                 /* Nr of functions to patch. */
> -
> +    size_t core_size;                    /* Only .text size. */
> +    size_t core_text_size;               /* Everything else - .data,.rodata, etc. */

These comments are the wrong way around.

> +    struct bug_frame *start_bug_frames[BUGFRAME_NR]; /* .bug.frame patching. */
> +    struct bug_frame *stop_bug_frames[BUGFRAME_NR];
>       char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
>   };
>
> @@ -544,26 +547,27 @@ static void free_payload_data(struct payload *payload)
>       payload->payload_pages = 0;
>   }
>
> -static void calc_section(struct xsplice_elf_sec *sec, size_t *core_size)
> +static void calc_section(struct xsplice_elf_sec *sec, size_t *size)
>   {
> -    size_t align_size = ROUNDUP(*core_size, sec->sec->sh_addralign);
> +    size_t align_size = ROUNDUP(*size, sec->sec->sh_addralign);
>       sec->sec->sh_entsize = align_size;
> -    *core_size = sec->sec->sh_size + align_size;
> +    *size = sec->sec->sh_size + align_size;
>   }
>
>   static int move_payload(struct payload *payload, struct xsplice_elf *elf)
>   {
>       uint8_t *buf;
>       unsigned int i;
> -    size_t core_size = 0;
> +    size_t size = 0;
>
>       /* Compute text regions */
>       for ( i = 0; i < elf->hdr->e_shnum; i++ )
>       {
>           if ( (elf->sec[i].sec->sh_flags & (SHF_ALLOC|SHF_EXECINSTR)) ==
>                (SHF_ALLOC|SHF_EXECINSTR) )
> -            calc_section(&elf->sec[i], &core_size);
> +            calc_section(&elf->sec[i], &size);
>       }
> +    payload->core_text_size = size;
>
>       /* Compute rw data */
>       for ( i = 0; i < elf->hdr->e_shnum; i++ )
> @@ -571,7 +575,7 @@ static int move_payload(struct payload *payload, struct xsplice_elf *elf)
>           if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
>                !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
>                (elf->sec[i].sec->sh_flags & SHF_WRITE) )
> -            calc_section(&elf->sec[i], &core_size);
> +            calc_section(&elf->sec[i], &size);
>       }
>
>       /* Compute ro data */
> @@ -580,16 +584,17 @@ static int move_payload(struct payload *payload, struct xsplice_elf *elf)
>           if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
>                !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
>                !(elf->sec[i].sec->sh_flags & SHF_WRITE) )
> -            calc_section(&elf->sec[i], &core_size);
> +            calc_section(&elf->sec[i], &size);
>       }
> +    payload->core_size = size;
>
> -    buf = alloc_payload(core_size);
> +    buf = alloc_payload(size);
>       if ( !buf ) {
>           printk(XENLOG_ERR "%s: Could not allocate memory for module\n",
>                  elf->name);
>           return -ENOMEM;
>       }
> -    memset(buf, 0, core_size);
> +    memset(buf, 0, size);
>
>       for ( i = 0; i < elf->hdr->e_shnum; i++ )
>       {
> @@ -604,7 +609,7 @@ static int move_payload(struct payload *payload, struct xsplice_elf *elf)
>       }
>
>       payload->payload_address = buf;
> -    payload->payload_pages = PFN_UP(core_size);
> +    payload->payload_pages = PFN_UP(size);

These renames should be folded into the originating patch (patch 8) or 
dropped.

>
>       return 0;
>   }
> @@ -647,6 +652,22 @@ static int find_special_sections(struct payload *payload,
>               if ( f->pad[j] )
>                   return -EINVAL;
>       }
> +    for ( i = 0; i < BUGFRAME_NR; i++ )
> +    {
> +        char str[14];
> +
> +        snprintf(str, sizeof str, ".bug_frames.%d", i);
> +        sec = xsplice_elf_sec_by_name(elf, str);
> +        if ( !sec )
> +            continue;
> +
> +        if ( ( !sec->sec->sh_size ) ||
> +             ( sec->sec->sh_size % sizeof (struct bug_frame) ) )
> +            return -EINVAL;
> +
> +        payload->start_bug_frames[i] = (struct bug_frame *)sec->load_addr;
> +        payload->stop_bug_frames[i] = (struct bug_frame *)(sec->load_addr + sec->sec->sh_size);
> +    }
>       return 0;
>   }
>
> @@ -942,6 +963,72 @@ void do_xsplice(void)
>       }
>   }
>
> +
> +/*
> + * Functions for handling special sections.
> + */
> +struct bug_frame *xsplice_find_bug(const char *eip, int *id)
> +{
> +    struct payload *data;
> +    struct bug_frame *bug;
> +    int i;
> +
> +    /* No locking since this list is only ever changed during apply or revert
> +     * context. */
> +    list_for_each_entry ( data, &applied_list, applied_list )
> +    {
> +        for (i = 0; i < 4; i++) {

BUGFRAME_NR

> +            if (!data->start_bug_frames[i])
> +                continue;
> +            if ( !((void *)eip >= data->payload_address &&
> +                   (void *)eip < (data->payload_address + data->core_text_size)))
> +                continue;
> +
> +            for ( bug = data->start_bug_frames[i]; bug != data->stop_bug_frames[i]; ++bug ) {
> +                if ( bug_loc(bug) == eip )
> +                {
> +                    *id = i;
> +                    return bug;
> +                }
> +            }
> +        }
> +    }
> +
> +    return NULL;
> +}
> +

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version'.
  2016-01-14 21:47 ` [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version' Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
@ 2016-01-19 14:57   ` Ross Lagerwall
  2016-01-19 16:47   ` Ross Lagerwall
  2 siblings, 0 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 14:57 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
> This change demonstrates how to generate an xSplice ELF payload.
>
> The idea here is that we want to patch in the hypervisor
> the 'xen_version_extra' function with an function that will
> return 'Hello World'. The 'xl info | grep extraversion'
> will reflect the new value after the patching.
>
snip
> +### Example
> +
> +A simple example of what a payload file can be:
> +
> +<pre>
> +/* MUST be in sync with hypervisor. */
> +struct xsplice_patch_func {
> +    const char *name;
> +    unsigned long new_addr;
> +    const unsigned long old_addr;
> +    uint32_t new_size;
> +    const uint32_t old_size;
> +    uint8_t pad[32];
> +};
> +
> +/* Our replacement function for xen_extra_version. */
> +const char *xen_hello_world(void)
> +{
> +    return "Hello World";
> +}
> +
> +struct xsplice_patch_func xsplice_hello_world = {
> +    .name = "xen_extra_version",
> +    .new_addr = &xen_hello_world,
> +    .old_addr = 0xffff82d08013963c, /* Extracted from xen-syms. */
> +    .new_size = 13, /* To be be computed by scripts. */
> +    .old_size = 13, /* -----------""---------------  */
> +};
> +</pre>
> +
> +With the linker script as follow to change the `xsplice_hello_world`
> +do be `.xsplice.funcs` :
> +
> +<pre>
> +OUTPUT_FORMAT("elf64-x86-64", "elf64-x86-64", "elf64-x86-64")
> +OUTPUT_ARCH(i386:x86-64)
> +ENTRY(xsplice_hello_world)
> +SECTIONS
> +{
> +    /* The hypervisor expects ".xsplice.func", so change
> +     * the ".data.xsplice_hello_world" to it. */
> +
> +    .xsplice.funcs : { *(*.xsplice_hello_world) }
> +    }
> +}
> +</pre>

You should be able to use __attribute__((__section__(".xsplice.funcs"))) 
on the structure to avoid needing to use a linker script.

> +
> +Code must be compiled with -fPIC.
> +
>   ## Hypercalls
>
>   We will employ the sub operations of the system management hypercall (sysctl).
> diff --git a/tools/misc/Makefile b/tools/misc/Makefile
> index c46873e..8385830 100644
> --- a/tools/misc/Makefile
> +++ b/tools/misc/Makefile
> @@ -36,6 +36,10 @@ INSTALL_SBIN += $(INSTALL_SBIN-y)
>   # Everything to be installed in a private bin/
>   INSTALL_PRIVBIN                += xenpvnetboot
>
> +# We need the hypervisor - and only 64-bit builds have it.
> +ifeq ($(XEN_COMPILE_ARCH),x86_64)
> +INSTALL_PRIVBIN                += xen_hello_world.xsplice
> +endif
>   # Everything to be installed
>   TARGETS_ALL := $(INSTALL_BIN) $(INSTALL_SBIN) $(INSTALL_PRIVBIN)
>
> @@ -49,7 +53,7 @@ TARGETS_COPY += xenpvnetboot
>   # Everything which needs to be built
>   TARGETS_BUILD := $(filter-out $(TARGETS_COPY),$(TARGETS_ALL))
>
> -.PHONY: all build
> +.PHONY: all build xsplice
>   all build: $(TARGETS_BUILD)
>
>   .PHONY: install
> @@ -111,4 +115,23 @@ gtraceview: gtraceview.o
>   xencov: xencov.o
>   	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
>
> +.PHONY: xsplice
> +xsplice:
> +ifeq ($(XEN_COMPILE_ARCH),x86_64)
> +	# We MUST regenerate the file everytime we build - in case the hypervisor
> +	# is rebuilt too.
> +	$(RM) *.xplice
> +	$(MAKE) xen_hello_world.xsplice

Can't you depend on xen-syms to avoid recompiling this every time.

> +endif
> +
> +XEN_EXTRA_VERSION_ADDR=$(shell nm --defined $(XEN_ROOT)/xen/xen-syms | grep xen_extra_version | awk '{print "0x"$$1}')
> +
> +xen_hello_world.xsplice: xen_hello_world.c
> +	$(CC) -DOLD_CODE=$(XEN_EXTRA_VERSION_ADDR) -I$(XEN_ROOT)/tools/include \
> +		-fPIC -Wl,--emit-relocs \
> +		-Wl,-r -Wl,--entry=xsplice_hello_world \
> +		-fdata-sections -ffunction-sections \
> +		-nostdlib -Txsplice.lds \
> +		-o $@ $<
> +	@objdump -x --section=.xsplice.funcs $@

If you use __attribute__((__section__(".xsplice.funcs"))) on the struct, 
you can drop the custom linker script and simplify the command-line to 
something like:
$(CC) -DOLD_CODE=$(XEN_EXTRA_VERSION_ADDR) -I$(XEN_ROOT)/tools/include \
	-c -o $@ $< $(CFLAGS)

Having mostly the same CFLAGS that Xen uses is important because it 
contains things like -mno-red-zone, -fno-asynchronous-unwind-tables, and 
-mno-sse, etc which affect the way the code is compiled.

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/13] xsplice: Implement payload loading (v2)
  2016-01-14 21:47 ` [PATCH v2 08/13] xsplice: Implement payload loading (v2) Konrad Rzeszutek Wilk
  2016-01-19 14:34   ` Ross Lagerwall
@ 2016-01-19 16:45   ` Ross Lagerwall
  1 sibling, 0 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 16:45 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
snip
> +static int move_payload(struct payload *payload, struct xsplice_elf *elf)
> +{
> +    uint8_t *buf;
> +    unsigned int i;
> +    size_t core_size = 0;
> +
> +    /* Compute text regions */
> +    for ( i = 0; i < elf->hdr->e_shnum; i++ )
> +    {
> +        if ( (elf->sec[i].sec->sh_flags & (SHF_ALLOC|SHF_EXECINSTR)) ==
> +             (SHF_ALLOC|SHF_EXECINSTR) )
> +            calc_section(&elf->sec[i], &core_size);
> +    }
> +
> +    /* Compute rw data */
> +    for ( i = 0; i < elf->hdr->e_shnum; i++ )
> +    {
> +        if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
> +             !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
> +             (elf->sec[i].sec->sh_flags & SHF_WRITE) )
> +            calc_section(&elf->sec[i], &core_size);
> +    }
> +
> +    /* Compute ro data */
> +    for ( i = 0; i < elf->hdr->e_shnum; i++ )
> +    {
> +        if ( (elf->sec[i].sec->sh_flags & SHF_ALLOC) &&
> +             !(elf->sec[i].sec->sh_flags & SHF_EXECINSTR) &&
> +             !(elf->sec[i].sec->sh_flags & SHF_WRITE) )
> +            calc_section(&elf->sec[i], &core_size);
> +    }
> +
> +    buf = alloc_payload(core_size);
> +    if ( !buf ) {
> +        printk(XENLOG_ERR "%s: Could not allocate memory for module\n",
> +               elf->name);
> +        return -ENOMEM;
> +    }
> +    memset(buf, 0, core_size);
> +
> +    for ( i = 0; i < elf->hdr->e_shnum; i++ )
> +    {
> +        if ( elf->sec[i].sec->sh_flags & SHF_ALLOC )
> +        {
> +            elf->sec[i].load_addr = buf + elf->sec[i].sec->sh_entsize;
> +            memcpy(elf->sec[i].load_addr, elf->sec[i].data,
> +                   elf->sec[i].sec->sh_size);
> +            printk(XENLOG_DEBUG "%s: Loaded %s at 0x%p\n",
> +                   elf->name, elf->sec[i].name, elf->sec[i].load_addr);
> +        }
> +    }

I found this bug a while back but didn't get round to pushing it anywhere.

8-<------------------------------------------------
commit 72803a4c765026c54f31988a4c689048c8723575
Author: Ross Lagerwall <ross.lagerwall@citrix.com>
Date:   Fri Nov 6 12:48:39 2015 +0000

     Don't copy NOBITS sections (fixes BSS initialization)

diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
index 9450b2a..799ccb5 100644
--- a/xen/common/xsplice.c
+++ b/xen/common/xsplice.c
@@ -600,8 +600,9 @@ static int move_module(struct payload *payload, 
struct xsplice_elf *elf)
          if ( elf->sec[i].sec->sh_flags & SHF_ALLOC )
          {
              elf->sec[i].load_addr = buf + elf->sec[i].sec->sh_entsize;
-            memcpy(elf->sec[i].load_addr, elf->sec[i].data,
-                   elf->sec[i].sec->sh_size);
+            if ( elf->sec[i].sec->sh_type != SHT_NOBITS )
+                memcpy(elf->sec[i].load_addr, elf->sec[i].data,
+                       elf->sec[i].sec->sh_size);
              printk(XENLOG_DEBUG "Loaded %s at 0x%p\n",
                     elf->sec[i].name, elf->sec[i].load_addr);
          }

-- 
Ross Lagerwall

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version'.
  2016-01-14 21:47 ` [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version' Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
  2016-01-19 14:57   ` Ross Lagerwall
@ 2016-01-19 16:47   ` Ross Lagerwall
  2 siblings, 0 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-19 16:47 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, mpohlack, andrew.cooper3,
	stefano.stabellini, jbeulich, ian.jackson, ian.campbell,
	wei.liu2, sasha.levin

On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
snip
> diff --git a/tools/misc/xen_hello_world.c b/tools/misc/xen_hello_world.c
> new file mode 100644
> index 0000000..8c24d8f
> --- /dev/null
> +++ b/tools/misc/xen_hello_world.c
> @@ -0,0 +1,15 @@
> +#include "xsplice.h"
> +
> +/* Our replacement function for xen_extra_version. */
> +const char *xen_hello_world(void)
> +{
> +    return "Hello World";
> +}
> +
> +struct xsplice_patch_func xsplice_hello_world = {
> +    .name = "xen_extra_version",
> +    .new_addr = &xen_hello_world,

This line introduces a warning:
xen_hello_world.c:11:17: warning: initialization makes integer from 
pointer without a cast [-Wint-conversion]
      .new_addr = &xen_hello_world,
                  ^
xen_hello_world.c:11:17: note: (near initialization for 
‘xsplice_hello_world.new_addr’)

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2)
  2016-01-19 14:39   ` Ross Lagerwall
@ 2016-01-19 16:55     ` Konrad Rzeszutek Wilk
  2016-01-25 11:43       ` Ross Lagerwall
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-19 16:55 UTC (permalink / raw)
  To: Ross Lagerwall
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	stefano.stabellini, jbeulich, sasha.levin, xen-devel

On Tue, Jan 19, 2016 at 02:39:40PM +0000, Ross Lagerwall wrote:
> On 01/14/2016 09:47 PM, Konrad Rzeszutek Wilk wrote:
> >From: Ross Lagerwall <ross.lagerwall@citrix.com>
> >
> >Implement support for the apply, revert and replace actions.
> >
> snip
> >+#include <xen/cpu.h>
> >  #include <xen/guest_access.h>
> >  #include <xen/keyhandler.h>
> >  #include <xen/lib.h>
> >@@ -10,25 +11,38 @@
> >  #include <xen/mm.h>
> >  #include <xen/sched.h>
> >  #include <xen/smp.h>
> >+#include <xen/softirq.h>
> >  #include <xen/spinlock.h>
> >+#include <xen/wait.h>
> >  #include <xen/xsplice_elf.h>
> >  #include <xen/xsplice.h>
> >
> >  #include <asm/event.h>
> >+#include <asm/nmi.h>
> >  #include <public/sysctl.h>
> >
> >-static DEFINE_SPINLOCK(payload_list_lock);
> >+/*
> >+ * Protects against payload_list operations and also allows only one
> >+ * caller in schedule_work.
> >+ */
> >+static DEFINE_SPINLOCK(payload_lock);
> 
> I think it would be cleaner if all the payload_list_lock changes were folded
> into Patch 3.

Good idea.
> 
> >  static LIST_HEAD(payload_list);
> >
> >+static LIST_HEAD(applied_list);
> >+
> >  static unsigned int payload_cnt;
> >  static unsigned int payload_version = 1;
> >
> >  struct payload {
> >      int32_t state;                       /* One of the XSPLICE_STATE_*. */
> >      int32_t rc;                          /* 0 or -XEN_EXX. */
> >+    uint32_t timeout;                    /* Timeout to do the operation. */
> 
> This should go into struct xsplice_work.

/me nods.
> 
> >      struct list_head list;               /* Linked to 'payload_list'. */
> >      void *payload_address;               /* Virtual address mapped. */
> >      size_t payload_pages;                /* Nr of the pages. */
> >+    struct list_head applied_list;       /* Linked to 'applied_list'. */
> >+    struct xsplice_patch_func *funcs;    /* The array of functions to patch. */
> >+    unsigned int nfuncs;                 /* Nr of functions to patch. */
> >
> >      char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
> >  };
> >@@ -36,6 +50,23 @@ struct payload {
> >  static int load_payload_data(struct payload *payload, uint8_t *raw, ssize_t len);
> >  static void free_payload_data(struct payload *payload);
> >
> >+/* Defines an outstanding patching action. */
> >+struct xsplice_work
> >+{
> >+    atomic_t semaphore;          /* Used for rendezvous. First to grab it will
> >+                                    do the patching. */
> >+    atomic_t irq_semaphore;      /* Used to signal all IRQs disabled. */
> >+    struct payload *data;        /* The payload on which to act. */
> >+    volatile bool_t do_work;     /* Signals work to do. */
> >+    volatile bool_t ready;       /* Signals all CPUs synchronized. */
> >+    uint32_t cmd;                /* Action request: XSPLICE_ACTION_* */
> >+};
> >+
> >+/* There can be only one outstanding patching action. */
> >+static struct xsplice_work xsplice_work;
> >+
> >+static int schedule_work(struct payload *data, uint32_t cmd);
> >+
> snip
> >+
> >+/*
> >+ * This function is executed having all other CPUs with no stack (we may
> >+ * have cpu_idle on it) and IRQs disabled.
> >+ */
> >+static int revert_payload(struct payload *data)
> >+{
> >+    unsigned int i;
> >+
> >+    printk(XENLOG_DEBUG "%s: Reverting.\n", data->name);
> >+
> >+    for ( i = 0; i < data->nfuncs; i++ )
> >+        xsplice_revert_jmp(data->funcs + i);
> >+
> >+    list_del(&data->applied_list);
> >+
> >+    return 0;
> >+}
> >+
> >+/* Must be holding the payload_list lock. */
> 
> payload lock?
> 
> >+static int schedule_work(struct payload *data, uint32_t cmd)
> >+{
> >+    /* Fail if an operation is already scheduled. */
> >+    if ( xsplice_work.do_work )
> >+        return -EAGAIN;
> 
> Hmm, I don't think EAGAIN is correct. It will cause xen-xsplice to poll for
> a status update, but the operation hasn't actually been submitted.

-EBUSY -EDEADLK ?
> 
> >+
> >+    xsplice_work.cmd = cmd;
> >+    xsplice_work.data = data;
> >+    atomic_set(&xsplice_work.semaphore, -1);
> >+    atomic_set(&xsplice_work.irq_semaphore, -1);
> >+
> >+    xsplice_work.ready = 0;
> >+    smp_wmb();
> >+    xsplice_work.do_work = 1;
> >+    smp_wmb();
> >+
> >+    return 0;
> >+}
> >+
> >+/*
> >+ * Note that because of this NOP code the do_nmi is not safely patchable.
> >+ * Also if we do receive 'real' NMIs we have lost them.
> >+ */
> >+static int mask_nmi_callback(const struct cpu_user_regs *regs, int cpu)
> >+{
> >+    return 1;
> >+}
> >+
> >+static void reschedule_fn(void *unused)
> >+{
> >+    smp_mb(); /* Synchronize with setting do_work */
> >+    raise_softirq(SCHEDULE_SOFTIRQ);
> >+}
> >+
> >+static int xsplice_do_wait(atomic_t *counter, s_time_t timeout,
> >+                           unsigned int total_cpus, const char *s)
> >+{
> >+    int rc = 0;
> >+
> >+    while ( atomic_read(counter) != total_cpus && NOW() < timeout )
> >+        cpu_relax();
> >+
> >+    /* Log & abort. */
> >+    if ( atomic_read(counter) != total_cpus )
> >+    {
> >+        printk(XENLOG_DEBUG "%s: %s %u/%u\n", xsplice_work.data->name,
> >+               s, atomic_read(counter), total_cpus);
> >+        rc = -EBUSY;
> >+        xsplice_work.data->rc = rc;
> >+        xsplice_work.do_work = 0;
> >+        smp_wmb();
> >+        return rc;
> >+    }
> >+    return rc;
> >+}
> >+
> >+static void xsplice_do_single(unsigned int total_cpus)
> >+{
> >+    nmi_callback_t saved_nmi_callback;
> >+    s_time_t timeout;
> >+    struct payload *data, *tmp;
> >+    int rc;
> >+
> >+    data = xsplice_work.data;
> >+    timeout = data->timeout ? data->timeout : MILLISECS(30);
> 
> The design doc says that a timeout of 0 means infinity.

True. Lets update the document.
> 
> >+    printk(XENLOG_DEBUG "%s: timeout is %"PRI_stime"ms\n", data->name,
> >+           timeout / MILLISECS(1));
> >+
> >+    timeout += NOW();
> >+
> >+    if ( xsplice_do_wait(&xsplice_work.semaphore, timeout, total_cpus,
> >+                         "Timed out on CPU semaphore") )
> >+        return;
> >+
> >+    /* "Mask" NMIs. */
> >+    saved_nmi_callback = set_nmi_callback(mask_nmi_callback);
> >+
> >+    /* All CPUs are waiting, now signal to disable IRQs. */
> >+    xsplice_work.ready = 1;
> >+    smp_wmb();
> >+
> >+    atomic_inc(&xsplice_work.irq_semaphore);
> >+    if ( xsplice_do_wait(&xsplice_work.irq_semaphore, timeout, total_cpus,
> >+                         "Timed out on IRQ semaphore.") )
> >+        return;
> >+
> >+    local_irq_disable();
> 
> As far as I can tell, the mechanics of how this works haven't changed, the
> code has just been reorganized. Which means the points that Martin raised
> about this mechanism are still outstanding.

A bit. I added the extra timeout on both of the 'spin-around' and also
moved some of the barriers around. Also removed your spin-lock and used
the atomic_t mechanism to synchronize.

But the one thing that I didn't do was the spin on the 'workers?' that
are just spinnig idly. They will do that forever if say the 'master'
hasn't gone to the IRQ semaphore part.

My thinking was that the 'workers' could also use the timeout feature
but just multiple it by two?

> 
> -- 
> Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/13] xsplice: Implement payload loading (v2)
  2016-01-19 14:34   ` Ross Lagerwall
@ 2016-01-19 16:59     ` Konrad Rzeszutek Wilk
  2016-01-25 11:21       ` Ross Lagerwall
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-01-19 16:59 UTC (permalink / raw)
  To: Ross Lagerwall
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	stefano.stabellini, jbeulich, sasha.levin, xen-devel

> >+static void find_hole(ssize_t pages, unsigned long *hole_start,
> >+                      unsigned long *hole_end)
> >+{
> >+    struct payload *data, *data2;
> >+
> >+    spin_lock(&payload_list_lock);
> >+    list_for_each_entry ( data, &payload_list, list )
> >+    {
> >+        list_for_each_entry ( data2, &payload_list, list )
> >+        {
> >+            unsigned long start, end;
> >+
> >+            start = (unsigned long)data2->payload_address;
> >+            end = start + data2->payload_pages * PAGE_SIZE;
> >+            if ( *hole_end > start && *hole_start < end )
> >+            {
> >+                *hole_start = end;
> >+                *hole_end = *hole_start + pages * PAGE_SIZE;
> >+                break;
> >+            }
> >+        }
> >+        if ( &data2->list == &payload_list )
> >+            break;
> >+    }
> >+    spin_unlock(&payload_list_lock);
> >+}
> 
> This function above should go down into the CONFIG_X86 section below.

Odd. I have it in my tree. Ah right I - I had the patch not committed in. <sigh>
.. snip..
> >+int xsplice_elf_resolve_symbols(struct xsplice_elf *elf)
.. snip..
> >+int xsplice_elf_perform_relocs(struct xsplice_elf *elf)
.. snip..
> 
> Is there a reason the above two functions weren't put in the previous patch?

Historical. I will move them there. Thanks!
> 
> >diff --git a/xen/include/asm-arm/config.h b/xen/include/asm-arm/config.h
> >index bd832df..4ea66bf 100644
> >--- a/xen/include/asm-arm/config.h
> >+++ b/xen/include/asm-arm/config.h
> >@@ -15,8 +15,10 @@
> >
> >  #if defined(CONFIG_ARM_64)
> >  # define LONG_BYTEORDER 3
> >+# define ELFSIZE 64
> >  #else
> >  # define LONG_BYTEORDER 2
> >+# define ELFSIZE 32
> >  #endif
> 
> What does this do?

Make Elf_Note and all the ELf_* macros actually work.
> 
> (And perhaps it should also be in the previous patch since it's mentioned in
> the previous patch's changelog?)

I kind of lost where it was added. 

I could spin it out as a seperate patch - or make it part of the previous
patch? Thoughts?
> 
> -- 
> Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 08/13] xsplice: Implement payload loading (v2)
  2016-01-19 16:59     ` Konrad Rzeszutek Wilk
@ 2016-01-25 11:21       ` Ross Lagerwall
  0 siblings, 0 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-25 11:21 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On 01/19/2016 04:59 PM, Konrad Rzeszutek Wilk wrote:
snip
>>
>> (And perhaps it should also be in the previous patch since it's mentioned in
>> the previous patch's changelog?)
>
> I kind of lost where it was added.
>
> I could spin it out as a seperate patch - or make it part of the previous
> patch? Thoughts?
>>

I think a separate patch would be better.

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2)
  2016-01-19 16:55     ` Konrad Rzeszutek Wilk
@ 2016-01-25 11:43       ` Ross Lagerwall
  2016-02-05 19:30         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-25 11:43 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	stefano.stabellini, jbeulich, sasha.levin, xen-devel

On 01/19/2016 04:55 PM, Konrad Rzeszutek Wilk wrote:
snip
>>> +/* Must be holding the payload_list lock. */
>>
>> payload lock?
>>
>>> +static int schedule_work(struct payload *data, uint32_t cmd)
>>> +{
>>> +    /* Fail if an operation is already scheduled. */
>>> +    if ( xsplice_work.do_work )
>>> +        return -EAGAIN;
>>
>> Hmm, I don't think EAGAIN is correct. It will cause xen-xsplice to poll for
>> a status update, but the operation hasn't actually been submitted.
>
> -EBUSY -EDEADLK ?

I would choose -EBUSY.

>>
>>> +
>>> +    xsplice_work.cmd = cmd;
>>> +    xsplice_work.data = data;
>>> +    atomic_set(&xsplice_work.semaphore, -1);
>>> +    atomic_set(&xsplice_work.irq_semaphore, -1);
>>> +
>>> +    xsplice_work.ready = 0;
>>> +    smp_wmb();
>>> +    xsplice_work.do_work = 1;
>>> +    smp_wmb();
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +/*
>>> + * Note that because of this NOP code the do_nmi is not safely patchable.
>>> + * Also if we do receive 'real' NMIs we have lost them.
>>> + */
>>> +static int mask_nmi_callback(const struct cpu_user_regs *regs, int cpu)
>>> +{
>>> +    return 1;
>>> +}
>>> +
>>> +static void reschedule_fn(void *unused)
>>> +{
>>> +    smp_mb(); /* Synchronize with setting do_work */
>>> +    raise_softirq(SCHEDULE_SOFTIRQ);
>>> +}
>>> +
>>> +static int xsplice_do_wait(atomic_t *counter, s_time_t timeout,
>>> +                           unsigned int total_cpus, const char *s)
>>> +{
>>> +    int rc = 0;
>>> +
>>> +    while ( atomic_read(counter) != total_cpus && NOW() < timeout )
>>> +        cpu_relax();
>>> +
>>> +    /* Log & abort. */
>>> +    if ( atomic_read(counter) != total_cpus )
>>> +    {
>>> +        printk(XENLOG_DEBUG "%s: %s %u/%u\n", xsplice_work.data->name,
>>> +               s, atomic_read(counter), total_cpus);
>>> +        rc = -EBUSY;
>>> +        xsplice_work.data->rc = rc;
>>> +        xsplice_work.do_work = 0;
>>> +        smp_wmb();
>>> +        return rc;
>>> +    }
>>> +    return rc;
>>> +}
>>> +
>>> +static void xsplice_do_single(unsigned int total_cpus)
>>> +{
>>> +    nmi_callback_t saved_nmi_callback;
>>> +    s_time_t timeout;
>>> +    struct payload *data, *tmp;
>>> +    int rc;
>>> +
>>> +    data = xsplice_work.data;
>>> +    timeout = data->timeout ? data->timeout : MILLISECS(30);
>>
>> The design doc says that a timeout of 0 means infinity.
>
> True. Lets update the document.
>>
>>> +    printk(XENLOG_DEBUG "%s: timeout is %"PRI_stime"ms\n", data->name,
>>> +           timeout / MILLISECS(1));
>>> +
>>> +    timeout += NOW();
>>> +
>>> +    if ( xsplice_do_wait(&xsplice_work.semaphore, timeout, total_cpus,
>>> +                         "Timed out on CPU semaphore") )
>>> +        return;
>>> +
>>> +    /* "Mask" NMIs. */
>>> +    saved_nmi_callback = set_nmi_callback(mask_nmi_callback);
>>> +
>>> +    /* All CPUs are waiting, now signal to disable IRQs. */
>>> +    xsplice_work.ready = 1;
>>> +    smp_wmb();
>>> +
>>> +    atomic_inc(&xsplice_work.irq_semaphore);
>>> +    if ( xsplice_do_wait(&xsplice_work.irq_semaphore, timeout, total_cpus,
>>> +                         "Timed out on IRQ semaphore.") )
>>> +        return;
>>> +
>>> +    local_irq_disable();
>>
>> As far as I can tell, the mechanics of how this works haven't changed, the
>> code has just been reorganized. Which means the points that Martin raised
>> about this mechanism are still outstanding.
>
> A bit. I added the extra timeout on both of the 'spin-around' and also
> moved some of the barriers around. Also removed your spin-lock and used
> the atomic_t mechanism to synchronize.
>
> But the one thing that I didn't do was the spin on the 'workers?' that
> are just spinnig idly. They will do that forever if say the 'master'
> hasn't gone to the IRQ semaphore part.
>
> My thinking was that the 'workers' could also use the timeout feature
> but just multiple it by two?
>

After looking at this again, I remembered that the algorithm I used is 
the same as the one used by stop_machine_run(). That function runs 
without timeouts at all (seemingly without problems), so why shouldn't 
this one? (The only reason stop_machine_run() itself isn't used for 
patching is because we need to enter the function without a stack, i.e. 
not from a tasklet.)

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2] xSplice v1 implementation.
  2016-01-15 16:58 ` [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
@ 2016-01-25 11:57   ` Ross Lagerwall
  0 siblings, 0 replies; 45+ messages in thread
From: Ross Lagerwall @ 2016-01-25 11:57 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

On 01/15/2016 04:58 PM, Konrad Rzeszutek Wilk wrote:
>> Or you can use git://github.com/rosslagerwall/xsplice-build.git tool
>> (it will need an extra patch, will send that shortly) - which
>> generates the ELF payloads.
>>
>> This link has a nice description of how to use the tool:
>> http://lists.xenproject.org/archives/html/xen-devel/2015-10/msg02595.html
>
> Attached.
>

Thanks. I've applied it with a couple of changes:
https://github.com/rosslagerwall/xsplice-build/commit/25d7b7d6c96c1ab44345cbfd62425f4672714a53

Firstly, reorganizing the struct requires the relocations to be 
calculated differently.
Secondly, I dropped the change to lookup.h to reduce the delta between 
xsplice-build and kpatch-build.

I've done a little testing and patch modules built with the tool can now 
be applied correctly to a build of your xsplice.v2 branch.

-- 
Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/13] xsplice: Design document (v5).
  2016-01-14 21:46 ` [PATCH v2 01/13] xsplice: Design document (v5) Konrad Rzeszutek Wilk
  2016-01-19 11:14   ` Wei Liu
  2016-01-19 14:31   ` Ross Lagerwall
@ 2016-02-05 15:25   ` Jan Beulich
  2016-02-05 21:47     ` Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 45+ messages in thread
From: Jan Beulich @ 2016-02-05 15:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, sasha.levin, xen-devel

>>> On 14.01.16 at 22:46, <konrad.wilk@oracle.com> wrote:
> +## Patching code
> +
> +The first mechanism to patch that comes in mind is in-place replacement.
> +That is replace the affected code with new code. Unfortunately the x86
> +ISA is variable size which places limits on how much space we have available
> +to replace the instructions. That is not a problem if the change is smaller
> +than the original opcode and we can fill it with nops. Problems will
> +appear if the replacement code is longer.
> +
> +The second mechanism is by replacing the call or jump to the
> +old function with the address of the new function.
> +
> +A third mechanism is to add a jump to the new function at the
> +start of the old function. N.B. The Xen hypervisor implements the third
> +mechanism.

Are we, btw, convinced that all functions will be at least 5 bytes
long? Granted it's not very likely for a smaller function to be buggy
and needing patching, but you never know... (Ah, just found a
respective note at the very end.)

> +### Example of trampoline and in-place splicing
> +
> +As example we will assume the hypervisor does not have XSA-132 (see
> +*domctl/sysctl: don't leak hypervisor stack to toolstacks*
> +4ff3449f0e9d175ceb9551d3f2aecb59273f639d) and we would like to binary patch
> +the hypervisor with it. The original code looks as so:
> +
> +<pre>
> +   48 89 e0                  mov    %rsp,%rax  
> +   48 25 00 80 ff ff         and    $0xffffffffffff8000,%rax  
> +</pre>
> +
> +while the new patched hypervisor would be:
> +
> +<pre>
> +   48 c7 45 b8 00 00 00 00   movq   $0x0,-0x48(%rbp)  
> +   48 c7 45 c0 00 00 00 00   movq   $0x0,-0x40(%rbp)  
> +   48 c7 45 c8 00 00 00 00   movq   $0x0,-0x38(%rbp)  
> +   48 89 e0                  mov    %rsp,%rax  
> +   48 25 00 80 ff ff         and    $0xffffffffffff8000,%rax  
> +</pre>
> +
> +This is inside the arch_do_domctl. This new change adds 21 extra
> +bytes of code which alters all the offsets inside the function. To alter
> +these offsets and add the extra 21 bytes of code we might not have enough
> +space in .text to squeeze this in.
> +
> +As such we could simplify this problem by only patching the site
> +which calls arch_do_domctl:
> +
> +<pre>
> +<do_domctl>:  
> + e8 4b b1 05 00          callq  ffff82d08015fbb9 <arch_do_domctl>  
> +</pre>
> +
> +with a new address for where the new `arch_do_domctl` would be (this
> +area would be allocated dynamically).
> +
> +Astute readers will wonder what we need to do if we were to patch 
> `do_domctl`
> +- which is not called directly by hypervisor but on behalf of the guests via
> +the `compat_hypercall_table` and `hypercall_table`.
> +Patching the offset in `hypercall_table` for `do_domctl:
> +(ffff82d080103079 <do_domctl>:)
> +<pre>
> +
> + ffff82d08024d490:   79 30  
> + ffff82d08024d492:   10 80 d0 82 ff ff   
> +
> +</pre>
> +with the new address where the new `do_domctl` is possible. The other
> +place where it is used is in `hvm_hypercall64_table` which would need
> +to be patched in a similar way. This would require an in-place splicing
> +of the new virtual address of `arch_do_domctl`.
> +
> +In summary this example patched the callee of the affected function by
> + * allocating memory for the new code to live in,
> + * changing the virtual address in all the functions which called the old
> +   code (computing the new offset, patching the callq with a new callq).
> + * changing the function pointer tables with the new virtual address of
> +   the function (splicing in the new virtual address). Since this table
> +   resides in the .rodata section we would need to temporarily change the
> +   page table permissions during this part.
> +
> +
> +However it has severe drawbacks - the safety checks which have to make sure
> +the function is not on the stack - must also check every caller. For some
> +patches this could mean - if there were an sufficient large amount of
> +callers - that we would never be able to apply the update.

While this is an issue, didn't we settle on doing the patching without
any deep call stacks only? Also why would that problem not apply to
to trampoline example right below this section (after all you can't
just go and patch a multi-byte instruction without making sure no
CPU is about to execute that code).

> +As such having the payload in an ELF file is the sensible way. We would be
> +carrying the various sets of structures (and data) in the ELF sections under
> +different names and with definitions. The prefix for the ELF section name
> +would always be: *.xsplice* to match up to the names of the structures.

Note that the use of * here is confusing - do you mean them to
represent quotes (matching up with this supposedly being a prefix)
or as wild card?

> +The xSplice core code loads the payload as a standard ELF binary, relocates it
> +and handles the architecture-specifc sections as needed. This process is much
> +like what the Linux kernel module loader does.
> +
> +The payload contains a section (xsplice_patch_func) with an array of structures
> +describing the functions to be patched:
> +<pre>
> +struct xsplice_patch_func {  
> +    const char *name;  
> +    unsigned long new_addr;  
> +    const unsigned long old_addr;  

Stray(?, and slightly confusing) const here...

> +    uint32_t new_size;  
> +    const uint32_t long old_size;  

... and here. This one also leaves the reader guess about the
actual type meant.

Also is using "long" here really a good idea? Shouldn't we rather use
fixed width or ELF types?

> +* `old_size` and `new_size` contain the sizes of the respective functions in bytes.
> +   The value **MUST** not be zero.

For old_size I can see this, but can't new_size being zero "NOP out
the entire code sequence"?

> +### XEN_SYSCTL_XSPLICE_UPLOAD (0)
> +
> +Upload a payload to the hypervisor. The payload is verified
> +against basic checks and if there are any issues the proper return code
> +will be returned. The payload is not applied at this time - that is
> +controlled by *XEN_SYSCTL_XSPLICE_ACTION*.
> +
> +The caller provides:
> +
> + * A `struct xen_xsplice_id` called `id` which has the unique id.
> + * `size` the size of the ELF payload (in bytes).
> + * `payload` the virtual address of where the ELF payload is.
> +
> +The `id` could be an UUID that stays fixed forever for a given
> +payload. It can be embedded into the ELF payload at creation time
> +and extracted by tools.
> +
> +The return value is zero if the payload was succesfully uploaded.
> +Otherwise an XEN_EXX return value is provided. Duplicate `id` are not supported.

Ca you, here and further down, make it unambiguous that the error
value returned is negative (or, as with the rc structure fields below,
maybe indeed positive)?

> +### XEN_SYSCTL_XSPLICE_LIST (2)
> +
> +Retrieve an array of abbreviated status and names of payloads that are loaded in the
> +hypervisor.
> +
> +The caller provides:
> +
> + * `version`. Initially (on first hypercall) *MUST* be zero.
> + * `idx` index iterator. On first call *MUST* be zero, subsequent calls varies.
> + * `nr` the max number of entries to populate.
> + * `pad` - *MUST* be zero.
> + * `status` virtual address of where to write `struct xen_xsplice_status`
> +   structures. Caller *MUST* allocate up to `nr` of them.
> + * `id` - virtual address of where to write the unique id of the payload.
> +   Caller *MUST* allocate up to `nr` of them. Each *MUST* be of
> +   **XEN_XSPLICE_NAME_SIZE** size.
> + * `len` - virtual address of where to write the length of each unique id
> +   of the payload. Caller *MUST* allocate up to `nr` of them. Each *MUST* be
> +   of sizeof(uint32_t) (4 bytes).
> +
> +If the hypercall returns an positive number, it is the number (up to `nr`)
> +of the payloads returned, along with `nr` updated with the number of remaining
> +payloads, `version` updated (it may be the same across hypercalls. If it
> +varies the data is stale and further calls could fail). The `status`,
> +`id`, and `len`' are updated at their designed index value (`idx`) with
> +the returned value of data.
> +
> +If the hypercall returns E2BIG the `count` is too big and should be
> +lowered.

s/count/nr/ ?

> +This operation can be preempted by the hypercall returning XEN_EAGAIN.
> +Retry.

Why is this necessary when preemption via the 'nr' field is already
possible?

Also what meaning would have zero as a return value here (not
spelled out above afaics)?

> +struct xen_sysctl_xsplice_list {  
> +    uint32_t version;                       /* IN/OUT: Initially *MUST* be zero.  
> +                                               On subsequent calls reuse value.  
> +                                               If varies between calls, we are  
> +                                             * getting stale data. */  
> +    uint32_t idx;                           /* IN/OUT: Index into array. */  
> +    uint32_t nr;                            /* IN: How many status, id, and len  
> +                                               should fill out.  
> +                                               OUT: How many payloads left. */  
> +    uint32_t pad;                           /* IN: Must be zero. */  
> +    XEN_GUEST_HANDLE_64(xen_xsplice_status_t) status;  /* OUT. Must have enough  
> +                                               space allocate for n of them. */  
> +    XEN_GUEST_HANDLE_64(char) id;           /* OUT: Array of ids. Each member  
> +                                               MUST XEN_XSPLICE_NAME_SIZE in size.  
> +                                               Must have n of them. */  
> +    XEN_GUEST_HANDLE_64(uint32) len;        /* OUT: Array of lengths of ids.  
> +                                               Must have n of them. */  

For all three, perhaps better refer to 'nr' instead of 'n'?

> +Note that patching functions that copy to or from guest memory requires
> +to support alternative support. This is due to SMAP (specifically *stac*
> +and *clac* operations) which is enabled on Broadwell and later architectures.

I think it should be emphasized that this is an example, and there
are other uses of alternative instructions (and likely more to come).

> +The v2 design must also have a mechanism for:
> +
> + *  An dependency mechanism for the payloads. To use that information to load:
> +    - The appropiate payload. To verify that payload is built against the
> +      hypervisor. This can be done via the `build-id`
> +      or via providing an copy of the old code - so that the hypervisor can
> +       verify it against the code in memory.

I was missing this above - do you really intend to do patching without
at least one of those two safety measures?

> +## Signature checking requirements.
> +
> +The signature checking requires that the layout of the data in memory
> +**MUST** be same for signature to be verified. This means that the payload
> +data layout in ELF format **MUST** match what the hypervisor would be
> +expecting such that it can properly do signature verification.
> +
> +The signature is based on the all of the payloads continuously laid out
> +in memory. The signature is to be appended at the end of the ELF payload
> +prefixed with the string '~Module signature appended~\n', followed by
> +an signature header then followed by the signature, key identifier, and signers
> +name.
> +
> +Specifically the signature header would be:
> +
> +<pre>
> +#define PKEY_ALGO_DSA       0  
> +#define PKEY_ALGO_RSA       1  
> +
> +#define PKEY_ID_PGP         0 /* OpenPGP generated key ID */  
> +#define PKEY_ID_X509        1 /* X.509 arbitrary subjectKeyIdentifier */  
> +
> +#define HASH_ALGO_MD4          0  
> +#define HASH_ALGO_MD5          1  
> +#define HASH_ALGO_SHA1         2  
> +#define HASH_ALGO_RIPE_MD_160  3  
> +#define HASH_ALGO_SHA256       4  
> +#define HASH_ALGO_SHA384       5  
> +#define HASH_ALGO_SHA512       6  
> +#define HASH_ALGO_SHA224       7  
> +#define HASH_ALGO_RIPE_MD_128  8  
> +#define HASH_ALGO_RIPE_MD_256  9  
> +#define HASH_ALGO_RIPE_MD_320 10  
> +#define HASH_ALGO_WP_256      11  
> +#define HASH_ALGO_WP_384      12  
> +#define HASH_ALGO_WP_512      13  
> +#define HASH_ALGO_TGR_128     14  
> +#define HASH_ALGO_TGR_160     15  
> +#define HASH_ALGO_TGR_192     16  
> +
> +
> +struct elf_payload_signature {  
> +	u8	algo;		/* Public-key crypto algorithm PKEY_ALGO_*. */  
> +	u8	hash;		/* Digest algorithm: HASH_ALGO_*. */  
> +	u8	id_type;	/* Key identifier type PKEY_ID*. */  
> +	u8	signer_len;	/* Length of signer's name */  
> +	u8	key_id_len;	/* Length of key identifier */  
> +	u8	__pad[3];  
> +	__be32	sig_len;	/* Length of signature data */  
> +};
> +
> +</pre>
> +(Note that this has been borrowed from Linux module signature code.).

It doesn't make clear who's supposed to do that verification. If
the hypervisor, this would seem to imply a whole lot of
cryptography code needing importing...

Jan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/13] xsplice: Design document (v5).
  2016-01-19 14:31   ` Ross Lagerwall
@ 2016-02-05 18:27     ` Konrad Rzeszutek Wilk
  2016-02-05 18:34     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-05 18:27 UTC (permalink / raw)
  To: Ross Lagerwall
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	stefano.stabellini, jbeulich, sasha.levin, xen-devel

Hey,

I applied all your comments..
> >+The `struct xen_xsplice_status` structure contains an status of payload which includes:
> >+
> >+ * `status` - whether it has been:
> >+   * *XSPLICE_STATUS_LOADED* (1) has been loaded.
> >+   * *XSPLICE_STATUS_CHECKED*  (2) the ELF payload safety checks passed.
> 
> The ELF safety checks are done during load. At this stage we don't have any
> checks yet so I'm not sure what will go here.
> 

The one thing that I am not too crazy about is that the upload
operation takes a long time. As in, we allocate memory, parse it,
then load it, free some memory ,etc.

I am thinking that perhaps of doing that synchronously we schedule
an tasklet that will do that. And hence an payload will move from
LOADED (or UPLOADED?) to CHECKED from a tasklet?

That can solve also the problem of somebody (ahem) having 'sync_console
com1=9600 loglvl=all' and with us parsing - it taking way too long.

Let me see what is involved in this.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/13] xsplice: Design document (v5).
  2016-01-19 14:31   ` Ross Lagerwall
  2016-02-05 18:27     ` Konrad Rzeszutek Wilk
@ 2016-02-05 18:34     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-05 18:34 UTC (permalink / raw)
  To: Ross Lagerwall
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	stefano.stabellini, jbeulich, sasha.levin, xen-devel

> >+<pre>
> >+              /->\
> >+              \  /
> >+ UNLOAD <--- CHECK ---> REPLACE|APPLY --> REVERT --\
> >+                \                                  |
> >+                 \-------------------<-------------/
> 
> This doesn't make much sense to me. The actions need to be represented by
> arrows that move from one state to another.

They are. The '<-' or '->' are arrows.

However I have to say that I had a hard time coming up with a good
graphical representation of this. If you have ideas for better I am
all ears.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 07/13] xsplice: Add helper elf routines (v2)
  2016-01-19 14:33   ` Ross Lagerwall
@ 2016-02-05 18:38     ` Konrad Rzeszutek Wilk
  2016-02-05 20:34       ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-05 18:38 UTC (permalink / raw)
  To: Ross Lagerwall
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	stefano.stabellini, jbeulich, sasha.levin, xen-devel

> >+#define return_(x) { printk(XENLOG_DEBUG "%s:%d rc: %d\n",  \
> >+                            __func__,__LINE__, x); return x; }
> >+

.. snip..
> >+        printk(XENLOG_ERR "Could not allocate memory for section table!\n");
> 
> Shouldn't this printk be removed if you're using return_?

I was torn on the return_ macro. At one hand it helps to identify what
went wrong with the payload file. But at the same time it is very
developer-centric - so perhaps not to be in the final piece.

And yes, if we do want to go ahead with the return_ macro, then this
should go away.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2)
  2016-01-25 11:43       ` Ross Lagerwall
@ 2016-02-05 19:30         ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-05 19:30 UTC (permalink / raw)
  To: Ross Lagerwall
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	stefano.stabellini, jbeulich, sasha.levin, xen-devel

> >>As far as I can tell, the mechanics of how this works haven't changed, the
> >>code has just been reorganized. Which means the points that Martin raised
> >>about this mechanism are still outstanding.
> >
> >A bit. I added the extra timeout on both of the 'spin-around' and also
> >moved some of the barriers around. Also removed your spin-lock and used
> >the atomic_t mechanism to synchronize.
> >
> >But the one thing that I didn't do was the spin on the 'workers?' that
> >are just spinnig idly. They will do that forever if say the 'master'
> >hasn't gone to the IRQ semaphore part.
> >
> >My thinking was that the 'workers' could also use the timeout feature
> >but just multiple it by two?
> >
> 
> After looking at this again, I remembered that the algorithm I used is the
> same as the one used by stop_machine_run(). That function runs without
> timeouts at all (seemingly without problems), so why shouldn't this one?

Because we may have a very busy system and we do not want to impair
the running guests.

> (The only reason stop_machine_run() itself isn't used for patching is
> because we need to enter the function without a stack, i.e. not from a
> tasklet.)
> 
> -- 
> Ross Lagerwall

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 07/13] xsplice: Add helper elf routines (v2)
  2016-02-05 18:38     ` Konrad Rzeszutek Wilk
@ 2016-02-05 20:34       ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-05 20:34 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, Ian Campbell, andrew.cooper3, xen.org, Martin Pohlack,
	Ross Lagerwall, stefano.stabellini, Jan Beulich, xen-devel,
	sasha.levin

On Fri, Feb 5, 2016 at 1:38 PM, Konrad Rzeszutek Wilk
<konrad.wilk@oracle.com> wrote:
>> >+#define return_(x) { printk(XENLOG_DEBUG "%s:%d rc: %d\n",  \
>> >+                            __func__,__LINE__, x); return x; }
>> >+
>
> .. snip..
>> >+        printk(XENLOG_ERR "Could not allocate memory for section table!\n");
>>
>> Shouldn't this printk be removed if you're using return_?
>
> I was torn on the return_ macro. At one hand it helps to identify what
> went wrong with the payload file. But at the same time it is very
> developer-centric - so perhaps not to be in the final piece.

And the answer is pretty obvious. If we compile as debug=y then we can include
them. If not, we just return the value.

>
> And yes, if we do want to go ahead with the return_ macro, then this
> should go away.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/13] xsplice: Design document (v5).
  2016-02-05 15:25   ` Jan Beulich
@ 2016-02-05 21:47     ` Konrad Rzeszutek Wilk
  2016-02-09  8:25       ` Jan Beulich
  0 siblings, 1 reply; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-05 21:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, sasha.levin, xen-devel

I've snipped the email. I've taken your reviews in account - and just
responding on some of them that I believe need more comments.

..snip..
> > +However it has severe drawbacks - the safety checks which have to make sure
> > +the function is not on the stack - must also check every caller. For some
> > +patches this could mean - if there were an sufficient large amount of
> > +callers - that we would never be able to apply the update.
> 
> While this is an issue, didn't we settle on doing the patching without
> any deep call stacks only? Also why would that problem not apply to
> to trampoline example right below this section (after all you can't
> just go and patch a multi-byte instruction without making sure no
> CPU is about to execute that code).

True. I massaged the comment and added another in the next one.

The thinking is that if we do inline patching we have way more of changes to
do as opposed to function patching. Hence the 'drawback' of doing the inline
patching is that we may not be able to satisfy the safety checks within
the time alloted. While the function has less of stacks to check.

Granted this is a bit up in the air - as we elected to do the patching on
code paths that have very defined stacks. But not sure if I should include that
in the design as opposed to the implementation.

> 
> > +As such having the payload in an ELF file is the sensible way. We would be
> > +carrying the various sets of structures (and data) in the ELF sections under
> > +different names and with definitions. The prefix for the ELF section name
> > +would always be: *.xsplice* to match up to the names of the structures.
> 
> Note that the use of * here is confusing - do you mean them to
> represent quotes (matching up with this supposedly being a prefix)
> or as wild card?

.. That had evolved a bit. Lets remove that - as earlier versions had
.xsplice for everything: .xsplice.reloc, .xsplice.data, ... etc.

> 
> > +The xSplice core code loads the payload as a standard ELF binary, relocates it
> > +and handles the architecture-specifc sections as needed. This process is much
> > +like what the Linux kernel module loader does.
> > +
> > +The payload contains a section (xsplice_patch_func) with an array of structures
> > +describing the functions to be patched:
> > +<pre>
> > +struct xsplice_patch_func {  
> > +    const char *name;  
> > +    unsigned long new_addr;  
> > +    const unsigned long old_addr;  
> 
> Stray(?, and slightly confusing) const here...
> 
> > +    uint32_t new_size;  
> > +    const uint32_t long old_size;  
> 
> ... and here. This one also leaves the reader guess about the
> actual type meant.

Ah, I put the const there in anticipation of you wanting an const!

> 
> Also is using "long" here really a good idea? Shouldn't we rather use
> fixed width or ELF types?

We can. It would look like this:

struct xsplice_patch_func {
    const unsigned char *name;
    Elf64_Xword new_addr;
    Elf64_Xword old_addr;
    Elf64_Word new_size;
    Elf64_Word old_size;
    uint8_t pad[32];
};

Much nicer.
> 
> > +* `old_size` and `new_size` contain the sizes of the respective functions in bytes.
> > +   The value **MUST** not be zero.
> 
> For old_size I can see this, but can't new_size being zero "NOP out
> the entire code sequence"?

The patchset does not (yet) support that. Nor the short branch instructions.
I am trying to keep the amount of 'features' to the minimum so that reviews
can be easier.

Let me add in todo list (v2: Not Yet Done) this request.

..snip..
> > +### XEN_SYSCTL_XSPLICE_LIST (2)
> > +
> > +Retrieve an array of abbreviated status and names of payloads that are loaded in the
> > +hypervisor.
> > +
> > +The caller provides:
> > +
> > + * `version`. Initially (on first hypercall) *MUST* be zero.
> > + * `idx` index iterator. On first call *MUST* be zero, subsequent calls varies.
> > + * `nr` the max number of entries to populate.
> > + * `pad` - *MUST* be zero.
> > + * `status` virtual address of where to write `struct xen_xsplice_status`
> > +   structures. Caller *MUST* allocate up to `nr` of them.
> > + * `id` - virtual address of where to write the unique id of the payload.
> > +   Caller *MUST* allocate up to `nr` of them. Each *MUST* be of
> > +   **XEN_XSPLICE_NAME_SIZE** size.
> > + * `len` - virtual address of where to write the length of each unique id
> > +   of the payload. Caller *MUST* allocate up to `nr` of them. Each *MUST* be
> > +   of sizeof(uint32_t) (4 bytes).
> > +
> > +If the hypercall returns an positive number, it is the number (up to `nr`)
> > +of the payloads returned, along with `nr` updated with the number of remaining
> > +payloads, `version` updated (it may be the same across hypercalls. If it
> > +varies the data is stale and further calls could fail). The `status`,
> > +`id`, and `len`' are updated at their designed index value (`idx`) with
> > +the returned value of data.
> > +
> > +If the hypercall returns E2BIG the `count` is too big and should be
> > +lowered.
> 
> s/count/nr/ ?
> 
> > +This operation can be preempted by the hypercall returning XEN_EAGAIN.
> > +Retry.
> 
> Why is this necessary when preemption via the 'nr' field is already
> possible?

I should explain that the XEN_EAGAIN is the mechanism by which the hypervisor
signals that it could only fulfill its 'nr' value.

> 
> Also what meaning would have zero as a return value here (not
> spelled out above afaics)?

Added it in - it means there is absolutly no payloads uploaded.
..snip..
> 
> > +Note that patching functions that copy to or from guest memory requires
> > +to support alternative support. This is due to SMAP (specifically *stac*
> > +and *clac* operations) which is enabled on Broadwell and later architectures.
> 
> I think it should be emphasized that this is an example, and there
> are other uses of alternative instructions (and likely more to come).
> 
> > +The v2 design must also have a mechanism for:
> > +
> > + *  An dependency mechanism for the payloads. To use that information to load:
> > +    - The appropiate payload. To verify that payload is built against the
> > +      hypervisor. This can be done via the `build-id`
> > +      or via providing an copy of the old code - so that the hypervisor can
> > +       verify it against the code in memory.
> 
> I was missing this above - do you really intend to do patching without
> at least one of those two safety measures?

Ross wrote the patches and I will make them part of the patch series. But the
problem is that there will be now over 30 patches - so to make it easier
to review I was thinking to roll them out in 'waves'. I can most certainly
include it in the next posting.
> 
> > +## Signature checking requirements.
.. snip..
> > +struct elf_payload_signature {  
> > +	u8	algo;		/* Public-key crypto algorithm PKEY_ALGO_*. */  
> > +	u8	hash;		/* Digest algorithm: HASH_ALGO_*. */  
> > +	u8	id_type;	/* Key identifier type PKEY_ID*. */  
> > +	u8	signer_len;	/* Length of signer's name */  
> > +	u8	key_id_len;	/* Length of key identifier */  
> > +	u8	__pad[3];  
> > +	__be32	sig_len;	/* Length of signature data */  
> > +};
> > +
> > +</pre>
> > +(Note that this has been borrowed from Linux module signature code.).
> 
> It doesn't make clear who's supposed to do that verification. If
> the hypervisor, this would seem to imply a whole lot of
> cryptography code needing importing...

Oh yes :-) Which is why this is on the 'Not Yet Done' part.

> 
> Jan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7)
  2016-01-14 21:47 ` [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7) Konrad Rzeszutek Wilk
  2016-01-19 14:30   ` Ross Lagerwall
@ 2016-02-06 22:35   ` Doug Goldstein
  2016-02-09  8:28     ` Jan Beulich
  2016-02-09 14:39     ` Konrad Rzeszutek Wilk
  1 sibling, 2 replies; 45+ messages in thread
From: Doug Goldstein @ 2016-02-06 22:35 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk, xen-devel, ross.lagerwall, mpohlack,
	andrew.cooper3, stefano.stabellini, jbeulich, ian.jackson,
	ian.campbell, wei.liu2, sasha.levin


[-- Attachment #1.1: Type: text/plain, Size: 26192 bytes --]

On 1/14/16 3:47 PM, Konrad Rzeszutek Wilk wrote:
> The implementation does not actually do any patching.
> 
> It just adds the framework for doing the hypercalls,
> keeping track of ELF payloads, and the basic operations:
>  - query which payloads exist,
>  - query for specific payloads,
>  - check*1, apply*1, replace*1, and unload payloads.
> 
> *1: Which of course in this patch are nops.
> 
> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
> 
> ---
> v2: Rebased on keyhandler: rework keyhandler infrastructure
> v3: Fixed XSM.
> v4: Removed REVERTED state.
>     Split status and error code.
>     Add REPLACE action.
>     Separate payload data from the payload structure.
>     s/XSPLICE_ID_../XSPLICE_NAME_../
> v5: Add xsplice and CONFIG_XSPLICE build toption.
>     Fix code per Jan's review.
>     Update the sysctl.h (change bits to enum like)
> v6: Rebase on Kconfig changes.
> v7: Add missing pad checks. Re-order keyhandler.h to build on ARM.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  tools/flask/policy/policy/modules/xen/xen.te |   1 +
>  xen/arch/arm/Kconfig                         |   1 +
>  xen/arch/x86/Kconfig                         |   1 +
>  xen/common/Kconfig                           |  14 +
>  xen/common/Makefile                          |   2 +
>  xen/common/sysctl.c                          |   8 +
>  xen/common/xsplice.c                         | 386 +++++++++++++++++++++++++++
>  xen/include/public/sysctl.h                  | 156 +++++++++++
>  xen/include/xen/xsplice.h                    |   7 +
>  xen/xsm/flask/hooks.c                        |   6 +
>  xen/xsm/flask/policy/access_vectors          |   2 +
>  11 files changed, 584 insertions(+)
>  create mode 100644 xen/common/xsplice.c
>  create mode 100644 xen/include/xen/xsplice.h
> 
> diff --git a/tools/flask/policy/policy/modules/xen/xen.te b/tools/flask/policy/policy/modules/xen/xen.te
> index d35ae22..542c3e1 100644
> --- a/tools/flask/policy/policy/modules/xen/xen.te
> +++ b/tools/flask/policy/policy/modules/xen/xen.te
> @@ -72,6 +72,7 @@ allow dom0_t xen_t:xen2 {
>  allow dom0_t xen_t:xen2 {
>      pmu_ctrl
>      get_symbol
> +    xsplice_op
>  };
>  allow dom0_t xen_t:mmu memorymap;
>  
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index 60e923c..3780949 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -23,6 +23,7 @@ config ARM
>  	select HAS_PASSTHROUGH
>  	select HAS_PDX
>  	select HAS_VIDEO
> +	select HAS_XSPLICE
>  
>  config ARCH_DEFCONFIG
>  	string
> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
> index 4781b34..2b6c832 100644
> --- a/xen/arch/x86/Kconfig
> +++ b/xen/arch/x86/Kconfig
> @@ -18,6 +18,7 @@ config X86
>  	select HAS_PCI
>  	select HAS_PDX
>  	select HAS_VGA
> +	select HAS_XSPLICE
>  
>  config ARCH_DEFCONFIG
>  	string
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index eadfc3b..aaf4053 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -51,6 +51,9 @@ config HAS_GDBSX
>  config HAS_IOPORTS
>  	bool
>  
> +config HAS_XSPLICE
> +	bool
> +
>  # Enable/Disable kexec support
>  config KEXEC
>  	bool "kexec support"
> @@ -97,4 +100,15 @@ config XSM
>  
>  	  If unsure, say N.
>  
> +# Enable/Disable xsplice support
> +config XSPLICE
> +	bool "xsplice support"
> +	default y
> +	depends on HAS_XSPLICE
> +	---help---
> +	  Allows a running Xen hypervisor to be patched without rebooting.
> +	  This is primarily used to patch an hypervisor with XSA fixes.
> +
> +	  If unsure, say Y.
> +
>  endmenu

I'm indifferent on the HAS_XSPLICE, you can drop that if you want to
simply stuff.


> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index 9f8b214..6fdeccf 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -71,3 +71,5 @@ subdir-$(coverage) += gcov
>  
>  subdir-y += libelf
>  subdir-$(CONFIG_HAS_DEVICE_TREE) += libfdt
> +
> +obj-$(CONFIG_XSPLICE) += xsplice.o
> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
> index a3007b8..55e6cfa 100644
> --- a/xen/common/sysctl.c
> +++ b/xen/common/sysctl.c
> @@ -28,6 +28,7 @@
>  #include <xsm/xsm.h>
>  #include <xen/pmstat.h>
>  #include <xen/gcov.h>
> +#include <xen/xsplice.h>
>  
>  long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
>  {
> @@ -460,6 +461,13 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
>          ret = tmem_control(&op->u.tmem_op);
>          break;
>  
> +#ifdef CONFIG_XSPLICE
> +    case XEN_SYSCTL_xsplice_op:
> +        ret = xsplice_control(&op->u.xsplice);
> +        copyback = 1;
> +        break;
> +#endif

Should the case statement still exist and not just return -ENOSYS?
Otherwise we're needlessly going into arch_do_sysctl() just to get the
same result.


> +
>      default:
>          ret = arch_do_sysctl(op, u_sysctl);
>          copyback = 0;
> diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
> new file mode 100644
> index 0000000..3c6acc3
> --- /dev/null
> +++ b/xen/common/xsplice.c
> @@ -0,0 +1,386 @@
> +/*
> + * Copyright (c) 2016 Oracle and/or its affiliates. All rights reserved.
> + *
> + */
> +
> +#include <xen/guest_access.h>
> +#include <xen/keyhandler.h>
> +#include <xen/lib.h>
> +#include <xen/list.h>
> +#include <xen/mm.h>
> +#include <xen/sched.h>
> +#include <xen/smp.h>
> +#include <xen/spinlock.h>
> +#include <xen/xsplice.h>
> +
> +#include <asm/event.h>
> +#include <public/sysctl.h>
> +
> +static DEFINE_SPINLOCK(payload_list_lock);
> +static LIST_HEAD(payload_list);
> +
> +static unsigned int payload_cnt;
> +static unsigned int payload_version = 1;
> +
> +struct payload {
> +    int32_t state;                       /* One of the XSPLICE_STATE_*. */
> +    int32_t rc;                          /* 0 or -XEN_EXX. */
> +    struct list_head list;               /* Linked to 'payload_list'. */
> +    char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
> +};
> +
> +static const char *state2str(int32_t state)
> +{
> +#define STATE(x) [XSPLICE_STATE_##x] = #x
> +    static const char *const names[] = {
> +            STATE(LOADED),
> +            STATE(CHECKED),
> +            STATE(APPLIED),
> +    };
> +#undef STATE
> +
> +    if (state >= ARRAY_SIZE(names))
> +        return "unknown";
> +
> +    if (state < 0)
> +        return "-EXX";
> +
> +    if (!names[state])
> +        return "unknown";
> +
> +    return names[state];
> +}
> +
> +static void xsplice_printall(unsigned char key)
> +{
> +    struct payload *data;
> +
> +    spin_lock(&payload_list_lock);
> +
> +    list_for_each_entry ( data, &payload_list, list )
> +        printk(" name=%s state=%s(%d)\n", data->name,
> +               state2str(data->state), data->state);
> +
> +    spin_unlock(&payload_list_lock);
> +}
> +
> +static int verify_name(xen_xsplice_name_t *name)
> +{
> +    if ( name->size == 0 || name->size > XEN_XSPLICE_NAME_SIZE )
> +        return -EINVAL;
> +
> +    if ( name->pad[0] || name->pad[1] || name->pad[2] )
> +        return -EINVAL;
> +
> +    if ( !guest_handle_okay(name->name, name->size) )
> +        return -EINVAL;
> +
> +    return 0;
> +}
> +
> +static int find_payload(xen_xsplice_name_t *name, bool_t need_lock,
> +                        struct payload **f)
> +{
> +    struct payload *data;
> +    XEN_GUEST_HANDLE_PARAM(char) str;
> +    char n[XEN_XSPLICE_NAME_SIZE + 1] = { 0 };
> +    int rc = -EINVAL;
> +
> +    rc = verify_name(name);
> +    if ( rc )
> +        return rc;
> +
> +    str = guest_handle_cast(name->name, char);
> +    if ( copy_from_guest(n, str, name->size) )
> +        return -EFAULT;
> +
> +    if ( need_lock )
> +        spin_lock(&payload_list_lock);
> +
> +    rc = -ENOENT;
> +    list_for_each_entry ( data, &payload_list, list )
> +    {
> +        if ( !strcmp(data->name, n) )
> +        {
> +            *f = data;
> +            rc = 0;
> +            break;
> +        }
> +    }
> +
> +    if ( need_lock )
> +        spin_unlock(&payload_list_lock);
> +
> +    return rc;
> +}
> +
> +static int verify_payload(xen_sysctl_xsplice_upload_t *upload)
> +{
> +    if ( verify_name(&upload->name) )
> +        return -EINVAL;
> +
> +    if ( upload->size == 0 )
> +        return -EINVAL;
> +
> +    if ( !guest_handle_okay(upload->payload, upload->size) )
> +        return -EFAULT;
> +
> +    return 0;
> +}
> +
> +/*
> + * We MUST be holding the payload_list_lock spinlock.
> + */
> +static void free_payload(struct payload *data)
> +{
> +    list_del(&data->list);
> +    payload_cnt--;
> +    payload_version++;
> +    xfree(data);
> +}
> +
> +static int xsplice_upload(xen_sysctl_xsplice_upload_t *upload)
> +{
> +    struct payload *data = NULL;
> +    uint8_t *raw_data;
> +    int rc;
> +
> +    rc = verify_payload(upload);
> +    if ( rc )
> +        return rc;
> +
> +    rc = find_payload(&upload->name, 1 /* true. */, &data);
> +    if ( rc == 0 /* Found. */ )
> +        return -EEXIST;
> +
> +    if ( rc != -ENOENT )
> +        return rc;
> +
> +    data = xzalloc(struct payload);
> +    if ( !data )
> +        return -ENOMEM;
> +
> +    memset(data, 0, sizeof *data);
> +    rc = -EFAULT;
> +    if ( copy_from_guest(data->name, upload->name.name, upload->name.size) )
> +        goto err_data;
> +
> +    rc = -ENOMEM;
> +    raw_data = alloc_xenheap_pages(get_order_from_bytes(upload->size), 0);
> +    if ( !raw_data )
> +        goto err_data;
> +
> +    rc = -EFAULT;
> +    if ( copy_from_guest(raw_data, upload->payload, upload->size) )
> +        goto err_raw;
> +
> +    data->state = XSPLICE_STATE_LOADED;
> +    data->rc = 0;
> +    INIT_LIST_HEAD(&data->list);
> +
> +    spin_lock(&payload_list_lock);
> +    list_add_tail(&data->list, &payload_list);
> +    payload_cnt++;
> +    payload_version++;
> +    spin_unlock(&payload_list_lock);
> +
> +    free_xenheap_pages(raw_data, get_order_from_bytes(upload->size));
> +    return 0;
> +
> + err_raw:
> +    free_xenheap_pages(raw_data, get_order_from_bytes(upload->size));
> + err_data:
> +    xfree(data);
> +    return rc;
> +}
> +
> +static int xsplice_get(xen_sysctl_xsplice_summary_t *summary)
> +{
> +    struct payload *data;
> +    int rc;
> +
> +    if ( summary->status.state )
> +        return -EINVAL;
> +
> +    if ( summary->status.rc != 0 )
> +        return -EINVAL;
> +
> +    rc = verify_name(&summary->name);
> +    if ( rc )
> +        return rc;
> +
> +    rc = find_payload(&summary->name, 1 /* true. */, &data);
> +    if ( rc )
> +        return rc;
> +
> +    summary->status.state = data->state;
> +    summary->status.rc = data->rc;
> +
> +    return 0;
> +}
> +
> +static int xsplice_list(xen_sysctl_xsplice_list_t *list)
> +{
> +    xen_xsplice_status_t status;
> +    struct payload *data;
> +    unsigned int idx = 0, i = 0;
> +    int rc = 0;
> +
> +    if ( list->nr > 1024 )
> +        return -E2BIG;
> +
> +    if ( list->pad != 0 )
> +        return -EINVAL;
> +
> +    if ( !guest_handle_okay(list->status, sizeof(status) * list->nr) ||
> +         !guest_handle_okay(list->name, XEN_XSPLICE_NAME_SIZE * list->nr) ||
> +         !guest_handle_okay(list->len, sizeof(uint32_t) * list->nr) )
> +        return -EINVAL;
> +
> +    spin_lock(&payload_list_lock);
> +    if ( list->idx > payload_cnt || !list->nr )
> +    {
> +        spin_unlock(&payload_list_lock);
> +        return -EINVAL;
> +    }
> +
> +    list_for_each_entry( data, &payload_list, list )
> +    {
> +        uint32_t len;
> +
> +        if ( list->idx > i++ )
> +            continue;
> +
> +        status.state = data->state;
> +        status.rc = data->rc;
> +        len = strlen(data->name);
> +
> +        /* N.B. 'idx' != 'i'. */
> +        if ( __copy_to_guest_offset(list->name, idx * XEN_XSPLICE_NAME_SIZE,
> +                                    data->name, len) ||
> +             __copy_to_guest_offset(list->len, idx, &len, 1) ||
> +             __copy_to_guest_offset(list->status, idx, &status, 1) )
> +        {
> +            rc = -EFAULT;
> +            break;
> +        }
> +        idx++;
> +        if ( hypercall_preempt_check() || (idx + 1 > list->nr) )
> +            break;
> +    }
> +    list->nr = payload_cnt - i; /* Remaining amount. */
> +    list->version = payload_version;
> +    spin_unlock(&payload_list_lock);
> +
> +    /* And how many we have processed. */
> +    return rc ? : idx;
> +}
> +
> +static int xsplice_action(xen_sysctl_xsplice_action_t *action)
> +{
> +    struct payload *data;
> +    int rc;
> +
> +    rc = verify_name(&action->name);
> +    if ( rc )
> +        return rc;
> +
> +    spin_lock(&payload_list_lock);
> +    rc = find_payload(&action->name, 0 /* We are holding the lock. */, &data);
> +    if ( rc )
> +        goto out;
> +
> +    switch ( action->cmd )
> +    {
> +    case XSPLICE_ACTION_CHECK:
> +        if ( (data->state == XSPLICE_STATE_LOADED) ||
> +             (data->state == XSPLICE_STATE_CHECKED) )
> +        {
> +            /* No implementation yet. */
> +            data->state = XSPLICE_STATE_CHECKED;
> +            data->rc = 0;
> +            rc = 0;
> +        }
> +        break;
> +    case XSPLICE_ACTION_UNLOAD:
> +        if ( (data->state == XSPLICE_STATE_LOADED) ||
> +             (data->state == XSPLICE_STATE_CHECKED) )
> +        {
> +            free_payload(data);
> +            /* No touching 'data' from here on! */
> +            rc = 0;
> +        }
> +        break;
> +    case XSPLICE_ACTION_REVERT:
> +        if ( data->state == XSPLICE_STATE_APPLIED )
> +        {
> +            /* No implementation yet. */
> +            data->state = XSPLICE_STATE_CHECKED;
> +            data->rc = 0;
> +            rc = 0;
> +        }
> +        break;
> +    case XSPLICE_ACTION_APPLY:
> +        if ( (data->state == XSPLICE_STATE_CHECKED) )
> +        {
> +            /* No implementation yet. */
> +            data->state = XSPLICE_STATE_APPLIED;
> +            data->rc = 0;
> +            rc = 0;
> +        }
> +        break;
> +    case XSPLICE_ACTION_REPLACE:
> +        if ( data->state == XSPLICE_STATE_CHECKED )
> +        {
> +            /* No implementation yet. */
> +            data->state = XSPLICE_STATE_CHECKED;
> +            data->rc = 0;
> +            rc = 0;
> +        }
> +        break;
> +    default:
> +        rc = -EOPNOTSUPP;
> +        break;
> +    }
> +
> + out:
> +    spin_unlock(&payload_list_lock);
> +
> +    return rc;
> +}
> +
> +int xsplice_control(xen_sysctl_xsplice_op_t *xsplice)
> +{
> +    int rc;
> +
> +    if ( xsplice->pad != 0 )
> +        return -EINVAL;
> +
> +    switch ( xsplice->cmd )
> +    {
> +    case XEN_SYSCTL_XSPLICE_UPLOAD:
> +        rc = xsplice_upload(&xsplice->u.upload);
> +        break;
> +    case XEN_SYSCTL_XSPLICE_GET:
> +        rc = xsplice_get(&xsplice->u.get);
> +        break;
> +    case XEN_SYSCTL_XSPLICE_LIST:
> +        rc = xsplice_list(&xsplice->u.list);
> +        break;
> +    case XEN_SYSCTL_XSPLICE_ACTION:
> +        rc = xsplice_action(&xsplice->u.action);
> +        break;
> +    default:
> +        rc = -EOPNOTSUPP;
> +        break;
> +   }
> +
> +    return rc;
> +}
> +
> +static int __init xsplice_init(void)
> +{
> +    register_keyhandler('x', xsplice_printall, "print xsplicing info", 1);
> +    return 0;
> +}
> +__initcall(xsplice_init);
> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
> index 96680eb..0b0b879 100644
> --- a/xen/include/public/sysctl.h
> +++ b/xen/include/public/sysctl.h
> @@ -766,6 +766,160 @@ struct xen_sysctl_tmem_op {
>  typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
>  
> +/*
> + * XEN_SYSCTL_XSPLICE_op
> + *
> + * Refer to the http://xenbits.xenproject.org/docs/unstable/misc/xsplice.html
> + * for the design details of this hyprcall.
> + */
> +
> +/*
> + * Structure describing an ELF payload. Uniquely identifies the
> + * payload. Should be human readable.
> + * Recommended length is upto XEN_XSPLICE_NAME_SIZE.
> + */
> +#define XEN_XSPLICE_NAME_SIZE 128
> +struct xen_xsplice_name {
> +    XEN_GUEST_HANDLE_64(char) name;         /* IN: pointer to name. */
> +    uint16_t size;                          /* IN: size of name. May be upto
> +                                               XEN_XSPLICE_NAME_SIZE. */
> +    uint16_t pad[3];                        /* IN: MUST be zero. */
> +};
> +typedef struct xen_xsplice_name xen_xsplice_name_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_xsplice_name_t);
> +
> +/*
> + * Upload a payload to the hypervisor. The payload is verified
> + * against basic checks and if there are any issues the proper return code
> + * will be returned. The payload is not applied at this time - that is
> + * controlled by XEN_SYSCTL_XSPLICE_ACTION.
> + *
> + * The return value is zero if the payload was succesfully uploaded.
> + * Otherwise an EXX return value is provided. Duplicate `name` are not
> + * supported.
> + *
> + * The payload at this point is verified against the basic checks.
> + *
> + * The `payload` is the ELF payload as mentioned in the `Payload format`
> + * section in the xSplice design document.
> + */
> +#define XEN_SYSCTL_XSPLICE_UPLOAD 0
> +struct xen_sysctl_xsplice_upload {
> +    xen_xsplice_name_t name;                /* IN, name of the patch. */
> +    uint64_t size;                          /* IN, size of the ELF file. */
> +    XEN_GUEST_HANDLE_64(uint8) payload;     /* IN, the ELF file. */
> +};
> +typedef struct xen_sysctl_xsplice_upload xen_sysctl_xsplice_upload_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_upload_t);
> +
> +/*
> + * Retrieve an status of an specific payload.
> + *
> + * Upon completion the `struct xen_xsplice_status` is updated.
> + *
> + * The return value is zero on success and XEN_EXX on failure. This operation
> + * is synchronous and does not require preemption.
> + */
> +#define XEN_SYSCTL_XSPLICE_GET 1
> +
> +struct xen_xsplice_status {
> +#define XSPLICE_STATE_LOADED       1
> +#define XSPLICE_STATE_CHECKED      2
> +#define XSPLICE_STATE_APPLIED      3
> +    int32_t state;                 /* OUT: XSPLICE_STATE_*. IN: MUST be zero. */
> +    int32_t rc;                    /* OUT: 0 if no error, otherwise -XEN_EXX. */
> +                                   /* IN: MUST be zero. */
> +};
> +typedef struct xen_xsplice_status xen_xsplice_status_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_xsplice_status_t);
> +
> +struct xen_sysctl_xsplice_summary {
> +    xen_xsplice_name_t name;                /* IN, name of the payload. */
> +    xen_xsplice_status_t status;            /* IN/OUT, state of it. */
> +};
> +typedef struct xen_sysctl_xsplice_summary xen_sysctl_xsplice_summary_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_summary_t);
> +
> +/*
> + * Retrieve an array of abbreviated status and names of payloads that are
> + * loaded in the hypervisor.
> + *
> + * If the hypercall returns an positive number, it is the number (up to `nr`)
> + * of the payloads returned, along with `nr` updated with the number of remaining
> + * payloads, `version` updated (it may be the same across hypercalls. If it
> + * varies the data is stale and further calls could fail). The `status`,
> + * `name`, and `len`' are updated at their designed index value (`idx`) with
> + * the returned value of data.
> + *
> + * If the hypercall returns E2BIG the `nr` is too big and should be
> + * lowered.
> + *
> + * This operation can be preempted by the hypercall returning EAGAIN.
> + * Retry.
> + *
> + * Note that due to the asynchronous nature of hypercalls the domain might have
> + * added or removed the number of payloads making this information stale. It is
> + * the responsibility of the toolstack to use the `version` field to check
> + * between each invocation. if the version differs it should discard the stale
> + * data and start from scratch. It is OK for the toolstack to use the new
> + * `version` field.
> + */
> +#define XEN_SYSCTL_XSPLICE_LIST 2
> +struct xen_sysctl_xsplice_list {
> +    uint32_t version;                       /* IN/OUT: Initially *MUST* be zero.
> +                                               On subsequent calls reuse value.
> +                                               If varies between calls, we are
> +                                             * getting stale data. */
> +    uint32_t idx;                           /* IN/OUT: Index into array. */
> +    uint32_t nr;                            /* IN: How many status, id, and len
> +                                               should fill out.
> +                                               OUT: How many payloads left. */
> +    uint32_t pad;                           /* IN: Must be zero. */
> +    XEN_GUEST_HANDLE_64(xen_xsplice_status_t) status;  /* OUT. Must have enough
> +                                               space allocate for nr of them. */
> +    XEN_GUEST_HANDLE_64(char) name;         /* OUT: Array of ids. Each member
> +                                               MUST XEN_XSPLICE_NAME_SIZE in size.
> +                                               Must have nr of them. */
> +    XEN_GUEST_HANDLE_64(uint32) len;        /* OUT: Array of lengths of ids.
> +                                               Must have nr of them. */
> +};
> +typedef struct xen_sysctl_xsplice_list xen_sysctl_xsplice_list_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_list_t);
> +
> +/*
> + * Perform an operation on the payload structure referenced by the `name` field.
> + * The operation request is asynchronous and the status should be retrieved
> + * by using either XEN_SYSCTL_XSPLICE_GET or XEN_SYSCTL_XSPLICE_LIST hypercall.
> + */
> +#define XEN_SYSCTL_XSPLICE_ACTION 3
> +struct xen_sysctl_xsplice_action {
> +    xen_xsplice_name_t name;                /* IN, name of the patch. */
> +#define XSPLICE_ACTION_CHECK        1
> +#define XSPLICE_ACTION_UNLOAD       2
> +#define XSPLICE_ACTION_REVERT       3
> +#define XSPLICE_ACTION_APPLY        4
> +#define XSPLICE_ACTION_REPLACE      5
> +    uint32_t cmd;                           /* IN: XSPLICE_ACTION_*. */
> +    uint32_t timeout;                       /* IN: Zero if no timeout. */
> +                                            /* Or upper bound of time (ms) */
> +                                            /* for operation to take. */
> +};
> +typedef struct xen_sysctl_xsplice_action xen_sysctl_xsplice_action_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_action_t);
> +
> +struct xen_sysctl_xsplice_op {
> +    uint32_t cmd;                           /* IN: XEN_SYSCTL_XSPLICE_*. */
> +    uint32_t pad;                           /* IN: Always zero. */
> +    union {
> +        xen_sysctl_xsplice_upload_t upload;
> +        xen_sysctl_xsplice_list_t list;
> +        xen_sysctl_xsplice_summary_t get;
> +        xen_sysctl_xsplice_action_t action;
> +    } u;
> +};
> +typedef struct xen_sysctl_xsplice_op xen_sysctl_xsplice_op_t;
> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_op_t);
> +
>  struct xen_sysctl {
>      uint32_t cmd;
>  #define XEN_SYSCTL_readconsole                    1
> @@ -791,6 +945,7 @@ struct xen_sysctl {
>  #define XEN_SYSCTL_pcitopoinfo                   22
>  #define XEN_SYSCTL_psr_cat_op                    23
>  #define XEN_SYSCTL_tmem_op                       24
> +#define XEN_SYSCTL_xsplice_op                    25
>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>      union {
>          struct xen_sysctl_readconsole       readconsole;
> @@ -816,6 +971,7 @@ struct xen_sysctl {
>          struct xen_sysctl_psr_cmt_op        psr_cmt_op;
>          struct xen_sysctl_psr_cat_op        psr_cat_op;
>          struct xen_sysctl_tmem_op           tmem_op;
> +        struct xen_sysctl_xsplice_op        xsplice;
>          uint8_t                             pad[128];
>      } u;
>  };
> diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
> new file mode 100644
> index 0000000..2cb2035
> --- /dev/null
> +++ b/xen/include/xen/xsplice.h
> @@ -0,0 +1,7 @@
> +#ifndef __XEN_XSPLICE_H__
> +#define __XEN_XSPLICE_H__
> +
> +struct xen_sysctl_xsplice_op;
> +int xsplice_control(struct xen_sysctl_xsplice_op *);
> +
> +#endif /* __XEN_XSPLICE_H__ */
> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
> index 9b7de30..5346dcf 100644
> --- a/xen/xsm/flask/hooks.c
> +++ b/xen/xsm/flask/hooks.c
> @@ -807,6 +807,12 @@ static int flask_sysctl(int cmd)
>      case XEN_SYSCTL_tmem_op:
>          return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
>  
> +#ifdef CONFIG_XSPLICE
> +    case XEN_SYSCTL_xsplice_op:
> +        return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
> +                                    XEN2__XSPLICE_OP, NULL);
> +#endif
> +
>      default:
>          printk("flask_sysctl: Unknown op %d\n", cmd);
>          return -EPERM;
> diff --git a/xen/xsm/flask/policy/access_vectors b/xen/xsm/flask/policy/access_vectors
> index effb59f..5f08d05 100644
> --- a/xen/xsm/flask/policy/access_vectors
> +++ b/xen/xsm/flask/policy/access_vectors
> @@ -93,6 +93,8 @@ class xen2
>      pmu_ctrl
>  # PMU use (domains, including unprivileged ones, will be using this operation)
>      pmu_use
> +# XEN_SYSCTL_xsplice_op
> +    xsplice_op
>  }
>  
>  # Classes domain and domain2 consist of operations that a domain performs on
> 


-- 
Doug Goldstein


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 959 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 01/13] xsplice: Design document (v5).
  2016-02-05 21:47     ` Konrad Rzeszutek Wilk
@ 2016-02-09  8:25       ` Jan Beulich
  0 siblings, 0 replies; 45+ messages in thread
From: Jan Beulich @ 2016-02-09  8:25 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, sasha.levin, xen-devel

>>> On 05.02.16 at 22:47, <konrad.wilk@oracle.com> wrote:
>> Also is using "long" here really a good idea? Shouldn't we rather use
>> fixed width or ELF types?
> 
> We can. It would look like this:
> 
> struct xsplice_patch_func {
>     const unsigned char *name;
>     Elf64_Xword new_addr;
>     Elf64_Xword old_addr;
>     Elf64_Word new_size;
>     Elf64_Word old_size;
>     uint8_t pad[32];
> };
> 
> Much nicer.

Leaving only the question on whether then we should have two (for
now) variants - one for ELF32 (which at least ARM32 will want) and
another for ELF64.

>> > +* `old_size` and `new_size` contain the sizes of the respective functions in bytes.
>> > +   The value **MUST** not be zero.
>> 
>> For old_size I can see this, but can't new_size being zero "NOP out
>> the entire code sequence"?
> 
> The patchset does not (yet) support that. Nor the short branch instructions.
> I am trying to keep the amount of 'features' to the minimum so that reviews
> can be easier.

Understood. However, the emphasized "**MUST**" pretty much
excludes the possibility for the future: One thing is the specification
that you present here, another the implementation, which may of
course initially lack certain functionality.

>> > +### XEN_SYSCTL_XSPLICE_LIST (2)
>> > +
>> > +Retrieve an array of abbreviated status and names of payloads that are loaded in the
>> > +hypervisor.
>> > +
>> > +The caller provides:
>> > +
>> > + * `version`. Initially (on first hypercall) *MUST* be zero.
>> > + * `idx` index iterator. On first call *MUST* be zero, subsequent calls varies.
>> > + * `nr` the max number of entries to populate.
>> > + * `pad` - *MUST* be zero.
>> > + * `status` virtual address of where to write `struct xen_xsplice_status`
>> > +   structures. Caller *MUST* allocate up to `nr` of them.
>> > + * `id` - virtual address of where to write the unique id of the payload.
>> > +   Caller *MUST* allocate up to `nr` of them. Each *MUST* be of
>> > +   **XEN_XSPLICE_NAME_SIZE** size.
>> > + * `len` - virtual address of where to write the length of each unique id
>> > +   of the payload. Caller *MUST* allocate up to `nr` of them. Each *MUST* be
>> > +   of sizeof(uint32_t) (4 bytes).
>> > +
>> > +If the hypercall returns an positive number, it is the number (up to `nr`)
>> > +of the payloads returned, along with `nr` updated with the number of remaining
>> > +payloads, `version` updated (it may be the same across hypercalls. If it
>> > +varies the data is stale and further calls could fail). The `status`,
>> > +`id`, and `len`' are updated at their designed index value (`idx`) with
>> > +the returned value of data.
>> > +
>> > +If the hypercall returns E2BIG the `count` is too big and should be
>> > +lowered.
>> 
>> s/count/nr/ ?
>> 
>> > +This operation can be preempted by the hypercall returning XEN_EAGAIN.
>> > +Retry.
>> 
>> Why is this necessary when preemption via the 'nr' field is already
>> possible?
> 
> I should explain that the XEN_EAGAIN is the mechanism by which the 
> hypervisor
> signals that it could only fulfill its 'nr' value.

But such a model seems to contradict "If the hypercall returns an
positive number, ..." above: Either you expect a positive number
to be returned in this case, or -XEN_EAGAIN.

>> > +The v2 design must also have a mechanism for:
>> > +
>> > + *  An dependency mechanism for the payloads. To use that information to load:
>> > +    - The appropiate payload. To verify that payload is built against the
>> > +      hypervisor. This can be done via the `build-id`
>> > +      or via providing an copy of the old code - so that the hypervisor can
>> > +       verify it against the code in memory.
>> 
>> I was missing this above - do you really intend to do patching without
>> at least one of those two safety measures?
> 
> Ross wrote the patches and I will make them part of the patch series. But 
> the
> problem is that there will be now over 30 patches - so to make it easier
> to review I was thinking to roll them out in 'waves'. I can most certainly
> include it in the next posting.

Trying to break up large series is much appreciated, but I think this
shouldn't lead to stuff going in being overly fragile.

Jan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7)
  2016-02-06 22:35   ` Doug Goldstein
@ 2016-02-09  8:28     ` Jan Beulich
  2016-02-09 14:39     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 45+ messages in thread
From: Jan Beulich @ 2016-02-09  8:28 UTC (permalink / raw)
  To: Doug Goldstein
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, sasha.levin, xen-devel

>>> On 06.02.16 at 23:35, <cardoe@cardoe.com> wrote:

(Just to demonstrate the effect - please go right to the end.)

> On 1/14/16 3:47 PM, Konrad Rzeszutek Wilk wrote:
>> The implementation does not actually do any patching.
>> 
>> It just adds the framework for doing the hypercalls,
>> keeping track of ELF payloads, and the basic operations:
>>  - query which payloads exist,
>>  - query for specific payloads,
>>  - check*1, apply*1, replace*1, and unload payloads.
>> 
>> *1: Which of course in this patch are nops.
>> 
>> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
>> 
>> ---
>> v2: Rebased on keyhandler: rework keyhandler infrastructure
>> v3: Fixed XSM.
>> v4: Removed REVERTED state.
>>     Split status and error code.
>>     Add REPLACE action.
>>     Separate payload data from the payload structure.
>>     s/XSPLICE_ID_../XSPLICE_NAME_../
>> v5: Add xsplice and CONFIG_XSPLICE build toption.
>>     Fix code per Jan's review.
>>     Update the sysctl.h (change bits to enum like)
>> v6: Rebase on Kconfig changes.
>> v7: Add missing pad checks. Re-order keyhandler.h to build on ARM.
>> 
>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> ---
>>  tools/flask/policy/policy/modules/xen/xen.te |   1 +
>>  xen/arch/arm/Kconfig                         |   1 +
>>  xen/arch/x86/Kconfig                         |   1 +
>>  xen/common/Kconfig                           |  14 +
>>  xen/common/Makefile                          |   2 +
>>  xen/common/sysctl.c                          |   8 +
>>  xen/common/xsplice.c                         | 386 
> +++++++++++++++++++++++++++
>>  xen/include/public/sysctl.h                  | 156 +++++++++++
>>  xen/include/xen/xsplice.h                    |   7 +
>>  xen/xsm/flask/hooks.c                        |   6 +
>>  xen/xsm/flask/policy/access_vectors          |   2 +
>>  11 files changed, 584 insertions(+)
>>  create mode 100644 xen/common/xsplice.c
>>  create mode 100644 xen/include/xen/xsplice.h
>> 
>> diff --git a/tools/flask/policy/policy/modules/xen/xen.te 
> b/tools/flask/policy/policy/modules/xen/xen.te
>> index d35ae22..542c3e1 100644
>> --- a/tools/flask/policy/policy/modules/xen/xen.te
>> +++ b/tools/flask/policy/policy/modules/xen/xen.te
>> @@ -72,6 +72,7 @@ allow dom0_t xen_t:xen2 {
>>  allow dom0_t xen_t:xen2 {
>>      pmu_ctrl
>>      get_symbol
>> +    xsplice_op
>>  };
>>  allow dom0_t xen_t:mmu memorymap;
>>  
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index 60e923c..3780949 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -23,6 +23,7 @@ config ARM
>>  	select HAS_PASSTHROUGH
>>  	select HAS_PDX
>>  	select HAS_VIDEO
>> +	select HAS_XSPLICE
>>  
>>  config ARCH_DEFCONFIG
>>  	string
>> diff --git a/xen/arch/x86/Kconfig b/xen/arch/x86/Kconfig
>> index 4781b34..2b6c832 100644
>> --- a/xen/arch/x86/Kconfig
>> +++ b/xen/arch/x86/Kconfig
>> @@ -18,6 +18,7 @@ config X86
>>  	select HAS_PCI
>>  	select HAS_PDX
>>  	select HAS_VGA
>> +	select HAS_XSPLICE
>>  
>>  config ARCH_DEFCONFIG
>>  	string
>> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
>> index eadfc3b..aaf4053 100644
>> --- a/xen/common/Kconfig
>> +++ b/xen/common/Kconfig
>> @@ -51,6 +51,9 @@ config HAS_GDBSX
>>  config HAS_IOPORTS
>>  	bool
>>  
>> +config HAS_XSPLICE
>> +	bool
>> +
>>  # Enable/Disable kexec support
>>  config KEXEC
>>  	bool "kexec support"
>> @@ -97,4 +100,15 @@ config XSM
>>  
>>  	  If unsure, say N.
>>  
>> +# Enable/Disable xsplice support
>> +config XSPLICE
>> +	bool "xsplice support"
>> +	default y
>> +	depends on HAS_XSPLICE
>> +	---help---
>> +	  Allows a running Xen hypervisor to be patched without rebooting.
>> +	  This is primarily used to patch an hypervisor with XSA fixes.
>> +
>> +	  If unsure, say Y.
>> +
>>  endmenu
> 
> I'm indifferent on the HAS_XSPLICE, you can drop that if you want to
> simply stuff.
> 
> 
>> diff --git a/xen/common/Makefile b/xen/common/Makefile
>> index 9f8b214..6fdeccf 100644
>> --- a/xen/common/Makefile
>> +++ b/xen/common/Makefile
>> @@ -71,3 +71,5 @@ subdir-$(coverage) += gcov
>>  
>>  subdir-y += libelf
>>  subdir-$(CONFIG_HAS_DEVICE_TREE) += libfdt
>> +
>> +obj-$(CONFIG_XSPLICE) += xsplice.o
>> diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
>> index a3007b8..55e6cfa 100644
>> --- a/xen/common/sysctl.c
>> +++ b/xen/common/sysctl.c
>> @@ -28,6 +28,7 @@
>>  #include <xsm/xsm.h>
>>  #include <xen/pmstat.h>
>>  #include <xen/gcov.h>
>> +#include <xen/xsplice.h>
>>  
>>  long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
>>  {
>> @@ -460,6 +461,13 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) 
> u_sysctl)
>>          ret = tmem_control(&op->u.tmem_op);
>>          break;
>>  
>> +#ifdef CONFIG_XSPLICE
>> +    case XEN_SYSCTL_xsplice_op:
>> +        ret = xsplice_control(&op->u.xsplice);
>> +        copyback = 1;
>> +        break;
>> +#endif
> 
> Should the case statement still exist and not just return -ENOSYS?
> Otherwise we're needlessly going into arch_do_sysctl() just to get the
> same result.
> 
> 
>> +
>>      default:
>>          ret = arch_do_sysctl(op, u_sysctl);
>>          copyback = 0;
>> diff --git a/xen/common/xsplice.c b/xen/common/xsplice.c
>> new file mode 100644
>> index 0000000..3c6acc3
>> --- /dev/null
>> +++ b/xen/common/xsplice.c
>> @@ -0,0 +1,386 @@
>> +/*
>> + * Copyright (c) 2016 Oracle and/or its affiliates. All rights reserved.
>> + *
>> + */
>> +
>> +#include <xen/guest_access.h>
>> +#include <xen/keyhandler.h>
>> +#include <xen/lib.h>
>> +#include <xen/list.h>
>> +#include <xen/mm.h>
>> +#include <xen/sched.h>
>> +#include <xen/smp.h>
>> +#include <xen/spinlock.h>
>> +#include <xen/xsplice.h>
>> +
>> +#include <asm/event.h>
>> +#include <public/sysctl.h>
>> +
>> +static DEFINE_SPINLOCK(payload_list_lock);
>> +static LIST_HEAD(payload_list);
>> +
>> +static unsigned int payload_cnt;
>> +static unsigned int payload_version = 1;
>> +
>> +struct payload {
>> +    int32_t state;                       /* One of the XSPLICE_STATE_*. */
>> +    int32_t rc;                          /* 0 or -XEN_EXX. */
>> +    struct list_head list;               /* Linked to 'payload_list'. */
>> +    char name[XEN_XSPLICE_NAME_SIZE + 1];/* Name of it. */
>> +};
>> +
>> +static const char *state2str(int32_t state)
>> +{
>> +#define STATE(x) [XSPLICE_STATE_##x] = #x
>> +    static const char *const names[] = {
>> +            STATE(LOADED),
>> +            STATE(CHECKED),
>> +            STATE(APPLIED),
>> +    };
>> +#undef STATE
>> +
>> +    if (state >= ARRAY_SIZE(names))
>> +        return "unknown";
>> +
>> +    if (state < 0)
>> +        return "-EXX";
>> +
>> +    if (!names[state])
>> +        return "unknown";
>> +
>> +    return names[state];
>> +}
>> +
>> +static void xsplice_printall(unsigned char key)
>> +{
>> +    struct payload *data;
>> +
>> +    spin_lock(&payload_list_lock);
>> +
>> +    list_for_each_entry ( data, &payload_list, list )
>> +        printk(" name=%s state=%s(%d)\n", data->name,
>> +               state2str(data->state), data->state);
>> +
>> +    spin_unlock(&payload_list_lock);
>> +}
>> +
>> +static int verify_name(xen_xsplice_name_t *name)
>> +{
>> +    if ( name->size == 0 || name->size > XEN_XSPLICE_NAME_SIZE )
>> +        return -EINVAL;
>> +
>> +    if ( name->pad[0] || name->pad[1] || name->pad[2] )
>> +        return -EINVAL;
>> +
>> +    if ( !guest_handle_okay(name->name, name->size) )
>> +        return -EINVAL;
>> +
>> +    return 0;
>> +}
>> +
>> +static int find_payload(xen_xsplice_name_t *name, bool_t need_lock,
>> +                        struct payload **f)
>> +{
>> +    struct payload *data;
>> +    XEN_GUEST_HANDLE_PARAM(char) str;
>> +    char n[XEN_XSPLICE_NAME_SIZE + 1] = { 0 };
>> +    int rc = -EINVAL;
>> +
>> +    rc = verify_name(name);
>> +    if ( rc )
>> +        return rc;
>> +
>> +    str = guest_handle_cast(name->name, char);
>> +    if ( copy_from_guest(n, str, name->size) )
>> +        return -EFAULT;
>> +
>> +    if ( need_lock )
>> +        spin_lock(&payload_list_lock);
>> +
>> +    rc = -ENOENT;
>> +    list_for_each_entry ( data, &payload_list, list )
>> +    {
>> +        if ( !strcmp(data->name, n) )
>> +        {
>> +            *f = data;
>> +            rc = 0;
>> +            break;
>> +        }
>> +    }
>> +
>> +    if ( need_lock )
>> +        spin_unlock(&payload_list_lock);
>> +
>> +    return rc;
>> +}
>> +
>> +static int verify_payload(xen_sysctl_xsplice_upload_t *upload)
>> +{
>> +    if ( verify_name(&upload->name) )
>> +        return -EINVAL;
>> +
>> +    if ( upload->size == 0 )
>> +        return -EINVAL;
>> +
>> +    if ( !guest_handle_okay(upload->payload, upload->size) )
>> +        return -EFAULT;
>> +
>> +    return 0;
>> +}
>> +
>> +/*
>> + * We MUST be holding the payload_list_lock spinlock.
>> + */
>> +static void free_payload(struct payload *data)
>> +{
>> +    list_del(&data->list);
>> +    payload_cnt--;
>> +    payload_version++;
>> +    xfree(data);
>> +}
>> +
>> +static int xsplice_upload(xen_sysctl_xsplice_upload_t *upload)
>> +{
>> +    struct payload *data = NULL;
>> +    uint8_t *raw_data;
>> +    int rc;
>> +
>> +    rc = verify_payload(upload);
>> +    if ( rc )
>> +        return rc;
>> +
>> +    rc = find_payload(&upload->name, 1 /* true. */, &data);
>> +    if ( rc == 0 /* Found. */ )
>> +        return -EEXIST;
>> +
>> +    if ( rc != -ENOENT )
>> +        return rc;
>> +
>> +    data = xzalloc(struct payload);
>> +    if ( !data )
>> +        return -ENOMEM;
>> +
>> +    memset(data, 0, sizeof *data);
>> +    rc = -EFAULT;
>> +    if ( copy_from_guest(data->name, upload->name.name, upload->name.size) )
>> +        goto err_data;
>> +
>> +    rc = -ENOMEM;
>> +    raw_data = alloc_xenheap_pages(get_order_from_bytes(upload->size), 0);
>> +    if ( !raw_data )
>> +        goto err_data;
>> +
>> +    rc = -EFAULT;
>> +    if ( copy_from_guest(raw_data, upload->payload, upload->size) )
>> +        goto err_raw;
>> +
>> +    data->state = XSPLICE_STATE_LOADED;
>> +    data->rc = 0;
>> +    INIT_LIST_HEAD(&data->list);
>> +
>> +    spin_lock(&payload_list_lock);
>> +    list_add_tail(&data->list, &payload_list);
>> +    payload_cnt++;
>> +    payload_version++;
>> +    spin_unlock(&payload_list_lock);
>> +
>> +    free_xenheap_pages(raw_data, get_order_from_bytes(upload->size));
>> +    return 0;
>> +
>> + err_raw:
>> +    free_xenheap_pages(raw_data, get_order_from_bytes(upload->size));
>> + err_data:
>> +    xfree(data);
>> +    return rc;
>> +}
>> +
>> +static int xsplice_get(xen_sysctl_xsplice_summary_t *summary)
>> +{
>> +    struct payload *data;
>> +    int rc;
>> +
>> +    if ( summary->status.state )
>> +        return -EINVAL;
>> +
>> +    if ( summary->status.rc != 0 )
>> +        return -EINVAL;
>> +
>> +    rc = verify_name(&summary->name);
>> +    if ( rc )
>> +        return rc;
>> +
>> +    rc = find_payload(&summary->name, 1 /* true. */, &data);
>> +    if ( rc )
>> +        return rc;
>> +
>> +    summary->status.state = data->state;
>> +    summary->status.rc = data->rc;
>> +
>> +    return 0;
>> +}
>> +
>> +static int xsplice_list(xen_sysctl_xsplice_list_t *list)
>> +{
>> +    xen_xsplice_status_t status;
>> +    struct payload *data;
>> +    unsigned int idx = 0, i = 0;
>> +    int rc = 0;
>> +
>> +    if ( list->nr > 1024 )
>> +        return -E2BIG;
>> +
>> +    if ( list->pad != 0 )
>> +        return -EINVAL;
>> +
>> +    if ( !guest_handle_okay(list->status, sizeof(status) * list->nr) ||
>> +         !guest_handle_okay(list->name, XEN_XSPLICE_NAME_SIZE * list->nr) ||
>> +         !guest_handle_okay(list->len, sizeof(uint32_t) * list->nr) )
>> +        return -EINVAL;
>> +
>> +    spin_lock(&payload_list_lock);
>> +    if ( list->idx > payload_cnt || !list->nr )
>> +    {
>> +        spin_unlock(&payload_list_lock);
>> +        return -EINVAL;
>> +    }
>> +
>> +    list_for_each_entry( data, &payload_list, list )
>> +    {
>> +        uint32_t len;
>> +
>> +        if ( list->idx > i++ )
>> +            continue;
>> +
>> +        status.state = data->state;
>> +        status.rc = data->rc;
>> +        len = strlen(data->name);
>> +
>> +        /* N.B. 'idx' != 'i'. */
>> +        if ( __copy_to_guest_offset(list->name, idx * XEN_XSPLICE_NAME_SIZE,
>> +                                    data->name, len) ||
>> +             __copy_to_guest_offset(list->len, idx, &len, 1) ||
>> +             __copy_to_guest_offset(list->status, idx, &status, 1) )
>> +        {
>> +            rc = -EFAULT;
>> +            break;
>> +        }
>> +        idx++;
>> +        if ( hypercall_preempt_check() || (idx + 1 > list->nr) )
>> +            break;
>> +    }
>> +    list->nr = payload_cnt - i; /* Remaining amount. */
>> +    list->version = payload_version;
>> +    spin_unlock(&payload_list_lock);
>> +
>> +    /* And how many we have processed. */
>> +    return rc ? : idx;
>> +}
>> +
>> +static int xsplice_action(xen_sysctl_xsplice_action_t *action)
>> +{
>> +    struct payload *data;
>> +    int rc;
>> +
>> +    rc = verify_name(&action->name);
>> +    if ( rc )
>> +        return rc;
>> +
>> +    spin_lock(&payload_list_lock);
>> +    rc = find_payload(&action->name, 0 /* We are holding the lock. */, &data);
>> +    if ( rc )
>> +        goto out;
>> +
>> +    switch ( action->cmd )
>> +    {
>> +    case XSPLICE_ACTION_CHECK:
>> +        if ( (data->state == XSPLICE_STATE_LOADED) ||
>> +             (data->state == XSPLICE_STATE_CHECKED) )
>> +        {
>> +            /* No implementation yet. */
>> +            data->state = XSPLICE_STATE_CHECKED;
>> +            data->rc = 0;
>> +            rc = 0;
>> +        }
>> +        break;
>> +    case XSPLICE_ACTION_UNLOAD:
>> +        if ( (data->state == XSPLICE_STATE_LOADED) ||
>> +             (data->state == XSPLICE_STATE_CHECKED) )
>> +        {
>> +            free_payload(data);
>> +            /* No touching 'data' from here on! */
>> +            rc = 0;
>> +        }
>> +        break;
>> +    case XSPLICE_ACTION_REVERT:
>> +        if ( data->state == XSPLICE_STATE_APPLIED )
>> +        {
>> +            /* No implementation yet. */
>> +            data->state = XSPLICE_STATE_CHECKED;
>> +            data->rc = 0;
>> +            rc = 0;
>> +        }
>> +        break;
>> +    case XSPLICE_ACTION_APPLY:
>> +        if ( (data->state == XSPLICE_STATE_CHECKED) )
>> +        {
>> +            /* No implementation yet. */
>> +            data->state = XSPLICE_STATE_APPLIED;
>> +            data->rc = 0;
>> +            rc = 0;
>> +        }
>> +        break;
>> +    case XSPLICE_ACTION_REPLACE:
>> +        if ( data->state == XSPLICE_STATE_CHECKED )
>> +        {
>> +            /* No implementation yet. */
>> +            data->state = XSPLICE_STATE_CHECKED;
>> +            data->rc = 0;
>> +            rc = 0;
>> +        }
>> +        break;
>> +    default:
>> +        rc = -EOPNOTSUPP;
>> +        break;
>> +    }
>> +
>> + out:
>> +    spin_unlock(&payload_list_lock);
>> +
>> +    return rc;
>> +}
>> +
>> +int xsplice_control(xen_sysctl_xsplice_op_t *xsplice)
>> +{
>> +    int rc;
>> +
>> +    if ( xsplice->pad != 0 )
>> +        return -EINVAL;
>> +
>> +    switch ( xsplice->cmd )
>> +    {
>> +    case XEN_SYSCTL_XSPLICE_UPLOAD:
>> +        rc = xsplice_upload(&xsplice->u.upload);
>> +        break;
>> +    case XEN_SYSCTL_XSPLICE_GET:
>> +        rc = xsplice_get(&xsplice->u.get);
>> +        break;
>> +    case XEN_SYSCTL_XSPLICE_LIST:
>> +        rc = xsplice_list(&xsplice->u.list);
>> +        break;
>> +    case XEN_SYSCTL_XSPLICE_ACTION:
>> +        rc = xsplice_action(&xsplice->u.action);
>> +        break;
>> +    default:
>> +        rc = -EOPNOTSUPP;
>> +        break;
>> +   }
>> +
>> +    return rc;
>> +}
>> +
>> +static int __init xsplice_init(void)
>> +{
>> +    register_keyhandler('x', xsplice_printall, "print xsplicing info", 1);
>> +    return 0;
>> +}
>> +__initcall(xsplice_init);
>> diff --git a/xen/include/public/sysctl.h b/xen/include/public/sysctl.h
>> index 96680eb..0b0b879 100644
>> --- a/xen/include/public/sysctl.h
>> +++ b/xen/include/public/sysctl.h
>> @@ -766,6 +766,160 @@ struct xen_sysctl_tmem_op {
>>  typedef struct xen_sysctl_tmem_op xen_sysctl_tmem_op_t;
>>  DEFINE_XEN_GUEST_HANDLE(xen_sysctl_tmem_op_t);
>>  
>> +/*
>> + * XEN_SYSCTL_XSPLICE_op
>> + *
>> + * Refer to the 
> http://xenbits.xenproject.org/docs/unstable/misc/xsplice.html 
>> + * for the design details of this hyprcall.
>> + */
>> +
>> +/*
>> + * Structure describing an ELF payload. Uniquely identifies the
>> + * payload. Should be human readable.
>> + * Recommended length is upto XEN_XSPLICE_NAME_SIZE.
>> + */
>> +#define XEN_XSPLICE_NAME_SIZE 128
>> +struct xen_xsplice_name {
>> +    XEN_GUEST_HANDLE_64(char) name;         /* IN: pointer to name. */
>> +    uint16_t size;                          /* IN: size of name. May be 
> upto
>> +                                               XEN_XSPLICE_NAME_SIZE. */
>> +    uint16_t pad[3];                        /* IN: MUST be zero. */
>> +};
>> +typedef struct xen_xsplice_name xen_xsplice_name_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_xsplice_name_t);
>> +
>> +/*
>> + * Upload a payload to the hypervisor. The payload is verified
>> + * against basic checks and if there are any issues the proper return code
>> + * will be returned. The payload is not applied at this time - that is
>> + * controlled by XEN_SYSCTL_XSPLICE_ACTION.
>> + *
>> + * The return value is zero if the payload was succesfully uploaded.
>> + * Otherwise an EXX return value is provided. Duplicate `name` are not
>> + * supported.
>> + *
>> + * The payload at this point is verified against the basic checks.
>> + *
>> + * The `payload` is the ELF payload as mentioned in the `Payload format`
>> + * section in the xSplice design document.
>> + */
>> +#define XEN_SYSCTL_XSPLICE_UPLOAD 0
>> +struct xen_sysctl_xsplice_upload {
>> +    xen_xsplice_name_t name;                /* IN, name of the patch. */
>> +    uint64_t size;                          /* IN, size of the ELF file. */
>> +    XEN_GUEST_HANDLE_64(uint8) payload;     /* IN, the ELF file. */
>> +};
>> +typedef struct xen_sysctl_xsplice_upload xen_sysctl_xsplice_upload_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_upload_t);
>> +
>> +/*
>> + * Retrieve an status of an specific payload.
>> + *
>> + * Upon completion the `struct xen_xsplice_status` is updated.
>> + *
>> + * The return value is zero on success and XEN_EXX on failure. This 
> operation
>> + * is synchronous and does not require preemption.
>> + */
>> +#define XEN_SYSCTL_XSPLICE_GET 1
>> +
>> +struct xen_xsplice_status {
>> +#define XSPLICE_STATE_LOADED       1
>> +#define XSPLICE_STATE_CHECKED      2
>> +#define XSPLICE_STATE_APPLIED      3
>> +    int32_t state;                 /* OUT: XSPLICE_STATE_*. IN: MUST be 
> zero. */
>> +    int32_t rc;                    /* OUT: 0 if no error, otherwise 
> -XEN_EXX. */
>> +                                   /* IN: MUST be zero. */
>> +};
>> +typedef struct xen_xsplice_status xen_xsplice_status_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_xsplice_status_t);
>> +
>> +struct xen_sysctl_xsplice_summary {
>> +    xen_xsplice_name_t name;                /* IN, name of the payload. */
>> +    xen_xsplice_status_t status;            /* IN/OUT, state of it. */
>> +};
>> +typedef struct xen_sysctl_xsplice_summary xen_sysctl_xsplice_summary_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_summary_t);
>> +
>> +/*
>> + * Retrieve an array of abbreviated status and names of payloads that are
>> + * loaded in the hypervisor.
>> + *
>> + * If the hypercall returns an positive number, it is the number (up to 
> `nr`)
>> + * of the payloads returned, along with `nr` updated with the number of 
> remaining
>> + * payloads, `version` updated (it may be the same across hypercalls. If it
>> + * varies the data is stale and further calls could fail). The `status`,
>> + * `name`, and `len`' are updated at their designed index value (`idx`) 
> with
>> + * the returned value of data.
>> + *
>> + * If the hypercall returns E2BIG the `nr` is too big and should be
>> + * lowered.
>> + *
>> + * This operation can be preempted by the hypercall returning EAGAIN.
>> + * Retry.
>> + *
>> + * Note that due to the asynchronous nature of hypercalls the domain might 
> have
>> + * added or removed the number of payloads making this information stale. 
> It is
>> + * the responsibility of the toolstack to use the `version` field to check
>> + * between each invocation. if the version differs it should discard the 
> stale
>> + * data and start from scratch. It is OK for the toolstack to use the new
>> + * `version` field.
>> + */
>> +#define XEN_SYSCTL_XSPLICE_LIST 2
>> +struct xen_sysctl_xsplice_list {
>> +    uint32_t version;                       /* IN/OUT: Initially *MUST* be 
> zero.
>> +                                               On subsequent calls reuse 
> value.
>> +                                               If varies between calls, we 
> are
>> +                                             * getting stale data. */
>> +    uint32_t idx;                           /* IN/OUT: Index into array. */
>> +    uint32_t nr;                            /* IN: How many status, id, and 
> len
>> +                                               should fill out.
>> +                                               OUT: How many payloads left. 
> */
>> +    uint32_t pad;                           /* IN: Must be zero. */
>> +    XEN_GUEST_HANDLE_64(xen_xsplice_status_t) status;  /* OUT. Must have 
> enough
>> +                                               space allocate for nr of 
> them. */
>> +    XEN_GUEST_HANDLE_64(char) name;         /* OUT: Array of ids. Each 
> member
>> +                                               MUST XEN_XSPLICE_NAME_SIZE 
> in size.
>> +                                               Must have nr of them. */
>> +    XEN_GUEST_HANDLE_64(uint32) len;        /* OUT: Array of lengths of 
> ids.
>> +                                               Must have nr of them. */
>> +};
>> +typedef struct xen_sysctl_xsplice_list xen_sysctl_xsplice_list_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_list_t);
>> +
>> +/*
>> + * Perform an operation on the payload structure referenced by the `name` 
> field.
>> + * The operation request is asynchronous and the status should be retrieved
>> + * by using either XEN_SYSCTL_XSPLICE_GET or XEN_SYSCTL_XSPLICE_LIST 
> hypercall.
>> + */
>> +#define XEN_SYSCTL_XSPLICE_ACTION 3
>> +struct xen_sysctl_xsplice_action {
>> +    xen_xsplice_name_t name;                /* IN, name of the patch. */
>> +#define XSPLICE_ACTION_CHECK        1
>> +#define XSPLICE_ACTION_UNLOAD       2
>> +#define XSPLICE_ACTION_REVERT       3
>> +#define XSPLICE_ACTION_APPLY        4
>> +#define XSPLICE_ACTION_REPLACE      5
>> +    uint32_t cmd;                           /* IN: XSPLICE_ACTION_*. */
>> +    uint32_t timeout;                       /* IN: Zero if no timeout. */
>> +                                            /* Or upper bound of time (ms) 
> */
>> +                                            /* for operation to take. */
>> +};
>> +typedef struct xen_sysctl_xsplice_action xen_sysctl_xsplice_action_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_action_t);
>> +
>> +struct xen_sysctl_xsplice_op {
>> +    uint32_t cmd;                           /* IN: XEN_SYSCTL_XSPLICE_*. */
>> +    uint32_t pad;                           /* IN: Always zero. */
>> +    union {
>> +        xen_sysctl_xsplice_upload_t upload;
>> +        xen_sysctl_xsplice_list_t list;
>> +        xen_sysctl_xsplice_summary_t get;
>> +        xen_sysctl_xsplice_action_t action;
>> +    } u;
>> +};
>> +typedef struct xen_sysctl_xsplice_op xen_sysctl_xsplice_op_t;
>> +DEFINE_XEN_GUEST_HANDLE(xen_sysctl_xsplice_op_t);
>> +
>>  struct xen_sysctl {
>>      uint32_t cmd;
>>  #define XEN_SYSCTL_readconsole                    1
>> @@ -791,6 +945,7 @@ struct xen_sysctl {
>>  #define XEN_SYSCTL_pcitopoinfo                   22
>>  #define XEN_SYSCTL_psr_cat_op                    23
>>  #define XEN_SYSCTL_tmem_op                       24
>> +#define XEN_SYSCTL_xsplice_op                    25
>>      uint32_t interface_version; /* XEN_SYSCTL_INTERFACE_VERSION */
>>      union {
>>          struct xen_sysctl_readconsole       readconsole;
>> @@ -816,6 +971,7 @@ struct xen_sysctl {
>>          struct xen_sysctl_psr_cmt_op        psr_cmt_op;
>>          struct xen_sysctl_psr_cat_op        psr_cat_op;
>>          struct xen_sysctl_tmem_op           tmem_op;
>> +        struct xen_sysctl_xsplice_op        xsplice;
>>          uint8_t                             pad[128];
>>      } u;
>>  };
>> diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
>> new file mode 100644
>> index 0000000..2cb2035
>> --- /dev/null
>> +++ b/xen/include/xen/xsplice.h
>> @@ -0,0 +1,7 @@
>> +#ifndef __XEN_XSPLICE_H__
>> +#define __XEN_XSPLICE_H__
>> +
>> +struct xen_sysctl_xsplice_op;
>> +int xsplice_control(struct xen_sysctl_xsplice_op *);
>> +
>> +#endif /* __XEN_XSPLICE_H__ */
>> diff --git a/xen/xsm/flask/hooks.c b/xen/xsm/flask/hooks.c
>> index 9b7de30..5346dcf 100644
>> --- a/xen/xsm/flask/hooks.c
>> +++ b/xen/xsm/flask/hooks.c
>> @@ -807,6 +807,12 @@ static int flask_sysctl(int cmd)
>>      case XEN_SYSCTL_tmem_op:
>>          return domain_has_xen(current->domain, XEN__TMEM_CONTROL);
>>  
>> +#ifdef CONFIG_XSPLICE
>> +    case XEN_SYSCTL_xsplice_op:
>> +        return avc_current_has_perm(SECINITSID_XEN, SECCLASS_XEN2,
>> +                                    XEN2__XSPLICE_OP, NULL);
>> +#endif
>> +
>>      default:
>>          printk("flask_sysctl: Unknown op %d\n", cmd);
>>          return -EPERM;
>> diff --git a/xen/xsm/flask/policy/access_vectors 
> b/xen/xsm/flask/policy/access_vectors
>> index effb59f..5f08d05 100644
>> --- a/xen/xsm/flask/policy/access_vectors
>> +++ b/xen/xsm/flask/policy/access_vectors
>> @@ -93,6 +93,8 @@ class xen2
>>      pmu_ctrl
>>  # PMU use (domains, including unprivileged ones, will be using this 
> operation)
>>      pmu_use
>> +# XEN_SYSCTL_xsplice_op
>> +    xsplice_op
>>  }
>>  
>>  # Classes domain and domain2 consist of operations that a domain performs 
> on
>> 
> 
> 
> -- 
> Doug Goldstein

Please trim your replies.

Thanks, Jan

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7)
  2016-02-06 22:35   ` Doug Goldstein
  2016-02-09  8:28     ` Jan Beulich
@ 2016-02-09 14:39     ` Konrad Rzeszutek Wilk
  1 sibling, 0 replies; 45+ messages in thread
From: Konrad Rzeszutek Wilk @ 2016-02-09 14:39 UTC (permalink / raw)
  To: Doug Goldstein
  Cc: wei.liu2, ian.campbell, andrew.cooper3, ian.jackson, mpohlack,
	ross.lagerwall, stefano.stabellini, jbeulich, sasha.levin,
	xen-devel

> > +# Enable/Disable xsplice support
> > +config XSPLICE
> > +	bool "xsplice support"
> > +	default y
> > +	depends on HAS_XSPLICE
> > +	---help---
> > +	  Allows a running Xen hypervisor to be patched without rebooting.
> > +	  This is primarily used to patch an hypervisor with XSA fixes.
> > +
> > +	  If unsure, say Y.
> > +
> >  endmenu
> 
> I'm indifferent on the HAS_XSPLICE, you can drop that if you want to
> simply stuff.

OK, let me remove it. Thanks!
.. snip..
> > +#ifdef CONFIG_XSPLICE
> > +    case XEN_SYSCTL_xsplice_op:
> > +        ret = xsplice_control(&op->u.xsplice);
> > +        copyback = 1;
> > +        break;
> > +#endif
> 
> Should the case statement still exist and not just return -ENOSYS?
> Otherwise we're needlessly going into arch_do_sysctl() just to get the
> same result.

There is some other code in that function, such as:

398 
399 #ifdef TEST_COVERAGE
400     case XEN_SYSCTL_coverage_op:
401         ret = sysctl_coverage_op(&op->u.coverage_op);
402         break;
403 #endif


Which follows that pattern. I figured I would the same thing as the rest
of the code around it. It is not that neat ..

But I could also make this a bit neater and have the header file return
-ENOSYS, like this: 

diff --git a/xen/common/sysctl.c b/xen/common/sysctl.c
index b81b8cd..a8355c5 100644
--- a/xen/common/sysctl.c
+++ b/xen/common/sysctl.c
@@ -461,12 +461,11 @@ long do_sysctl(XEN_GUEST_HANDLE_PARAM(xen_sysctl_t) u_sysctl)
         ret = tmem_control(&op->u.tmem_op);
         break;
 
-#ifdef CONFIG_XSPLICE
     case XEN_SYSCTL_xsplice_op:
         ret = xsplice_control(&op->u.xsplice);
-        copyback = 1;
+        if ( ret != -ENOSYS )
+            copyback = 1;
         break;
-#endif
 
     default:
         ret = arch_do_sysctl(op, u_sysctl);
diff --git a/xen/include/xen/xsplice.h b/xen/include/xen/xsplice.h
index 3c1b2e4..a0ae95e 100644
--- a/xen/include/xen/xsplice.h
+++ b/xen/include/xen/xsplice.h
@@ -23,15 +23,18 @@ struct xsplice_patch_func {
     uint8_t pad[24];
 };
 
-int xsplice_control(struct xen_sysctl_xsplice_op *);
-
 #ifdef CONFIG_XSPLICE
+int xsplice_control(struct xen_sysctl_xsplice_op *);
 void do_xsplice(void);
 struct bug_frame *xsplice_find_bug(const char *eip, int *id);
 bool_t is_module(const void *addr);
 bool_t is_active_module_text(unsigned long addr);
 unsigned long search_module_extables(unsigned long addr);
 #else
+static inline xsplice_control(struct xen_sysctl_xsplice_op *)
+{
+    return -ENOSYS;
+}
 static inline void do_xsplice(void) { };
 static inline struct bug_frame *xsplice_find_bug(const char *eip, int *id)
 {

Yeah, let me roll that in.
Thanks!

^ permalink raw reply related	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2016-02-09 14:39 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-14 21:46 [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
2016-01-14 21:46 ` [PATCH v2 01/13] xsplice: Design document (v5) Konrad Rzeszutek Wilk
2016-01-19 11:14   ` Wei Liu
2016-01-19 14:31   ` Ross Lagerwall
2016-02-05 18:27     ` Konrad Rzeszutek Wilk
2016-02-05 18:34     ` Konrad Rzeszutek Wilk
2016-02-05 15:25   ` Jan Beulich
2016-02-05 21:47     ` Konrad Rzeszutek Wilk
2016-02-09  8:25       ` Jan Beulich
2016-01-14 21:47 ` [PATCH v2 02/13] hypervisor/arm/keyhandler: Declare struct cpu_user_regs; Konrad Rzeszutek Wilk
2016-01-14 21:47 ` [PATCH v2 03/13] xen/xsplice: Hypervisor implementation of XEN_XSPLICE_op (v7) Konrad Rzeszutek Wilk
2016-01-19 14:30   ` Ross Lagerwall
2016-02-06 22:35   ` Doug Goldstein
2016-02-09  8:28     ` Jan Beulich
2016-02-09 14:39     ` Konrad Rzeszutek Wilk
2016-01-14 21:47 ` [PATCH v2 04/13] libxc: Implementation of XEN_XSPLICE_op in libxc (v4) Konrad Rzeszutek Wilk
2016-01-19 11:14   ` Wei Liu
2016-01-14 21:47 ` [PATCH v2 05/13] xen-xsplice: Tool to manipulate xsplice payloads (v3) Konrad Rzeszutek Wilk
2016-01-19 11:14   ` Wei Liu
2016-01-19 14:30   ` Ross Lagerwall
2016-01-14 21:47 ` [PATCH v2 06/13] elf: Add relocation types to elfstructs.h Konrad Rzeszutek Wilk
2016-01-14 21:47 ` [PATCH v2 07/13] xsplice: Add helper elf routines (v2) Konrad Rzeszutek Wilk
2016-01-19 14:33   ` Ross Lagerwall
2016-02-05 18:38     ` Konrad Rzeszutek Wilk
2016-02-05 20:34       ` Konrad Rzeszutek Wilk
2016-01-14 21:47 ` [PATCH v2 08/13] xsplice: Implement payload loading (v2) Konrad Rzeszutek Wilk
2016-01-19 14:34   ` Ross Lagerwall
2016-01-19 16:59     ` Konrad Rzeszutek Wilk
2016-01-25 11:21       ` Ross Lagerwall
2016-01-19 16:45   ` Ross Lagerwall
2016-01-14 21:47 ` [PATCH v2 09/13] xsplice: Implement support for applying/reverting/replacing patches. (v2) Konrad Rzeszutek Wilk
2016-01-19 14:39   ` Ross Lagerwall
2016-01-19 16:55     ` Konrad Rzeszutek Wilk
2016-01-25 11:43       ` Ross Lagerwall
2016-02-05 19:30         ` Konrad Rzeszutek Wilk
2016-01-14 21:47 ` [PATCH v2 10/13] xen_hello_world.xsplice: Test payload for patching 'xen_extra_version' Konrad Rzeszutek Wilk
2016-01-19 11:14   ` Wei Liu
2016-01-19 14:57   ` Ross Lagerwall
2016-01-19 16:47   ` Ross Lagerwall
2016-01-14 21:47 ` [PATCH v2 11/13] xsplice: Add support for bug frames. (v2) Konrad Rzeszutek Wilk
2016-01-19 14:42   ` Ross Lagerwall
2016-01-14 21:47 ` [PATCH v2 12/13] xsplice: Add support for exception tables. (v2) Konrad Rzeszutek Wilk
2016-01-14 21:47 ` [PATCH v2 13/13] xsplice: Add support for alternatives Konrad Rzeszutek Wilk
2016-01-15 16:58 ` [PATCH v2] xSplice v1 implementation Konrad Rzeszutek Wilk
2016-01-25 11:57   ` Ross Lagerwall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.