All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v15 00/11] KVM//x86/arm/arm64: dirty page logging for ARMv7/8 (3.18.0-rc2)
@ 2014-12-15  7:27 ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

Patch series adds support for armv7/8 dirty page logging. Also we move 
towards generic dirty page logging interface and move some common code to 
generic layer currently shared by x86, armv7 and armv8. 

armv7/8 Dirty page logging implementation overivew-
- initially write protects memory region 2nd stage page tables
- read dirty page log and again write protect dirty pages for next pass.
- second stage huge page are dissolved into pages to keep track of
  dirty memory at page granularity. Tracking at huge page granularity limits
  migration to an almost idle system. Small page size logging supports higher
  memory dirty rates. armv7 supports only 2MB Huge pages, armv8 may support
  2MB with kernel configurered to 4k page and 512MB for 64k page. Additional 
  logic has been included to support PUD sized 2nd stage 1GB huge pages which 
  apply to 4k page, 48 bit address space. Host kernel and ARM KVM support 2MB 
  and 512MB huge pages.
- In the event migration is canceled, normal behavior is resumed huge pages
  are rebuilt over time.

Testing:
- ARMv7: 
  o Fast Models Live Migration and shared memory mmio described below.
    For both instances correctness is validated through checksum of source and
    destination file copies on both ends. Precise tests with instructions
    will appear shortly at:  
	https://github.com/mjsmar/arm-dirtylog-tests
  o To test migration Christoffer's patches need to be applied 
    https://lists.cs.columbia.edu/pipermail/kvmarm/2014-December/012809.html
    "Fix vgic initialization problems". You can try validate without the patches
    (through checksums) but destination VM will not be responsive
  o Tested with 2MB huge pages, 4k pages

- ARMv8:
  o Currently migration is not supported on ARMv8 another method is used to
    validate dirty page logging. Used Foundation Model 9.0.xx for testing.
    Again details will appear at: 
	https://github.com/mjsmar/arm-dirtylog-tests

  o Test Description:
    - Added mmio device to QEMU 'virt' with on board memory (8MB in this case),
      Device memory is Posix shared memory segment visible to host. 
      Enabled dirty logging for that memslot.
    - Added memslot migration thread to export dirty bit map to host.
    - Implemented memory migration thread on host.
  
  o Operation:
    - On Guest application mmaps() the region and writes to it
    - The host migration thread does a pre-copy of /dev/shm/aeshmem to a host
      file, repeatedly requests QEMU for memory region dirty page log,
      incrementally copies dirty pages from /dev/shm/aeshmem to host file.
    - Guest application is stopped and both /dev/shm/aeshmem and host file are
      checksummed and check for match to validate dirty page log applied
      incremental updates, validating dirty page logging.
    - Tested with 2MB huge pages, 64k pages. 
    - 512MB not tested yet due to hardware limitations. 
    - 1GB not tested will require customized setup and hardcoding in kernel.
  o To test modifed QEMU is needed to map VM GICC at same offset as Foundation 
    Models gic-v3 GICV (thanks to Marc's insight), currently QEMU hardcodes
    GICC to 64KB aligned page.

Changes since v14:
- Fixed a bug referencsing 2nd stage pmd pfn instead IPA to flush 2nd stage TLB.
- Fixed initial write protect to include KVM_MR_MOVE case.
- Fixed timing issue between tlb flush and completion on other CPUs.
- Added PUD write protect and clear.
- Refactored some code in kvm/mmu.c due to 3rd issue above.
- Combined armv7 and 8 patches into one series
- Reworded descirption for kvm_vm_ioctl_get_dirty_log(), applied Paolos changes
- rebased to 3.18.0-rc2

Changes since v13:
- Addressed comments from Cornelia, Paolo, Marc, and Christoffer
- Most signifcant change is reduce number of arguments to stage2_set_pte
- Another is introduce Kconfig symbol for generic kvm_get_dirty_log_protect()

Changes since v12:
- Added Paolos and James Hogan's comments to extend kvm_get_dirty_log() to
  make it further generic by adding write protection in addition to dirty bit
  map handling. This led to new generic function kvm_get_dirty_log_protect().

Changes since v11:
- Implemented Alex's comments to simplify generic layer.

Changes since v10:
- addressed wanghaibin comments 
- addressed Christoffers comments

Changes since v9:
- Split patches into generic and architecture specific variants for TLB Flushing
  and dirty log read (patches 1,2 & 3,4,5,6)
- rebased to 3.16.0-rc1
- Applied Christoffers comments

Mario Smarduch (10):
  KVM: Add architecture-defined TLB flush support
  KVM: Add generic support for dirty page logging
  KVM: arm: Add ARMv7 API to flush TLBs
  KVM: arm: Add initial dirty page locking support
  KVM: arm: dirty logging write protect support
  KVM: arm: page logging 2nd stage fault handling
  KVM: arm64: ARMv8 header changes for page logging
  KVM: arm64: Add HYP nterface to flush 1st/2nd stage
  KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  KVM: arm/arm64: Add support to dissolve huge PUD

Paolo Bonzini (1):
  KVM: x86: switch to kvm_get_dirty_log_protect

 arch/arm/include/asm/kvm_asm.h         |   1 +
 arch/arm/include/asm/kvm_host.h        |   2 +
 arch/arm/include/asm/kvm_mmu.h         |  29 ++++
 arch/arm/include/asm/pgtable-3level.h  |   1 +
 arch/arm/kvm/Kconfig                   |   2 +
 arch/arm/kvm/arm.c                     |  32 +++-
 arch/arm/kvm/interrupts.S              |  11 ++
 arch/arm/kvm/mmu.c                     | 304 ++++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/kvm_asm.h       |   1 +
 arch/arm64/include/asm/kvm_host.h      |   1 +
 arch/arm64/include/asm/kvm_mmu.h       |  30 ++++
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/kvm/Kconfig                 |   2 +
 arch/arm64/kvm/hyp.S                   |  22 +++
 arch/x86/include/asm/kvm_host.h        |   3 -
 arch/x86/kvm/Kconfig                   |   1 +
 arch/x86/kvm/mmu.c                     |   4 +-
 arch/x86/kvm/x86.c                     |  72 ++------
 include/linux/kvm_host.h               |   9 +
 virt/kvm/Kconfig                       |   9 +
 virt/kvm/kvm_main.c                    |  82 +++++++++
 21 files changed, 549 insertions(+), 73 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 00/11] KVM//x86/arm/arm64: dirty page logging for ARMv7/8 (3.18.0-rc2)
@ 2014-12-15  7:27 ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: linux-arm-kernel

Patch series adds support for armv7/8 dirty page logging. Also we move 
towards generic dirty page logging interface and move some common code to 
generic layer currently shared by x86, armv7 and armv8. 

armv7/8 Dirty page logging implementation overivew-
- initially write protects memory region 2nd stage page tables
- read dirty page log and again write protect dirty pages for next pass.
- second stage huge page are dissolved into pages to keep track of
  dirty memory at page granularity. Tracking at huge page granularity limits
  migration to an almost idle system. Small page size logging supports higher
  memory dirty rates. armv7 supports only 2MB Huge pages, armv8 may support
  2MB with kernel configurered to 4k page and 512MB for 64k page. Additional 
  logic has been included to support PUD sized 2nd stage 1GB huge pages which 
  apply to 4k page, 48 bit address space. Host kernel and ARM KVM support 2MB 
  and 512MB huge pages.
- In the event migration is canceled, normal behavior is resumed huge pages
  are rebuilt over time.

Testing:
- ARMv7: 
  o Fast Models Live Migration and shared memory mmio described below.
    For both instances correctness is validated through checksum of source and
    destination file copies on both ends. Precise tests with instructions
    will appear shortly at:  
	https://github.com/mjsmar/arm-dirtylog-tests
  o To test migration Christoffer's patches need to be applied 
    https://lists.cs.columbia.edu/pipermail/kvmarm/2014-December/012809.html
    "Fix vgic initialization problems". You can try validate without the patches
    (through checksums) but destination VM will not be responsive
  o Tested with 2MB huge pages, 4k pages

- ARMv8:
  o Currently migration is not supported on ARMv8 another method is used to
    validate dirty page logging. Used Foundation Model 9.0.xx for testing.
    Again details will appear at: 
	https://github.com/mjsmar/arm-dirtylog-tests

  o Test Description:
    - Added mmio device to QEMU 'virt' with on board memory (8MB in this case),
      Device memory is Posix shared memory segment visible to host. 
      Enabled dirty logging for that memslot.
    - Added memslot migration thread to export dirty bit map to host.
    - Implemented memory migration thread on host.
  
  o Operation:
    - On Guest application mmaps() the region and writes to it
    - The host migration thread does a pre-copy of /dev/shm/aeshmem to a host
      file, repeatedly requests QEMU for memory region dirty page log,
      incrementally copies dirty pages from /dev/shm/aeshmem to host file.
    - Guest application is stopped and both /dev/shm/aeshmem and host file are
      checksummed and check for match to validate dirty page log applied
      incremental updates, validating dirty page logging.
    - Tested with 2MB huge pages, 64k pages. 
    - 512MB not tested yet due to hardware limitations. 
    - 1GB not tested will require customized setup and hardcoding in kernel.
  o To test modifed QEMU is needed to map VM GICC at same offset as Foundation 
    Models gic-v3 GICV (thanks to Marc's insight), currently QEMU hardcodes
    GICC to 64KB aligned page.

Changes since v14:
- Fixed a bug referencsing 2nd stage pmd pfn instead IPA to flush 2nd stage TLB.
- Fixed initial write protect to include KVM_MR_MOVE case.
- Fixed timing issue between tlb flush and completion on other CPUs.
- Added PUD write protect and clear.
- Refactored some code in kvm/mmu.c due to 3rd issue above.
- Combined armv7 and 8 patches into one series
- Reworded descirption for kvm_vm_ioctl_get_dirty_log(), applied Paolos changes
- rebased to 3.18.0-rc2

Changes since v13:
- Addressed comments from Cornelia, Paolo, Marc, and Christoffer
- Most signifcant change is reduce number of arguments to stage2_set_pte
- Another is introduce Kconfig symbol for generic kvm_get_dirty_log_protect()

Changes since v12:
- Added Paolos and James Hogan's comments to extend kvm_get_dirty_log() to
  make it further generic by adding write protection in addition to dirty bit
  map handling. This led to new generic function kvm_get_dirty_log_protect().

Changes since v11:
- Implemented Alex's comments to simplify generic layer.

Changes since v10:
- addressed wanghaibin comments 
- addressed Christoffers comments

Changes since v9:
- Split patches into generic and architecture specific variants for TLB Flushing
  and dirty log read (patches 1,2 & 3,4,5,6)
- rebased to 3.16.0-rc1
- Applied Christoffers comments

Mario Smarduch (10):
  KVM: Add architecture-defined TLB flush support
  KVM: Add generic support for dirty page logging
  KVM: arm: Add ARMv7 API to flush TLBs
  KVM: arm: Add initial dirty page locking support
  KVM: arm: dirty logging write protect support
  KVM: arm: page logging 2nd stage fault handling
  KVM: arm64: ARMv8 header changes for page logging
  KVM: arm64: Add HYP nterface to flush 1st/2nd stage
  KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  KVM: arm/arm64: Add support to dissolve huge PUD

Paolo Bonzini (1):
  KVM: x86: switch to kvm_get_dirty_log_protect

 arch/arm/include/asm/kvm_asm.h         |   1 +
 arch/arm/include/asm/kvm_host.h        |   2 +
 arch/arm/include/asm/kvm_mmu.h         |  29 ++++
 arch/arm/include/asm/pgtable-3level.h  |   1 +
 arch/arm/kvm/Kconfig                   |   2 +
 arch/arm/kvm/arm.c                     |  32 +++-
 arch/arm/kvm/interrupts.S              |  11 ++
 arch/arm/kvm/mmu.c                     | 304 ++++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/kvm_asm.h       |   1 +
 arch/arm64/include/asm/kvm_host.h      |   1 +
 arch/arm64/include/asm/kvm_mmu.h       |  30 ++++
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/kvm/Kconfig                 |   2 +
 arch/arm64/kvm/hyp.S                   |  22 +++
 arch/x86/include/asm/kvm_host.h        |   3 -
 arch/x86/kvm/Kconfig                   |   1 +
 arch/x86/kvm/mmu.c                     |   4 +-
 arch/x86/kvm/x86.c                     |  72 ++------
 include/linux/kvm_host.h               |   9 +
 virt/kvm/Kconfig                       |   9 +
 virt/kvm/kvm_main.c                    |  82 +++++++++
 21 files changed, 549 insertions(+), 73 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 00/11] KVM//x86/arm/arm64: dirty page logging for ARMv7/8 (3.18.0-rc2)
@ 2014-12-15  7:27 ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

Patch series adds support for armv7/8 dirty page logging. Also we move 
towards generic dirty page logging interface and move some common code to 
generic layer currently shared by x86, armv7 and armv8. 

armv7/8 Dirty page logging implementation overivew-
- initially write protects memory region 2nd stage page tables
- read dirty page log and again write protect dirty pages for next pass.
- second stage huge page are dissolved into pages to keep track of
  dirty memory at page granularity. Tracking at huge page granularity limits
  migration to an almost idle system. Small page size logging supports higher
  memory dirty rates. armv7 supports only 2MB Huge pages, armv8 may support
  2MB with kernel configurered to 4k page and 512MB for 64k page. Additional 
  logic has been included to support PUD sized 2nd stage 1GB huge pages which 
  apply to 4k page, 48 bit address space. Host kernel and ARM KVM support 2MB 
  and 512MB huge pages.
- In the event migration is canceled, normal behavior is resumed huge pages
  are rebuilt over time.

Testing:
- ARMv7: 
  o Fast Models Live Migration and shared memory mmio described below.
    For both instances correctness is validated through checksum of source and
    destination file copies on both ends. Precise tests with instructions
    will appear shortly at:  
	https://github.com/mjsmar/arm-dirtylog-tests
  o To test migration Christoffer's patches need to be applied 
    https://lists.cs.columbia.edu/pipermail/kvmarm/2014-December/012809.html
    "Fix vgic initialization problems". You can try validate without the patches
    (through checksums) but destination VM will not be responsive
  o Tested with 2MB huge pages, 4k pages

- ARMv8:
  o Currently migration is not supported on ARMv8 another method is used to
    validate dirty page logging. Used Foundation Model 9.0.xx for testing.
    Again details will appear at: 
	https://github.com/mjsmar/arm-dirtylog-tests

  o Test Description:
    - Added mmio device to QEMU 'virt' with on board memory (8MB in this case),
      Device memory is Posix shared memory segment visible to host. 
      Enabled dirty logging for that memslot.
    - Added memslot migration thread to export dirty bit map to host.
    - Implemented memory migration thread on host.
  
  o Operation:
    - On Guest application mmaps() the region and writes to it
    - The host migration thread does a pre-copy of /dev/shm/aeshmem to a host
      file, repeatedly requests QEMU for memory region dirty page log,
      incrementally copies dirty pages from /dev/shm/aeshmem to host file.
    - Guest application is stopped and both /dev/shm/aeshmem and host file are
      checksummed and check for match to validate dirty page log applied
      incremental updates, validating dirty page logging.
    - Tested with 2MB huge pages, 64k pages. 
    - 512MB not tested yet due to hardware limitations. 
    - 1GB not tested will require customized setup and hardcoding in kernel.
  o To test modifed QEMU is needed to map VM GICC at same offset as Foundation 
    Models gic-v3 GICV (thanks to Marc's insight), currently QEMU hardcodes
    GICC to 64KB aligned page.

Changes since v14:
- Fixed a bug referencsing 2nd stage pmd pfn instead IPA to flush 2nd stage TLB.
- Fixed initial write protect to include KVM_MR_MOVE case.
- Fixed timing issue between tlb flush and completion on other CPUs.
- Added PUD write protect and clear.
- Refactored some code in kvm/mmu.c due to 3rd issue above.
- Combined armv7 and 8 patches into one series
- Reworded descirption for kvm_vm_ioctl_get_dirty_log(), applied Paolos changes
- rebased to 3.18.0-rc2

Changes since v13:
- Addressed comments from Cornelia, Paolo, Marc, and Christoffer
- Most signifcant change is reduce number of arguments to stage2_set_pte
- Another is introduce Kconfig symbol for generic kvm_get_dirty_log_protect()

Changes since v12:
- Added Paolos and James Hogan's comments to extend kvm_get_dirty_log() to
  make it further generic by adding write protection in addition to dirty bit
  map handling. This led to new generic function kvm_get_dirty_log_protect().

Changes since v11:
- Implemented Alex's comments to simplify generic layer.

Changes since v10:
- addressed wanghaibin comments 
- addressed Christoffers comments

Changes since v9:
- Split patches into generic and architecture specific variants for TLB Flushing
  and dirty log read (patches 1,2 & 3,4,5,6)
- rebased to 3.16.0-rc1
- Applied Christoffers comments

Mario Smarduch (10):
  KVM: Add architecture-defined TLB flush support
  KVM: Add generic support for dirty page logging
  KVM: arm: Add ARMv7 API to flush TLBs
  KVM: arm: Add initial dirty page locking support
  KVM: arm: dirty logging write protect support
  KVM: arm: page logging 2nd stage fault handling
  KVM: arm64: ARMv8 header changes for page logging
  KVM: arm64: Add HYP nterface to flush 1st/2nd stage
  KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  KVM: arm/arm64: Add support to dissolve huge PUD

Paolo Bonzini (1):
  KVM: x86: switch to kvm_get_dirty_log_protect

 arch/arm/include/asm/kvm_asm.h         |   1 +
 arch/arm/include/asm/kvm_host.h        |   2 +
 arch/arm/include/asm/kvm_mmu.h         |  29 ++++
 arch/arm/include/asm/pgtable-3level.h  |   1 +
 arch/arm/kvm/Kconfig                   |   2 +
 arch/arm/kvm/arm.c                     |  32 +++-
 arch/arm/kvm/interrupts.S              |  11 ++
 arch/arm/kvm/mmu.c                     | 304 ++++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/kvm_asm.h       |   1 +
 arch/arm64/include/asm/kvm_host.h      |   1 +
 arch/arm64/include/asm/kvm_mmu.h       |  30 ++++
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/kvm/Kconfig                 |   2 +
 arch/arm64/kvm/hyp.S                   |  22 +++
 arch/x86/include/asm/kvm_host.h        |   3 -
 arch/x86/kvm/Kconfig                   |   1 +
 arch/x86/kvm/mmu.c                     |   4 +-
 arch/x86/kvm/x86.c                     |  72 ++------
 include/linux/kvm_host.h               |   9 +
 virt/kvm/Kconfig                       |   9 +
 virt/kvm/kvm_main.c                    |  82 +++++++++
 21 files changed, 549 insertions(+), 73 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 00/11] KVM//x86/arm/arm64: dirty page logging for ARMv7/8 (3.18.0-rc2)
@ 2014-12-15  7:27 ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: kvm-ia64

Patch series adds support for armv7/8 dirty page logging. Also we move 
towards generic dirty page logging interface and move some common code to 
generic layer currently shared by x86, armv7 and armv8. 

armv7/8 Dirty page logging implementation overivew-
- initially write protects memory region 2nd stage page tables
- read dirty page log and again write protect dirty pages for next pass.
- second stage huge page are dissolved into pages to keep track of
  dirty memory at page granularity. Tracking at huge page granularity limits
  migration to an almost idle system. Small page size logging supports higher
  memory dirty rates. armv7 supports only 2MB Huge pages, armv8 may support
  2MB with kernel configurered to 4k page and 512MB for 64k page. Additional 
  logic has been included to support PUD sized 2nd stage 1GB huge pages which 
  apply to 4k page, 48 bit address space. Host kernel and ARM KVM support 2MB 
  and 512MB huge pages.
- In the event migration is canceled, normal behavior is resumed huge pages
  are rebuilt over time.

Testing:
- ARMv7: 
  o Fast Models Live Migration and shared memory mmio described below.
    For both instances correctness is validated through checksum of source and
    destination file copies on both ends. Precise tests with instructions
    will appear shortly at:  
	https://github.com/mjsmar/arm-dirtylog-tests
  o To test migration Christoffer's patches need to be applied 
    https://lists.cs.columbia.edu/pipermail/kvmarm/2014-December/012809.html
    "Fix vgic initialization problems". You can try validate without the patches
    (through checksums) but destination VM will not be responsive
  o Tested with 2MB huge pages, 4k pages

- ARMv8:
  o Currently migration is not supported on ARMv8 another method is used to
    validate dirty page logging. Used Foundation Model 9.0.xx for testing.
    Again details will appear at: 
	https://github.com/mjsmar/arm-dirtylog-tests

  o Test Description:
    - Added mmio device to QEMU 'virt' with on board memory (8MB in this case),
      Device memory is Posix shared memory segment visible to host. 
      Enabled dirty logging for that memslot.
    - Added memslot migration thread to export dirty bit map to host.
    - Implemented memory migration thread on host.
  
  o Operation:
    - On Guest application mmaps() the region and writes to it
    - The host migration thread does a pre-copy of /dev/shm/aeshmem to a host
      file, repeatedly requests QEMU for memory region dirty page log,
      incrementally copies dirty pages from /dev/shm/aeshmem to host file.
    - Guest application is stopped and both /dev/shm/aeshmem and host file are
      checksummed and check for match to validate dirty page log applied
      incremental updates, validating dirty page logging.
    - Tested with 2MB huge pages, 64k pages. 
    - 512MB not tested yet due to hardware limitations. 
    - 1GB not tested will require customized setup and hardcoding in kernel.
  o To test modifed QEMU is needed to map VM GICC at same offset as Foundation 
    Models gic-v3 GICV (thanks to Marc's insight), currently QEMU hardcodes
    GICC to 64KB aligned page.

Changes since v14:
- Fixed a bug referencsing 2nd stage pmd pfn instead IPA to flush 2nd stage TLB.
- Fixed initial write protect to include KVM_MR_MOVE case.
- Fixed timing issue between tlb flush and completion on other CPUs.
- Added PUD write protect and clear.
- Refactored some code in kvm/mmu.c due to 3rd issue above.
- Combined armv7 and 8 patches into one series
- Reworded descirption for kvm_vm_ioctl_get_dirty_log(), applied Paolos changes
- rebased to 3.18.0-rc2

Changes since v13:
- Addressed comments from Cornelia, Paolo, Marc, and Christoffer
- Most signifcant change is reduce number of arguments to stage2_set_pte
- Another is introduce Kconfig symbol for generic kvm_get_dirty_log_protect()

Changes since v12:
- Added Paolos and James Hogan's comments to extend kvm_get_dirty_log() to
  make it further generic by adding write protection in addition to dirty bit
  map handling. This led to new generic function kvm_get_dirty_log_protect().

Changes since v11:
- Implemented Alex's comments to simplify generic layer.

Changes since v10:
- addressed wanghaibin comments 
- addressed Christoffers comments

Changes since v9:
- Split patches into generic and architecture specific variants for TLB Flushing
  and dirty log read (patches 1,2 & 3,4,5,6)
- rebased to 3.16.0-rc1
- Applied Christoffers comments

Mario Smarduch (10):
  KVM: Add architecture-defined TLB flush support
  KVM: Add generic support for dirty page logging
  KVM: arm: Add ARMv7 API to flush TLBs
  KVM: arm: Add initial dirty page locking support
  KVM: arm: dirty logging write protect support
  KVM: arm: page logging 2nd stage fault handling
  KVM: arm64: ARMv8 header changes for page logging
  KVM: arm64: Add HYP nterface to flush 1st/2nd stage
  KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  KVM: arm/arm64: Add support to dissolve huge PUD

Paolo Bonzini (1):
  KVM: x86: switch to kvm_get_dirty_log_protect

 arch/arm/include/asm/kvm_asm.h         |   1 +
 arch/arm/include/asm/kvm_host.h        |   2 +
 arch/arm/include/asm/kvm_mmu.h         |  29 ++++
 arch/arm/include/asm/pgtable-3level.h  |   1 +
 arch/arm/kvm/Kconfig                   |   2 +
 arch/arm/kvm/arm.c                     |  32 +++-
 arch/arm/kvm/interrupts.S              |  11 ++
 arch/arm/kvm/mmu.c                     | 304 ++++++++++++++++++++++++++++++++-
 arch/arm64/include/asm/kvm_asm.h       |   1 +
 arch/arm64/include/asm/kvm_host.h      |   1 +
 arch/arm64/include/asm/kvm_mmu.h       |  30 ++++
 arch/arm64/include/asm/pgtable-hwdef.h |   4 +
 arch/arm64/kvm/Kconfig                 |   2 +
 arch/arm64/kvm/hyp.S                   |  22 +++
 arch/x86/include/asm/kvm_host.h        |   3 -
 arch/x86/kvm/Kconfig                   |   1 +
 arch/x86/kvm/mmu.c                     |   4 +-
 arch/x86/kvm/x86.c                     |  72 ++------
 include/linux/kvm_host.h               |   9 +
 virt/kvm/Kconfig                       |   9 +
 virt/kvm/kvm_main.c                    |  82 +++++++++
 21 files changed, 549 insertions(+), 73 deletions(-)

-- 
1.9.1


^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 01/11] KVM: Add architecture-defined TLB flush support
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:27   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

Allow architectures to override the generic kvm_flush_remote_tlbs()
function via HAVE_KVM_ARCH_TLB_FLUSH_ALL. ARMv7 will need this to
provide its own TLB flush interface.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 virt/kvm/Kconfig    | 3 +++
 virt/kvm/kvm_main.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index fc0c5e6..3796a21 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -37,3 +37,6 @@ config HAVE_KVM_CPU_RELAX_INTERCEPT
 
 config KVM_VFIO
        bool
+
+config HAVE_KVM_ARCH_TLB_FLUSH_ALL
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3cee7b1..51e9dfa 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -185,6 +185,7 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
 	return called;
 }
 
+#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
 	long dirty_count = kvm->tlbs_dirty;
@@ -195,6 +196,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
 }
 EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs);
+#endif
 
 void kvm_reload_remote_mmus(struct kvm *kvm)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 01/11] KVM: Add architecture-defined TLB flush support
@ 2014-12-15  7:27   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: linux-arm-kernel

Allow architectures to override the generic kvm_flush_remote_tlbs()
function via HAVE_KVM_ARCH_TLB_FLUSH_ALL. ARMv7 will need this to
provide its own TLB flush interface.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 virt/kvm/Kconfig    | 3 +++
 virt/kvm/kvm_main.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index fc0c5e6..3796a21 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -37,3 +37,6 @@ config HAVE_KVM_CPU_RELAX_INTERCEPT
 
 config KVM_VFIO
        bool
+
+config HAVE_KVM_ARCH_TLB_FLUSH_ALL
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3cee7b1..51e9dfa 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -185,6 +185,7 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
 	return called;
 }
 
+#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
 	long dirty_count = kvm->tlbs_dirty;
@@ -195,6 +196,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
 }
 EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs);
+#endif
 
 void kvm_reload_remote_mmus(struct kvm *kvm)
 {
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 01/11] KVM: Add architecture-defined TLB flush support
@ 2014-12-15  7:27   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

Allow architectures to override the generic kvm_flush_remote_tlbs()
function via HAVE_KVM_ARCH_TLB_FLUSH_ALL. ARMv7 will need this to
provide its own TLB flush interface.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 virt/kvm/Kconfig    | 3 +++
 virt/kvm/kvm_main.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index fc0c5e6..3796a21 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -37,3 +37,6 @@ config HAVE_KVM_CPU_RELAX_INTERCEPT
 
 config KVM_VFIO
        bool
+
+config HAVE_KVM_ARCH_TLB_FLUSH_ALL
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3cee7b1..51e9dfa 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -185,6 +185,7 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
 	return called;
 }
 
+#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
 	long dirty_count = kvm->tlbs_dirty;
@@ -195,6 +196,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
 }
 EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs);
+#endif
 
 void kvm_reload_remote_mmus(struct kvm *kvm)
 {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 01/11] KVM: Add architecture-defined TLB flush support
@ 2014-12-15  7:27   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: kvm-ia64

Allow architectures to override the generic kvm_flush_remote_tlbs()
function via HAVE_KVM_ARCH_TLB_FLUSH_ALL. ARMv7 will need this to
provide its own TLB flush interface.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 virt/kvm/Kconfig    | 3 +++
 virt/kvm/kvm_main.c | 2 ++
 2 files changed, 5 insertions(+)

diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index fc0c5e6..3796a21 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -37,3 +37,6 @@ config HAVE_KVM_CPU_RELAX_INTERCEPT
 
 config KVM_VFIO
        bool
+
+config HAVE_KVM_ARCH_TLB_FLUSH_ALL
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3cee7b1..51e9dfa 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -185,6 +185,7 @@ bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
 	return called;
 }
 
+#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
 	long dirty_count = kvm->tlbs_dirty;
@@ -195,6 +196,7 @@ void kvm_flush_remote_tlbs(struct kvm *kvm)
 	cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
 }
 EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs);
+#endif
 
 void kvm_reload_remote_mmus(struct kvm *kvm)
 {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 02/11] KVM: Add generic support for dirty page logging
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:27   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

kvm_get_dirty_log() provides generic handling of dirty bitmap, currently reused
by several architectures. Building on that we intrdoduce
kvm_get_dirty_log_protect() adding write protection to mark these pages dirty
for future write access, before next KVM_GET_DIRTY_LOG ioctl call from user
space.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 include/linux/kvm_host.h |  9 ++++++
 virt/kvm/Kconfig         |  6 ++++
 virt/kvm/kvm_main.c      | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 95 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e4d8f70..ed29e79 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -602,6 +602,15 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext);
 
 int kvm_get_dirty_log(struct kvm *kvm,
 			struct kvm_dirty_log *log, int *is_dirty);
+
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty);
+
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+					struct kvm_memory_slot *slot,
+					gfn_t gfn_offset,
+					unsigned long mask);
+
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 				struct kvm_dirty_log *log);
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 3796a21..314950c 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -40,3 +40,9 @@ config KVM_VFIO
 
 config HAVE_KVM_ARCH_TLB_FLUSH_ALL
        bool
+
+config HAVE_KVM_ARCH_DIRTY_LOG_PROTECT
+       bool
+
+config KVM_GENERIC_DIRTYLOG_READ_PROTECT
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 51e9dfa..55a16b2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1001,6 +1001,86 @@ out:
 }
 EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
 
+#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
+/**
+ * kvm_get_dirty_log_protect - get a snapshot of dirty pages, and if any pages
+ *	are dirty write protect them for next write.
+ * @kvm:	pointer to kvm instance
+ * @log:	slot id and address to which we copy the log
+ * @is_dirty:	flag set if any page is dirty
+ *
+ * We need to keep it in mind that VCPU threads can write to the bitmap
+ * concurrently. So, to avoid losing track of dirty pages we keep the
+ * following order:
+ *
+ *    1. Take a snapshot of the bit and clear it if needed.
+ *    2. Write protect the corresponding page.
+ *    3. Copy the snapshot to the userspace.
+ *    4. Upon return caller flushes TLB's if needed.
+ *
+ * Between 2 and 4, the guest may write to the page using the remaining TLB
+ * entry.  This is not a problem because the page is reported dirty using
+ * the snapshot taken before and step 4 ensures that writes done after
+ * exiting to userspace will be logged for the next call.
+ *
+ */
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty)
+{
+	struct kvm_memory_slot *memslot;
+	int r, i;
+	unsigned long n;
+	unsigned long *dirty_bitmap;
+	unsigned long *dirty_bitmap_buffer;
+
+	r = -EINVAL;
+	if (log->slot >= KVM_USER_MEM_SLOTS)
+		goto out;
+
+	memslot = id_to_memslot(kvm->memslots, log->slot);
+
+	dirty_bitmap = memslot->dirty_bitmap;
+	r = -ENOENT;
+	if (!dirty_bitmap)
+		goto out;
+
+	n = kvm_dirty_bitmap_bytes(memslot);
+
+	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
+	memset(dirty_bitmap_buffer, 0, n);
+
+	spin_lock(&kvm->mmu_lock);
+	*is_dirty = false;
+	for (i = 0; i < n / sizeof(long); i++) {
+		unsigned long mask;
+		gfn_t offset;
+
+		if (!dirty_bitmap[i])
+			continue;
+
+		*is_dirty = true;
+
+		mask = xchg(&dirty_bitmap[i], 0);
+		dirty_bitmap_buffer[i] = mask;
+
+		offset = i * BITS_PER_LONG;
+		kvm_arch_mmu_write_protect_pt_masked(kvm, memslot, offset,
+								mask);
+	}
+
+	spin_unlock(&kvm->mmu_lock);
+
+	r = -EFAULT;
+	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
+		goto out;
+
+	r = 0;
+out:
+	return r;
+}
+EXPORT_SYMBOL_GPL(kvm_get_dirty_log_protect);
+#endif
+
 bool kvm_largepages_enabled(void)
 {
 	return largepages_enabled;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 02/11] KVM: Add generic support for dirty page logging
@ 2014-12-15  7:27   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: linux-arm-kernel

kvm_get_dirty_log() provides generic handling of dirty bitmap, currently reused
by several architectures. Building on that we intrdoduce
kvm_get_dirty_log_protect() adding write protection to mark these pages dirty
for future write access, before next KVM_GET_DIRTY_LOG ioctl call from user
space.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 include/linux/kvm_host.h |  9 ++++++
 virt/kvm/Kconfig         |  6 ++++
 virt/kvm/kvm_main.c      | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 95 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e4d8f70..ed29e79 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -602,6 +602,15 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext);
 
 int kvm_get_dirty_log(struct kvm *kvm,
 			struct kvm_dirty_log *log, int *is_dirty);
+
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty);
+
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+					struct kvm_memory_slot *slot,
+					gfn_t gfn_offset,
+					unsigned long mask);
+
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 				struct kvm_dirty_log *log);
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 3796a21..314950c 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -40,3 +40,9 @@ config KVM_VFIO
 
 config HAVE_KVM_ARCH_TLB_FLUSH_ALL
        bool
+
+config HAVE_KVM_ARCH_DIRTY_LOG_PROTECT
+       bool
+
+config KVM_GENERIC_DIRTYLOG_READ_PROTECT
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 51e9dfa..55a16b2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1001,6 +1001,86 @@ out:
 }
 EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
 
+#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
+/**
+ * kvm_get_dirty_log_protect - get a snapshot of dirty pages, and if any pages
+ *	are dirty write protect them for next write.
+ * @kvm:	pointer to kvm instance
+ * @log:	slot id and address to which we copy the log
+ * @is_dirty:	flag set if any page is dirty
+ *
+ * We need to keep it in mind that VCPU threads can write to the bitmap
+ * concurrently. So, to avoid losing track of dirty pages we keep the
+ * following order:
+ *
+ *    1. Take a snapshot of the bit and clear it if needed.
+ *    2. Write protect the corresponding page.
+ *    3. Copy the snapshot to the userspace.
+ *    4. Upon return caller flushes TLB's if needed.
+ *
+ * Between 2 and 4, the guest may write to the page using the remaining TLB
+ * entry.  This is not a problem because the page is reported dirty using
+ * the snapshot taken before and step 4 ensures that writes done after
+ * exiting to userspace will be logged for the next call.
+ *
+ */
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty)
+{
+	struct kvm_memory_slot *memslot;
+	int r, i;
+	unsigned long n;
+	unsigned long *dirty_bitmap;
+	unsigned long *dirty_bitmap_buffer;
+
+	r = -EINVAL;
+	if (log->slot >= KVM_USER_MEM_SLOTS)
+		goto out;
+
+	memslot = id_to_memslot(kvm->memslots, log->slot);
+
+	dirty_bitmap = memslot->dirty_bitmap;
+	r = -ENOENT;
+	if (!dirty_bitmap)
+		goto out;
+
+	n = kvm_dirty_bitmap_bytes(memslot);
+
+	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
+	memset(dirty_bitmap_buffer, 0, n);
+
+	spin_lock(&kvm->mmu_lock);
+	*is_dirty = false;
+	for (i = 0; i < n / sizeof(long); i++) {
+		unsigned long mask;
+		gfn_t offset;
+
+		if (!dirty_bitmap[i])
+			continue;
+
+		*is_dirty = true;
+
+		mask = xchg(&dirty_bitmap[i], 0);
+		dirty_bitmap_buffer[i] = mask;
+
+		offset = i * BITS_PER_LONG;
+		kvm_arch_mmu_write_protect_pt_masked(kvm, memslot, offset,
+								mask);
+	}
+
+	spin_unlock(&kvm->mmu_lock);
+
+	r = -EFAULT;
+	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
+		goto out;
+
+	r = 0;
+out:
+	return r;
+}
+EXPORT_SYMBOL_GPL(kvm_get_dirty_log_protect);
+#endif
+
 bool kvm_largepages_enabled(void)
 {
 	return largepages_enabled;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 02/11] KVM: Add generic support for dirty page logging
@ 2014-12-15  7:27   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

kvm_get_dirty_log() provides generic handling of dirty bitmap, currently reused
by several architectures. Building on that we intrdoduce
kvm_get_dirty_log_protect() adding write protection to mark these pages dirty
for future write access, before next KVM_GET_DIRTY_LOG ioctl call from user
space.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 include/linux/kvm_host.h |  9 ++++++
 virt/kvm/Kconfig         |  6 ++++
 virt/kvm/kvm_main.c      | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 95 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e4d8f70..ed29e79 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -602,6 +602,15 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext);
 
 int kvm_get_dirty_log(struct kvm *kvm,
 			struct kvm_dirty_log *log, int *is_dirty);
+
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty);
+
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+					struct kvm_memory_slot *slot,
+					gfn_t gfn_offset,
+					unsigned long mask);
+
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 				struct kvm_dirty_log *log);
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 3796a21..314950c 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -40,3 +40,9 @@ config KVM_VFIO
 
 config HAVE_KVM_ARCH_TLB_FLUSH_ALL
        bool
+
+config HAVE_KVM_ARCH_DIRTY_LOG_PROTECT
+       bool
+
+config KVM_GENERIC_DIRTYLOG_READ_PROTECT
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 51e9dfa..55a16b2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1001,6 +1001,86 @@ out:
 }
 EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
 
+#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
+/**
+ * kvm_get_dirty_log_protect - get a snapshot of dirty pages, and if any pages
+ *	are dirty write protect them for next write.
+ * @kvm:	pointer to kvm instance
+ * @log:	slot id and address to which we copy the log
+ * @is_dirty:	flag set if any page is dirty
+ *
+ * We need to keep it in mind that VCPU threads can write to the bitmap
+ * concurrently. So, to avoid losing track of dirty pages we keep the
+ * following order:
+ *
+ *    1. Take a snapshot of the bit and clear it if needed.
+ *    2. Write protect the corresponding page.
+ *    3. Copy the snapshot to the userspace.
+ *    4. Upon return caller flushes TLB's if needed.
+ *
+ * Between 2 and 4, the guest may write to the page using the remaining TLB
+ * entry.  This is not a problem because the page is reported dirty using
+ * the snapshot taken before and step 4 ensures that writes done after
+ * exiting to userspace will be logged for the next call.
+ *
+ */
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty)
+{
+	struct kvm_memory_slot *memslot;
+	int r, i;
+	unsigned long n;
+	unsigned long *dirty_bitmap;
+	unsigned long *dirty_bitmap_buffer;
+
+	r = -EINVAL;
+	if (log->slot >= KVM_USER_MEM_SLOTS)
+		goto out;
+
+	memslot = id_to_memslot(kvm->memslots, log->slot);
+
+	dirty_bitmap = memslot->dirty_bitmap;
+	r = -ENOENT;
+	if (!dirty_bitmap)
+		goto out;
+
+	n = kvm_dirty_bitmap_bytes(memslot);
+
+	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
+	memset(dirty_bitmap_buffer, 0, n);
+
+	spin_lock(&kvm->mmu_lock);
+	*is_dirty = false;
+	for (i = 0; i < n / sizeof(long); i++) {
+		unsigned long mask;
+		gfn_t offset;
+
+		if (!dirty_bitmap[i])
+			continue;
+
+		*is_dirty = true;
+
+		mask = xchg(&dirty_bitmap[i], 0);
+		dirty_bitmap_buffer[i] = mask;
+
+		offset = i * BITS_PER_LONG;
+		kvm_arch_mmu_write_protect_pt_masked(kvm, memslot, offset,
+								mask);
+	}
+
+	spin_unlock(&kvm->mmu_lock);
+
+	r = -EFAULT;
+	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
+		goto out;
+
+	r = 0;
+out:
+	return r;
+}
+EXPORT_SYMBOL_GPL(kvm_get_dirty_log_protect);
+#endif
+
 bool kvm_largepages_enabled(void)
 {
 	return largepages_enabled;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 02/11] KVM: Add generic support for dirty page logging
@ 2014-12-15  7:27   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:27 UTC (permalink / raw)
  To: kvm-ia64

kvm_get_dirty_log() provides generic handling of dirty bitmap, currently reused
by several architectures. Building on that we intrdoduce
kvm_get_dirty_log_protect() adding write protection to mark these pages dirty
for future write access, before next KVM_GET_DIRTY_LOG ioctl call from user
space.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 include/linux/kvm_host.h |  9 ++++++
 virt/kvm/Kconfig         |  6 ++++
 virt/kvm/kvm_main.c      | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 95 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e4d8f70..ed29e79 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -602,6 +602,15 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext);
 
 int kvm_get_dirty_log(struct kvm *kvm,
 			struct kvm_dirty_log *log, int *is_dirty);
+
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty);
+
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+					struct kvm_memory_slot *slot,
+					gfn_t gfn_offset,
+					unsigned long mask);
+
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 				struct kvm_dirty_log *log);
 
diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig
index 3796a21..314950c 100644
--- a/virt/kvm/Kconfig
+++ b/virt/kvm/Kconfig
@@ -40,3 +40,9 @@ config KVM_VFIO
 
 config HAVE_KVM_ARCH_TLB_FLUSH_ALL
        bool
+
+config HAVE_KVM_ARCH_DIRTY_LOG_PROTECT
+       bool
+
+config KVM_GENERIC_DIRTYLOG_READ_PROTECT
+       bool
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 51e9dfa..55a16b2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1001,6 +1001,86 @@ out:
 }
 EXPORT_SYMBOL_GPL(kvm_get_dirty_log);
 
+#ifdef CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT
+/**
+ * kvm_get_dirty_log_protect - get a snapshot of dirty pages, and if any pages
+ *	are dirty write protect them for next write.
+ * @kvm:	pointer to kvm instance
+ * @log:	slot id and address to which we copy the log
+ * @is_dirty:	flag set if any page is dirty
+ *
+ * We need to keep it in mind that VCPU threads can write to the bitmap
+ * concurrently. So, to avoid losing track of dirty pages we keep the
+ * following order:
+ *
+ *    1. Take a snapshot of the bit and clear it if needed.
+ *    2. Write protect the corresponding page.
+ *    3. Copy the snapshot to the userspace.
+ *    4. Upon return caller flushes TLB's if needed.
+ *
+ * Between 2 and 4, the guest may write to the page using the remaining TLB
+ * entry.  This is not a problem because the page is reported dirty using
+ * the snapshot taken before and step 4 ensures that writes done after
+ * exiting to userspace will be logged for the next call.
+ *
+ */
+int kvm_get_dirty_log_protect(struct kvm *kvm,
+			struct kvm_dirty_log *log, bool *is_dirty)
+{
+	struct kvm_memory_slot *memslot;
+	int r, i;
+	unsigned long n;
+	unsigned long *dirty_bitmap;
+	unsigned long *dirty_bitmap_buffer;
+
+	r = -EINVAL;
+	if (log->slot >= KVM_USER_MEM_SLOTS)
+		goto out;
+
+	memslot = id_to_memslot(kvm->memslots, log->slot);
+
+	dirty_bitmap = memslot->dirty_bitmap;
+	r = -ENOENT;
+	if (!dirty_bitmap)
+		goto out;
+
+	n = kvm_dirty_bitmap_bytes(memslot);
+
+	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
+	memset(dirty_bitmap_buffer, 0, n);
+
+	spin_lock(&kvm->mmu_lock);
+	*is_dirty = false;
+	for (i = 0; i < n / sizeof(long); i++) {
+		unsigned long mask;
+		gfn_t offset;
+
+		if (!dirty_bitmap[i])
+			continue;
+
+		*is_dirty = true;
+
+		mask = xchg(&dirty_bitmap[i], 0);
+		dirty_bitmap_buffer[i] = mask;
+
+		offset = i * BITS_PER_LONG;
+		kvm_arch_mmu_write_protect_pt_masked(kvm, memslot, offset,
+								mask);
+	}
+
+	spin_unlock(&kvm->mmu_lock);
+
+	r = -EFAULT;
+	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
+		goto out;
+
+	r = 0;
+out:
+	return r;
+}
+EXPORT_SYMBOL_GPL(kvm_get_dirty_log_protect);
+#endif
+
 bool kvm_largepages_enabled(void)
 {
 	return largepages_enabled;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 03/11] KVM: x86: switch to kvm_get_dirty_log_protect
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

From: Paolo Bonzini <pbonzini@redhat.com>

We now have a generic function that does most of the work of
kvm_vm_ioctl_get_dirty_log, now use it.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/x86/include/asm/kvm_host.h |  3 --
 arch/x86/kvm/Kconfig            |  1 +
 arch/x86/kvm/mmu.c              |  4 +--
 arch/x86/kvm/x86.c              | 72 ++++++++---------------------------------
 4 files changed, 16 insertions(+), 64 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6ed0c30..ae7db3e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -804,9 +804,6 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
-				     struct kvm_memory_slot *slot,
-				     gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
 void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index f9d16ff..d073594 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM
 	select PERF_EVENTS
 	select HAVE_KVM_MSI
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	select KVM_VFIO
 	---help---
 	  Support hosting fully virtualized guest machines using hardware
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 978f402..89ab64c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1218,7 +1218,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
 }
 
 /**
- * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages
+ * kvm_arch_mmu_write_protect_pt_masked - write protect selected PT level pages
  * @kvm: kvm instance
  * @slot: slot to protect
  * @gfn_offset: start of the BITS_PER_LONG pages we care about
@@ -1227,7 +1227,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
  * Used when we do not need to care about huge page mappings: e.g. during dirty
  * logging we do not have any such mappings.
  */
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0033df3..80769f4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3657,83 +3657,37 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
  * @kvm: kvm instance
  * @log: slot id and address to which we copy the log
  *
- * We need to keep it in mind that VCPU threads can write to the bitmap
- * concurrently.  So, to avoid losing data, we keep the following order for
- * each bit:
+ * Steps 1-4 below provide general overview of dirty page logging. See
+ * kvm_get_dirty_log_protect() function description for additional details.
+ *
+ * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
+ * always flush the TLB (step 4) even if previous step failed  and the dirty
+ * bitmap may be corrupt. Regardless of previous outcome the KVM logging API
+ * does not preclude user space subsequent dirty log read. Flushing TLB ensures
+ * writes will be marked dirty for next log read.
  *
  *   1. Take a snapshot of the bit and clear it if needed.
  *   2. Write protect the corresponding page.
- *   3. Flush TLB's if needed.
- *   4. Copy the snapshot to the userspace.
- *
- * Between 2 and 3, the guest may write to the page using the remaining TLB
- * entry.  This is not a problem because the page will be reported dirty at
- * step 4 using the snapshot taken before and step 3 ensures that successive
- * writes will be logged for the next call.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
-	int r;
-	struct kvm_memory_slot *memslot;
-	unsigned long n, i;
-	unsigned long *dirty_bitmap;
-	unsigned long *dirty_bitmap_buffer;
 	bool is_dirty = false;
+	int r;
 
 	mutex_lock(&kvm->slots_lock);
 
-	r = -EINVAL;
-	if (log->slot >= KVM_USER_MEM_SLOTS)
-		goto out;
-
-	memslot = id_to_memslot(kvm->memslots, log->slot);
-
-	dirty_bitmap = memslot->dirty_bitmap;
-	r = -ENOENT;
-	if (!dirty_bitmap)
-		goto out;
-
-	n = kvm_dirty_bitmap_bytes(memslot);
-
-	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
-	memset(dirty_bitmap_buffer, 0, n);
-
-	spin_lock(&kvm->mmu_lock);
-
-	for (i = 0; i < n / sizeof(long); i++) {
-		unsigned long mask;
-		gfn_t offset;
-
-		if (!dirty_bitmap[i])
-			continue;
-
-		is_dirty = true;
-
-		mask = xchg(&dirty_bitmap[i], 0);
-		dirty_bitmap_buffer[i] = mask;
-
-		offset = i * BITS_PER_LONG;
-		kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
-	}
-
-	spin_unlock(&kvm->mmu_lock);
-
-	/* See the comments in kvm_mmu_slot_remove_write_access(). */
-	lockdep_assert_held(&kvm->slots_lock);
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
 
 	/*
 	 * All the TLBs can be flushed out of mmu lock, see the comments in
 	 * kvm_mmu_slot_remove_write_access().
 	 */
+	lockdep_assert_held(&kvm->slots_lock);
 	if (is_dirty)
 		kvm_flush_remote_tlbs(kvm);
 
-	r = -EFAULT;
-	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
-		goto out;
-
-	r = 0;
-out:
 	mutex_unlock(&kvm->slots_lock);
 	return r;
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 03/11] KVM: x86: switch to kvm_get_dirty_log_protect
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

From: Paolo Bonzini <pbonzini@redhat.com>

We now have a generic function that does most of the work of
kvm_vm_ioctl_get_dirty_log, now use it.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/x86/include/asm/kvm_host.h |  3 --
 arch/x86/kvm/Kconfig            |  1 +
 arch/x86/kvm/mmu.c              |  4 +--
 arch/x86/kvm/x86.c              | 72 ++++++++---------------------------------
 4 files changed, 16 insertions(+), 64 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6ed0c30..ae7db3e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -804,9 +804,6 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
-				     struct kvm_memory_slot *slot,
-				     gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
 void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index f9d16ff..d073594 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM
 	select PERF_EVENTS
 	select HAVE_KVM_MSI
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	select KVM_VFIO
 	---help---
 	  Support hosting fully virtualized guest machines using hardware
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 978f402..89ab64c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1218,7 +1218,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
 }
 
 /**
- * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages
+ * kvm_arch_mmu_write_protect_pt_masked - write protect selected PT level pages
  * @kvm: kvm instance
  * @slot: slot to protect
  * @gfn_offset: start of the BITS_PER_LONG pages we care about
@@ -1227,7 +1227,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
  * Used when we do not need to care about huge page mappings: e.g. during dirty
  * logging we do not have any such mappings.
  */
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0033df3..80769f4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3657,83 +3657,37 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
  * @kvm: kvm instance
  * @log: slot id and address to which we copy the log
  *
- * We need to keep it in mind that VCPU threads can write to the bitmap
- * concurrently.  So, to avoid losing data, we keep the following order for
- * each bit:
+ * Steps 1-4 below provide general overview of dirty page logging. See
+ * kvm_get_dirty_log_protect() function description for additional details.
+ *
+ * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
+ * always flush the TLB (step 4) even if previous step failed  and the dirty
+ * bitmap may be corrupt. Regardless of previous outcome the KVM logging API
+ * does not preclude user space subsequent dirty log read. Flushing TLB ensures
+ * writes will be marked dirty for next log read.
  *
  *   1. Take a snapshot of the bit and clear it if needed.
  *   2. Write protect the corresponding page.
- *   3. Flush TLB's if needed.
- *   4. Copy the snapshot to the userspace.
- *
- * Between 2 and 3, the guest may write to the page using the remaining TLB
- * entry.  This is not a problem because the page will be reported dirty at
- * step 4 using the snapshot taken before and step 3 ensures that successive
- * writes will be logged for the next call.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
-	int r;
-	struct kvm_memory_slot *memslot;
-	unsigned long n, i;
-	unsigned long *dirty_bitmap;
-	unsigned long *dirty_bitmap_buffer;
 	bool is_dirty = false;
+	int r;
 
 	mutex_lock(&kvm->slots_lock);
 
-	r = -EINVAL;
-	if (log->slot >= KVM_USER_MEM_SLOTS)
-		goto out;
-
-	memslot = id_to_memslot(kvm->memslots, log->slot);
-
-	dirty_bitmap = memslot->dirty_bitmap;
-	r = -ENOENT;
-	if (!dirty_bitmap)
-		goto out;
-
-	n = kvm_dirty_bitmap_bytes(memslot);
-
-	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
-	memset(dirty_bitmap_buffer, 0, n);
-
-	spin_lock(&kvm->mmu_lock);
-
-	for (i = 0; i < n / sizeof(long); i++) {
-		unsigned long mask;
-		gfn_t offset;
-
-		if (!dirty_bitmap[i])
-			continue;
-
-		is_dirty = true;
-
-		mask = xchg(&dirty_bitmap[i], 0);
-		dirty_bitmap_buffer[i] = mask;
-
-		offset = i * BITS_PER_LONG;
-		kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
-	}
-
-	spin_unlock(&kvm->mmu_lock);
-
-	/* See the comments in kvm_mmu_slot_remove_write_access(). */
-	lockdep_assert_held(&kvm->slots_lock);
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
 
 	/*
 	 * All the TLBs can be flushed out of mmu lock, see the comments in
 	 * kvm_mmu_slot_remove_write_access().
 	 */
+	lockdep_assert_held(&kvm->slots_lock);
 	if (is_dirty)
 		kvm_flush_remote_tlbs(kvm);
 
-	r = -EFAULT;
-	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
-		goto out;
-
-	r = 0;
-out:
 	mutex_unlock(&kvm->slots_lock);
 	return r;
 }
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 03/11] KVM: x86: switch to kvm_get_dirty_log_protect
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

From: Paolo Bonzini <pbonzini@redhat.com>

We now have a generic function that does most of the work of
kvm_vm_ioctl_get_dirty_log, now use it.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/x86/include/asm/kvm_host.h |  3 --
 arch/x86/kvm/Kconfig            |  1 +
 arch/x86/kvm/mmu.c              |  4 +--
 arch/x86/kvm/x86.c              | 72 ++++++++---------------------------------
 4 files changed, 16 insertions(+), 64 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6ed0c30..ae7db3e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -804,9 +804,6 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
-				     struct kvm_memory_slot *slot,
-				     gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
 void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index f9d16ff..d073594 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM
 	select PERF_EVENTS
 	select HAVE_KVM_MSI
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	select KVM_VFIO
 	---help---
 	  Support hosting fully virtualized guest machines using hardware
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 978f402..89ab64c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1218,7 +1218,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
 }
 
 /**
- * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages
+ * kvm_arch_mmu_write_protect_pt_masked - write protect selected PT level pages
  * @kvm: kvm instance
  * @slot: slot to protect
  * @gfn_offset: start of the BITS_PER_LONG pages we care about
@@ -1227,7 +1227,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
  * Used when we do not need to care about huge page mappings: e.g. during dirty
  * logging we do not have any such mappings.
  */
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0033df3..80769f4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3657,83 +3657,37 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
  * @kvm: kvm instance
  * @log: slot id and address to which we copy the log
  *
- * We need to keep it in mind that VCPU threads can write to the bitmap
- * concurrently.  So, to avoid losing data, we keep the following order for
- * each bit:
+ * Steps 1-4 below provide general overview of dirty page logging. See
+ * kvm_get_dirty_log_protect() function description for additional details.
+ *
+ * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
+ * always flush the TLB (step 4) even if previous step failed  and the dirty
+ * bitmap may be corrupt. Regardless of previous outcome the KVM logging API
+ * does not preclude user space subsequent dirty log read. Flushing TLB ensures
+ * writes will be marked dirty for next log read.
  *
  *   1. Take a snapshot of the bit and clear it if needed.
  *   2. Write protect the corresponding page.
- *   3. Flush TLB's if needed.
- *   4. Copy the snapshot to the userspace.
- *
- * Between 2 and 3, the guest may write to the page using the remaining TLB
- * entry.  This is not a problem because the page will be reported dirty at
- * step 4 using the snapshot taken before and step 3 ensures that successive
- * writes will be logged for the next call.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
-	int r;
-	struct kvm_memory_slot *memslot;
-	unsigned long n, i;
-	unsigned long *dirty_bitmap;
-	unsigned long *dirty_bitmap_buffer;
 	bool is_dirty = false;
+	int r;
 
 	mutex_lock(&kvm->slots_lock);
 
-	r = -EINVAL;
-	if (log->slot >= KVM_USER_MEM_SLOTS)
-		goto out;
-
-	memslot = id_to_memslot(kvm->memslots, log->slot);
-
-	dirty_bitmap = memslot->dirty_bitmap;
-	r = -ENOENT;
-	if (!dirty_bitmap)
-		goto out;
-
-	n = kvm_dirty_bitmap_bytes(memslot);
-
-	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
-	memset(dirty_bitmap_buffer, 0, n);
-
-	spin_lock(&kvm->mmu_lock);
-
-	for (i = 0; i < n / sizeof(long); i++) {
-		unsigned long mask;
-		gfn_t offset;
-
-		if (!dirty_bitmap[i])
-			continue;
-
-		is_dirty = true;
-
-		mask = xchg(&dirty_bitmap[i], 0);
-		dirty_bitmap_buffer[i] = mask;
-
-		offset = i * BITS_PER_LONG;
-		kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
-	}
-
-	spin_unlock(&kvm->mmu_lock);
-
-	/* See the comments in kvm_mmu_slot_remove_write_access(). */
-	lockdep_assert_held(&kvm->slots_lock);
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
 
 	/*
 	 * All the TLBs can be flushed out of mmu lock, see the comments in
 	 * kvm_mmu_slot_remove_write_access().
 	 */
+	lockdep_assert_held(&kvm->slots_lock);
 	if (is_dirty)
 		kvm_flush_remote_tlbs(kvm);
 
-	r = -EFAULT;
-	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
-		goto out;
-
-	r = 0;
-out:
 	mutex_unlock(&kvm->slots_lock);
 	return r;
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 03/11] KVM: x86: switch to kvm_get_dirty_log_protect
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

From: Paolo Bonzini <pbonzini@redhat.com>

We now have a generic function that does most of the work of
kvm_vm_ioctl_get_dirty_log, now use it.

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/x86/include/asm/kvm_host.h |  3 --
 arch/x86/kvm/Kconfig            |  1 +
 arch/x86/kvm/mmu.c              |  4 +--
 arch/x86/kvm/x86.c              | 72 ++++++++---------------------------------
 4 files changed, 16 insertions(+), 64 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6ed0c30..ae7db3e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -804,9 +804,6 @@ void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot);
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
-				     struct kvm_memory_slot *slot,
-				     gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
 void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index f9d16ff..d073594 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -39,6 +39,7 @@ config KVM
 	select PERF_EVENTS
 	select HAVE_KVM_MSI
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	select KVM_VFIO
 	---help---
 	  Support hosting fully virtualized guest machines using hardware
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 978f402..89ab64c 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1218,7 +1218,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
 }
 
 /**
- * kvm_mmu_write_protect_pt_masked - write protect selected PT level pages
+ * kvm_arch_mmu_write_protect_pt_masked - write protect selected PT level pages
  * @kvm: kvm instance
  * @slot: slot to protect
  * @gfn_offset: start of the BITS_PER_LONG pages we care about
@@ -1227,7 +1227,7 @@ static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
  * Used when we do not need to care about huge page mappings: e.g. during dirty
  * logging we do not have any such mappings.
  */
-void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask)
 {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0033df3..80769f4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3657,83 +3657,37 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
  * @kvm: kvm instance
  * @log: slot id and address to which we copy the log
  *
- * We need to keep it in mind that VCPU threads can write to the bitmap
- * concurrently.  So, to avoid losing data, we keep the following order for
- * each bit:
+ * Steps 1-4 below provide general overview of dirty page logging. See
+ * kvm_get_dirty_log_protect() function description for additional details.
+ *
+ * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
+ * always flush the TLB (step 4) even if previous step failed  and the dirty
+ * bitmap may be corrupt. Regardless of previous outcome the KVM logging API
+ * does not preclude user space subsequent dirty log read. Flushing TLB ensures
+ * writes will be marked dirty for next log read.
  *
  *   1. Take a snapshot of the bit and clear it if needed.
  *   2. Write protect the corresponding page.
- *   3. Flush TLB's if needed.
- *   4. Copy the snapshot to the userspace.
- *
- * Between 2 and 3, the guest may write to the page using the remaining TLB
- * entry.  This is not a problem because the page will be reported dirty at
- * step 4 using the snapshot taken before and step 3 ensures that successive
- * writes will be logged for the next call.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
-	int r;
-	struct kvm_memory_slot *memslot;
-	unsigned long n, i;
-	unsigned long *dirty_bitmap;
-	unsigned long *dirty_bitmap_buffer;
 	bool is_dirty = false;
+	int r;
 
 	mutex_lock(&kvm->slots_lock);
 
-	r = -EINVAL;
-	if (log->slot >= KVM_USER_MEM_SLOTS)
-		goto out;
-
-	memslot = id_to_memslot(kvm->memslots, log->slot);
-
-	dirty_bitmap = memslot->dirty_bitmap;
-	r = -ENOENT;
-	if (!dirty_bitmap)
-		goto out;
-
-	n = kvm_dirty_bitmap_bytes(memslot);
-
-	dirty_bitmap_buffer = dirty_bitmap + n / sizeof(long);
-	memset(dirty_bitmap_buffer, 0, n);
-
-	spin_lock(&kvm->mmu_lock);
-
-	for (i = 0; i < n / sizeof(long); i++) {
-		unsigned long mask;
-		gfn_t offset;
-
-		if (!dirty_bitmap[i])
-			continue;
-
-		is_dirty = true;
-
-		mask = xchg(&dirty_bitmap[i], 0);
-		dirty_bitmap_buffer[i] = mask;
-
-		offset = i * BITS_PER_LONG;
-		kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
-	}
-
-	spin_unlock(&kvm->mmu_lock);
-
-	/* See the comments in kvm_mmu_slot_remove_write_access(). */
-	lockdep_assert_held(&kvm->slots_lock);
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
 
 	/*
 	 * All the TLBs can be flushed out of mmu lock, see the comments in
 	 * kvm_mmu_slot_remove_write_access().
 	 */
+	lockdep_assert_held(&kvm->slots_lock);
 	if (is_dirty)
 		kvm_flush_remote_tlbs(kvm);
 
-	r = -EFAULT;
-	if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
-		goto out;
-
-	r = 0;
-out:
 	mutex_unlock(&kvm->slots_lock);
 	return r;
 }
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 04/11] KVM: arm: Add ARMv7 API to flush TLBs
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds ARMv7 architecture TLB Flush function.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_asm.h  |  1 +
 arch/arm/include/asm/kvm_host.h | 12 ++++++++++++
 arch/arm/kvm/Kconfig            |  1 +
 arch/arm/kvm/interrupts.S       | 11 +++++++++++
 4 files changed, 25 insertions(+)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 3a67bec..25410b2 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 #endif
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 53036e2..9eb286e 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -223,6 +223,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+/**
+ * kvm_flush_remote_tlbs() - flush all VM TLB entries
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries without address
+ * parameter.
+ */
+static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 466bd29..f27f336 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
 	depends on ARM_VIRT_EXT && ARM_LPAE
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index 01dcb0e..79caf79 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	bx	lr
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ *
+ * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
+ * parameter
+ */
+
+ENTRY(__kvm_tlb_flush_vmid)
+	b	__kvm_tlb_flush_vmid_ipa
+ENDPROC(__kvm_tlb_flush_vmid)
+
 /********************************************************************
  * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
  * domain, for all VMIDs
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 04/11] KVM: arm: Add ARMv7 API to flush TLBs
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds ARMv7 architecture TLB Flush function.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_asm.h  |  1 +
 arch/arm/include/asm/kvm_host.h | 12 ++++++++++++
 arch/arm/kvm/Kconfig            |  1 +
 arch/arm/kvm/interrupts.S       | 11 +++++++++++
 4 files changed, 25 insertions(+)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 3a67bec..25410b2 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 #endif
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 53036e2..9eb286e 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -223,6 +223,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+/**
+ * kvm_flush_remote_tlbs() - flush all VM TLB entries
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries without address
+ * parameter.
+ */
+static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 466bd29..f27f336 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
 	depends on ARM_VIRT_EXT && ARM_LPAE
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index 01dcb0e..79caf79 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	bx	lr
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ *
+ * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
+ * parameter
+ */
+
+ENTRY(__kvm_tlb_flush_vmid)
+	b	__kvm_tlb_flush_vmid_ipa
+ENDPROC(__kvm_tlb_flush_vmid)
+
 /********************************************************************
  * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
  * domain, for all VMIDs
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 04/11] KVM: arm: Add ARMv7 API to flush TLBs
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds ARMv7 architecture TLB Flush function.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_asm.h  |  1 +
 arch/arm/include/asm/kvm_host.h | 12 ++++++++++++
 arch/arm/kvm/Kconfig            |  1 +
 arch/arm/kvm/interrupts.S       | 11 +++++++++++
 4 files changed, 25 insertions(+)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 3a67bec..25410b2 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 #endif
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 53036e2..9eb286e 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -223,6 +223,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+/**
+ * kvm_flush_remote_tlbs() - flush all VM TLB entries
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries without address
+ * parameter.
+ */
+static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 466bd29..f27f336 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
 	depends on ARM_VIRT_EXT && ARM_LPAE
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index 01dcb0e..79caf79 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	bx	lr
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ *
+ * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
+ * parameter
+ */
+
+ENTRY(__kvm_tlb_flush_vmid)
+	b	__kvm_tlb_flush_vmid_ipa
+ENDPROC(__kvm_tlb_flush_vmid)
+
 /********************************************************************
  * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
  * domain, for all VMIDs
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 04/11] KVM: arm: Add ARMv7 API to flush TLBs
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

This patch adds ARMv7 architecture TLB Flush function.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_asm.h  |  1 +
 arch/arm/include/asm/kvm_host.h | 12 ++++++++++++
 arch/arm/kvm/Kconfig            |  1 +
 arch/arm/kvm/interrupts.S       | 11 +++++++++++
 4 files changed, 25 insertions(+)

diff --git a/arch/arm/include/asm/kvm_asm.h b/arch/arm/include/asm/kvm_asm.h
index 3a67bec..25410b2 100644
--- a/arch/arm/include/asm/kvm_asm.h
+++ b/arch/arm/include/asm/kvm_asm.h
@@ -96,6 +96,7 @@ extern char __kvm_hyp_code_end[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 #endif
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 53036e2..9eb286e 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -223,6 +223,18 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+/**
+ * kvm_flush_remote_tlbs() - flush all VM TLB entries
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries without address
+ * parameter.
+ */
+static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index 466bd29..f27f336 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
 	depends on ARM_VIRT_EXT && ARM_LPAE
diff --git a/arch/arm/kvm/interrupts.S b/arch/arm/kvm/interrupts.S
index 01dcb0e..79caf79 100644
--- a/arch/arm/kvm/interrupts.S
+++ b/arch/arm/kvm/interrupts.S
@@ -66,6 +66,17 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	bx	lr
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ *
+ * Reuses __kvm_tlb_flush_vmid_ipa() for ARMv7, without passing address
+ * parameter
+ */
+
+ENTRY(__kvm_tlb_flush_vmid)
+	b	__kvm_tlb_flush_vmid_ipa
+ENDPROC(__kvm_tlb_flush_vmid)
+
 /********************************************************************
  * Flush TLBs and instruction caches of all CPUs inside the inner-shareable
  * domain, for all VMIDs
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

Add support for initial write protection of VM memslots. This patch
series assumes that huge PUDs will not be used in 2nd stage tables, which is
always valid on ARMv7

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h       |   2 +
 arch/arm/include/asm/kvm_mmu.h        |  21 ++++++
 arch/arm/include/asm/pgtable-3level.h |   1 +
 arch/arm/kvm/mmu.c                    | 135 ++++++++++++++++++++++++++++++++++
 4 files changed, 159 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 9eb286e..b138431 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -248,6 +248,8 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
+
 static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index f867060..dda0046 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -113,6 +113,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= L_PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
+}
+
+
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
 ({	u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;		\
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index a31ecdad..ae1d30a1 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -130,6 +130,7 @@
 #define L_PTE_S2_RDONLY			(_AT(pteval_t, 1) << 6)   /* HAP[1]   */
 #define L_PTE_S2_RDWR			(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define L_PMD_S2_RDONLY			(_AT(pmdval_t, 1) << 6)   /* HAP[1]   */
 #define L_PMD_S2_RDWR			(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 16ae5f0..29e0108 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -45,6 +45,7 @@ static phys_addr_t hyp_idmap_vector;
 #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
 
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
+#define kvm_pud_huge(_x)	pud_huge(_x)
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
@@ -840,6 +841,131 @@ static bool kvm_is_device_pfn(unsigned long pfn)
 	return !pfn_valid(pfn);
 }
 
+#ifdef CONFIG_ARM
+/**
+ * stage2_wp_ptes - write protect PMD range
+ * @pmd:	pointer to pmd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_ptes(pmd_t *pmd, phys_addr_t addr, phys_addr_t end)
+{
+	pte_t *pte;
+
+	pte = pte_offset_kernel(pmd, addr);
+	do {
+		if (!pte_none(*pte)) {
+			if (!kvm_s2pte_readonly(pte))
+				kvm_set_s2pte_readonly(pte);
+		}
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+/**
+ * stage2_wp_pmds - write protect PUD range
+ * @pud:	pointer to pud entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
+{
+	pmd_t *pmd;
+	phys_addr_t next;
+
+	pmd = pmd_offset(pud, addr);
+
+	do {
+		next = kvm_pmd_addr_end(addr, end);
+		if (!pmd_none(*pmd)) {
+			if (kvm_pmd_huge(*pmd)) {
+				if (!kvm_s2pmd_readonly(pmd))
+					kvm_set_s2pmd_readonly(pmd);
+			} else {
+				stage2_wp_ptes(pmd, addr, next);
+			}
+		}
+	} while (pmd++, addr = next, addr != end);
+}
+
+/**
+  * stage2_wp_puds - write protect PGD range
+  * @pgd:	pointer to pgd entry
+  * @addr:	range start address
+  * @end:	range end address
+  *
+  * Process PUD entries, for a huge PUD we cause a panic.
+  */
+static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+{
+	pud_t *pud;
+	phys_addr_t next;
+
+	pud = pud_offset(pgd, addr);
+	do {
+		next = kvm_pud_addr_end(addr, end);
+		if (!pud_none(*pud)) {
+			/* TODO:PUD not supported, revisit later if supported */
+			BUG_ON(kvm_pud_huge(*pud));
+			stage2_wp_pmds(pud, addr, next);
+		}
+	} while (pud++, addr = next, addr != end);
+}
+
+/**
+ * stage2_wp_range() - write protect stage2 memory region range
+ * @kvm:	The KVM pointer
+ * @addr:	Start address of range
+ * @end:	End address of range
+ */
+static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
+{
+	pgd_t *pgd;
+	phys_addr_t next;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	do {
+		/*
+		 * Release kvm_mmu_lock periodically if the memory region is
+		 * large. Otherwise, we may see kernel panics with
+		 * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCK_DETECTOR,
+		 * CONFIG_LOCK_DEP. Additionally, holding the lock too long
+		 * will also starve other vCPUs.
+		 */
+		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
+			cond_resched_lock(&kvm->mmu_lock);
+
+		next = kvm_pgd_addr_end(addr, end);
+		if (pgd_present(*pgd))
+			stage2_wp_puds(pgd, addr, next);
+	} while (pgd++, addr = next, addr != end);
+}
+
+/**
+ * kvm_mmu_wp_memory_region() - write protect stage 2 entries for memory slot
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot to write protect
+ *
+ * Called to start logging dirty pages after memory region
+ * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
+ * all present PMD and PTEs are write protected in the memory region.
+ * Afterwards read of dirty page log can be called.
+ *
+ * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
+ * serializing operations for VM memory regions.
+ */
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
+{
+	struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot);
+	phys_addr_t start = memslot->base_gfn << PAGE_SHIFT;
+	phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
+
+	spin_lock(&kvm->mmu_lock);
+	stage2_wp_range(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
+	kvm_flush_remote_tlbs(kvm);
+}
+#endif
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
 			  unsigned long fault_status)
@@ -1227,6 +1353,15 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				   const struct kvm_memory_slot *old,
 				   enum kvm_mr_change change)
 {
+#ifdef CONFIG_ARM
+	/*
+	 * At this point memslot has been committed and there is an
+	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
+	 * memory slot is write protected.
+	 */
+	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
+		kvm_mmu_wp_memory_region(kvm, mem->slot);
+#endif
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for initial write protection of VM memslots. This patch
series assumes that huge PUDs will not be used in 2nd stage tables, which is
always valid on ARMv7

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h       |   2 +
 arch/arm/include/asm/kvm_mmu.h        |  21 ++++++
 arch/arm/include/asm/pgtable-3level.h |   1 +
 arch/arm/kvm/mmu.c                    | 135 ++++++++++++++++++++++++++++++++++
 4 files changed, 159 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 9eb286e..b138431 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -248,6 +248,8 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
+
 static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index f867060..dda0046 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -113,6 +113,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= L_PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
+}
+
+
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
 ({	u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;		\
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index a31ecdad..ae1d30a1 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -130,6 +130,7 @@
 #define L_PTE_S2_RDONLY			(_AT(pteval_t, 1) << 6)   /* HAP[1]   */
 #define L_PTE_S2_RDWR			(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define L_PMD_S2_RDONLY			(_AT(pmdval_t, 1) << 6)   /* HAP[1]   */
 #define L_PMD_S2_RDWR			(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 16ae5f0..29e0108 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -45,6 +45,7 @@ static phys_addr_t hyp_idmap_vector;
 #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
 
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
+#define kvm_pud_huge(_x)	pud_huge(_x)
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
@@ -840,6 +841,131 @@ static bool kvm_is_device_pfn(unsigned long pfn)
 	return !pfn_valid(pfn);
 }
 
+#ifdef CONFIG_ARM
+/**
+ * stage2_wp_ptes - write protect PMD range
+ * @pmd:	pointer to pmd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_ptes(pmd_t *pmd, phys_addr_t addr, phys_addr_t end)
+{
+	pte_t *pte;
+
+	pte = pte_offset_kernel(pmd, addr);
+	do {
+		if (!pte_none(*pte)) {
+			if (!kvm_s2pte_readonly(pte))
+				kvm_set_s2pte_readonly(pte);
+		}
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+/**
+ * stage2_wp_pmds - write protect PUD range
+ * @pud:	pointer to pud entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
+{
+	pmd_t *pmd;
+	phys_addr_t next;
+
+	pmd = pmd_offset(pud, addr);
+
+	do {
+		next = kvm_pmd_addr_end(addr, end);
+		if (!pmd_none(*pmd)) {
+			if (kvm_pmd_huge(*pmd)) {
+				if (!kvm_s2pmd_readonly(pmd))
+					kvm_set_s2pmd_readonly(pmd);
+			} else {
+				stage2_wp_ptes(pmd, addr, next);
+			}
+		}
+	} while (pmd++, addr = next, addr != end);
+}
+
+/**
+  * stage2_wp_puds - write protect PGD range
+  * @pgd:	pointer to pgd entry
+  * @addr:	range start address
+  * @end:	range end address
+  *
+  * Process PUD entries, for a huge PUD we cause a panic.
+  */
+static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+{
+	pud_t *pud;
+	phys_addr_t next;
+
+	pud = pud_offset(pgd, addr);
+	do {
+		next = kvm_pud_addr_end(addr, end);
+		if (!pud_none(*pud)) {
+			/* TODO:PUD not supported, revisit later if supported */
+			BUG_ON(kvm_pud_huge(*pud));
+			stage2_wp_pmds(pud, addr, next);
+		}
+	} while (pud++, addr = next, addr != end);
+}
+
+/**
+ * stage2_wp_range() - write protect stage2 memory region range
+ * @kvm:	The KVM pointer
+ * @addr:	Start address of range
+ * @end:	End address of range
+ */
+static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
+{
+	pgd_t *pgd;
+	phys_addr_t next;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	do {
+		/*
+		 * Release kvm_mmu_lock periodically if the memory region is
+		 * large. Otherwise, we may see kernel panics with
+		 * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCK_DETECTOR,
+		 * CONFIG_LOCK_DEP. Additionally, holding the lock too long
+		 * will also starve other vCPUs.
+		 */
+		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
+			cond_resched_lock(&kvm->mmu_lock);
+
+		next = kvm_pgd_addr_end(addr, end);
+		if (pgd_present(*pgd))
+			stage2_wp_puds(pgd, addr, next);
+	} while (pgd++, addr = next, addr != end);
+}
+
+/**
+ * kvm_mmu_wp_memory_region() - write protect stage 2 entries for memory slot
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot to write protect
+ *
+ * Called to start logging dirty pages after memory region
+ * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
+ * all present PMD and PTEs are write protected in the memory region.
+ * Afterwards read of dirty page log can be called.
+ *
+ * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
+ * serializing operations for VM memory regions.
+ */
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
+{
+	struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot);
+	phys_addr_t start = memslot->base_gfn << PAGE_SHIFT;
+	phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
+
+	spin_lock(&kvm->mmu_lock);
+	stage2_wp_range(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
+	kvm_flush_remote_tlbs(kvm);
+}
+#endif
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
 			  unsigned long fault_status)
@@ -1227,6 +1353,15 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				   const struct kvm_memory_slot *old,
 				   enum kvm_mr_change change)
 {
+#ifdef CONFIG_ARM
+	/*
+	 * At this point memslot has been committed and there is an
+	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
+	 * memory slot is write protected.
+	 */
+	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
+		kvm_mmu_wp_memory_region(kvm, mem->slot);
+#endif
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

Add support for initial write protection of VM memslots. This patch
series assumes that huge PUDs will not be used in 2nd stage tables, which is
always valid on ARMv7

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h       |   2 +
 arch/arm/include/asm/kvm_mmu.h        |  21 ++++++
 arch/arm/include/asm/pgtable-3level.h |   1 +
 arch/arm/kvm/mmu.c                    | 135 ++++++++++++++++++++++++++++++++++
 4 files changed, 159 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 9eb286e..b138431 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -248,6 +248,8 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
+
 static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index f867060..dda0046 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -113,6 +113,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= L_PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & L_PTE_S2_RDWR) = L_PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
+}
+
+
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
 ({	u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;		\
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index a31ecdad..ae1d30a1 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -130,6 +130,7 @@
 #define L_PTE_S2_RDONLY			(_AT(pteval_t, 1) << 6)   /* HAP[1]   */
 #define L_PTE_S2_RDWR			(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define L_PMD_S2_RDONLY			(_AT(pmdval_t, 1) << 6)   /* HAP[1]   */
 #define L_PMD_S2_RDWR			(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 16ae5f0..29e0108 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -45,6 +45,7 @@ static phys_addr_t hyp_idmap_vector;
 #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
 
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
+#define kvm_pud_huge(_x)	pud_huge(_x)
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
@@ -840,6 +841,131 @@ static bool kvm_is_device_pfn(unsigned long pfn)
 	return !pfn_valid(pfn);
 }
 
+#ifdef CONFIG_ARM
+/**
+ * stage2_wp_ptes - write protect PMD range
+ * @pmd:	pointer to pmd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_ptes(pmd_t *pmd, phys_addr_t addr, phys_addr_t end)
+{
+	pte_t *pte;
+
+	pte = pte_offset_kernel(pmd, addr);
+	do {
+		if (!pte_none(*pte)) {
+			if (!kvm_s2pte_readonly(pte))
+				kvm_set_s2pte_readonly(pte);
+		}
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+/**
+ * stage2_wp_pmds - write protect PUD range
+ * @pud:	pointer to pud entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
+{
+	pmd_t *pmd;
+	phys_addr_t next;
+
+	pmd = pmd_offset(pud, addr);
+
+	do {
+		next = kvm_pmd_addr_end(addr, end);
+		if (!pmd_none(*pmd)) {
+			if (kvm_pmd_huge(*pmd)) {
+				if (!kvm_s2pmd_readonly(pmd))
+					kvm_set_s2pmd_readonly(pmd);
+			} else {
+				stage2_wp_ptes(pmd, addr, next);
+			}
+		}
+	} while (pmd++, addr = next, addr != end);
+}
+
+/**
+  * stage2_wp_puds - write protect PGD range
+  * @pgd:	pointer to pgd entry
+  * @addr:	range start address
+  * @end:	range end address
+  *
+  * Process PUD entries, for a huge PUD we cause a panic.
+  */
+static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+{
+	pud_t *pud;
+	phys_addr_t next;
+
+	pud = pud_offset(pgd, addr);
+	do {
+		next = kvm_pud_addr_end(addr, end);
+		if (!pud_none(*pud)) {
+			/* TODO:PUD not supported, revisit later if supported */
+			BUG_ON(kvm_pud_huge(*pud));
+			stage2_wp_pmds(pud, addr, next);
+		}
+	} while (pud++, addr = next, addr != end);
+}
+
+/**
+ * stage2_wp_range() - write protect stage2 memory region range
+ * @kvm:	The KVM pointer
+ * @addr:	Start address of range
+ * @end:	End address of range
+ */
+static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
+{
+	pgd_t *pgd;
+	phys_addr_t next;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	do {
+		/*
+		 * Release kvm_mmu_lock periodically if the memory region is
+		 * large. Otherwise, we may see kernel panics with
+		 * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCK_DETECTOR,
+		 * CONFIG_LOCK_DEP. Additionally, holding the lock too long
+		 * will also starve other vCPUs.
+		 */
+		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
+			cond_resched_lock(&kvm->mmu_lock);
+
+		next = kvm_pgd_addr_end(addr, end);
+		if (pgd_present(*pgd))
+			stage2_wp_puds(pgd, addr, next);
+	} while (pgd++, addr = next, addr != end);
+}
+
+/**
+ * kvm_mmu_wp_memory_region() - write protect stage 2 entries for memory slot
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot to write protect
+ *
+ * Called to start logging dirty pages after memory region
+ * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
+ * all present PMD and PTEs are write protected in the memory region.
+ * Afterwards read of dirty page log can be called.
+ *
+ * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
+ * serializing operations for VM memory regions.
+ */
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
+{
+	struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot);
+	phys_addr_t start = memslot->base_gfn << PAGE_SHIFT;
+	phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
+
+	spin_lock(&kvm->mmu_lock);
+	stage2_wp_range(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
+	kvm_flush_remote_tlbs(kvm);
+}
+#endif
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
 			  unsigned long fault_status)
@@ -1227,6 +1353,15 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				   const struct kvm_memory_slot *old,
 				   enum kvm_mr_change change)
 {
+#ifdef CONFIG_ARM
+	/*
+	 * At this point memslot has been committed and there is an
+	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
+	 * memory slot is write protected.
+	 */
+	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
+		kvm_mmu_wp_memory_region(kvm, mem->slot);
+#endif
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

Add support for initial write protection of VM memslots. This patch
series assumes that huge PUDs will not be used in 2nd stage tables, which is
always valid on ARMv7

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h       |   2 +
 arch/arm/include/asm/kvm_mmu.h        |  21 ++++++
 arch/arm/include/asm/pgtable-3level.h |   1 +
 arch/arm/kvm/mmu.c                    | 135 ++++++++++++++++++++++++++++++++++
 4 files changed, 159 insertions(+)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 9eb286e..b138431 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -248,6 +248,8 @@ static inline void vgic_arch_setup(const struct vgic_params *vgic)
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
 
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
+
 static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index f867060..dda0046 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -113,6 +113,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= L_PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~L_PTE_S2_RDWR) | L_PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & L_PTE_S2_RDWR) = L_PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
+}
+
+
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
 ({	u64 __boundary = ((addr) + PGDIR_SIZE) & PGDIR_MASK;		\
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h
index a31ecdad..ae1d30a1 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -130,6 +130,7 @@
 #define L_PTE_S2_RDONLY			(_AT(pteval_t, 1) << 6)   /* HAP[1]   */
 #define L_PTE_S2_RDWR			(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define L_PMD_S2_RDONLY			(_AT(pmdval_t, 1) << 6)   /* HAP[1]   */
 #define L_PMD_S2_RDWR			(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 16ae5f0..29e0108 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -45,6 +45,7 @@ static phys_addr_t hyp_idmap_vector;
 #define hyp_pgd_order get_order(PTRS_PER_PGD * sizeof(pgd_t))
 
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
+#define kvm_pud_huge(_x)	pud_huge(_x)
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
@@ -840,6 +841,131 @@ static bool kvm_is_device_pfn(unsigned long pfn)
 	return !pfn_valid(pfn);
 }
 
+#ifdef CONFIG_ARM
+/**
+ * stage2_wp_ptes - write protect PMD range
+ * @pmd:	pointer to pmd entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_ptes(pmd_t *pmd, phys_addr_t addr, phys_addr_t end)
+{
+	pte_t *pte;
+
+	pte = pte_offset_kernel(pmd, addr);
+	do {
+		if (!pte_none(*pte)) {
+			if (!kvm_s2pte_readonly(pte))
+				kvm_set_s2pte_readonly(pte);
+		}
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+/**
+ * stage2_wp_pmds - write protect PUD range
+ * @pud:	pointer to pud entry
+ * @addr:	range start address
+ * @end:	range end address
+ */
+static void stage2_wp_pmds(pud_t *pud, phys_addr_t addr, phys_addr_t end)
+{
+	pmd_t *pmd;
+	phys_addr_t next;
+
+	pmd = pmd_offset(pud, addr);
+
+	do {
+		next = kvm_pmd_addr_end(addr, end);
+		if (!pmd_none(*pmd)) {
+			if (kvm_pmd_huge(*pmd)) {
+				if (!kvm_s2pmd_readonly(pmd))
+					kvm_set_s2pmd_readonly(pmd);
+			} else {
+				stage2_wp_ptes(pmd, addr, next);
+			}
+		}
+	} while (pmd++, addr = next, addr != end);
+}
+
+/**
+  * stage2_wp_puds - write protect PGD range
+  * @pgd:	pointer to pgd entry
+  * @addr:	range start address
+  * @end:	range end address
+  *
+  * Process PUD entries, for a huge PUD we cause a panic.
+  */
+static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
+{
+	pud_t *pud;
+	phys_addr_t next;
+
+	pud = pud_offset(pgd, addr);
+	do {
+		next = kvm_pud_addr_end(addr, end);
+		if (!pud_none(*pud)) {
+			/* TODO:PUD not supported, revisit later if supported */
+			BUG_ON(kvm_pud_huge(*pud));
+			stage2_wp_pmds(pud, addr, next);
+		}
+	} while (pud++, addr = next, addr != end);
+}
+
+/**
+ * stage2_wp_range() - write protect stage2 memory region range
+ * @kvm:	The KVM pointer
+ * @addr:	Start address of range
+ * @end:	End address of range
+ */
+static void stage2_wp_range(struct kvm *kvm, phys_addr_t addr, phys_addr_t end)
+{
+	pgd_t *pgd;
+	phys_addr_t next;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	do {
+		/*
+		 * Release kvm_mmu_lock periodically if the memory region is
+		 * large. Otherwise, we may see kernel panics with
+		 * CONFIG_DETECT_HUNG_TASK, CONFIG_LOCK_DETECTOR,
+		 * CONFIG_LOCK_DEP. Additionally, holding the lock too long
+		 * will also starve other vCPUs.
+		 */
+		if (need_resched() || spin_needbreak(&kvm->mmu_lock))
+			cond_resched_lock(&kvm->mmu_lock);
+
+		next = kvm_pgd_addr_end(addr, end);
+		if (pgd_present(*pgd))
+			stage2_wp_puds(pgd, addr, next);
+	} while (pgd++, addr = next, addr != end);
+}
+
+/**
+ * kvm_mmu_wp_memory_region() - write protect stage 2 entries for memory slot
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot to write protect
+ *
+ * Called to start logging dirty pages after memory region
+ * KVM_MEM_LOG_DIRTY_PAGES operation is called. After this function returns
+ * all present PMD and PTEs are write protected in the memory region.
+ * Afterwards read of dirty page log can be called.
+ *
+ * Acquires kvm_mmu_lock. Called with kvm->slots_lock mutex acquired,
+ * serializing operations for VM memory regions.
+ */
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
+{
+	struct kvm_memory_slot *memslot = id_to_memslot(kvm->memslots, slot);
+	phys_addr_t start = memslot->base_gfn << PAGE_SHIFT;
+	phys_addr_t end = (memslot->base_gfn + memslot->npages) << PAGE_SHIFT;
+
+	spin_lock(&kvm->mmu_lock);
+	stage2_wp_range(kvm, start, end);
+	spin_unlock(&kvm->mmu_lock);
+	kvm_flush_remote_tlbs(kvm);
+}
+#endif
+
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
 			  unsigned long fault_status)
@@ -1227,6 +1353,15 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				   const struct kvm_memory_slot *old,
 				   enum kvm_mr_change change)
 {
+#ifdef CONFIG_ARM
+	/*
+	 * At this point memslot has been committed and there is an
+	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
+	 * memory slot is write protected.
+	 */
+	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
+		kvm_mmu_wp_memory_region(kvm, mem->slot);
+#endif
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 06/11] KVM: arm: dirty logging write protect support
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
calls. We call kvm_get_dirty_log_protect() function to do most of the work.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/Kconfig |  1 +
 arch/arm/kvm/arm.c   | 34 ++++++++++++++++++++++++++++++++++
 arch/arm/kvm/mmu.c   | 22 ++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index f27f336..a8d1ace 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -24,6 +24,7 @@ config KVM
 	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	depends on ARM_VIRT_EXT && ARM_LPAE
 	---help---
 	  Support hosting virtualized guest machines. You will also
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 9e193c8..6e4290c 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -719,9 +719,43 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	}
 }
 
+/**
+ * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
+ * @kvm: kvm instance
+ * @log: slot id and address to which we copy the log
+ *
+ * Steps 1-4 below provide general overview of dirty page logging. See
+ * kvm_get_dirty_log_protect() function description for additional details.
+ *
+ * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
+ * always flush the TLB (step 4) even if previous step failed  and the dirty
+ * bitmap may be corrupt. Regardless of previous outcome the KVM logging API
+ * does not preclude user space subsequent dirty log read. Flushing TLB ensures
+ * writes will be marked dirty for next log read.
+ *
+ *   1. Take a snapshot of the bit and clear it if needed.
+ *   2. Write protect the corresponding page.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
+ */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
+#ifdef CONFIG_ARM
+	bool is_dirty = false;
+	int r;
+
+	mutex_lock(&kvm->slots_lock);
+
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
+
+	if (is_dirty)
+		kvm_flush_remote_tlbs(kvm);
+
+	mutex_unlock(&kvm->slots_lock);
+	return r;
+#else /* arm64 */
 	return -EINVAL;
+#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 29e0108..73d506f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -964,6 +964,28 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 	spin_unlock(&kvm->mmu_lock);
 	kvm_flush_remote_tlbs(kvm);
 }
+
+/**
+ * kvm_arch_mmu_write_protect_pt_masked() - write protect dirty pages
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot associated with mask
+ * @gfn_offset:	The gfn offset in memory slot
+ * @mask:	The mask of dirty pages at offset 'gfn_offset' in this memory
+ *		slot to be write protected
+ *
+ * Walks bits set in mask write protects the associated pte's. Caller must
+ * acquire kvm_mmu_lock.
+ */
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+		struct kvm_memory_slot *slot,
+		gfn_t gfn_offset, unsigned long mask)
+{
+	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
+	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
+	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
+
+	stage2_wp_range(kvm, start, end);
+}
 #endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 06/11] KVM: arm: dirty logging write protect support
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
calls. We call kvm_get_dirty_log_protect() function to do most of the work.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/Kconfig |  1 +
 arch/arm/kvm/arm.c   | 34 ++++++++++++++++++++++++++++++++++
 arch/arm/kvm/mmu.c   | 22 ++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index f27f336..a8d1ace 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -24,6 +24,7 @@ config KVM
 	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	depends on ARM_VIRT_EXT && ARM_LPAE
 	---help---
 	  Support hosting virtualized guest machines. You will also
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 9e193c8..6e4290c 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -719,9 +719,43 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	}
 }
 
+/**
+ * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
+ * @kvm: kvm instance
+ * @log: slot id and address to which we copy the log
+ *
+ * Steps 1-4 below provide general overview of dirty page logging. See
+ * kvm_get_dirty_log_protect() function description for additional details.
+ *
+ * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
+ * always flush the TLB (step 4) even if previous step failed  and the dirty
+ * bitmap may be corrupt. Regardless of previous outcome the KVM logging API
+ * does not preclude user space subsequent dirty log read. Flushing TLB ensures
+ * writes will be marked dirty for next log read.
+ *
+ *   1. Take a snapshot of the bit and clear it if needed.
+ *   2. Write protect the corresponding page.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
+ */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
+#ifdef CONFIG_ARM
+	bool is_dirty = false;
+	int r;
+
+	mutex_lock(&kvm->slots_lock);
+
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
+
+	if (is_dirty)
+		kvm_flush_remote_tlbs(kvm);
+
+	mutex_unlock(&kvm->slots_lock);
+	return r;
+#else /* arm64 */
 	return -EINVAL;
+#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 29e0108..73d506f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -964,6 +964,28 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 	spin_unlock(&kvm->mmu_lock);
 	kvm_flush_remote_tlbs(kvm);
 }
+
+/**
+ * kvm_arch_mmu_write_protect_pt_masked() - write protect dirty pages
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot associated with mask
+ * @gfn_offset:	The gfn offset in memory slot
+ * @mask:	The mask of dirty pages at offset 'gfn_offset' in this memory
+ *		slot to be write protected
+ *
+ * Walks bits set in mask write protects the associated pte's. Caller must
+ * acquire kvm_mmu_lock.
+ */
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+		struct kvm_memory_slot *slot,
+		gfn_t gfn_offset, unsigned long mask)
+{
+	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
+	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
+	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
+
+	stage2_wp_range(kvm, start, end);
+}
 #endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 06/11] KVM: arm: dirty logging write protect support
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
calls. We call kvm_get_dirty_log_protect() function to do most of the work.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/Kconfig |  1 +
 arch/arm/kvm/arm.c   | 34 ++++++++++++++++++++++++++++++++++
 arch/arm/kvm/mmu.c   | 22 ++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index f27f336..a8d1ace 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -24,6 +24,7 @@ config KVM
 	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	depends on ARM_VIRT_EXT && ARM_LPAE
 	---help---
 	  Support hosting virtualized guest machines. You will also
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 9e193c8..6e4290c 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -719,9 +719,43 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	}
 }
 
+/**
+ * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
+ * @kvm: kvm instance
+ * @log: slot id and address to which we copy the log
+ *
+ * Steps 1-4 below provide general overview of dirty page logging. See
+ * kvm_get_dirty_log_protect() function description for additional details.
+ *
+ * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
+ * always flush the TLB (step 4) even if previous step failed  and the dirty
+ * bitmap may be corrupt. Regardless of previous outcome the KVM logging API
+ * does not preclude user space subsequent dirty log read. Flushing TLB ensures
+ * writes will be marked dirty for next log read.
+ *
+ *   1. Take a snapshot of the bit and clear it if needed.
+ *   2. Write protect the corresponding page.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
+ */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
+#ifdef CONFIG_ARM
+	bool is_dirty = false;
+	int r;
+
+	mutex_lock(&kvm->slots_lock);
+
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
+
+	if (is_dirty)
+		kvm_flush_remote_tlbs(kvm);
+
+	mutex_unlock(&kvm->slots_lock);
+	return r;
+#else /* arm64 */
 	return -EINVAL;
+#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 29e0108..73d506f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -964,6 +964,28 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 	spin_unlock(&kvm->mmu_lock);
 	kvm_flush_remote_tlbs(kvm);
 }
+
+/**
+ * kvm_arch_mmu_write_protect_pt_masked() - write protect dirty pages
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot associated with mask
+ * @gfn_offset:	The gfn offset in memory slot
+ * @mask:	The mask of dirty pages at offset 'gfn_offset' in this memory
+ *		slot to be write protected
+ *
+ * Walks bits set in mask write protects the associated pte's. Caller must
+ * acquire kvm_mmu_lock.
+ */
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+		struct kvm_memory_slot *slot,
+		gfn_t gfn_offset, unsigned long mask)
+{
+	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
+	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
+	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
+
+	stage2_wp_range(kvm, start, end);
+}
 #endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 06/11] KVM: arm: dirty logging write protect support
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
calls. We call kvm_get_dirty_log_protect() function to do most of the work.

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/Kconfig |  1 +
 arch/arm/kvm/arm.c   | 34 ++++++++++++++++++++++++++++++++++
 arch/arm/kvm/mmu.c   | 22 ++++++++++++++++++++++
 3 files changed, 57 insertions(+)

diff --git a/arch/arm/kvm/Kconfig b/arch/arm/kvm/Kconfig
index f27f336..a8d1ace 100644
--- a/arch/arm/kvm/Kconfig
+++ b/arch/arm/kvm/Kconfig
@@ -24,6 +24,7 @@ config KVM
 	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	depends on ARM_VIRT_EXT && ARM_LPAE
 	---help---
 	  Support hosting virtualized guest machines. You will also
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 9e193c8..6e4290c 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -719,9 +719,43 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 	}
 }
 
+/**
+ * kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
+ * @kvm: kvm instance
+ * @log: slot id and address to which we copy the log
+ *
+ * Steps 1-4 below provide general overview of dirty page logging. See
+ * kvm_get_dirty_log_protect() function description for additional details.
+ *
+ * We call kvm_get_dirty_log_protect() to handle steps 1-3, upon return we
+ * always flush the TLB (step 4) even if previous step failed  and the dirty
+ * bitmap may be corrupt. Regardless of previous outcome the KVM logging API
+ * does not preclude user space subsequent dirty log read. Flushing TLB ensures
+ * writes will be marked dirty for next log read.
+ *
+ *   1. Take a snapshot of the bit and clear it if needed.
+ *   2. Write protect the corresponding page.
+ *   3. Copy the snapshot to the userspace.
+ *   4. Flush TLB's if needed.
+ */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
+#ifdef CONFIG_ARM
+	bool is_dirty = false;
+	int r;
+
+	mutex_lock(&kvm->slots_lock);
+
+	r = kvm_get_dirty_log_protect(kvm, log, &is_dirty);
+
+	if (is_dirty)
+		kvm_flush_remote_tlbs(kvm);
+
+	mutex_unlock(&kvm->slots_lock);
+	return r;
+#else /* arm64 */
 	return -EINVAL;
+#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 29e0108..73d506f 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -964,6 +964,28 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
 	spin_unlock(&kvm->mmu_lock);
 	kvm_flush_remote_tlbs(kvm);
 }
+
+/**
+ * kvm_arch_mmu_write_protect_pt_masked() - write protect dirty pages
+ * @kvm:	The KVM pointer
+ * @slot:	The memory slot associated with mask
+ * @gfn_offset:	The gfn offset in memory slot
+ * @mask:	The mask of dirty pages at offset 'gfn_offset' in this memory
+ *		slot to be write protected
+ *
+ * Walks bits set in mask write protects the associated pte's. Caller must
+ * acquire kvm_mmu_lock.
+ */
+void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
+		struct kvm_memory_slot *slot,
+		gfn_t gfn_offset, unsigned long mask)
+{
+	phys_addr_t base_gfn = slot->base_gfn + gfn_offset;
+	phys_addr_t start = (base_gfn +  __ffs(mask)) << PAGE_SHIFT;
+	phys_addr_t end = (base_gfn + __fls(mask) + 1) << PAGE_SHIFT;
+
+	stage2_wp_range(kvm, start, end);
+}
 #endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds support for handling 2nd stage page faults during migration,
it disables faulting in huge pages, and dissolves huge pages to page tables.
In case migration is canceled huge pages are used again. 

Also since last version an issues was found on SMP host running SMP Guest 
and clearing huge TLB entry. Multiple CPUs can write to same huge page range 
so all pages in the range are marked dirty after the TLB is flushed. It didn't 
showup on hardware, but appeared on Fast Models perhpas the TLB flush is 
slower. To prevent clutter in user_mem_abort() refactored some code into 
functions.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/mmu.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 78 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 73d506f..dc763bb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
 #define kvm_pud_huge(_x)	pud_huge(_x)
 
+#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
+#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
+
+static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
+{
+#ifdef CONFIG_ARM
+	return !!memslot->dirty_bitmap;
+#else
+	return false;
+#endif
+}
+
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
 	/*
@@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
 }
 
+/**
+ * stage2_dissolve_pmd() - clear and flush huge PMD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ * @pmd	pmd pointer for IPA
+ *
+ * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
+{
+	gfn_t gfn;
+	int i;
+
+	if (kvm_pmd_huge(*pmd)) {
+		pmd_clear(pmd);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pmd));
+#ifdef CONFIG_SMP
+		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
+
+		/*
+		 * Mark all pages in PMD range dirty, in case other CPUs are
+		 * writing to it.
+		 */
+		for (i = 0; i < PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+#endif
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
+			  phys_addr_t addr, const pte_t *new_pte,
+			  unsigned long flags)
 {
 	pmd_t *pmd;
 	pte_t *pte, old_pte;
+	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
+	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
@@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		return 0;
 	}
 
+	/*
+	 * While dirty page logging - dissolve huge PMD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pmd(kvm, addr, pmd);
+
 	/* Create stage-2 page mappings - Level 2 */
 	if (pmd_none(*pmd)) {
 		if (!cache)
@@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 		if (ret)
 			goto out;
 		spin_lock(&kvm->mmu_lock);
-		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
+		ret = stage2_set_pte(kvm, &cache, addr, &pte,
+						KVM_S2PTE_FLAG_IS_IOMAP);
 		spin_unlock(&kvm->mmu_lock);
 		if (ret)
 			goto out;
@@ -1002,6 +1056,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
 	bool fault_ipa_uncached;
+	unsigned long logging_active = 0;
+
+	if (kvm_get_logging_state(memslot))
+		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	write_fault = kvm_is_write_fault(vcpu);
 	if (fault_status == FSC_PERM && !write_fault) {
@@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (is_vm_hugetlb_page(vma)) {
+	if (is_vm_hugetlb_page(vma) && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
@@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	spin_lock(&kvm->mmu_lock);
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
-	if (!hugetlb && !force_pte)
+	if (!hugetlb && !force_pte && !logging_active)
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
 	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
@@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
 		pte_t new_pte = pfn_pte(pfn, mem_type);
+		unsigned long flags = logging_active;
+
+		if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE))
+			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
+
 		if (writable) {
 			kvm_set_s2pte_writable(&new_pte);
 			kvm_set_pfn_dirty(pfn);
 		}
 		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
 					  fault_ipa_uncached);
-		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
-			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
+		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
 	}
 
-
+	if (write_fault)
+		mark_page_dirty(kvm, gfn);
 out_unlock:
 	spin_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
@@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
 {
 	pte_t *pte = (pte_t *)data;
 
-	stage2_set_pte(kvm, NULL, gpa, pte, false);
+	/*
+	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
+	 * flag set because MMU notifiers will have unmapped a huge PMD before
+	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
+	 * therefore stage2_set_pte() never needs to clear out a huge PMD
+	 * through this calling path.
+	 */
+	stage2_set_pte(kvm, NULL, gpa, pte, 0);
 }
 
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for handling 2nd stage page faults during migration,
it disables faulting in huge pages, and dissolves huge pages to page tables.
In case migration is canceled huge pages are used again. 

Also since last version an issues was found on SMP host running SMP Guest 
and clearing huge TLB entry. Multiple CPUs can write to same huge page range 
so all pages in the range are marked dirty after the TLB is flushed. It didn't 
showup on hardware, but appeared on Fast Models perhpas the TLB flush is 
slower. To prevent clutter in user_mem_abort() refactored some code into 
functions.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/mmu.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 78 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 73d506f..dc763bb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
 #define kvm_pud_huge(_x)	pud_huge(_x)
 
+#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
+#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
+
+static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
+{
+#ifdef CONFIG_ARM
+	return !!memslot->dirty_bitmap;
+#else
+	return false;
+#endif
+}
+
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
 	/*
@@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
 }
 
+/**
+ * stage2_dissolve_pmd() - clear and flush huge PMD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ * @pmd	pmd pointer for IPA
+ *
+ * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
+{
+	gfn_t gfn;
+	int i;
+
+	if (kvm_pmd_huge(*pmd)) {
+		pmd_clear(pmd);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pmd));
+#ifdef CONFIG_SMP
+		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
+
+		/*
+		 * Mark all pages in PMD range dirty, in case other CPUs are
+		 * writing to it.
+		 */
+		for (i = 0; i < PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+#endif
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
+			  phys_addr_t addr, const pte_t *new_pte,
+			  unsigned long flags)
 {
 	pmd_t *pmd;
 	pte_t *pte, old_pte;
+	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
+	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
@@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		return 0;
 	}
 
+	/*
+	 * While dirty page logging - dissolve huge PMD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pmd(kvm, addr, pmd);
+
 	/* Create stage-2 page mappings - Level 2 */
 	if (pmd_none(*pmd)) {
 		if (!cache)
@@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 		if (ret)
 			goto out;
 		spin_lock(&kvm->mmu_lock);
-		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
+		ret = stage2_set_pte(kvm, &cache, addr, &pte,
+						KVM_S2PTE_FLAG_IS_IOMAP);
 		spin_unlock(&kvm->mmu_lock);
 		if (ret)
 			goto out;
@@ -1002,6 +1056,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
 	bool fault_ipa_uncached;
+	unsigned long logging_active = 0;
+
+	if (kvm_get_logging_state(memslot))
+		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	write_fault = kvm_is_write_fault(vcpu);
 	if (fault_status == FSC_PERM && !write_fault) {
@@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (is_vm_hugetlb_page(vma)) {
+	if (is_vm_hugetlb_page(vma) && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
@@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	spin_lock(&kvm->mmu_lock);
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
-	if (!hugetlb && !force_pte)
+	if (!hugetlb && !force_pte && !logging_active)
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
 	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
@@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
 		pte_t new_pte = pfn_pte(pfn, mem_type);
+		unsigned long flags = logging_active;
+
+		if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE))
+			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
+
 		if (writable) {
 			kvm_set_s2pte_writable(&new_pte);
 			kvm_set_pfn_dirty(pfn);
 		}
 		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
 					  fault_ipa_uncached);
-		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
-			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
+		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
 	}
 
-
+	if (write_fault)
+		mark_page_dirty(kvm, gfn);
 out_unlock:
 	spin_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
@@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
 {
 	pte_t *pte = (pte_t *)data;
 
-	stage2_set_pte(kvm, NULL, gpa, pte, false);
+	/*
+	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
+	 * flag set because MMU notifiers will have unmapped a huge PMD before
+	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
+	 * therefore stage2_set_pte() never needs to clear out a huge PMD
+	 * through this calling path.
+	 */
+	stage2_set_pte(kvm, NULL, gpa, pte, 0);
 }
 
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds support for handling 2nd stage page faults during migration,
it disables faulting in huge pages, and dissolves huge pages to page tables.
In case migration is canceled huge pages are used again. 

Also since last version an issues was found on SMP host running SMP Guest 
and clearing huge TLB entry. Multiple CPUs can write to same huge page range 
so all pages in the range are marked dirty after the TLB is flushed. It didn't 
showup on hardware, but appeared on Fast Models perhpas the TLB flush is 
slower. To prevent clutter in user_mem_abort() refactored some code into 
functions.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/mmu.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 78 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 73d506f..dc763bb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
 #define kvm_pud_huge(_x)	pud_huge(_x)
 
+#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
+#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
+
+static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
+{
+#ifdef CONFIG_ARM
+	return !!memslot->dirty_bitmap;
+#else
+	return false;
+#endif
+}
+
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
 	/*
@@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
 }
 
+/**
+ * stage2_dissolve_pmd() - clear and flush huge PMD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ * @pmd	pmd pointer for IPA
+ *
+ * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
+{
+	gfn_t gfn;
+	int i;
+
+	if (kvm_pmd_huge(*pmd)) {
+		pmd_clear(pmd);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pmd));
+#ifdef CONFIG_SMP
+		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
+
+		/*
+		 * Mark all pages in PMD range dirty, in case other CPUs are
+		 * writing to it.
+		 */
+		for (i = 0; i < PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+#endif
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
+			  phys_addr_t addr, const pte_t *new_pte,
+			  unsigned long flags)
 {
 	pmd_t *pmd;
 	pte_t *pte, old_pte;
+	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
+	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
@@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		return 0;
 	}
 
+	/*
+	 * While dirty page logging - dissolve huge PMD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pmd(kvm, addr, pmd);
+
 	/* Create stage-2 page mappings - Level 2 */
 	if (pmd_none(*pmd)) {
 		if (!cache)
@@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 		if (ret)
 			goto out;
 		spin_lock(&kvm->mmu_lock);
-		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
+		ret = stage2_set_pte(kvm, &cache, addr, &pte,
+						KVM_S2PTE_FLAG_IS_IOMAP);
 		spin_unlock(&kvm->mmu_lock);
 		if (ret)
 			goto out;
@@ -1002,6 +1056,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
 	bool fault_ipa_uncached;
+	unsigned long logging_active = 0;
+
+	if (kvm_get_logging_state(memslot))
+		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	write_fault = kvm_is_write_fault(vcpu);
 	if (fault_status = FSC_PERM && !write_fault) {
@@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (is_vm_hugetlb_page(vma)) {
+	if (is_vm_hugetlb_page(vma) && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
@@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	spin_lock(&kvm->mmu_lock);
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
-	if (!hugetlb && !force_pte)
+	if (!hugetlb && !force_pte && !logging_active)
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
 	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
@@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
 		pte_t new_pte = pfn_pte(pfn, mem_type);
+		unsigned long flags = logging_active;
+
+		if (pgprot_val(mem_type) = pgprot_val(PAGE_S2_DEVICE))
+			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
+
 		if (writable) {
 			kvm_set_s2pte_writable(&new_pte);
 			kvm_set_pfn_dirty(pfn);
 		}
 		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
 					  fault_ipa_uncached);
-		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
-			pgprot_val(mem_type) = pgprot_val(PAGE_S2_DEVICE));
+		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
 	}
 
-
+	if (write_fault)
+		mark_page_dirty(kvm, gfn);
 out_unlock:
 	spin_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
@@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
 {
 	pte_t *pte = (pte_t *)data;
 
-	stage2_set_pte(kvm, NULL, gpa, pte, false);
+	/*
+	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
+	 * flag set because MMU notifiers will have unmapped a huge PMD before
+	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
+	 * therefore stage2_set_pte() never needs to clear out a huge PMD
+	 * through this calling path.
+	 */
+	stage2_set_pte(kvm, NULL, gpa, pte, 0);
 }
 
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

This patch adds support for handling 2nd stage page faults during migration,
it disables faulting in huge pages, and dissolves huge pages to page tables.
In case migration is canceled huge pages are used again. 

Also since last version an issues was found on SMP host running SMP Guest 
and clearing huge TLB entry. Multiple CPUs can write to same huge page range 
so all pages in the range are marked dirty after the TLB is flushed. It didn't 
showup on hardware, but appeared on Fast Models perhpas the TLB flush is 
slower. To prevent clutter in user_mem_abort() refactored some code into 
functions.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/mmu.c | 86 +++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 78 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 73d506f..dc763bb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
 #define kvm_pud_huge(_x)	pud_huge(_x)
 
+#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
+#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
+
+static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
+{
+#ifdef CONFIG_ARM
+	return !!memslot->dirty_bitmap;
+#else
+	return false;
+#endif
+}
+
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
 	/*
@@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
 }
 
+/**
+ * stage2_dissolve_pmd() - clear and flush huge PMD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ * @pmd	pmd pointer for IPA
+ *
+ * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
+{
+	gfn_t gfn;
+	int i;
+
+	if (kvm_pmd_huge(*pmd)) {
+		pmd_clear(pmd);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pmd));
+#ifdef CONFIG_SMP
+		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
+
+		/*
+		 * Mark all pages in PMD range dirty, in case other CPUs are
+		 * writing to it.
+		 */
+		for (i = 0; i < PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+#endif
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
+			  phys_addr_t addr, const pte_t *new_pte,
+			  unsigned long flags)
 {
 	pmd_t *pmd;
 	pte_t *pte, old_pte;
+	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
+	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
@@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		return 0;
 	}
 
+	/*
+	 * While dirty page logging - dissolve huge PMD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pmd(kvm, addr, pmd);
+
 	/* Create stage-2 page mappings - Level 2 */
 	if (pmd_none(*pmd)) {
 		if (!cache)
@@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 		if (ret)
 			goto out;
 		spin_lock(&kvm->mmu_lock);
-		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
+		ret = stage2_set_pte(kvm, &cache, addr, &pte,
+						KVM_S2PTE_FLAG_IS_IOMAP);
 		spin_unlock(&kvm->mmu_lock);
 		if (ret)
 			goto out;
@@ -1002,6 +1056,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
 	bool fault_ipa_uncached;
+	unsigned long logging_active = 0;
+
+	if (kvm_get_logging_state(memslot))
+		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	write_fault = kvm_is_write_fault(vcpu);
 	if (fault_status = FSC_PERM && !write_fault) {
@@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (is_vm_hugetlb_page(vma)) {
+	if (is_vm_hugetlb_page(vma) && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
@@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	spin_lock(&kvm->mmu_lock);
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
-	if (!hugetlb && !force_pte)
+	if (!hugetlb && !force_pte && !logging_active)
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
 	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
@@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
 		pte_t new_pte = pfn_pte(pfn, mem_type);
+		unsigned long flags = logging_active;
+
+		if (pgprot_val(mem_type) = pgprot_val(PAGE_S2_DEVICE))
+			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
+
 		if (writable) {
 			kvm_set_s2pte_writable(&new_pte);
 			kvm_set_pfn_dirty(pfn);
 		}
 		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
 					  fault_ipa_uncached);
-		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
-			pgprot_val(mem_type) = pgprot_val(PAGE_S2_DEVICE));
+		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
 	}
 
-
+	if (write_fault)
+		mark_page_dirty(kvm, gfn);
 out_unlock:
 	spin_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
@@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
 {
 	pte_t *pte = (pte_t *)data;
 
-	stage2_set_pte(kvm, NULL, gpa, pte, false);
+	/*
+	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
+	 * flag set because MMU notifiers will have unmapped a huge PMD before
+	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
+	 * therefore stage2_set_pte() never needs to clear out a huge PMD
+	 * through this calling path.
+	 */
+	stage2_set_pte(kvm, NULL, gpa, pte, 0);
 }
 
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 08/11] KVM: arm64: ARMv8 header changes for page logging
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds arm64 helpers to write protect pmds/ptes and retrieve
permissions while logging dirty pages. Also adds prototype to write protect
a memory slot and adds a pmd define to check for read-only pmds.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm64/include/asm/kvm_asm.h       |  1 +
 arch/arm64/include/asm/kvm_host.h      |  1 +
 arch/arm64/include/asm/kvm_mmu.h       | 21 +++++++++++++++++++++
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 4 files changed, 24 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 4838421..4f7310f 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -126,6 +126,7 @@ extern char __kvm_hyp_vector[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2012c4b..8b60c0f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -200,6 +200,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 123b521..f925e40 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -117,6 +117,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~PTE_S2_RDWR) | PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~PMD_S2_RDWR) | PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
+}
+
+
 #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
 #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
 #define kvm_pmd_addr_end(addr, end)	pmd_addr_end(addr, end)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 88174e0..5f930cc 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -119,6 +119,7 @@
 #define PTE_S2_RDONLY		(_AT(pteval_t, 1) << 6)   /* HAP[2:1] */
 #define PTE_S2_RDWR		(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 08/11] KVM: arm64: ARMv8 header changes for page logging
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds arm64 helpers to write protect pmds/ptes and retrieve
permissions while logging dirty pages. Also adds prototype to write protect
a memory slot and adds a pmd define to check for read-only pmds.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm64/include/asm/kvm_asm.h       |  1 +
 arch/arm64/include/asm/kvm_host.h      |  1 +
 arch/arm64/include/asm/kvm_mmu.h       | 21 +++++++++++++++++++++
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 4 files changed, 24 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 4838421..4f7310f 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -126,6 +126,7 @@ extern char __kvm_hyp_vector[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2012c4b..8b60c0f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -200,6 +200,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 123b521..f925e40 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -117,6 +117,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~PTE_S2_RDWR) | PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~PMD_S2_RDWR) | PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
+}
+
+
 #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
 #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
 #define kvm_pmd_addr_end(addr, end)	pmd_addr_end(addr, end)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 88174e0..5f930cc 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -119,6 +119,7 @@
 #define PTE_S2_RDONLY		(_AT(pteval_t, 1) << 6)   /* HAP[2:1] */
 #define PTE_S2_RDWR		(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 08/11] KVM: arm64: ARMv8 header changes for page logging
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds arm64 helpers to write protect pmds/ptes and retrieve
permissions while logging dirty pages. Also adds prototype to write protect
a memory slot and adds a pmd define to check for read-only pmds.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm64/include/asm/kvm_asm.h       |  1 +
 arch/arm64/include/asm/kvm_host.h      |  1 +
 arch/arm64/include/asm/kvm_mmu.h       | 21 +++++++++++++++++++++
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 4 files changed, 24 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 4838421..4f7310f 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -126,6 +126,7 @@ extern char __kvm_hyp_vector[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2012c4b..8b60c0f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -200,6 +200,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 123b521..f925e40 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -117,6 +117,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~PTE_S2_RDWR) | PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & PTE_S2_RDWR) = PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~PMD_S2_RDWR) | PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & PMD_S2_RDWR) = PMD_S2_RDONLY;
+}
+
+
 #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
 #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
 #define kvm_pmd_addr_end(addr, end)	pmd_addr_end(addr, end)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 88174e0..5f930cc 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -119,6 +119,7 @@
 #define PTE_S2_RDONLY		(_AT(pteval_t, 1) << 6)   /* HAP[2:1] */
 #define PTE_S2_RDWR		(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 08/11] KVM: arm64: ARMv8 header changes for page logging
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

This patch adds arm64 helpers to write protect pmds/ptes and retrieve
permissions while logging dirty pages. Also adds prototype to write protect
a memory slot and adds a pmd define to check for read-only pmds.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm64/include/asm/kvm_asm.h       |  1 +
 arch/arm64/include/asm/kvm_host.h      |  1 +
 arch/arm64/include/asm/kvm_mmu.h       | 21 +++++++++++++++++++++
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 4 files changed, 24 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 4838421..4f7310f 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -126,6 +126,7 @@ extern char __kvm_hyp_vector[];
 
 extern void __kvm_flush_vm_context(void);
 extern void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa);
+extern void __kvm_tlb_flush_vmid(struct kvm *kvm);
 
 extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2012c4b..8b60c0f 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -200,6 +200,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 123b521..f925e40 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -117,6 +117,27 @@ static inline void kvm_set_s2pmd_writable(pmd_t *pmd)
 	pmd_val(*pmd) |= PMD_S2_RDWR;
 }
 
+static inline void kvm_set_s2pte_readonly(pte_t *pte)
+{
+	pte_val(*pte) = (pte_val(*pte) & ~PTE_S2_RDWR) | PTE_S2_RDONLY;
+}
+
+static inline bool kvm_s2pte_readonly(pte_t *pte)
+{
+	return (pte_val(*pte) & PTE_S2_RDWR) = PTE_S2_RDONLY;
+}
+
+static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
+{
+	pmd_val(*pmd) = (pmd_val(*pmd) & ~PMD_S2_RDWR) | PMD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
+{
+	return (pmd_val(*pmd) & PMD_S2_RDWR) = PMD_S2_RDONLY;
+}
+
+
 #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
 #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
 #define kvm_pmd_addr_end(addr, end)	pmd_addr_end(addr, end)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 88174e0..5f930cc 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -119,6 +119,7 @@
 #define PTE_S2_RDONLY		(_AT(pteval_t, 1) << 6)   /* HAP[2:1] */
 #define PTE_S2_RDWR		(_AT(pteval_t, 3) << 6)   /* HAP[2:1] */
 
+#define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
 /*
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 09/11] KVM: arm64: Add HYP interface to flush VM Stage 1/2 TLB entires
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds support for arm64 hyp interface to flush all TLBs associated
with VMID.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm64/kvm/hyp.S | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index b72aa9f..6e1b5df 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -1030,6 +1030,28 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	ret
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ * @struct kvm *kvm - pointer to kvm structure
+ *
+ * Invalidates all Stage 1 and 2 TLB entries for current VMID.
+ */
+ENTRY(__kvm_tlb_flush_vmid)
+	dsb     ishst
+
+	kern_hyp_va     x0
+	ldr     x2, [x0, #KVM_VTTBR]
+	msr     vttbr_el2, x2
+	isb
+
+	tlbi    vmalls12e1is
+	dsb     ish
+	isb
+
+	msr     vttbr_el2, xzr
+	ret
+ENDPROC(__kvm_tlb_flush_vmid)
+
 ENTRY(__kvm_flush_vm_context)
 	dsb	ishst
 	tlbi	alle1is
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 09/11] KVM: arm64: Add HYP interface to flush VM Stage 1/2 TLB entires
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support for arm64 hyp interface to flush all TLBs associated
with VMID.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm64/kvm/hyp.S | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index b72aa9f..6e1b5df 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -1030,6 +1030,28 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	ret
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ * @struct kvm *kvm - pointer to kvm structure
+ *
+ * Invalidates all Stage 1 and 2 TLB entries for current VMID.
+ */
+ENTRY(__kvm_tlb_flush_vmid)
+	dsb     ishst
+
+	kern_hyp_va     x0
+	ldr     x2, [x0, #KVM_VTTBR]
+	msr     vttbr_el2, x2
+	isb
+
+	tlbi    vmalls12e1is
+	dsb     ish
+	isb
+
+	msr     vttbr_el2, xzr
+	ret
+ENDPROC(__kvm_tlb_flush_vmid)
+
 ENTRY(__kvm_flush_vm_context)
 	dsb	ishst
 	tlbi	alle1is
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 09/11] KVM: arm64: Add HYP interface to flush VM Stage 1/2 TLB entires
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds support for arm64 hyp interface to flush all TLBs associated
with VMID.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm64/kvm/hyp.S | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index b72aa9f..6e1b5df 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -1030,6 +1030,28 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	ret
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ * @struct kvm *kvm - pointer to kvm structure
+ *
+ * Invalidates all Stage 1 and 2 TLB entries for current VMID.
+ */
+ENTRY(__kvm_tlb_flush_vmid)
+	dsb     ishst
+
+	kern_hyp_va     x0
+	ldr     x2, [x0, #KVM_VTTBR]
+	msr     vttbr_el2, x2
+	isb
+
+	tlbi    vmalls12e1is
+	dsb     ish
+	isb
+
+	msr     vttbr_el2, xzr
+	ret
+ENDPROC(__kvm_tlb_flush_vmid)
+
 ENTRY(__kvm_flush_vm_context)
 	dsb	ishst
 	tlbi	alle1is
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 09/11] KVM: arm64: Add HYP interface to flush VM Stage 1/2 TLB entires
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

This patch adds support for arm64 hyp interface to flush all TLBs associated
with VMID.

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm64/kvm/hyp.S | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index b72aa9f..6e1b5df 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -1030,6 +1030,28 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
 	ret
 ENDPROC(__kvm_tlb_flush_vmid_ipa)
 
+/**
+ * void __kvm_tlb_flush_vmid(struct kvm *kvm) - Flush per-VMID TLBs
+ * @struct kvm *kvm - pointer to kvm structure
+ *
+ * Invalidates all Stage 1 and 2 TLB entries for current VMID.
+ */
+ENTRY(__kvm_tlb_flush_vmid)
+	dsb     ishst
+
+	kern_hyp_va     x0
+	ldr     x2, [x0, #KVM_VTTBR]
+	msr     vttbr_el2, x2
+	isb
+
+	tlbi    vmalls12e1is
+	dsb     ish
+	isb
+
+	msr     vttbr_el2, xzr
+	ret
+ENDPROC(__kvm_tlb_flush_vmid)
+
 ENTRY(__kvm_flush_vm_context)
 	dsb	ishst
 	tlbi	alle1is
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
logging at architecture layer.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h | 12 ------------
 arch/arm/kvm/arm.c              |  4 ----
 arch/arm/kvm/mmu.c              | 19 +++++++++++--------
 arch/arm64/kvm/Kconfig          |  2 ++
 4 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index b138431..088ea87 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
-/**
- * kvm_flush_remote_tlbs() - flush all VM TLB entries
- * @kvm:	pointer to kvm structure.
- *
- * Interface to HYP function to flush all VM TLB entries without address
- * parameter.
- */
-static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
-{
-	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
-}
-
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 6e4290c..1b6577c 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
-#ifdef CONFIG_ARM
 	bool is_dirty = false;
 	int r;
 
@@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 
 	mutex_unlock(&kvm->slots_lock);
 	return r;
-#else /* arm64 */
-	return -EINVAL;
-#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index dc763bb..59003df 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
 
 static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
 {
-#ifdef CONFIG_ARM
 	return !!memslot->dirty_bitmap;
-#else
-	return false;
-#endif
+}
+
+/**
+ * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries
+ */
+inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
 }
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
@@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
 	return !pfn_valid(pfn);
 }
 
-#ifdef CONFIG_ARM
 /**
  * stage2_wp_ptes - write protect PMD range
  * @pmd:	pointer to pmd entry
@@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
 
 	stage2_wp_range(kvm, start, end);
 }
-#endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
@@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				   const struct kvm_memory_slot *old,
 				   enum kvm_mr_change change)
 {
-#ifdef CONFIG_ARM
 	/*
 	 * At this point memslot has been committed and there is an
 	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
@@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	 */
 	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
 		kvm_mmu_wp_memory_region(kvm, mem->slot);
-#endif
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 8ba85e9..3ce389b 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -22,10 +22,12 @@ config KVM
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
 	select KVM_ARM_VGIC
 	select KVM_ARM_TIMER
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	---help---
 	  Support hosting virtualized guest machines.
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
logging at architecture layer.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h | 12 ------------
 arch/arm/kvm/arm.c              |  4 ----
 arch/arm/kvm/mmu.c              | 19 +++++++++++--------
 arch/arm64/kvm/Kconfig          |  2 ++
 4 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index b138431..088ea87 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
-/**
- * kvm_flush_remote_tlbs() - flush all VM TLB entries
- * @kvm:	pointer to kvm structure.
- *
- * Interface to HYP function to flush all VM TLB entries without address
- * parameter.
- */
-static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
-{
-	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
-}
-
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 6e4290c..1b6577c 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
-#ifdef CONFIG_ARM
 	bool is_dirty = false;
 	int r;
 
@@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 
 	mutex_unlock(&kvm->slots_lock);
 	return r;
-#else /* arm64 */
-	return -EINVAL;
-#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index dc763bb..59003df 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
 
 static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
 {
-#ifdef CONFIG_ARM
 	return !!memslot->dirty_bitmap;
-#else
-	return false;
-#endif
+}
+
+/**
+ * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries
+ */
+inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
 }
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
@@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
 	return !pfn_valid(pfn);
 }
 
-#ifdef CONFIG_ARM
 /**
  * stage2_wp_ptes - write protect PMD range
  * @pmd:	pointer to pmd entry
@@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
 
 	stage2_wp_range(kvm, start, end);
 }
-#endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
@@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				   const struct kvm_memory_slot *old,
 				   enum kvm_mr_change change)
 {
-#ifdef CONFIG_ARM
 	/*
 	 * At this point memslot has been committed and there is an
 	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
@@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	 */
 	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
 		kvm_mmu_wp_memory_region(kvm, mem->slot);
-#endif
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 8ba85e9..3ce389b 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -22,10 +22,12 @@ config KVM
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
 	select KVM_ARM_VGIC
 	select KVM_ARM_TIMER
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	---help---
 	  Support hosting virtualized guest machines.
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
logging at architecture layer.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h | 12 ------------
 arch/arm/kvm/arm.c              |  4 ----
 arch/arm/kvm/mmu.c              | 19 +++++++++++--------
 arch/arm64/kvm/Kconfig          |  2 ++
 4 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index b138431..088ea87 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
-/**
- * kvm_flush_remote_tlbs() - flush all VM TLB entries
- * @kvm:	pointer to kvm structure.
- *
- * Interface to HYP function to flush all VM TLB entries without address
- * parameter.
- */
-static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
-{
-	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
-}
-
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 6e4290c..1b6577c 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
-#ifdef CONFIG_ARM
 	bool is_dirty = false;
 	int r;
 
@@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 
 	mutex_unlock(&kvm->slots_lock);
 	return r;
-#else /* arm64 */
-	return -EINVAL;
-#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index dc763bb..59003df 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
 
 static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
 {
-#ifdef CONFIG_ARM
 	return !!memslot->dirty_bitmap;
-#else
-	return false;
-#endif
+}
+
+/**
+ * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries
+ */
+inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
 }
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
@@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
 	return !pfn_valid(pfn);
 }
 
-#ifdef CONFIG_ARM
 /**
  * stage2_wp_ptes - write protect PMD range
  * @pmd:	pointer to pmd entry
@@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
 
 	stage2_wp_range(kvm, start, end);
 }
-#endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
@@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				   const struct kvm_memory_slot *old,
 				   enum kvm_mr_change change)
 {
-#ifdef CONFIG_ARM
 	/*
 	 * At this point memslot has been committed and there is an
 	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
@@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	 */
 	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
 		kvm_mmu_wp_memory_region(kvm, mem->slot);
-#endif
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 8ba85e9..3ce389b 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -22,10 +22,12 @@ config KVM
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
 	select KVM_ARM_VGIC
 	select KVM_ARM_TIMER
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	---help---
 	  Support hosting virtualized guest machines.
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
logging at architecture layer.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_host.h | 12 ------------
 arch/arm/kvm/arm.c              |  4 ----
 arch/arm/kvm/mmu.c              | 19 +++++++++++--------
 arch/arm64/kvm/Kconfig          |  2 ++
 4 files changed, 13 insertions(+), 24 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index b138431..088ea87 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
-/**
- * kvm_flush_remote_tlbs() - flush all VM TLB entries
- * @kvm:	pointer to kvm structure.
- *
- * Interface to HYP function to flush all VM TLB entries without address
- * parameter.
- */
-static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
-{
-	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
-}
-
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 6e4290c..1b6577c 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
  */
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 {
-#ifdef CONFIG_ARM
 	bool is_dirty = false;
 	int r;
 
@@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 
 	mutex_unlock(&kvm->slots_lock);
 	return r;
-#else /* arm64 */
-	return -EINVAL;
-#endif
 }
 
 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index dc763bb..59003df 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
 
 static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
 {
-#ifdef CONFIG_ARM
 	return !!memslot->dirty_bitmap;
-#else
-	return false;
-#endif
+}
+
+/**
+ * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
+ * @kvm:	pointer to kvm structure.
+ *
+ * Interface to HYP function to flush all VM TLB entries
+ */
+inline void kvm_flush_remote_tlbs(struct kvm *kvm)
+{
+	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
 }
 
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
@@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
 	return !pfn_valid(pfn);
 }
 
-#ifdef CONFIG_ARM
 /**
  * stage2_wp_ptes - write protect PMD range
  * @pmd:	pointer to pmd entry
@@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
 
 	stage2_wp_range(kvm, start, end);
 }
-#endif
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 			  struct kvm_memory_slot *memslot, unsigned long hva,
@@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 				   const struct kvm_memory_slot *old,
 				   enum kvm_mr_change change)
 {
-#ifdef CONFIG_ARM
 	/*
 	 * At this point memslot has been committed and there is an
 	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
@@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	 */
 	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
 		kvm_mmu_wp_memory_region(kvm, mem->slot);
-#endif
 }
 
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 8ba85e9..3ce389b 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -22,10 +22,12 @@ config KVM
 	select PREEMPT_NOTIFIERS
 	select ANON_INODES
 	select HAVE_KVM_CPU_RELAX_INTERCEPT
+	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
 	select KVM_MMIO
 	select KVM_ARM_HOST
 	select KVM_ARM_VGIC
 	select KVM_ARM_TIMER
+	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
 	---help---
 	  Support hosting virtualized guest machines.
 
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
  2014-12-15  7:27 ` Mario Smarduch
  (?)
  (?)
@ 2014-12-15  7:28   ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
write protected for initial memory region write protection. Code to dissolve 
huge PUD is supported in user_mem_abort(). At this time this code has not been 
tested, but similar approach to current ARMv8 page logging test is in work,
limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
4k page/48 bit host, some host kernel test code needs to be added to detect
page fault to this region and side step general processing. Also similar to 
PMD case all pages in range are marked dirty when PUD entry is cleared.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_mmu.h         |  8 +++++
 arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
 arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
 arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
 4 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index dda0046..703d04d 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	return false;
+}
 
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 59003df..35840fb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
 	}
 }
 
+/**
+  * stage2_find_pud() - find a PUD entry
+  * @kvm:	pointer to kvm structure.
+  * @addr:	IPA address
+  *
+  * Return address of PUD entry or NULL if not allocated.
+  */
+static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
+{
+	pgd_t *pgd;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	if (pgd_none(*pgd))
+		return NULL;
+
+	return pud_offset(pgd, addr);
+}
+
+/**
+ * stage2_dissolve_pud() - clear and flush huge PUD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ *
+ * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
+{
+	pud_t *pud;
+	gfn_t gfn;
+	long i;
+
+	pud = stage2_find_pud(kvm, addr);
+	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
+		pud_clear(pud);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pud));
+#ifdef CONFIG_SMP
+		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
+		/*
+		 * Mark all pages in PUD range dirty, in case other
+		 * CPUs are  writing to it.
+		 */
+		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+#endif
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
 	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
+	/*
+	 * While dirty page logging - dissolve huge PUD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pud(kvm, addr);
+
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
 	if (!pmd) {
@@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
 	do {
 		next = kvm_pud_addr_end(addr, end);
 		if (!pud_none(*pud)) {
-			/* TODO:PUD not supported, revisit later if supported */
-			BUG_ON(kvm_pud_huge(*pud));
-			stage2_wp_pmds(pud, addr, next);
+			if (kvm_pud_huge(*pud)) {
+				if (!kvm_s2pud_readonly(pud))
+					kvm_set_s2pud_readonly(pud);
+			} else
+				stage2_wp_pmds(pud, addr, next);
 		}
 	} while (pud++, addr = next, addr != end);
 }
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index f925e40..3b692c5 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 	return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	return (pud_val(*pud) & PUD_S2_RDWR) == PUD_S2_RDONLY;
+}
 
 #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
 #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 5f930cc..1714c84 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -122,6 +122,9 @@
 #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
+#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
write protected for initial memory region write protection. Code to dissolve 
huge PUD is supported in user_mem_abort(). At this time this code has not been 
tested, but similar approach to current ARMv8 page logging test is in work,
limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
4k page/48 bit host, some host kernel test code needs to be added to detect
page fault to this region and side step general processing. Also similar to 
PMD case all pages in range are marked dirty when PUD entry is cleared.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_mmu.h         |  8 +++++
 arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
 arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
 arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
 4 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index dda0046..703d04d 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	return false;
+}
 
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 59003df..35840fb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
 	}
 }
 
+/**
+  * stage2_find_pud() - find a PUD entry
+  * @kvm:	pointer to kvm structure.
+  * @addr:	IPA address
+  *
+  * Return address of PUD entry or NULL if not allocated.
+  */
+static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
+{
+	pgd_t *pgd;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	if (pgd_none(*pgd))
+		return NULL;
+
+	return pud_offset(pgd, addr);
+}
+
+/**
+ * stage2_dissolve_pud() - clear and flush huge PUD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ *
+ * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
+{
+	pud_t *pud;
+	gfn_t gfn;
+	long i;
+
+	pud = stage2_find_pud(kvm, addr);
+	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
+		pud_clear(pud);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pud));
+#ifdef CONFIG_SMP
+		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
+		/*
+		 * Mark all pages in PUD range dirty, in case other
+		 * CPUs are  writing to it.
+		 */
+		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+#endif
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
 	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
+	/*
+	 * While dirty page logging - dissolve huge PUD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pud(kvm, addr);
+
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
 	if (!pmd) {
@@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
 	do {
 		next = kvm_pud_addr_end(addr, end);
 		if (!pud_none(*pud)) {
-			/* TODO:PUD not supported, revisit later if supported */
-			BUG_ON(kvm_pud_huge(*pud));
-			stage2_wp_pmds(pud, addr, next);
+			if (kvm_pud_huge(*pud)) {
+				if (!kvm_s2pud_readonly(pud))
+					kvm_set_s2pud_readonly(pud);
+			} else
+				stage2_wp_pmds(pud, addr, next);
 		}
 	} while (pud++, addr = next, addr != end);
 }
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index f925e40..3b692c5 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 	return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	return (pud_val(*pud) & PUD_S2_RDWR) == PUD_S2_RDONLY;
+}
 
 #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
 #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 5f930cc..1714c84 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -122,6 +122,9 @@
 #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
+#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: pbonzini, james.hogan, christoffer.dall, agraf, marc.zyngier,
	cornelia.huck, borntraeger, catalin.marinas
  Cc: kvmarm, kvm, kvm-ppc, kvm-ia64, linux-arm-kernel, steve.capper,
	peter.maydell, Mario Smarduch

This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
write protected for initial memory region write protection. Code to dissolve 
huge PUD is supported in user_mem_abort(). At this time this code has not been 
tested, but similar approach to current ARMv8 page logging test is in work,
limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
4k page/48 bit host, some host kernel test code needs to be added to detect
page fault to this region and side step general processing. Also similar to 
PMD case all pages in range are marked dirty when PUD entry is cleared.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_mmu.h         |  8 +++++
 arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
 arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
 arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
 4 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index dda0046..703d04d 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	return false;
+}
 
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 59003df..35840fb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
 	}
 }
 
+/**
+  * stage2_find_pud() - find a PUD entry
+  * @kvm:	pointer to kvm structure.
+  * @addr:	IPA address
+  *
+  * Return address of PUD entry or NULL if not allocated.
+  */
+static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
+{
+	pgd_t *pgd;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	if (pgd_none(*pgd))
+		return NULL;
+
+	return pud_offset(pgd, addr);
+}
+
+/**
+ * stage2_dissolve_pud() - clear and flush huge PUD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ *
+ * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
+{
+	pud_t *pud;
+	gfn_t gfn;
+	long i;
+
+	pud = stage2_find_pud(kvm, addr);
+	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
+		pud_clear(pud);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pud));
+#ifdef CONFIG_SMP
+		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
+		/*
+		 * Mark all pages in PUD range dirty, in case other
+		 * CPUs are  writing to it.
+		 */
+		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+#endif
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
 	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
+	/*
+	 * While dirty page logging - dissolve huge PUD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pud(kvm, addr);
+
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
 	if (!pmd) {
@@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
 	do {
 		next = kvm_pud_addr_end(addr, end);
 		if (!pud_none(*pud)) {
-			/* TODO:PUD not supported, revisit later if supported */
-			BUG_ON(kvm_pud_huge(*pud));
-			stage2_wp_pmds(pud, addr, next);
+			if (kvm_pud_huge(*pud)) {
+				if (!kvm_s2pud_readonly(pud))
+					kvm_set_s2pud_readonly(pud);
+			} else
+				stage2_wp_pmds(pud, addr, next);
 		}
 	} while (pud++, addr = next, addr != end);
 }
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index f925e40..3b692c5 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 	return (pmd_val(*pmd) & PMD_S2_RDWR) = PMD_S2_RDONLY;
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	return (pud_val(*pud) & PUD_S2_RDWR) = PUD_S2_RDONLY;
+}
 
 #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
 #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 5f930cc..1714c84 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -122,6 +122,9 @@
 #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
+#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2014-12-15  7:28   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-15  7:28 UTC (permalink / raw)
  To: kvm-ia64

This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
write protected for initial memory region write protection. Code to dissolve 
huge PUD is supported in user_mem_abort(). At this time this code has not been 
tested, but similar approach to current ARMv8 page logging test is in work,
limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
4k page/48 bit host, some host kernel test code needs to be added to detect
page fault to this region and side step general processing. Also similar to 
PMD case all pages in range are marked dirty when PUD entry is cleared.

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/include/asm/kvm_mmu.h         |  8 +++++
 arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
 arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
 arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
 4 files changed, 81 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index dda0046..703d04d 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	return false;
+}
 
 /* Open coded p*d_addr_end that can deal with 64bit addresses */
 #define kvm_pgd_addr_end(addr, end)					\
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 59003df..35840fb 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
 	}
 }
 
+/**
+  * stage2_find_pud() - find a PUD entry
+  * @kvm:	pointer to kvm structure.
+  * @addr:	IPA address
+  *
+  * Return address of PUD entry or NULL if not allocated.
+  */
+static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
+{
+	pgd_t *pgd;
+
+	pgd = kvm->arch.pgd + pgd_index(addr);
+	if (pgd_none(*pgd))
+		return NULL;
+
+	return pud_offset(pgd, addr);
+}
+
+/**
+ * stage2_dissolve_pud() - clear and flush huge PUD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ *
+ * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
+{
+	pud_t *pud;
+	gfn_t gfn;
+	long i;
+
+	pud = stage2_find_pud(kvm, addr);
+	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
+		pud_clear(pud);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pud));
+#ifdef CONFIG_SMP
+		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
+		/*
+		 * Mark all pages in PUD range dirty, in case other
+		 * CPUs are  writing to it.
+		 */
+		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+#endif
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
 	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
+	/*
+	 * While dirty page logging - dissolve huge PUD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pud(kvm, addr);
+
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
 	if (!pmd) {
@@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
 	do {
 		next = kvm_pud_addr_end(addr, end);
 		if (!pud_none(*pud)) {
-			/* TODO:PUD not supported, revisit later if supported */
-			BUG_ON(kvm_pud_huge(*pud));
-			stage2_wp_pmds(pud, addr, next);
+			if (kvm_pud_huge(*pud)) {
+				if (!kvm_s2pud_readonly(pud))
+					kvm_set_s2pud_readonly(pud);
+			} else
+				stage2_wp_pmds(pud, addr, next);
 		}
 	} while (pud++, addr = next, addr != end);
 }
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index f925e40..3b692c5 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
 	return (pmd_val(*pmd) & PMD_S2_RDWR) = PMD_S2_RDONLY;
 }
 
+static inline void kvm_set_s2pud_readonly(pud_t *pud)
+{
+	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
+}
+
+static inline bool kvm_s2pud_readonly(pud_t *pud)
+{
+	return (pud_val(*pud) & PUD_S2_RDWR) = PUD_S2_RDONLY;
+}
 
 #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
 #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 5f930cc..1714c84 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -122,6 +122,9 @@
 #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
 #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
 
+#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
+#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
+
 /*
  * Memory Attribute override for Stage-2 (MemAttr[3:0])
  */
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2014-12-18  2:07   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-18  2:07 UTC (permalink / raw)
  To: christoffer.dall, marc.zyngier
  Cc: kvmarm, kvm, linux-arm-kernel, pbonzini, catalin.marinas, Mario Smarduch

This patch is a followup to v15 patch series, with following changes:
- When clearing/dissolving a huge, PMD mark huge page range dirty, since
  the state of whole range is unknown. After the huge page is dissolved 
  dirty page logging is at page granularity.
- Correct comment due to misinterpreted test results

Retested, everything appears to work fine. 
  

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/mmu.c |   86 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 78 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 73d506f..7e83a16 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
 #define kvm_pud_huge(_x)	pud_huge(_x)
 
+#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
+#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
+
+static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
+{
+#ifdef CONFIG_ARM
+	return !!memslot->dirty_bitmap;
+#else
+	return false;
+#endif
+}
+
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
 	/*
@@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
 }
 
+/**
+ * stage2_dissolve_pmd() - clear and flush huge PMD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ * @pmd	pmd pointer for IPA
+ *
+ * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
+{
+	gfn_t gfn;
+	int i;
+
+	if (kvm_pmd_huge(*pmd)) {
+
+		pmd_clear(pmd);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pmd));
+
+		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
+
+		/*
+		 * The write is to a huge page, mark the whole page dirty
+		 * including this gfn.
+		 */
+		for (i = 0; i < PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
+			  phys_addr_t addr, const pte_t *new_pte,
+			  unsigned long flags)
 {
 	pmd_t *pmd;
 	pte_t *pte, old_pte;
+	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
+	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
@@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		return 0;
 	}
 
+	/*
+	 * While dirty page logging - dissolve huge PMD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pmd(kvm, addr, pmd);
+
 	/* Create stage-2 page mappings - Level 2 */
 	if (pmd_none(*pmd)) {
 		if (!cache)
@@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 		if (ret)
 			goto out;
 		spin_lock(&kvm->mmu_lock);
-		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
+		ret = stage2_set_pte(kvm, &cache, addr, &pte,
+						KVM_S2PTE_FLAG_IS_IOMAP);
 		spin_unlock(&kvm->mmu_lock);
 		if (ret)
 			goto out;
@@ -1002,6 +1056,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
 	bool fault_ipa_uncached;
+	unsigned long logging_active = 0;
 
 	write_fault = kvm_is_write_fault(vcpu);
 	if (fault_status == FSC_PERM && !write_fault) {
@@ -1009,6 +1064,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
+	if (kvm_get_logging_state(memslot) && write_fault)
+		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
+
 	/* Let's check if we will get back a huge page backed by hugetlbfs */
 	down_read(&current->mm->mmap_sem);
 	vma = find_vma_intersection(current->mm, hva, hva + 1);
@@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (is_vm_hugetlb_page(vma)) {
+	if (is_vm_hugetlb_page(vma) && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
@@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	spin_lock(&kvm->mmu_lock);
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
-	if (!hugetlb && !force_pte)
+	if (!hugetlb && !force_pte && !logging_active)
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
 	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
@@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
 		pte_t new_pte = pfn_pte(pfn, mem_type);
+		unsigned long flags = logging_active;
+
+		if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE))
+			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
+
 		if (writable) {
 			kvm_set_s2pte_writable(&new_pte);
 			kvm_set_pfn_dirty(pfn);
 		}
 		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
 					  fault_ipa_uncached);
-		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
-			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
+		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
 	}
 
-
+	if (write_fault)
+		mark_page_dirty(kvm, gfn);
 out_unlock:
 	spin_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
@@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
 {
 	pte_t *pte = (pte_t *)data;
 
-	stage2_set_pte(kvm, NULL, gpa, pte, false);
+	/*
+	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
+	 * flag set because MMU notifiers will have unmapped a huge PMD before
+	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
+	 * therefore stage2_set_pte() never needs to clear out a huge PMD
+	 * through this calling path.
+	 */
+	stage2_set_pte(kvm, NULL, gpa, pte, 0);
 }
 
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2014-12-18  2:07   ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2014-12-18  2:07 UTC (permalink / raw)
  To: linux-arm-kernel

This patch is a followup to v15 patch series, with following changes:
- When clearing/dissolving a huge, PMD mark huge page range dirty, since
  the state of whole range is unknown. After the huge page is dissolved 
  dirty page logging is at page granularity.
- Correct comment due to misinterpreted test results

Retested, everything appears to work fine. 
  

Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
---
 arch/arm/kvm/mmu.c |   86 +++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 78 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 73d506f..7e83a16 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
 #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
 #define kvm_pud_huge(_x)	pud_huge(_x)
 
+#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
+#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
+
+static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
+{
+#ifdef CONFIG_ARM
+	return !!memslot->dirty_bitmap;
+#else
+	return false;
+#endif
+}
+
 static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 {
 	/*
@@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
 		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
 }
 
+/**
+ * stage2_dissolve_pmd() - clear and flush huge PMD entry
+ * @kvm:	pointer to kvm structure.
+ * @addr	IPA
+ * @pmd	pmd pointer for IPA
+ *
+ * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
+ * pages in the range dirty.
+ */
+void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
+{
+	gfn_t gfn;
+	int i;
+
+	if (kvm_pmd_huge(*pmd)) {
+
+		pmd_clear(pmd);
+		kvm_tlb_flush_vmid_ipa(kvm, addr);
+		put_page(virt_to_page(pmd));
+
+		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
+
+		/*
+		 * The write is to a huge page, mark the whole page dirty
+		 * including this gfn.
+		 */
+		for (i = 0; i < PTRS_PER_PMD; i++)
+			mark_page_dirty(kvm, gfn + i);
+	}
+}
+
 static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
 				  int min, int max)
 {
@@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
 }
 
 static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
-			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
+			  phys_addr_t addr, const pte_t *new_pte,
+			  unsigned long flags)
 {
 	pmd_t *pmd;
 	pte_t *pte, old_pte;
+	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
+	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
 
 	/* Create stage-2 page table mapping - Levels 0 and 1 */
 	pmd = stage2_get_pmd(kvm, cache, addr);
@@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
 		return 0;
 	}
 
+	/*
+	 * While dirty page logging - dissolve huge PMD, then continue on to
+	 * allocate page.
+	 */
+	if (logging_active)
+		stage2_dissolve_pmd(kvm, addr, pmd);
+
 	/* Create stage-2 page mappings - Level 2 */
 	if (pmd_none(*pmd)) {
 		if (!cache)
@@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 		if (ret)
 			goto out;
 		spin_lock(&kvm->mmu_lock);
-		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
+		ret = stage2_set_pte(kvm, &cache, addr, &pte,
+						KVM_S2PTE_FLAG_IS_IOMAP);
 		spin_unlock(&kvm->mmu_lock);
 		if (ret)
 			goto out;
@@ -1002,6 +1056,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	pfn_t pfn;
 	pgprot_t mem_type = PAGE_S2;
 	bool fault_ipa_uncached;
+	unsigned long logging_active = 0;
 
 	write_fault = kvm_is_write_fault(vcpu);
 	if (fault_status == FSC_PERM && !write_fault) {
@@ -1009,6 +1064,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
+	if (kvm_get_logging_state(memslot) && write_fault)
+		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
+
 	/* Let's check if we will get back a huge page backed by hugetlbfs */
 	down_read(&current->mm->mmap_sem);
 	vma = find_vma_intersection(current->mm, hva, hva + 1);
@@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (is_vm_hugetlb_page(vma)) {
+	if (is_vm_hugetlb_page(vma) && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
@@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	spin_lock(&kvm->mmu_lock);
 	if (mmu_notifier_retry(kvm, mmu_seq))
 		goto out_unlock;
-	if (!hugetlb && !force_pte)
+	if (!hugetlb && !force_pte && !logging_active)
 		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
 
 	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
@@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
 	} else {
 		pte_t new_pte = pfn_pte(pfn, mem_type);
+		unsigned long flags = logging_active;
+
+		if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE))
+			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
+
 		if (writable) {
 			kvm_set_s2pte_writable(&new_pte);
 			kvm_set_pfn_dirty(pfn);
 		}
 		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
 					  fault_ipa_uncached);
-		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
-			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
+		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
 	}
 
-
+	if (write_fault)
+		mark_page_dirty(kvm, gfn);
 out_unlock:
 	spin_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
@@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
 {
 	pte_t *pte = (pte_t *)data;
 
-	stage2_set_pte(kvm, NULL, gpa, pte, false);
+	/*
+	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
+	 * flag set because MMU notifiers will have unmapped a huge PMD before
+	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
+	 * therefore stage2_set_pte() never needs to clear out a huge PMD
+	 * through this calling path.
+	 */
+	stage2_set_pte(kvm, NULL, gpa, pte, 0);
 }
 
 
-- 
1.7.9.5

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
  2014-12-18  2:07   ` Mario Smarduch
@ 2015-01-07 12:38     ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 12:38 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: marc.zyngier, kvmarm, kvm, linux-arm-kernel, pbonzini, catalin.marinas

On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
> This patch is a followup to v15 patch series, with following changes:
> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
>   the state of whole range is unknown. After the huge page is dissolved 
>   dirty page logging is at page granularity.

What is the sequence of events where you could have dirtied another page
within the PMD range after the user initially requested dirty page
logging?

> - Correct comment due to misinterpreted test results
> 
> Retested, everything appears to work fine. 

you should resend this with the proper commit message, and changelogs
should go beneath the '---' separator.

>   
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/kvm/mmu.c |   86 +++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 78 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 73d506f..7e83a16 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
>  #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
>  #define kvm_pud_huge(_x)	pud_huge(_x)
>  
> +#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
> +#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
> +
> +static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)

nit: if you respin I think this would be slightly more clear if it was
named something like memslot_is_logging() - I have a vague feeling I was
the one who suggested this name in the past but now it annoyes me
slightly.

> +{
> +#ifdef CONFIG_ARM
> +	return !!memslot->dirty_bitmap;
> +#else
> +	return false;
> +#endif
> +}
> +
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  {
>  	/*
> @@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>  }
>  
> +/**
> + * stage2_dissolve_pmd() - clear and flush huge PMD entry
> + * @kvm:	pointer to kvm structure.
> + * @addr	IPA
> + * @pmd	pmd pointer for IPA
> + *
> + * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> + * pages in the range dirty.
> + */
> +void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)

this can be a static

> +{
> +	gfn_t gfn;
> +	int i;
> +
> +	if (kvm_pmd_huge(*pmd)) {

Can you invert this, so you return early if it's not a
kvm_pmd_huge(*pmd) ?

> +
> +		pmd_clear(pmd);
> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> +		put_page(virt_to_page(pmd));
> +
> +		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
> +
> +		/*
> +		 * The write is to a huge page, mark the whole page dirty
> +		 * including this gfn.
> +		 */

we need the explanation I'm asking for in the commit message as part of
the comment here. Currently the comment explains what the code is quite
obviously doing, but not *why*....

> +		for (i = 0; i < PTRS_PER_PMD; i++)
> +			mark_page_dirty(kvm, gfn + i);
> +	}
> +}
> +
>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>  				  int min, int max)
>  {
> @@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
>  }
>  
>  static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> -			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
> +			  phys_addr_t addr, const pte_t *new_pte,
> +			  unsigned long flags)
>  {
>  	pmd_t *pmd;
>  	pte_t *pte, old_pte;
> +	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
> +	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;

why not declare these as bool?

>  
>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>  	pmd = stage2_get_pmd(kvm, cache, addr);
> @@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  		return 0;
>  	}
>  
> +	/*
> +	 * While dirty page logging - dissolve huge PMD, then continue on to
> +	 * allocate page.
> +	 */
> +	if (logging_active)
> +		stage2_dissolve_pmd(kvm, addr, pmd);
> +
>  	/* Create stage-2 page mappings - Level 2 */
>  	if (pmd_none(*pmd)) {
>  		if (!cache)
> @@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
>  		if (ret)
>  			goto out;
>  		spin_lock(&kvm->mmu_lock);
> -		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
> +		ret = stage2_set_pte(kvm, &cache, addr, &pte,
> +						KVM_S2PTE_FLAG_IS_IOMAP);
>  		spin_unlock(&kvm->mmu_lock);
>  		if (ret)
>  			goto out;
> @@ -1002,6 +1056,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	pfn_t pfn;
>  	pgprot_t mem_type = PAGE_S2;
>  	bool fault_ipa_uncached;
> +	unsigned long logging_active = 0;

can you change this to a bool and set the flag explicitly once you've
declared flags further down?  I think that's more clear.

>  
>  	write_fault = kvm_is_write_fault(vcpu);
>  	if (fault_status == FSC_PERM && !write_fault) {
> @@ -1009,6 +1064,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		return -EFAULT;
>  	}
>  
> +	if (kvm_get_logging_state(memslot) && write_fault)
> +		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
> +

so if the guest is faulting on a read of a huge page then we're going to
map it as a huge page, but not if it's faulting on a write.  Why
exactly?  A slight optimization?  Perhaps it's worth a comment.

>  	/* Let's check if we will get back a huge page backed by hugetlbfs */
>  	down_read(&current->mm->mmap_sem);
>  	vma = find_vma_intersection(current->mm, hva, hva + 1);
> @@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		return -EFAULT;
>  	}
>  
> -	if (is_vm_hugetlb_page(vma)) {
> +	if (is_vm_hugetlb_page(vma) && !logging_active) {

So I think this whole thing could look nicer if you set force_pte = true
together with setting logging_active above, and then change this check
to check && !force_pte here and get rid of the extra check of
!logging_active for the THP check below.

Sorry to be a bit pedantic, but this code is really critical.

>  		hugetlb = true;
>  		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>  	} else {
> @@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	spin_lock(&kvm->mmu_lock);
>  	if (mmu_notifier_retry(kvm, mmu_seq))
>  		goto out_unlock;
> -	if (!hugetlb && !force_pte)
> +	if (!hugetlb && !force_pte && !logging_active)
>  		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>  
>  	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
> @@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
>  	} else {
>  		pte_t new_pte = pfn_pte(pfn, mem_type);
> +		unsigned long flags = logging_active;
> +
> +		if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE))
> +			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
> +
>  		if (writable) {
>  			kvm_set_s2pte_writable(&new_pte);
>  			kvm_set_pfn_dirty(pfn);
>  		}
>  		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
>  					  fault_ipa_uncached);
> -		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
> -			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
> +		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
>  	}
>  
> -
> +	if (write_fault)
> +		mark_page_dirty(kvm, gfn);
>  out_unlock:
>  	spin_unlock(&kvm->mmu_lock);
>  	kvm_release_pfn_clean(pfn);
> @@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
>  {
>  	pte_t *pte = (pte_t *)data;
>  
> -	stage2_set_pte(kvm, NULL, gpa, pte, false);
> +	/*
> +	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
> +	 * flag set because MMU notifiers will have unmapped a huge PMD before

                ^^^ surely you mean 'clear', right?

> +	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
> +	 * therefore stage2_set_pte() never needs to clear out a huge PMD
> +	 * through this calling path.
> +	 */
> +	stage2_set_pte(kvm, NULL, gpa, pte, 0);
>  }
>  
>  
> -- 
> 1.7.9.5
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2015-01-07 12:38     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
> This patch is a followup to v15 patch series, with following changes:
> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
>   the state of whole range is unknown. After the huge page is dissolved 
>   dirty page logging is at page granularity.

What is the sequence of events where you could have dirtied another page
within the PMD range after the user initially requested dirty page
logging?

> - Correct comment due to misinterpreted test results
> 
> Retested, everything appears to work fine. 

you should resend this with the proper commit message, and changelogs
should go beneath the '---' separator.

>   
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/kvm/mmu.c |   86 +++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 78 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 73d506f..7e83a16 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
>  #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
>  #define kvm_pud_huge(_x)	pud_huge(_x)
>  
> +#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
> +#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
> +
> +static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)

nit: if you respin I think this would be slightly more clear if it was
named something like memslot_is_logging() - I have a vague feeling I was
the one who suggested this name in the past but now it annoyes me
slightly.

> +{
> +#ifdef CONFIG_ARM
> +	return !!memslot->dirty_bitmap;
> +#else
> +	return false;
> +#endif
> +}
> +
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  {
>  	/*
> @@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>  		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>  }
>  
> +/**
> + * stage2_dissolve_pmd() - clear and flush huge PMD entry
> + * @kvm:	pointer to kvm structure.
> + * @addr	IPA
> + * @pmd	pmd pointer for IPA
> + *
> + * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> + * pages in the range dirty.
> + */
> +void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)

this can be a static

> +{
> +	gfn_t gfn;
> +	int i;
> +
> +	if (kvm_pmd_huge(*pmd)) {

Can you invert this, so you return early if it's not a
kvm_pmd_huge(*pmd) ?

> +
> +		pmd_clear(pmd);
> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> +		put_page(virt_to_page(pmd));
> +
> +		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
> +
> +		/*
> +		 * The write is to a huge page, mark the whole page dirty
> +		 * including this gfn.
> +		 */

we need the explanation I'm asking for in the commit message as part of
the comment here. Currently the comment explains what the code is quite
obviously doing, but not *why*....

> +		for (i = 0; i < PTRS_PER_PMD; i++)
> +			mark_page_dirty(kvm, gfn + i);
> +	}
> +}
> +
>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>  				  int min, int max)
>  {
> @@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
>  }
>  
>  static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> -			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
> +			  phys_addr_t addr, const pte_t *new_pte,
> +			  unsigned long flags)
>  {
>  	pmd_t *pmd;
>  	pte_t *pte, old_pte;
> +	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
> +	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;

why not declare these as bool?

>  
>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>  	pmd = stage2_get_pmd(kvm, cache, addr);
> @@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  		return 0;
>  	}
>  
> +	/*
> +	 * While dirty page logging - dissolve huge PMD, then continue on to
> +	 * allocate page.
> +	 */
> +	if (logging_active)
> +		stage2_dissolve_pmd(kvm, addr, pmd);
> +
>  	/* Create stage-2 page mappings - Level 2 */
>  	if (pmd_none(*pmd)) {
>  		if (!cache)
> @@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
>  		if (ret)
>  			goto out;
>  		spin_lock(&kvm->mmu_lock);
> -		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
> +		ret = stage2_set_pte(kvm, &cache, addr, &pte,
> +						KVM_S2PTE_FLAG_IS_IOMAP);
>  		spin_unlock(&kvm->mmu_lock);
>  		if (ret)
>  			goto out;
> @@ -1002,6 +1056,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	pfn_t pfn;
>  	pgprot_t mem_type = PAGE_S2;
>  	bool fault_ipa_uncached;
> +	unsigned long logging_active = 0;

can you change this to a bool and set the flag explicitly once you've
declared flags further down?  I think that's more clear.

>  
>  	write_fault = kvm_is_write_fault(vcpu);
>  	if (fault_status == FSC_PERM && !write_fault) {
> @@ -1009,6 +1064,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		return -EFAULT;
>  	}
>  
> +	if (kvm_get_logging_state(memslot) && write_fault)
> +		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
> +

so if the guest is faulting on a read of a huge page then we're going to
map it as a huge page, but not if it's faulting on a write.  Why
exactly?  A slight optimization?  Perhaps it's worth a comment.

>  	/* Let's check if we will get back a huge page backed by hugetlbfs */
>  	down_read(&current->mm->mmap_sem);
>  	vma = find_vma_intersection(current->mm, hva, hva + 1);
> @@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		return -EFAULT;
>  	}
>  
> -	if (is_vm_hugetlb_page(vma)) {
> +	if (is_vm_hugetlb_page(vma) && !logging_active) {

So I think this whole thing could look nicer if you set force_pte = true
together with setting logging_active above, and then change this check
to check && !force_pte here and get rid of the extra check of
!logging_active for the THP check below.

Sorry to be a bit pedantic, but this code is really critical.

>  		hugetlb = true;
>  		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>  	} else {
> @@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  	spin_lock(&kvm->mmu_lock);
>  	if (mmu_notifier_retry(kvm, mmu_seq))
>  		goto out_unlock;
> -	if (!hugetlb && !force_pte)
> +	if (!hugetlb && !force_pte && !logging_active)
>  		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>  
>  	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
> @@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
>  	} else {
>  		pte_t new_pte = pfn_pte(pfn, mem_type);
> +		unsigned long flags = logging_active;
> +
> +		if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE))
> +			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
> +
>  		if (writable) {
>  			kvm_set_s2pte_writable(&new_pte);
>  			kvm_set_pfn_dirty(pfn);
>  		}
>  		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
>  					  fault_ipa_uncached);
> -		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
> -			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
> +		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
>  	}
>  
> -
> +	if (write_fault)
> +		mark_page_dirty(kvm, gfn);
>  out_unlock:
>  	spin_unlock(&kvm->mmu_lock);
>  	kvm_release_pfn_clean(pfn);
> @@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
>  {
>  	pte_t *pte = (pte_t *)data;
>  
> -	stage2_set_pte(kvm, NULL, gpa, pte, false);
> +	/*
> +	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
> +	 * flag set because MMU notifiers will have unmapped a huge PMD before

                ^^^ surely you mean 'clear', right?

> +	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
> +	 * therefore stage2_set_pte() never needs to clear out a huge PMD
> +	 * through this calling path.
> +	 */
> +	stage2_set_pte(kvm, NULL, gpa, pte, 0);
>  }
>  
>  
> -- 
> 1.7.9.5
> 

Thanks,
-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  2014-12-15  7:28   ` Mario Smarduch
  (?)
  (?)
@ 2015-01-07 12:47     ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 12:47 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic

                           dirty

> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
> logging at architecture layer.
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_host.h | 12 ------------
>  arch/arm/kvm/arm.c              |  4 ----
>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>  arch/arm64/kvm/Kconfig          |  2 ++
>  4 files changed, 13 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index b138431..088ea87 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>  }
>  
> -/**
> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
> - * @kvm:	pointer to kvm structure.
> - *
> - * Interface to HYP function to flush all VM TLB entries without address
> - * parameter.
> - */
> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> -{
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> -}
> -
>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>  {
>  	return 0;
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 6e4290c..1b6577c 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>   */
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  {
> -#ifdef CONFIG_ARM
>  	bool is_dirty = false;
>  	int r;
>  
> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  
>  	mutex_unlock(&kvm->slots_lock);
>  	return r;
> -#else /* arm64 */
> -	return -EINVAL;
> -#endif
>  }
>  
>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index dc763bb..59003df 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>  
>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>  {
> -#ifdef CONFIG_ARM
>  	return !!memslot->dirty_bitmap;
> -#else
> -	return false;
> -#endif
> +}
> +
> +/**
> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
> + * @kvm:	pointer to kvm structure.
> + *
> + * Interface to HYP function to flush all VM TLB entries
> + */
> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)

did you intend for a non-staic inline here?

> +{
> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>  	return !pfn_valid(pfn);
>  }
>  
> -#ifdef CONFIG_ARM
>  /**
>   * stage2_wp_ptes - write protect PMD range
>   * @pmd:	pointer to pmd entry
> @@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>  
>  	stage2_wp_range(kvm, start, end);
>  }
> -#endif
>  
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  			  struct kvm_memory_slot *memslot, unsigned long hva,
> @@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  				   const struct kvm_memory_slot *old,
>  				   enum kvm_mr_change change)
>  {
> -#ifdef CONFIG_ARM
>  	/*
>  	 * At this point memslot has been committed and there is an
>  	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
> @@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  	 */
>  	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>  		kvm_mmu_wp_memory_region(kvm, mem->slot);
> -#endif
>  }
>  
>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 8ba85e9..3ce389b 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -22,10 +22,12 @@ config KVM
>  	select PREEMPT_NOTIFIERS
>  	select ANON_INODES
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>  	select KVM_MMIO
>  	select KVM_ARM_HOST
>  	select KVM_ARM_VGIC
>  	select KVM_ARM_TIMER
> +	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>  	---help---
>  	  Support hosting virtualized guest machines.
>  
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-07 12:47     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic

                           dirty

> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
> logging at architecture layer.
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_host.h | 12 ------------
>  arch/arm/kvm/arm.c              |  4 ----
>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>  arch/arm64/kvm/Kconfig          |  2 ++
>  4 files changed, 13 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index b138431..088ea87 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>  }
>  
> -/**
> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
> - * @kvm:	pointer to kvm structure.
> - *
> - * Interface to HYP function to flush all VM TLB entries without address
> - * parameter.
> - */
> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> -{
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> -}
> -
>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>  {
>  	return 0;
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 6e4290c..1b6577c 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>   */
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  {
> -#ifdef CONFIG_ARM
>  	bool is_dirty = false;
>  	int r;
>  
> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  
>  	mutex_unlock(&kvm->slots_lock);
>  	return r;
> -#else /* arm64 */
> -	return -EINVAL;
> -#endif
>  }
>  
>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index dc763bb..59003df 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>  
>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>  {
> -#ifdef CONFIG_ARM
>  	return !!memslot->dirty_bitmap;
> -#else
> -	return false;
> -#endif
> +}
> +
> +/**
> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
> + * @kvm:	pointer to kvm structure.
> + *
> + * Interface to HYP function to flush all VM TLB entries
> + */
> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)

did you intend for a non-staic inline here?

> +{
> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>  	return !pfn_valid(pfn);
>  }
>  
> -#ifdef CONFIG_ARM
>  /**
>   * stage2_wp_ptes - write protect PMD range
>   * @pmd:	pointer to pmd entry
> @@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>  
>  	stage2_wp_range(kvm, start, end);
>  }
> -#endif
>  
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  			  struct kvm_memory_slot *memslot, unsigned long hva,
> @@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  				   const struct kvm_memory_slot *old,
>  				   enum kvm_mr_change change)
>  {
> -#ifdef CONFIG_ARM
>  	/*
>  	 * At this point memslot has been committed and there is an
>  	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
> @@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  	 */
>  	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>  		kvm_mmu_wp_memory_region(kvm, mem->slot);
> -#endif
>  }
>  
>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 8ba85e9..3ce389b 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -22,10 +22,12 @@ config KVM
>  	select PREEMPT_NOTIFIERS
>  	select ANON_INODES
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>  	select KVM_MMIO
>  	select KVM_ARM_HOST
>  	select KVM_ARM_VGIC
>  	select KVM_ARM_TIMER
> +	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>  	---help---
>  	  Support hosting virtualized guest machines.
>  
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-07 12:47     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 12:47 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic

                           dirty

> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
> logging at architecture layer.
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_host.h | 12 ------------
>  arch/arm/kvm/arm.c              |  4 ----
>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>  arch/arm64/kvm/Kconfig          |  2 ++
>  4 files changed, 13 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index b138431..088ea87 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>  }
>  
> -/**
> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
> - * @kvm:	pointer to kvm structure.
> - *
> - * Interface to HYP function to flush all VM TLB entries without address
> - * parameter.
> - */
> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> -{
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> -}
> -
>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>  {
>  	return 0;
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 6e4290c..1b6577c 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>   */
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  {
> -#ifdef CONFIG_ARM
>  	bool is_dirty = false;
>  	int r;
>  
> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  
>  	mutex_unlock(&kvm->slots_lock);
>  	return r;
> -#else /* arm64 */
> -	return -EINVAL;
> -#endif
>  }
>  
>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index dc763bb..59003df 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>  
>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>  {
> -#ifdef CONFIG_ARM
>  	return !!memslot->dirty_bitmap;
> -#else
> -	return false;
> -#endif
> +}
> +
> +/**
> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
> + * @kvm:	pointer to kvm structure.
> + *
> + * Interface to HYP function to flush all VM TLB entries
> + */
> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)

did you intend for a non-staic inline here?

> +{
> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>  	return !pfn_valid(pfn);
>  }
>  
> -#ifdef CONFIG_ARM
>  /**
>   * stage2_wp_ptes - write protect PMD range
>   * @pmd:	pointer to pmd entry
> @@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>  
>  	stage2_wp_range(kvm, start, end);
>  }
> -#endif
>  
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  			  struct kvm_memory_slot *memslot, unsigned long hva,
> @@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  				   const struct kvm_memory_slot *old,
>  				   enum kvm_mr_change change)
>  {
> -#ifdef CONFIG_ARM
>  	/*
>  	 * At this point memslot has been committed and there is an
>  	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
> @@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  	 */
>  	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>  		kvm_mmu_wp_memory_region(kvm, mem->slot);
> -#endif
>  }
>  
>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 8ba85e9..3ce389b 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -22,10 +22,12 @@ config KVM
>  	select PREEMPT_NOTIFIERS
>  	select ANON_INODES
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>  	select KVM_MMIO
>  	select KVM_ARM_HOST
>  	select KVM_ARM_VGIC
>  	select KVM_ARM_TIMER
> +	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>  	---help---
>  	  Support hosting virtualized guest machines.
>  
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-07 12:47     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 12:47 UTC (permalink / raw)
  To: kvm-ia64

On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic

                           dirty

> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
> logging at architecture layer.
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_host.h | 12 ------------
>  arch/arm/kvm/arm.c              |  4 ----
>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>  arch/arm64/kvm/Kconfig          |  2 ++
>  4 files changed, 13 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index b138431..088ea87 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>  }
>  
> -/**
> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
> - * @kvm:	pointer to kvm structure.
> - *
> - * Interface to HYP function to flush all VM TLB entries without address
> - * parameter.
> - */
> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> -{
> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> -}
> -
>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>  {
>  	return 0;
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index 6e4290c..1b6577c 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>   */
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  {
> -#ifdef CONFIG_ARM
>  	bool is_dirty = false;
>  	int r;
>  
> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  
>  	mutex_unlock(&kvm->slots_lock);
>  	return r;
> -#else /* arm64 */
> -	return -EINVAL;
> -#endif
>  }
>  
>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index dc763bb..59003df 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>  
>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>  {
> -#ifdef CONFIG_ARM
>  	return !!memslot->dirty_bitmap;
> -#else
> -	return false;
> -#endif
> +}
> +
> +/**
> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
> + * @kvm:	pointer to kvm structure.
> + *
> + * Interface to HYP function to flush all VM TLB entries
> + */
> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)

did you intend for a non-staic inline here?

> +{
> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>  }
>  
>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> @@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>  	return !pfn_valid(pfn);
>  }
>  
> -#ifdef CONFIG_ARM
>  /**
>   * stage2_wp_ptes - write protect PMD range
>   * @pmd:	pointer to pmd entry
> @@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>  
>  	stage2_wp_range(kvm, start, end);
>  }
> -#endif
>  
>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  			  struct kvm_memory_slot *memslot, unsigned long hva,
> @@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  				   const struct kvm_memory_slot *old,
>  				   enum kvm_mr_change change)
>  {
> -#ifdef CONFIG_ARM
>  	/*
>  	 * At this point memslot has been committed and there is an
>  	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
> @@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  	 */
>  	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>  		kvm_mmu_wp_memory_region(kvm, mem->slot);
> -#endif
>  }
>  
>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 8ba85e9..3ce389b 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -22,10 +22,12 @@ config KVM
>  	select PREEMPT_NOTIFIERS
>  	select ANON_INODES
>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>  	select KVM_MMIO
>  	select KVM_ARM_HOST
>  	select KVM_ARM_VGIC
>  	select KVM_ARM_TIMER
> +	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>  	---help---
>  	  Support hosting virtualized guest machines.
>  
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
  2014-12-15  7:28   ` Mario Smarduch
  (?)
  (?)
@ 2015-01-07 13:05     ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
> write protected for initial memory region write protection. Code to dissolve 
> huge PUD is supported in user_mem_abort(). At this time this code has not been 
> tested, but similar approach to current ARMv8 page logging test is in work,
> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
> 4k page/48 bit host, some host kernel test code needs to be added to detect
> page fault to this region and side step general processing. Also similar to 
> PMD case all pages in range are marked dirty when PUD entry is cleared.

the note about this code being untested shouldn't be part of the commit
message but after the '---' separater or in the cover letter I think.

> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>  4 files changed, 81 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index dda0046..703d04d 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
>  }
>  
> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> +{
> +}
> +
> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> +{
> +	return false;
> +}
>  
>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>  #define kvm_pgd_addr_end(addr, end)					\
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 59003df..35840fb 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>  	}
>  }
>  
> +/**
> +  * stage2_find_pud() - find a PUD entry
> +  * @kvm:	pointer to kvm structure.
> +  * @addr:	IPA address
> +  *
> +  * Return address of PUD entry or NULL if not allocated.
> +  */
> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)

why can't you reuse stage2_get_pud here?

> +{
> +	pgd_t *pgd;
> +
> +	pgd = kvm->arch.pgd + pgd_index(addr);
> +	if (pgd_none(*pgd))
> +		return NULL;
> +
> +	return pud_offset(pgd, addr);
> +}
> +
> +/**
> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> + * @kvm:	pointer to kvm structure.
> + * @addr	IPA
> + *
> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> + * pages in the range dirty.
> + */
> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
> +{
> +	pud_t *pud;
> +	gfn_t gfn;
> +	long i;
> +
> +	pud = stage2_find_pud(kvm, addr);
> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {

I'm just thinking here, why do we need to check if we get a valid pud
back here, but we don't need the equivalent check in dissolve_pmd from
patch 7?

I think the rationale is that it should never happen because we never
call these functions with the logging and iomap flags at the same
time...

> +		pud_clear(pud);
> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> +		put_page(virt_to_page(pud));
> +#ifdef CONFIG_SMP
> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
> +		/*
> +		 * Mark all pages in PUD range dirty, in case other
> +		 * CPUs are  writing to it.
> +		 */
> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
> +			mark_page_dirty(kvm, gfn + i);
> +#endif
> +	}
> +}
> +
>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>  				  int min, int max)
>  {
> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>  
> +	/*
> +	 * While dirty page logging - dissolve huge PUD, then continue on to
> +	 * allocate page.
> +	 */
> +	if (logging_active)
> +		stage2_dissolve_pud(kvm, addr);
> +

I know I asked for this, but what's the purpose really when we never set
a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?

Marc, you may have some thoughts here...

>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>  	pmd = stage2_get_pmd(kvm, cache, addr);
>  	if (!pmd) {
> @@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
>  	do {
>  		next = kvm_pud_addr_end(addr, end);
>  		if (!pud_none(*pud)) {
> -			/* TODO:PUD not supported, revisit later if supported */
> -			BUG_ON(kvm_pud_huge(*pud));
> -			stage2_wp_pmds(pud, addr, next);
> +			if (kvm_pud_huge(*pud)) {
> +				if (!kvm_s2pud_readonly(pud))
> +					kvm_set_s2pud_readonly(pud);

I guess the same question that I had above applies here as well (sorry
for making you go rounds on this one).

> +			} else
> +				stage2_wp_pmds(pud, addr, next);
>  		}
>  	} while (pud++, addr = next, addr != end);
>  }
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index f925e40..3b692c5 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>  	return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
>  }
>  
> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> +{
> +	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
> +}
> +
> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> +{
> +	return (pud_val(*pud) & PUD_S2_RDWR) == PUD_S2_RDONLY;
> +}
>  
>  #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
>  #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index 5f930cc..1714c84 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -122,6 +122,9 @@
>  #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
>  #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>  
> +#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
> +#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
> +
>  /*
>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>   */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
> write protected for initial memory region write protection. Code to dissolve 
> huge PUD is supported in user_mem_abort(). At this time this code has not been 
> tested, but similar approach to current ARMv8 page logging test is in work,
> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
> 4k page/48 bit host, some host kernel test code needs to be added to detect
> page fault to this region and side step general processing. Also similar to 
> PMD case all pages in range are marked dirty when PUD entry is cleared.

the note about this code being untested shouldn't be part of the commit
message but after the '---' separater or in the cover letter I think.

> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>  4 files changed, 81 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index dda0046..703d04d 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
>  }
>  
> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> +{
> +}
> +
> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> +{
> +	return false;
> +}
>  
>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>  #define kvm_pgd_addr_end(addr, end)					\
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 59003df..35840fb 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>  	}
>  }
>  
> +/**
> +  * stage2_find_pud() - find a PUD entry
> +  * @kvm:	pointer to kvm structure.
> +  * @addr:	IPA address
> +  *
> +  * Return address of PUD entry or NULL if not allocated.
> +  */
> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)

why can't you reuse stage2_get_pud here?

> +{
> +	pgd_t *pgd;
> +
> +	pgd = kvm->arch.pgd + pgd_index(addr);
> +	if (pgd_none(*pgd))
> +		return NULL;
> +
> +	return pud_offset(pgd, addr);
> +}
> +
> +/**
> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> + * @kvm:	pointer to kvm structure.
> + * @addr	IPA
> + *
> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> + * pages in the range dirty.
> + */
> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
> +{
> +	pud_t *pud;
> +	gfn_t gfn;
> +	long i;
> +
> +	pud = stage2_find_pud(kvm, addr);
> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {

I'm just thinking here, why do we need to check if we get a valid pud
back here, but we don't need the equivalent check in dissolve_pmd from
patch 7?

I think the rationale is that it should never happen because we never
call these functions with the logging and iomap flags at the same
time...

> +		pud_clear(pud);
> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> +		put_page(virt_to_page(pud));
> +#ifdef CONFIG_SMP
> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
> +		/*
> +		 * Mark all pages in PUD range dirty, in case other
> +		 * CPUs are  writing to it.
> +		 */
> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
> +			mark_page_dirty(kvm, gfn + i);
> +#endif
> +	}
> +}
> +
>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>  				  int min, int max)
>  {
> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>  
> +	/*
> +	 * While dirty page logging - dissolve huge PUD, then continue on to
> +	 * allocate page.
> +	 */
> +	if (logging_active)
> +		stage2_dissolve_pud(kvm, addr);
> +

I know I asked for this, but what's the purpose really when we never set
a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?

Marc, you may have some thoughts here...

>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>  	pmd = stage2_get_pmd(kvm, cache, addr);
>  	if (!pmd) {
> @@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
>  	do {
>  		next = kvm_pud_addr_end(addr, end);
>  		if (!pud_none(*pud)) {
> -			/* TODO:PUD not supported, revisit later if supported */
> -			BUG_ON(kvm_pud_huge(*pud));
> -			stage2_wp_pmds(pud, addr, next);
> +			if (kvm_pud_huge(*pud)) {
> +				if (!kvm_s2pud_readonly(pud))
> +					kvm_set_s2pud_readonly(pud);

I guess the same question that I had above applies here as well (sorry
for making you go rounds on this one).

> +			} else
> +				stage2_wp_pmds(pud, addr, next);
>  		}
>  	} while (pud++, addr = next, addr != end);
>  }
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index f925e40..3b692c5 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>  	return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
>  }
>  
> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> +{
> +	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
> +}
> +
> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> +{
> +	return (pud_val(*pud) & PUD_S2_RDWR) == PUD_S2_RDONLY;
> +}
>  
>  #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
>  #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index 5f930cc..1714c84 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -122,6 +122,9 @@
>  #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
>  #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>  
> +#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
> +#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
> +
>  /*
>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>   */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
> write protected for initial memory region write protection. Code to dissolve 
> huge PUD is supported in user_mem_abort(). At this time this code has not been 
> tested, but similar approach to current ARMv8 page logging test is in work,
> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
> 4k page/48 bit host, some host kernel test code needs to be added to detect
> page fault to this region and side step general processing. Also similar to 
> PMD case all pages in range are marked dirty when PUD entry is cleared.

the note about this code being untested shouldn't be part of the commit
message but after the '---' separater or in the cover letter I think.

> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>  4 files changed, 81 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index dda0046..703d04d 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
>  }
>  
> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> +{
> +}
> +
> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> +{
> +	return false;
> +}
>  
>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>  #define kvm_pgd_addr_end(addr, end)					\
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 59003df..35840fb 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>  	}
>  }
>  
> +/**
> +  * stage2_find_pud() - find a PUD entry
> +  * @kvm:	pointer to kvm structure.
> +  * @addr:	IPA address
> +  *
> +  * Return address of PUD entry or NULL if not allocated.
> +  */
> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)

why can't you reuse stage2_get_pud here?

> +{
> +	pgd_t *pgd;
> +
> +	pgd = kvm->arch.pgd + pgd_index(addr);
> +	if (pgd_none(*pgd))
> +		return NULL;
> +
> +	return pud_offset(pgd, addr);
> +}
> +
> +/**
> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> + * @kvm:	pointer to kvm structure.
> + * @addr	IPA
> + *
> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> + * pages in the range dirty.
> + */
> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
> +{
> +	pud_t *pud;
> +	gfn_t gfn;
> +	long i;
> +
> +	pud = stage2_find_pud(kvm, addr);
> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {

I'm just thinking here, why do we need to check if we get a valid pud
back here, but we don't need the equivalent check in dissolve_pmd from
patch 7?

I think the rationale is that it should never happen because we never
call these functions with the logging and iomap flags at the same
time...

> +		pud_clear(pud);
> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> +		put_page(virt_to_page(pud));
> +#ifdef CONFIG_SMP
> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
> +		/*
> +		 * Mark all pages in PUD range dirty, in case other
> +		 * CPUs are  writing to it.
> +		 */
> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
> +			mark_page_dirty(kvm, gfn + i);
> +#endif
> +	}
> +}
> +
>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>  				  int min, int max)
>  {
> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>  
> +	/*
> +	 * While dirty page logging - dissolve huge PUD, then continue on to
> +	 * allocate page.
> +	 */
> +	if (logging_active)
> +		stage2_dissolve_pud(kvm, addr);
> +

I know I asked for this, but what's the purpose really when we never set
a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?

Marc, you may have some thoughts here...

>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>  	pmd = stage2_get_pmd(kvm, cache, addr);
>  	if (!pmd) {
> @@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
>  	do {
>  		next = kvm_pud_addr_end(addr, end);
>  		if (!pud_none(*pud)) {
> -			/* TODO:PUD not supported, revisit later if supported */
> -			BUG_ON(kvm_pud_huge(*pud));
> -			stage2_wp_pmds(pud, addr, next);
> +			if (kvm_pud_huge(*pud)) {
> +				if (!kvm_s2pud_readonly(pud))
> +					kvm_set_s2pud_readonly(pud);

I guess the same question that I had above applies here as well (sorry
for making you go rounds on this one).

> +			} else
> +				stage2_wp_pmds(pud, addr, next);
>  		}
>  	} while (pud++, addr = next, addr != end);
>  }
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index f925e40..3b692c5 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>  	return (pmd_val(*pmd) & PMD_S2_RDWR) = PMD_S2_RDONLY;
>  }
>  
> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> +{
> +	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
> +}
> +
> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> +{
> +	return (pud_val(*pud) & PUD_S2_RDWR) = PUD_S2_RDONLY;
> +}
>  
>  #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
>  #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index 5f930cc..1714c84 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -122,6 +122,9 @@
>  #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
>  #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>  
> +#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
> +#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
> +
>  /*
>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>   */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: kvm-ia64

On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
> write protected for initial memory region write protection. Code to dissolve 
> huge PUD is supported in user_mem_abort(). At this time this code has not been 
> tested, but similar approach to current ARMv8 page logging test is in work,
> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
> 4k page/48 bit host, some host kernel test code needs to be added to detect
> page fault to this region and side step general processing. Also similar to 
> PMD case all pages in range are marked dirty when PUD entry is cleared.

the note about this code being untested shouldn't be part of the commit
message but after the '---' separater or in the cover letter I think.

> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> ---
>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>  4 files changed, 81 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index dda0046..703d04d 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
>  }
>  
> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> +{
> +}
> +
> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> +{
> +	return false;
> +}
>  
>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>  #define kvm_pgd_addr_end(addr, end)					\
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index 59003df..35840fb 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>  	}
>  }
>  
> +/**
> +  * stage2_find_pud() - find a PUD entry
> +  * @kvm:	pointer to kvm structure.
> +  * @addr:	IPA address
> +  *
> +  * Return address of PUD entry or NULL if not allocated.
> +  */
> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)

why can't you reuse stage2_get_pud here?

> +{
> +	pgd_t *pgd;
> +
> +	pgd = kvm->arch.pgd + pgd_index(addr);
> +	if (pgd_none(*pgd))
> +		return NULL;
> +
> +	return pud_offset(pgd, addr);
> +}
> +
> +/**
> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> + * @kvm:	pointer to kvm structure.
> + * @addr	IPA
> + *
> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> + * pages in the range dirty.
> + */
> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
> +{
> +	pud_t *pud;
> +	gfn_t gfn;
> +	long i;
> +
> +	pud = stage2_find_pud(kvm, addr);
> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {

I'm just thinking here, why do we need to check if we get a valid pud
back here, but we don't need the equivalent check in dissolve_pmd from
patch 7?

I think the rationale is that it should never happen because we never
call these functions with the logging and iomap flags at the same
time...

> +		pud_clear(pud);
> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> +		put_page(virt_to_page(pud));
> +#ifdef CONFIG_SMP
> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
> +		/*
> +		 * Mark all pages in PUD range dirty, in case other
> +		 * CPUs are  writing to it.
> +		 */
> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
> +			mark_page_dirty(kvm, gfn + i);
> +#endif
> +	}
> +}
> +
>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>  				  int min, int max)
>  {
> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>  
> +	/*
> +	 * While dirty page logging - dissolve huge PUD, then continue on to
> +	 * allocate page.
> +	 */
> +	if (logging_active)
> +		stage2_dissolve_pud(kvm, addr);
> +

I know I asked for this, but what's the purpose really when we never set
a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?

Marc, you may have some thoughts here...

>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>  	pmd = stage2_get_pmd(kvm, cache, addr);
>  	if (!pmd) {
> @@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
>  	do {
>  		next = kvm_pud_addr_end(addr, end);
>  		if (!pud_none(*pud)) {
> -			/* TODO:PUD not supported, revisit later if supported */
> -			BUG_ON(kvm_pud_huge(*pud));
> -			stage2_wp_pmds(pud, addr, next);
> +			if (kvm_pud_huge(*pud)) {
> +				if (!kvm_s2pud_readonly(pud))
> +					kvm_set_s2pud_readonly(pud);

I guess the same question that I had above applies here as well (sorry
for making you go rounds on this one).

> +			} else
> +				stage2_wp_pmds(pud, addr, next);
>  		}
>  	} while (pud++, addr = next, addr != end);
>  }
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index f925e40..3b692c5 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>  	return (pmd_val(*pmd) & PMD_S2_RDWR) = PMD_S2_RDONLY;
>  }
>  
> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> +{
> +	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
> +}
> +
> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> +{
> +	return (pud_val(*pud) & PUD_S2_RDWR) = PUD_S2_RDONLY;
> +}
>  
>  #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
>  #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index 5f930cc..1714c84 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -122,6 +122,9 @@
>  #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
>  #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>  
> +#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
> +#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
> +
>  /*
>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>   */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support
  2014-12-15  7:28   ` Mario Smarduch
  (?)
  (?)
@ 2015-01-07 13:05     ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Sun, Dec 14, 2014 at 11:28:02PM -0800, Mario Smarduch wrote:
> Add support for initial write protection of VM memslots. This patch
> series assumes that huge PUDs will not be used in 2nd stage tables, which is
> always valid on ARMv7
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 14, 2014 at 11:28:02PM -0800, Mario Smarduch wrote:
> Add support for initial write protection of VM memslots. This patch
> series assumes that huge PUDs will not be used in 2nd stage tables, which is
> always valid on ARMv7
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Sun, Dec 14, 2014 at 11:28:02PM -0800, Mario Smarduch wrote:
> Add support for initial write protection of VM memslots. This patch
> series assumes that huge PUDs will not be used in 2nd stage tables, which is
> always valid on ARMv7
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: kvm-ia64

On Sun, Dec 14, 2014 at 11:28:02PM -0800, Mario Smarduch wrote:
> Add support for initial write protection of VM memslots. This patch
> series assumes that huge PUDs will not be used in 2nd stage tables, which is
> always valid on ARMv7
> 
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Acked-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 06/11] KVM: arm: dirty logging write protect support
  2014-12-15  7:28   ` Mario Smarduch
  (?)
  (?)
@ 2015-01-07 13:05     ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Sun, Dec 14, 2014 at 11:28:03PM -0800, Mario Smarduch wrote:
> Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
> calls. We call kvm_get_dirty_log_protect() function to do most of the work.
> 
> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 06/11] KVM: arm: dirty logging write protect support
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 14, 2014 at 11:28:03PM -0800, Mario Smarduch wrote:
> Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
> calls. We call kvm_get_dirty_log_protect() function to do most of the work.
> 
> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 06/11] KVM: arm: dirty logging write protect support
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Sun, Dec 14, 2014 at 11:28:03PM -0800, Mario Smarduch wrote:
> Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
> calls. We call kvm_get_dirty_log_protect() function to do most of the work.
> 
> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 06/11] KVM: arm: dirty logging write protect support
@ 2015-01-07 13:05     ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-07 13:05 UTC (permalink / raw)
  To: kvm-ia64

On Sun, Dec 14, 2014 at 11:28:03PM -0800, Mario Smarduch wrote:
> Add support to track dirty pages between user space KVM_GET_DIRTY_LOG ioctl
> calls. We call kvm_get_dirty_log_protect() function to do most of the work.
> 
> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>

Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
  2015-01-07 12:38     ` Christoffer Dall
@ 2015-01-08  1:43       ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  1:43 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: marc.zyngier, kvmarm, kvm, linux-arm-kernel, pbonzini, catalin.marinas

Hi Christoffer,
  before going through your comments, I discovered that
in 3.18.0-rc2 - a generic __get_user_pages_fast()
was implemented, now ARM picks this up. This causes
gfn_to_pfn_prot() to return meaningful 'writable'
value for a read fault, provided the region is writable.

Prior to that the weak version returned 0 and 'writable'
had no optimization effect to set pte/pmd - RW on
a read fault.

As a consequence dirty logging broke in 3.18, I was seeing
weird but very intermittent issues. I just put in the
additional few lines to fix it, prevent pte RW (only R) on
read faults  while  logging writable region.

On 01/07/2015 04:38 AM, Christoffer Dall wrote:
> On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
>> This patch is a followup to v15 patch series, with following changes:
>> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
>>   the state of whole range is unknown. After the huge page is dissolved 
>>   dirty page logging is at page granularity.
> 
> What is the sequence of events where you could have dirtied another page
> within the PMD range after the user initially requested dirty page
> logging?

No there is none. My issue was the start point for tracking dirty pages
and that would be second call to dirty log read. Not first
call after initial write protect where any page in range can
be assumed dirty. I'll remove this, not sure if there would be any
use case to call dirty log only once.

> 
>> - Correct comment due to misinterpreted test results
>>
>> Retested, everything appears to work fine. 
> 
> you should resend this with the proper commit message, and changelogs
> should go beneath the '---' separator.

Yes will do.
> 
>>   
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/kvm/mmu.c |   86 +++++++++++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 78 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 73d506f..7e83a16 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
>>  #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
>>  #define kvm_pud_huge(_x)	pud_huge(_x)
>>  
>> +#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
>> +#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
>> +
>> +static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
> 
> nit: if you respin I think this would be slightly more clear if it was
> named something like memslot_is_logging() - I have a vague feeling I was
> the one who suggested this name in the past but now it annoyes me
> slightly.

Yes, more intuitive.

> 
>> +{
>> +#ifdef CONFIG_ARM
>> +	return !!memslot->dirty_bitmap;
>> +#else
>> +	return false;
>> +#endif
>> +}
>> +
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>  {
>>  	/*
>> @@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>  		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>>  }
>>  
>> +/**
>> + * stage2_dissolve_pmd() - clear and flush huge PMD entry
>> + * @kvm:	pointer to kvm structure.
>> + * @addr	IPA
>> + * @pmd	pmd pointer for IPA
>> + *
>> + * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>> + * pages in the range dirty.
>> + */
>> +void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
> 
> this can be a static
> 
Missed it.
>> +{
>> +	gfn_t gfn;
>> +	int i;
>> +
>> +	if (kvm_pmd_huge(*pmd)) {
> 
> Can you invert this, so you return early if it's not a
> kvm_pmd_huge(*pmd) ?

will do.
> 
>> +
>> +		pmd_clear(pmd);
>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>> +		put_page(virt_to_page(pmd));
>> +
>> +		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
>> +
>> +		/*
>> +		 * The write is to a huge page, mark the whole page dirty
>> +		 * including this gfn.
>> +		 */
> 
> we need the explanation I'm asking for in the commit message as part of
> the comment here. Currently the comment explains what the code is quite
> obviously doing, but not *why*....

That was what I mentioned above, it will be gone.
> 
>> +		for (i = 0; i < PTRS_PER_PMD; i++)
>> +			mark_page_dirty(kvm, gfn + i);
>> +	}
>> +}
>> +
>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>  				  int min, int max)
>>  {
>> @@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
>>  }
>>  
>>  static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>> -			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
>> +			  phys_addr_t addr, const pte_t *new_pte,
>> +			  unsigned long flags)
>>  {
>>  	pmd_t *pmd;
>>  	pte_t *pte, old_pte;
>> +	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>> +	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
> 
> why not declare these as bool?
Yes.
> 
>>  
>>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>>  	pmd = stage2_get_pmd(kvm, cache, addr);
>> @@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>  		return 0;
>>  	}
>>  
>> +	/*
>> +	 * While dirty page logging - dissolve huge PMD, then continue on to
>> +	 * allocate page.
>> +	 */
>> +	if (logging_active)
>> +		stage2_dissolve_pmd(kvm, addr, pmd);
>> +
>>  	/* Create stage-2 page mappings - Level 2 */
>>  	if (pmd_none(*pmd)) {
>>  		if (!cache)
>> @@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
>>  		if (ret)
>>  			goto out;
>>  		spin_lock(&kvm->mmu_lock);
>> -		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
>> +		ret = stage2_set_pte(kvm, &cache, addr, &pte,
>> +						KVM_S2PTE_FLAG_IS_IOMAP);
>>  		spin_unlock(&kvm->mmu_lock);
>>  		if (ret)
>>  			goto out;
>> @@ -1002,6 +1056,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	pfn_t pfn;
>>  	pgprot_t mem_type = PAGE_S2;
>>  	bool fault_ipa_uncached;
>> +	unsigned long logging_active = 0;
> 
> can you change this to a bool and set the flag explicitly once you've
> declared flags further down?  I think that's more clear.
ok.
> 
>>  
>>  	write_fault = kvm_is_write_fault(vcpu);
>>  	if (fault_status == FSC_PERM && !write_fault) {
>> @@ -1009,6 +1064,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		return -EFAULT;
>>  	}
>>  
>> +	if (kvm_get_logging_state(memslot) && write_fault)
>> +		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>> +
Yes I noticed non writable regions were dissolved, but this
was not the way to go about it. Right now after call is made
to gfn_to_pfn_prot() the snippet is executed to do
nothing for non-writable regions.

if (kvm_get_logging_state(memslot) && writable) {
	logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
	if (!write_fault)
		 can_set_pte_rw = false;
        gfn = fault_ipa >> PAGE_SHIFT;
        force_pte = true;
}

if (!hugetlb && !force_pte)
 ...

> 
> so if the guest is faulting on a read of a huge page then we're going to
> map it as a huge page, but not if it's faulting on a write.  Why
> exactly?  A slight optimization?  Perhaps it's worth a comment.
> 
>>  	/* Let's check if we will get back a huge page backed by hugetlbfs */
>>  	down_read(&current->mm->mmap_sem);
>>  	vma = find_vma_intersection(current->mm, hva, hva + 1);
>> @@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		return -EFAULT;
>>  	}
>>  
>> -	if (is_vm_hugetlb_page(vma)) {
>> +	if (is_vm_hugetlb_page(vma) && !logging_active) {

These references to logging_active in conditional checks
are gone.

> 
> So I think this whole thing could look nicer if you set force_pte = true
> together with setting logging_active above, and then change this check
> to check && !force_pte here and get rid of the extra check of
> !logging_active for the THP check below.
> 
> Sorry to be a bit pedantic, but this code is really critical.
> 
>>  		hugetlb = true;
>>  		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>>  	} else {
>> @@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	spin_lock(&kvm->mmu_lock);
>>  	if (mmu_notifier_retry(kvm, mmu_seq))
>>  		goto out_unlock;
>> -	if (!hugetlb && !force_pte)
>> +	if (!hugetlb && !force_pte && !logging_active)
>>  		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>>  
>>  	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
>> @@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
>>  	} else {
>>  		pte_t new_pte = pfn_pte(pfn, mem_type);
>> +		unsigned long flags = logging_active;
>> +
>> +		if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE))
>> +			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
>> +
>>  		if (writable) {
>>  			kvm_set_s2pte_writable(&new_pte);
>>  			kvm_set_pfn_dirty(pfn);
>>  		}
>>  		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
>>  					  fault_ipa_uncached);
>> -		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
>> -			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
>> +		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
>>  	}
>>  
>> -
>> +	if (write_fault)
>> +		mark_page_dirty(kvm, gfn);
>>  out_unlock:
>>  	spin_unlock(&kvm->mmu_lock);
>>  	kvm_release_pfn_clean(pfn);
>> @@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
>>  {
>>  	pte_t *pte = (pte_t *)data;
>>  
>> -	stage2_set_pte(kvm, NULL, gpa, pte, false);
>> +	/*
>> +	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
>> +	 * flag set because MMU notifiers will have unmapped a huge PMD before
> 
>                 ^^^ surely you mean 'clear', right?
Yes of course, the value is 0 we can only deal with pages here.
> 
>> +	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
>> +	 * therefore stage2_set_pte() never needs to clear out a huge PMD
>> +	 * through this calling path.
>> +	 */
>> +	stage2_set_pte(kvm, NULL, gpa, pte, 0);
>>  }
>>  
>>  
>> -- 
>> 1.7.9.5
>>
> 
> Thanks,
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2015-01-08  1:43       ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  1:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Christoffer,
  before going through your comments, I discovered that
in 3.18.0-rc2 - a generic __get_user_pages_fast()
was implemented, now ARM picks this up. This causes
gfn_to_pfn_prot() to return meaningful 'writable'
value for a read fault, provided the region is writable.

Prior to that the weak version returned 0 and 'writable'
had no optimization effect to set pte/pmd - RW on
a read fault.

As a consequence dirty logging broke in 3.18, I was seeing
weird but very intermittent issues. I just put in the
additional few lines to fix it, prevent pte RW (only R) on
read faults  while  logging writable region.

On 01/07/2015 04:38 AM, Christoffer Dall wrote:
> On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
>> This patch is a followup to v15 patch series, with following changes:
>> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
>>   the state of whole range is unknown. After the huge page is dissolved 
>>   dirty page logging is at page granularity.
> 
> What is the sequence of events where you could have dirtied another page
> within the PMD range after the user initially requested dirty page
> logging?

No there is none. My issue was the start point for tracking dirty pages
and that would be second call to dirty log read. Not first
call after initial write protect where any page in range can
be assumed dirty. I'll remove this, not sure if there would be any
use case to call dirty log only once.

> 
>> - Correct comment due to misinterpreted test results
>>
>> Retested, everything appears to work fine. 
> 
> you should resend this with the proper commit message, and changelogs
> should go beneath the '---' separator.

Yes will do.
> 
>>   
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/kvm/mmu.c |   86 +++++++++++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 78 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 73d506f..7e83a16 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -47,6 +47,18 @@ static phys_addr_t hyp_idmap_vector;
>>  #define kvm_pmd_huge(_x)	(pmd_huge(_x) || pmd_trans_huge(_x))
>>  #define kvm_pud_huge(_x)	pud_huge(_x)
>>  
>> +#define KVM_S2PTE_FLAG_IS_IOMAP		(1UL << 0)
>> +#define KVM_S2PTE_FLAG_LOGGING_ACTIVE	(1UL << 1)
>> +
>> +static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
> 
> nit: if you respin I think this would be slightly more clear if it was
> named something like memslot_is_logging() - I have a vague feeling I was
> the one who suggested this name in the past but now it annoyes me
> slightly.

Yes, more intuitive.

> 
>> +{
>> +#ifdef CONFIG_ARM
>> +	return !!memslot->dirty_bitmap;
>> +#else
>> +	return false;
>> +#endif
>> +}
>> +
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>  {
>>  	/*
>> @@ -59,6 +71,37 @@ static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>>  		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, kvm, ipa);
>>  }
>>  
>> +/**
>> + * stage2_dissolve_pmd() - clear and flush huge PMD entry
>> + * @kvm:	pointer to kvm structure.
>> + * @addr	IPA
>> + * @pmd	pmd pointer for IPA
>> + *
>> + * Function clears a PMD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>> + * pages in the range dirty.
>> + */
>> +void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
> 
> this can be a static
> 
Missed it.
>> +{
>> +	gfn_t gfn;
>> +	int i;
>> +
>> +	if (kvm_pmd_huge(*pmd)) {
> 
> Can you invert this, so you return early if it's not a
> kvm_pmd_huge(*pmd) ?

will do.
> 
>> +
>> +		pmd_clear(pmd);
>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>> +		put_page(virt_to_page(pmd));
>> +
>> +		gfn = (addr & PMD_MASK) >> PAGE_SHIFT;
>> +
>> +		/*
>> +		 * The write is to a huge page, mark the whole page dirty
>> +		 * including this gfn.
>> +		 */
> 
> we need the explanation I'm asking for in the commit message as part of
> the comment here. Currently the comment explains what the code is quite
> obviously doing, but not *why*....

That was what I mentioned above, it will be gone.
> 
>> +		for (i = 0; i < PTRS_PER_PMD; i++)
>> +			mark_page_dirty(kvm, gfn + i);
>> +	}
>> +}
>> +
>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>  				  int min, int max)
>>  {
>> @@ -703,10 +746,13 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
>>  }
>>  
>>  static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>> -			  phys_addr_t addr, const pte_t *new_pte, bool iomap)
>> +			  phys_addr_t addr, const pte_t *new_pte,
>> +			  unsigned long flags)
>>  {
>>  	pmd_t *pmd;
>>  	pte_t *pte, old_pte;
>> +	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>> +	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
> 
> why not declare these as bool?
Yes.
> 
>>  
>>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>>  	pmd = stage2_get_pmd(kvm, cache, addr);
>> @@ -718,6 +764,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>  		return 0;
>>  	}
>>  
>> +	/*
>> +	 * While dirty page logging - dissolve huge PMD, then continue on to
>> +	 * allocate page.
>> +	 */
>> +	if (logging_active)
>> +		stage2_dissolve_pmd(kvm, addr, pmd);
>> +
>>  	/* Create stage-2 page mappings - Level 2 */
>>  	if (pmd_none(*pmd)) {
>>  		if (!cache)
>> @@ -774,7 +827,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
>>  		if (ret)
>>  			goto out;
>>  		spin_lock(&kvm->mmu_lock);
>> -		ret = stage2_set_pte(kvm, &cache, addr, &pte, true);
>> +		ret = stage2_set_pte(kvm, &cache, addr, &pte,
>> +						KVM_S2PTE_FLAG_IS_IOMAP);
>>  		spin_unlock(&kvm->mmu_lock);
>>  		if (ret)
>>  			goto out;
>> @@ -1002,6 +1056,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	pfn_t pfn;
>>  	pgprot_t mem_type = PAGE_S2;
>>  	bool fault_ipa_uncached;
>> +	unsigned long logging_active = 0;
> 
> can you change this to a bool and set the flag explicitly once you've
> declared flags further down?  I think that's more clear.
ok.
> 
>>  
>>  	write_fault = kvm_is_write_fault(vcpu);
>>  	if (fault_status == FSC_PERM && !write_fault) {
>> @@ -1009,6 +1064,9 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		return -EFAULT;
>>  	}
>>  
>> +	if (kvm_get_logging_state(memslot) && write_fault)
>> +		logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>> +
Yes I noticed non writable regions were dissolved, but this
was not the way to go about it. Right now after call is made
to gfn_to_pfn_prot() the snippet is executed to do
nothing for non-writable regions.

if (kvm_get_logging_state(memslot) && writable) {
	logging_active = KVM_S2PTE_FLAG_LOGGING_ACTIVE;
	if (!write_fault)
		 can_set_pte_rw = false;
        gfn = fault_ipa >> PAGE_SHIFT;
        force_pte = true;
}

if (!hugetlb && !force_pte)
 ...

> 
> so if the guest is faulting on a read of a huge page then we're going to
> map it as a huge page, but not if it's faulting on a write.  Why
> exactly?  A slight optimization?  Perhaps it's worth a comment.
> 
>>  	/* Let's check if we will get back a huge page backed by hugetlbfs */
>>  	down_read(&current->mm->mmap_sem);
>>  	vma = find_vma_intersection(current->mm, hva, hva + 1);
>> @@ -1018,7 +1076,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		return -EFAULT;
>>  	}
>>  
>> -	if (is_vm_hugetlb_page(vma)) {
>> +	if (is_vm_hugetlb_page(vma) && !logging_active) {

These references to logging_active in conditional checks
are gone.

> 
> So I think this whole thing could look nicer if you set force_pte = true
> together with setting logging_active above, and then change this check
> to check && !force_pte here and get rid of the extra check of
> !logging_active for the THP check below.
> 
> Sorry to be a bit pedantic, but this code is really critical.
> 
>>  		hugetlb = true;
>>  		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
>>  	} else {
>> @@ -1065,7 +1123,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  	spin_lock(&kvm->mmu_lock);
>>  	if (mmu_notifier_retry(kvm, mmu_seq))
>>  		goto out_unlock;
>> -	if (!hugetlb && !force_pte)
>> +	if (!hugetlb && !force_pte && !logging_active)
>>  		hugetlb = transparent_hugepage_adjust(&pfn, &fault_ipa);
>>  
>>  	fault_ipa_uncached = memslot->flags & KVM_MEMSLOT_INCOHERENT;
>> @@ -1082,17 +1140,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  		ret = stage2_set_pmd_huge(kvm, memcache, fault_ipa, &new_pmd);
>>  	} else {
>>  		pte_t new_pte = pfn_pte(pfn, mem_type);
>> +		unsigned long flags = logging_active;
>> +
>> +		if (pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE))
>> +			flags |= KVM_S2PTE_FLAG_IS_IOMAP;
>> +
>>  		if (writable) {
>>  			kvm_set_s2pte_writable(&new_pte);
>>  			kvm_set_pfn_dirty(pfn);
>>  		}
>>  		coherent_cache_guest_page(vcpu, hva, PAGE_SIZE,
>>  					  fault_ipa_uncached);
>> -		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte,
>> -			pgprot_val(mem_type) == pgprot_val(PAGE_S2_DEVICE));
>> +		ret = stage2_set_pte(kvm, memcache, fault_ipa, &new_pte, flags);
>>  	}
>>  
>> -
>> +	if (write_fault)
>> +		mark_page_dirty(kvm, gfn);
>>  out_unlock:
>>  	spin_unlock(&kvm->mmu_lock);
>>  	kvm_release_pfn_clean(pfn);
>> @@ -1242,7 +1305,14 @@ static void kvm_set_spte_handler(struct kvm *kvm, gpa_t gpa, void *data)
>>  {
>>  	pte_t *pte = (pte_t *)data;
>>  
>> -	stage2_set_pte(kvm, NULL, gpa, pte, false);
>> +	/*
>> +	 * We can always call stage2_set_pte with KVM_S2PTE_FLAG_LOGGING_ACTIVE
>> +	 * flag set because MMU notifiers will have unmapped a huge PMD before
> 
>                 ^^^ surely you mean 'clear', right?
Yes of course, the value is 0 we can only deal with pages here.
> 
>> +	 * calling ->change_pte() (which in turn calls kvm_set_spte_hva()) and
>> +	 * therefore stage2_set_pte() never needs to clear out a huge PMD
>> +	 * through this calling path.
>> +	 */
>> +	stage2_set_pte(kvm, NULL, gpa, pte, 0);
>>  }
>>  
>>  
>> -- 
>> 1.7.9.5
>>
> 
> Thanks,
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  2015-01-07 12:47     ` Christoffer Dall
  (?)
  (?)
@ 2015-01-08  1:51       ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  1:51 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On 01/07/2015 04:47 AM, Christoffer Dall wrote:
> On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
>> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
> 
>                            dirty
yeah.
> 
>> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
>> logging at architecture layer.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_host.h | 12 ------------
>>  arch/arm/kvm/arm.c              |  4 ----
>>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>>  arch/arm64/kvm/Kconfig          |  2 ++
>>  4 files changed, 13 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index b138431..088ea87 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>  }
>>  
>> -/**
>> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
>> - * @kvm:	pointer to kvm structure.
>> - *
>> - * Interface to HYP function to flush all VM TLB entries without address
>> - * parameter.
>> - */
>> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>> -{
>> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> -}
>> -
>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>  {
>>  	return 0;
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 6e4290c..1b6577c 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>   */
>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  {
>> -#ifdef CONFIG_ARM
>>  	bool is_dirty = false;
>>  	int r;
>>  
>> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  
>>  	mutex_unlock(&kvm->slots_lock);
>>  	return r;
>> -#else /* arm64 */
>> -	return -EINVAL;
>> -#endif
>>  }
>>  
>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index dc763bb..59003df 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>>  
>>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>>  {
>> -#ifdef CONFIG_ARM
>>  	return !!memslot->dirty_bitmap;
>> -#else
>> -	return false;
>> -#endif
>> +}
>> +
>> +/**
>> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
>> + * @kvm:	pointer to kvm structure.
>> + *
>> + * Interface to HYP function to flush all VM TLB entries
>> + */
>> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> 
> did you intend for a non-staic inline here?

Yes it's used in arm.c and mmu.c
> 
>> +{
>> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>  }
>>  
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>>  	return !pfn_valid(pfn);
>>  }
>>  
>> -#ifdef CONFIG_ARM
>>  /**
>>   * stage2_wp_ptes - write protect PMD range
>>   * @pmd:	pointer to pmd entry
>> @@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>>  
>>  	stage2_wp_range(kvm, start, end);
>>  }
>> -#endif
>>  
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  			  struct kvm_memory_slot *memslot, unsigned long hva,
>> @@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  				   const struct kvm_memory_slot *old,
>>  				   enum kvm_mr_change change)
>>  {
>> -#ifdef CONFIG_ARM
>>  	/*
>>  	 * At this point memslot has been committed and there is an
>>  	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
>> @@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  	 */
>>  	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>>  		kvm_mmu_wp_memory_region(kvm, mem->slot);
>> -#endif
>>  }
>>  
>>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
>> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
>> index 8ba85e9..3ce389b 100644
>> --- a/arch/arm64/kvm/Kconfig
>> +++ b/arch/arm64/kvm/Kconfig
>> @@ -22,10 +22,12 @@ config KVM
>>  	select PREEMPT_NOTIFIERS
>>  	select ANON_INODES
>>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>>  	select KVM_MMIO
>>  	select KVM_ARM_HOST
>>  	select KVM_ARM_VGIC
>>  	select KVM_ARM_TIMER
>> +	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>>  	---help---
>>  	  Support hosting virtualized guest machines.
>>  
>> -- 
>> 1.9.1
>>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08  1:51       ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  1:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/07/2015 04:47 AM, Christoffer Dall wrote:
> On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
>> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
> 
>                            dirty
yeah.
> 
>> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
>> logging at architecture layer.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_host.h | 12 ------------
>>  arch/arm/kvm/arm.c              |  4 ----
>>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>>  arch/arm64/kvm/Kconfig          |  2 ++
>>  4 files changed, 13 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index b138431..088ea87 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>  }
>>  
>> -/**
>> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
>> - * @kvm:	pointer to kvm structure.
>> - *
>> - * Interface to HYP function to flush all VM TLB entries without address
>> - * parameter.
>> - */
>> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>> -{
>> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> -}
>> -
>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>  {
>>  	return 0;
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 6e4290c..1b6577c 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>   */
>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  {
>> -#ifdef CONFIG_ARM
>>  	bool is_dirty = false;
>>  	int r;
>>  
>> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  
>>  	mutex_unlock(&kvm->slots_lock);
>>  	return r;
>> -#else /* arm64 */
>> -	return -EINVAL;
>> -#endif
>>  }
>>  
>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index dc763bb..59003df 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>>  
>>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>>  {
>> -#ifdef CONFIG_ARM
>>  	return !!memslot->dirty_bitmap;
>> -#else
>> -	return false;
>> -#endif
>> +}
>> +
>> +/**
>> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
>> + * @kvm:	pointer to kvm structure.
>> + *
>> + * Interface to HYP function to flush all VM TLB entries
>> + */
>> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> 
> did you intend for a non-staic inline here?

Yes it's used in arm.c and mmu.c
> 
>> +{
>> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>  }
>>  
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>>  	return !pfn_valid(pfn);
>>  }
>>  
>> -#ifdef CONFIG_ARM
>>  /**
>>   * stage2_wp_ptes - write protect PMD range
>>   * @pmd:	pointer to pmd entry
>> @@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>>  
>>  	stage2_wp_range(kvm, start, end);
>>  }
>> -#endif
>>  
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  			  struct kvm_memory_slot *memslot, unsigned long hva,
>> @@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  				   const struct kvm_memory_slot *old,
>>  				   enum kvm_mr_change change)
>>  {
>> -#ifdef CONFIG_ARM
>>  	/*
>>  	 * At this point memslot has been committed and there is an
>>  	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
>> @@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  	 */
>>  	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>>  		kvm_mmu_wp_memory_region(kvm, mem->slot);
>> -#endif
>>  }
>>  
>>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
>> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
>> index 8ba85e9..3ce389b 100644
>> --- a/arch/arm64/kvm/Kconfig
>> +++ b/arch/arm64/kvm/Kconfig
>> @@ -22,10 +22,12 @@ config KVM
>>  	select PREEMPT_NOTIFIERS
>>  	select ANON_INODES
>>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>>  	select KVM_MMIO
>>  	select KVM_ARM_HOST
>>  	select KVM_ARM_VGIC
>>  	select KVM_ARM_TIMER
>> +	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>>  	---help---
>>  	  Support hosting virtualized guest machines.
>>  
>> -- 
>> 1.9.1
>>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08  1:51       ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  1:51 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On 01/07/2015 04:47 AM, Christoffer Dall wrote:
> On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
>> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
> 
>                            dirty
yeah.
> 
>> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
>> logging at architecture layer.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_host.h | 12 ------------
>>  arch/arm/kvm/arm.c              |  4 ----
>>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>>  arch/arm64/kvm/Kconfig          |  2 ++
>>  4 files changed, 13 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index b138431..088ea87 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>  }
>>  
>> -/**
>> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
>> - * @kvm:	pointer to kvm structure.
>> - *
>> - * Interface to HYP function to flush all VM TLB entries without address
>> - * parameter.
>> - */
>> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>> -{
>> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> -}
>> -
>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>  {
>>  	return 0;
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 6e4290c..1b6577c 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>   */
>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  {
>> -#ifdef CONFIG_ARM
>>  	bool is_dirty = false;
>>  	int r;
>>  
>> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  
>>  	mutex_unlock(&kvm->slots_lock);
>>  	return r;
>> -#else /* arm64 */
>> -	return -EINVAL;
>> -#endif
>>  }
>>  
>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index dc763bb..59003df 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>>  
>>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>>  {
>> -#ifdef CONFIG_ARM
>>  	return !!memslot->dirty_bitmap;
>> -#else
>> -	return false;
>> -#endif
>> +}
>> +
>> +/**
>> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
>> + * @kvm:	pointer to kvm structure.
>> + *
>> + * Interface to HYP function to flush all VM TLB entries
>> + */
>> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> 
> did you intend for a non-staic inline here?

Yes it's used in arm.c and mmu.c
> 
>> +{
>> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>  }
>>  
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>>  	return !pfn_valid(pfn);
>>  }
>>  
>> -#ifdef CONFIG_ARM
>>  /**
>>   * stage2_wp_ptes - write protect PMD range
>>   * @pmd:	pointer to pmd entry
>> @@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>>  
>>  	stage2_wp_range(kvm, start, end);
>>  }
>> -#endif
>>  
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  			  struct kvm_memory_slot *memslot, unsigned long hva,
>> @@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  				   const struct kvm_memory_slot *old,
>>  				   enum kvm_mr_change change)
>>  {
>> -#ifdef CONFIG_ARM
>>  	/*
>>  	 * At this point memslot has been committed and there is an
>>  	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
>> @@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  	 */
>>  	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>>  		kvm_mmu_wp_memory_region(kvm, mem->slot);
>> -#endif
>>  }
>>  
>>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
>> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
>> index 8ba85e9..3ce389b 100644
>> --- a/arch/arm64/kvm/Kconfig
>> +++ b/arch/arm64/kvm/Kconfig
>> @@ -22,10 +22,12 @@ config KVM
>>  	select PREEMPT_NOTIFIERS
>>  	select ANON_INODES
>>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>>  	select KVM_MMIO
>>  	select KVM_ARM_HOST
>>  	select KVM_ARM_VGIC
>>  	select KVM_ARM_TIMER
>> +	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>>  	---help---
>>  	  Support hosting virtualized guest machines.
>>  
>> -- 
>> 1.9.1
>>


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08  1:51       ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  1:51 UTC (permalink / raw)
  To: kvm-ia64

On 01/07/2015 04:47 AM, Christoffer Dall wrote:
> On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
>> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
> 
>                            dirty
yeah.
> 
>> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
>> logging at architecture layer.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_host.h | 12 ------------
>>  arch/arm/kvm/arm.c              |  4 ----
>>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>>  arch/arm64/kvm/Kconfig          |  2 ++
>>  4 files changed, 13 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index b138431..088ea87 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>  }
>>  
>> -/**
>> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
>> - * @kvm:	pointer to kvm structure.
>> - *
>> - * Interface to HYP function to flush all VM TLB entries without address
>> - * parameter.
>> - */
>> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>> -{
>> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>> -}
>> -
>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>  {
>>  	return 0;
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index 6e4290c..1b6577c 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>   */
>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  {
>> -#ifdef CONFIG_ARM
>>  	bool is_dirty = false;
>>  	int r;
>>  
>> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  
>>  	mutex_unlock(&kvm->slots_lock);
>>  	return r;
>> -#else /* arm64 */
>> -	return -EINVAL;
>> -#endif
>>  }
>>  
>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index dc763bb..59003df 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>>  
>>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>>  {
>> -#ifdef CONFIG_ARM
>>  	return !!memslot->dirty_bitmap;
>> -#else
>> -	return false;
>> -#endif
>> +}
>> +
>> +/**
>> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
>> + * @kvm:	pointer to kvm structure.
>> + *
>> + * Interface to HYP function to flush all VM TLB entries
>> + */
>> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> 
> did you intend for a non-staic inline here?

Yes it's used in arm.c and mmu.c
> 
>> +{
>> +	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>  }
>>  
>>  static void kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
>> @@ -895,7 +902,6 @@ static bool kvm_is_device_pfn(unsigned long pfn)
>>  	return !pfn_valid(pfn);
>>  }
>>  
>> -#ifdef CONFIG_ARM
>>  /**
>>   * stage2_wp_ptes - write protect PMD range
>>   * @pmd:	pointer to pmd entry
>> @@ -1040,7 +1046,6 @@ void kvm_arch_mmu_write_protect_pt_masked(struct kvm *kvm,
>>  
>>  	stage2_wp_range(kvm, start, end);
>>  }
>> -#endif
>>  
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  			  struct kvm_memory_slot *memslot, unsigned long hva,
>> @@ -1445,7 +1450,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  				   const struct kvm_memory_slot *old,
>>  				   enum kvm_mr_change change)
>>  {
>> -#ifdef CONFIG_ARM
>>  	/*
>>  	 * At this point memslot has been committed and there is an
>>  	 * allocated dirty_bitmap[], dirty pages will be be tracked while the
>> @@ -1453,7 +1457,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>>  	 */
>>  	if (change != KVM_MR_DELETE && mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
>>  		kvm_mmu_wp_memory_region(kvm, mem->slot);
>> -#endif
>>  }
>>  
>>  int kvm_arch_prepare_memory_region(struct kvm *kvm,
>> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
>> index 8ba85e9..3ce389b 100644
>> --- a/arch/arm64/kvm/Kconfig
>> +++ b/arch/arm64/kvm/Kconfig
>> @@ -22,10 +22,12 @@ config KVM
>>  	select PREEMPT_NOTIFIERS
>>  	select ANON_INODES
>>  	select HAVE_KVM_CPU_RELAX_INTERCEPT
>> +	select HAVE_KVM_ARCH_TLB_FLUSH_ALL
>>  	select KVM_MMIO
>>  	select KVM_ARM_HOST
>>  	select KVM_ARM_VGIC
>>  	select KVM_ARM_TIMER
>> +	select KVM_GENERIC_DIRTYLOG_READ_PROTECT
>>  	---help---
>>  	  Support hosting virtualized guest machines.
>>  
>> -- 
>> 1.9.1
>>


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
  2015-01-07 13:05     ` Christoffer Dall
  (?)
  (?)
@ 2015-01-08  3:01       ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  3:01 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On 01/07/2015 05:05 AM, Christoffer Dall wrote:
> On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
>> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
>> write protected for initial memory region write protection. Code to dissolve 
>> huge PUD is supported in user_mem_abort(). At this time this code has not been 
>> tested, but similar approach to current ARMv8 page logging test is in work,
>> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
>> 4k page/48 bit host, some host kernel test code needs to be added to detect
>> page fault to this region and side step general processing. Also similar to 
>> PMD case all pages in range are marked dirty when PUD entry is cleared.
> 
> the note about this code being untested shouldn't be part of the commit
> message but after the '---' separater or in the cover letter I think.

Ah ok.
> 
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>>  4 files changed, 81 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index dda0046..703d04d 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
>>  }
>>  
>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>> +{
>> +}
>> +
>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>> +{
>> +	return false;
>> +}
>>  
>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>  #define kvm_pgd_addr_end(addr, end)					\
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 59003df..35840fb 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>>  	}
>>  }
>>  
>> +/**
>> +  * stage2_find_pud() - find a PUD entry
>> +  * @kvm:	pointer to kvm structure.
>> +  * @addr:	IPA address
>> +  *
>> +  * Return address of PUD entry or NULL if not allocated.
>> +  */
>> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
> 
> why can't you reuse stage2_get_pud here?

stage2_get_* allocate intermediate tables, when they're called
you know intermediate tables are needed to install a pmd or pte.
But currently there is no way to tell we faulted in a PUD
region, this code just checks if a PUD is set, and not
allocate intermediate tables along the way.

Overall not sure if this is in preparation for a new huge page (PUD sized)?
Besides developing a custom test, not sure how to use this
and determine we fault in a PUD region? Generic 'gup'
code does handle PUDs but perhaps some arch. has PUD sized
huge pages.

> 
>> +{
>> +	pgd_t *pgd;
>> +
>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>> +	if (pgd_none(*pgd))
>> +		return NULL;
>> +
>> +	return pud_offset(pgd, addr);
>> +}
>> +
>> +/**
>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>> + * @kvm:	pointer to kvm structure.
>> + * @addr	IPA
>> + *
>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>> + * pages in the range dirty.
>> + */
>> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
>> +{
>> +	pud_t *pud;
>> +	gfn_t gfn;
>> +	long i;
>> +
>> +	pud = stage2_find_pud(kvm, addr);
>> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
> 
> I'm just thinking here, why do we need to check if we get a valid pud
> back here, but we don't need the equivalent check in dissolve_pmd from
> patch 7?

kvm_pud_huge() doesn't check bit 0 for invalid entry, but
pud_none() is not the right way to check either, maybe pud_bad()
first. Nothing is done in patch 7 since the pmd is retrieved from
stage2_get_pmd().

> 
> I think the rationale is that it should never happen because we never
> call these functions with the logging and iomap flags at the same
> time...

I'm little lost here, not sure how it's related to above.
But I think a VFIO device will have a memslot and
it would be possible to enable logging. But to what
end I'm not sure.

> 
>> +		pud_clear(pud);
>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>> +		put_page(virt_to_page(pud));
>> +#ifdef CONFIG_SMP
>> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
>> +		/*
>> +		 * Mark all pages in PUD range dirty, in case other
>> +		 * CPUs are  writing to it.
>> +		 */
>> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
>> +			mark_page_dirty(kvm, gfn + i);
>> +#endif
>> +	}
>> +}
>> +
>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>  				  int min, int max)
>>  {
>> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>>  
>> +	/*
>> +	 * While dirty page logging - dissolve huge PUD, then continue on to
>> +	 * allocate page.
>> +	 */
>> +	if (logging_active)
>> +		stage2_dissolve_pud(kvm, addr);
>> +
> 
> I know I asked for this, but what's the purpose really when we never set
> a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
> 
> Marc, you may have some thoughts here...

Not sure myself what's the vision for PUD support.

> 
>>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>>  	pmd = stage2_get_pmd(kvm, cache, addr);
>>  	if (!pmd) {
>> @@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
>>  	do {
>>  		next = kvm_pud_addr_end(addr, end);
>>  		if (!pud_none(*pud)) {
>> -			/* TODO:PUD not supported, revisit later if supported */
>> -			BUG_ON(kvm_pud_huge(*pud));
>> -			stage2_wp_pmds(pud, addr, next);
>> +			if (kvm_pud_huge(*pud)) {
>> +				if (!kvm_s2pud_readonly(pud))
>> +					kvm_set_s2pud_readonly(pud);
> 
> I guess the same question that I had above applies here as well (sorry
> for making you go rounds on this one).
> 
>> +			} else
>> +				stage2_wp_pmds(pud, addr, next);
>>  		}
>>  	} while (pud++, addr = next, addr != end);
>>  }
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index f925e40..3b692c5 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>  	return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
>>  }
>>  
>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>> +{
>> +	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
>> +}
>> +
>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>> +{
>> +	return (pud_val(*pud) & PUD_S2_RDWR) == PUD_S2_RDONLY;
>> +}
>>  
>>  #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
>>  #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
>> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
>> index 5f930cc..1714c84 100644
>> --- a/arch/arm64/include/asm/pgtable-hwdef.h
>> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
>> @@ -122,6 +122,9 @@
>>  #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
>>  #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>>  
>> +#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
>> +#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
>> +
>>  /*
>>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>>   */
>> -- 
>> 1.9.1
>>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08  3:01       ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  3:01 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/07/2015 05:05 AM, Christoffer Dall wrote:
> On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
>> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
>> write protected for initial memory region write protection. Code to dissolve 
>> huge PUD is supported in user_mem_abort(). At this time this code has not been 
>> tested, but similar approach to current ARMv8 page logging test is in work,
>> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
>> 4k page/48 bit host, some host kernel test code needs to be added to detect
>> page fault to this region and side step general processing. Also similar to 
>> PMD case all pages in range are marked dirty when PUD entry is cleared.
> 
> the note about this code being untested shouldn't be part of the commit
> message but after the '---' separater or in the cover letter I think.

Ah ok.
> 
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>>  4 files changed, 81 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index dda0046..703d04d 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
>>  }
>>  
>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>> +{
>> +}
>> +
>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>> +{
>> +	return false;
>> +}
>>  
>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>  #define kvm_pgd_addr_end(addr, end)					\
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 59003df..35840fb 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>>  	}
>>  }
>>  
>> +/**
>> +  * stage2_find_pud() - find a PUD entry
>> +  * @kvm:	pointer to kvm structure.
>> +  * @addr:	IPA address
>> +  *
>> +  * Return address of PUD entry or NULL if not allocated.
>> +  */
>> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
> 
> why can't you reuse stage2_get_pud here?

stage2_get_* allocate intermediate tables, when they're called
you know intermediate tables are needed to install a pmd or pte.
But currently there is no way to tell we faulted in a PUD
region, this code just checks if a PUD is set, and not
allocate intermediate tables along the way.

Overall not sure if this is in preparation for a new huge page (PUD sized)?
Besides developing a custom test, not sure how to use this
and determine we fault in a PUD region? Generic 'gup'
code does handle PUDs but perhaps some arch. has PUD sized
huge pages.

> 
>> +{
>> +	pgd_t *pgd;
>> +
>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>> +	if (pgd_none(*pgd))
>> +		return NULL;
>> +
>> +	return pud_offset(pgd, addr);
>> +}
>> +
>> +/**
>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>> + * @kvm:	pointer to kvm structure.
>> + * @addr	IPA
>> + *
>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>> + * pages in the range dirty.
>> + */
>> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
>> +{
>> +	pud_t *pud;
>> +	gfn_t gfn;
>> +	long i;
>> +
>> +	pud = stage2_find_pud(kvm, addr);
>> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
> 
> I'm just thinking here, why do we need to check if we get a valid pud
> back here, but we don't need the equivalent check in dissolve_pmd from
> patch 7?

kvm_pud_huge() doesn't check bit 0 for invalid entry, but
pud_none() is not the right way to check either, maybe pud_bad()
first. Nothing is done in patch 7 since the pmd is retrieved from
stage2_get_pmd().

> 
> I think the rationale is that it should never happen because we never
> call these functions with the logging and iomap flags at the same
> time...

I'm little lost here, not sure how it's related to above.
But I think a VFIO device will have a memslot and
it would be possible to enable logging. But to what
end I'm not sure.

> 
>> +		pud_clear(pud);
>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>> +		put_page(virt_to_page(pud));
>> +#ifdef CONFIG_SMP
>> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
>> +		/*
>> +		 * Mark all pages in PUD range dirty, in case other
>> +		 * CPUs are  writing to it.
>> +		 */
>> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
>> +			mark_page_dirty(kvm, gfn + i);
>> +#endif
>> +	}
>> +}
>> +
>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>  				  int min, int max)
>>  {
>> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>>  
>> +	/*
>> +	 * While dirty page logging - dissolve huge PUD, then continue on to
>> +	 * allocate page.
>> +	 */
>> +	if (logging_active)
>> +		stage2_dissolve_pud(kvm, addr);
>> +
> 
> I know I asked for this, but what's the purpose really when we never set
> a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
> 
> Marc, you may have some thoughts here...

Not sure myself what's the vision for PUD support.

> 
>>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>>  	pmd = stage2_get_pmd(kvm, cache, addr);
>>  	if (!pmd) {
>> @@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
>>  	do {
>>  		next = kvm_pud_addr_end(addr, end);
>>  		if (!pud_none(*pud)) {
>> -			/* TODO:PUD not supported, revisit later if supported */
>> -			BUG_ON(kvm_pud_huge(*pud));
>> -			stage2_wp_pmds(pud, addr, next);
>> +			if (kvm_pud_huge(*pud)) {
>> +				if (!kvm_s2pud_readonly(pud))
>> +					kvm_set_s2pud_readonly(pud);
> 
> I guess the same question that I had above applies here as well (sorry
> for making you go rounds on this one).
> 
>> +			} else
>> +				stage2_wp_pmds(pud, addr, next);
>>  		}
>>  	} while (pud++, addr = next, addr != end);
>>  }
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index f925e40..3b692c5 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>  	return (pmd_val(*pmd) & PMD_S2_RDWR) == PMD_S2_RDONLY;
>>  }
>>  
>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>> +{
>> +	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
>> +}
>> +
>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>> +{
>> +	return (pud_val(*pud) & PUD_S2_RDWR) == PUD_S2_RDONLY;
>> +}
>>  
>>  #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
>>  #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
>> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
>> index 5f930cc..1714c84 100644
>> --- a/arch/arm64/include/asm/pgtable-hwdef.h
>> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
>> @@ -122,6 +122,9 @@
>>  #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
>>  #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>>  
>> +#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
>> +#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
>> +
>>  /*
>>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>>   */
>> -- 
>> 1.9.1
>>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08  3:01       ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  3:01 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On 01/07/2015 05:05 AM, Christoffer Dall wrote:
> On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
>> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
>> write protected for initial memory region write protection. Code to dissolve 
>> huge PUD is supported in user_mem_abort(). At this time this code has not been 
>> tested, but similar approach to current ARMv8 page logging test is in work,
>> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
>> 4k page/48 bit host, some host kernel test code needs to be added to detect
>> page fault to this region and side step general processing. Also similar to 
>> PMD case all pages in range are marked dirty when PUD entry is cleared.
> 
> the note about this code being untested shouldn't be part of the commit
> message but after the '---' separater or in the cover letter I think.

Ah ok.
> 
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>>  4 files changed, 81 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index dda0046..703d04d 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
>>  }
>>  
>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>> +{
>> +}
>> +
>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>> +{
>> +	return false;
>> +}
>>  
>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>  #define kvm_pgd_addr_end(addr, end)					\
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 59003df..35840fb 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>>  	}
>>  }
>>  
>> +/**
>> +  * stage2_find_pud() - find a PUD entry
>> +  * @kvm:	pointer to kvm structure.
>> +  * @addr:	IPA address
>> +  *
>> +  * Return address of PUD entry or NULL if not allocated.
>> +  */
>> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
> 
> why can't you reuse stage2_get_pud here?

stage2_get_* allocate intermediate tables, when they're called
you know intermediate tables are needed to install a pmd or pte.
But currently there is no way to tell we faulted in a PUD
region, this code just checks if a PUD is set, and not
allocate intermediate tables along the way.

Overall not sure if this is in preparation for a new huge page (PUD sized)?
Besides developing a custom test, not sure how to use this
and determine we fault in a PUD region? Generic 'gup'
code does handle PUDs but perhaps some arch. has PUD sized
huge pages.

> 
>> +{
>> +	pgd_t *pgd;
>> +
>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>> +	if (pgd_none(*pgd))
>> +		return NULL;
>> +
>> +	return pud_offset(pgd, addr);
>> +}
>> +
>> +/**
>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>> + * @kvm:	pointer to kvm structure.
>> + * @addr	IPA
>> + *
>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>> + * pages in the range dirty.
>> + */
>> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
>> +{
>> +	pud_t *pud;
>> +	gfn_t gfn;
>> +	long i;
>> +
>> +	pud = stage2_find_pud(kvm, addr);
>> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
> 
> I'm just thinking here, why do we need to check if we get a valid pud
> back here, but we don't need the equivalent check in dissolve_pmd from
> patch 7?

kvm_pud_huge() doesn't check bit 0 for invalid entry, but
pud_none() is not the right way to check either, maybe pud_bad()
first. Nothing is done in patch 7 since the pmd is retrieved from
stage2_get_pmd().

> 
> I think the rationale is that it should never happen because we never
> call these functions with the logging and iomap flags at the same
> time...

I'm little lost here, not sure how it's related to above.
But I think a VFIO device will have a memslot and
it would be possible to enable logging. But to what
end I'm not sure.

> 
>> +		pud_clear(pud);
>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>> +		put_page(virt_to_page(pud));
>> +#ifdef CONFIG_SMP
>> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
>> +		/*
>> +		 * Mark all pages in PUD range dirty, in case other
>> +		 * CPUs are  writing to it.
>> +		 */
>> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
>> +			mark_page_dirty(kvm, gfn + i);
>> +#endif
>> +	}
>> +}
>> +
>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>  				  int min, int max)
>>  {
>> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>>  
>> +	/*
>> +	 * While dirty page logging - dissolve huge PUD, then continue on to
>> +	 * allocate page.
>> +	 */
>> +	if (logging_active)
>> +		stage2_dissolve_pud(kvm, addr);
>> +
> 
> I know I asked for this, but what's the purpose really when we never set
> a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
> 
> Marc, you may have some thoughts here...

Not sure myself what's the vision for PUD support.

> 
>>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>>  	pmd = stage2_get_pmd(kvm, cache, addr);
>>  	if (!pmd) {
>> @@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
>>  	do {
>>  		next = kvm_pud_addr_end(addr, end);
>>  		if (!pud_none(*pud)) {
>> -			/* TODO:PUD not supported, revisit later if supported */
>> -			BUG_ON(kvm_pud_huge(*pud));
>> -			stage2_wp_pmds(pud, addr, next);
>> +			if (kvm_pud_huge(*pud)) {
>> +				if (!kvm_s2pud_readonly(pud))
>> +					kvm_set_s2pud_readonly(pud);
> 
> I guess the same question that I had above applies here as well (sorry
> for making you go rounds on this one).
> 
>> +			} else
>> +				stage2_wp_pmds(pud, addr, next);
>>  		}
>>  	} while (pud++, addr = next, addr != end);
>>  }
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index f925e40..3b692c5 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>  	return (pmd_val(*pmd) & PMD_S2_RDWR) = PMD_S2_RDONLY;
>>  }
>>  
>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>> +{
>> +	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
>> +}
>> +
>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>> +{
>> +	return (pud_val(*pud) & PUD_S2_RDWR) = PUD_S2_RDONLY;
>> +}
>>  
>>  #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
>>  #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
>> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
>> index 5f930cc..1714c84 100644
>> --- a/arch/arm64/include/asm/pgtable-hwdef.h
>> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
>> @@ -122,6 +122,9 @@
>>  #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
>>  #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>>  
>> +#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
>> +#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
>> +
>>  /*
>>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>>   */
>> -- 
>> 1.9.1
>>


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08  3:01       ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08  3:01 UTC (permalink / raw)
  To: kvm-ia64

On 01/07/2015 05:05 AM, Christoffer Dall wrote:
> On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
>> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
>> write protected for initial memory region write protection. Code to dissolve 
>> huge PUD is supported in user_mem_abort(). At this time this code has not been 
>> tested, but similar approach to current ARMv8 page logging test is in work,
>> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
>> 4k page/48 bit host, some host kernel test code needs to be added to detect
>> page fault to this region and side step general processing. Also similar to 
>> PMD case all pages in range are marked dirty when PUD entry is cleared.
> 
> the note about this code being untested shouldn't be part of the commit
> message but after the '---' separater or in the cover letter I think.

Ah ok.
> 
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>>  4 files changed, 81 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>> index dda0046..703d04d 100644
>> --- a/arch/arm/include/asm/kvm_mmu.h
>> +++ b/arch/arm/include/asm/kvm_mmu.h
>> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
>>  }
>>  
>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>> +{
>> +}
>> +
>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>> +{
>> +	return false;
>> +}
>>  
>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>  #define kvm_pgd_addr_end(addr, end)					\
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index 59003df..35840fb 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>>  	}
>>  }
>>  
>> +/**
>> +  * stage2_find_pud() - find a PUD entry
>> +  * @kvm:	pointer to kvm structure.
>> +  * @addr:	IPA address
>> +  *
>> +  * Return address of PUD entry or NULL if not allocated.
>> +  */
>> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
> 
> why can't you reuse stage2_get_pud here?

stage2_get_* allocate intermediate tables, when they're called
you know intermediate tables are needed to install a pmd or pte.
But currently there is no way to tell we faulted in a PUD
region, this code just checks if a PUD is set, and not
allocate intermediate tables along the way.

Overall not sure if this is in preparation for a new huge page (PUD sized)?
Besides developing a custom test, not sure how to use this
and determine we fault in a PUD region? Generic 'gup'
code does handle PUDs but perhaps some arch. has PUD sized
huge pages.

> 
>> +{
>> +	pgd_t *pgd;
>> +
>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>> +	if (pgd_none(*pgd))
>> +		return NULL;
>> +
>> +	return pud_offset(pgd, addr);
>> +}
>> +
>> +/**
>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>> + * @kvm:	pointer to kvm structure.
>> + * @addr	IPA
>> + *
>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>> + * pages in the range dirty.
>> + */
>> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
>> +{
>> +	pud_t *pud;
>> +	gfn_t gfn;
>> +	long i;
>> +
>> +	pud = stage2_find_pud(kvm, addr);
>> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
> 
> I'm just thinking here, why do we need to check if we get a valid pud
> back here, but we don't need the equivalent check in dissolve_pmd from
> patch 7?

kvm_pud_huge() doesn't check bit 0 for invalid entry, but
pud_none() is not the right way to check either, maybe pud_bad()
first. Nothing is done in patch 7 since the pmd is retrieved from
stage2_get_pmd().

> 
> I think the rationale is that it should never happen because we never
> call these functions with the logging and iomap flags at the same
> time...

I'm little lost here, not sure how it's related to above.
But I think a VFIO device will have a memslot and
it would be possible to enable logging. But to what
end I'm not sure.

> 
>> +		pud_clear(pud);
>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>> +		put_page(virt_to_page(pud));
>> +#ifdef CONFIG_SMP
>> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
>> +		/*
>> +		 * Mark all pages in PUD range dirty, in case other
>> +		 * CPUs are  writing to it.
>> +		 */
>> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
>> +			mark_page_dirty(kvm, gfn + i);
>> +#endif
>> +	}
>> +}
>> +
>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>  				  int min, int max)
>>  {
>> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>>  
>> +	/*
>> +	 * While dirty page logging - dissolve huge PUD, then continue on to
>> +	 * allocate page.
>> +	 */
>> +	if (logging_active)
>> +		stage2_dissolve_pud(kvm, addr);
>> +
> 
> I know I asked for this, but what's the purpose really when we never set
> a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
> 
> Marc, you may have some thoughts here...

Not sure myself what's the vision for PUD support.

> 
>>  	/* Create stage-2 page table mapping - Levels 0 and 1 */
>>  	pmd = stage2_get_pmd(kvm, cache, addr);
>>  	if (!pmd) {
>> @@ -964,9 +1020,11 @@ static void  stage2_wp_puds(pgd_t *pgd, phys_addr_t addr, phys_addr_t end)
>>  	do {
>>  		next = kvm_pud_addr_end(addr, end);
>>  		if (!pud_none(*pud)) {
>> -			/* TODO:PUD not supported, revisit later if supported */
>> -			BUG_ON(kvm_pud_huge(*pud));
>> -			stage2_wp_pmds(pud, addr, next);
>> +			if (kvm_pud_huge(*pud)) {
>> +				if (!kvm_s2pud_readonly(pud))
>> +					kvm_set_s2pud_readonly(pud);
> 
> I guess the same question that I had above applies here as well (sorry
> for making you go rounds on this one).
> 
>> +			} else
>> +				stage2_wp_pmds(pud, addr, next);
>>  		}
>>  	} while (pud++, addr = next, addr != end);
>>  }
>> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
>> index f925e40..3b692c5 100644
>> --- a/arch/arm64/include/asm/kvm_mmu.h
>> +++ b/arch/arm64/include/asm/kvm_mmu.h
>> @@ -137,6 +137,15 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>  	return (pmd_val(*pmd) & PMD_S2_RDWR) = PMD_S2_RDONLY;
>>  }
>>  
>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>> +{
>> +	pud_val(*pud) = (pud_val(*pud) & ~PUD_S2_RDWR) | PUD_S2_RDONLY;
>> +}
>> +
>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>> +{
>> +	return (pud_val(*pud) & PUD_S2_RDWR) = PUD_S2_RDONLY;
>> +}
>>  
>>  #define kvm_pgd_addr_end(addr, end)	pgd_addr_end(addr, end)
>>  #define kvm_pud_addr_end(addr, end)	pud_addr_end(addr, end)
>> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
>> index 5f930cc..1714c84 100644
>> --- a/arch/arm64/include/asm/pgtable-hwdef.h
>> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
>> @@ -122,6 +122,9 @@
>>  #define PMD_S2_RDONLY		(_AT(pmdval_t, 1) << 6)   /* HAP[2:1] */
>>  #define PMD_S2_RDWR		(_AT(pmdval_t, 3) << 6)   /* HAP[2:1] */
>>  
>> +#define PUD_S2_RDONLY		(_AT(pudval_t, 1) << 6)   /* HAP[2:1] */
>> +#define PUD_S2_RDWR		(_AT(pudval_t, 3) << 6)   /* HAP[2:1] */
>> +
>>  /*
>>   * Memory Attribute override for Stage-2 (MemAttr[3:0])
>>   */
>> -- 
>> 1.9.1
>>


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
  2015-01-08  1:43       ` Mario Smarduch
@ 2015-01-08 10:45         ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 10:45 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: marc.zyngier, kvmarm, kvm, linux-arm-kernel, pbonzini, catalin.marinas

On Wed, Jan 07, 2015 at 05:43:18PM -0800, Mario Smarduch wrote:
> Hi Christoffer,
>   before going through your comments, I discovered that
> in 3.18.0-rc2 - a generic __get_user_pages_fast()
> was implemented, now ARM picks this up. This causes
> gfn_to_pfn_prot() to return meaningful 'writable'
> value for a read fault, provided the region is writable.
> 
> Prior to that the weak version returned 0 and 'writable'
> had no optimization effect to set pte/pmd - RW on
> a read fault.
> 
> As a consequence dirty logging broke in 3.18, I was seeing
> weird but very intermittent issues. I just put in the
> additional few lines to fix it, prevent pte RW (only R) on
> read faults  while  logging writable region.
> 
> On 01/07/2015 04:38 AM, Christoffer Dall wrote:
> > On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
> >> This patch is a followup to v15 patch series, with following changes:
> >> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
> >>   the state of whole range is unknown. After the huge page is dissolved 
> >>   dirty page logging is at page granularity.
> > 
> > What is the sequence of events where you could have dirtied another page
> > within the PMD range after the user initially requested dirty page
> > logging?
> 
> No there is none. My issue was the start point for tracking dirty pages
> and that would be second call to dirty log read. Not first
> call after initial write protect where any page in range can
> be assumed dirty. I'll remove this, not sure if there would be any
> use case to call dirty log only once.
> 

Calling dirty log once can not give you anything meaningful, right?  You
must assume all memory is 'dirty' at this point, no?

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2015-01-08 10:45         ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 07, 2015 at 05:43:18PM -0800, Mario Smarduch wrote:
> Hi Christoffer,
>   before going through your comments, I discovered that
> in 3.18.0-rc2 - a generic __get_user_pages_fast()
> was implemented, now ARM picks this up. This causes
> gfn_to_pfn_prot() to return meaningful 'writable'
> value for a read fault, provided the region is writable.
> 
> Prior to that the weak version returned 0 and 'writable'
> had no optimization effect to set pte/pmd - RW on
> a read fault.
> 
> As a consequence dirty logging broke in 3.18, I was seeing
> weird but very intermittent issues. I just put in the
> additional few lines to fix it, prevent pte RW (only R) on
> read faults  while  logging writable region.
> 
> On 01/07/2015 04:38 AM, Christoffer Dall wrote:
> > On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
> >> This patch is a followup to v15 patch series, with following changes:
> >> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
> >>   the state of whole range is unknown. After the huge page is dissolved 
> >>   dirty page logging is at page granularity.
> > 
> > What is the sequence of events where you could have dirtied another page
> > within the PMD range after the user initially requested dirty page
> > logging?
> 
> No there is none. My issue was the start point for tracking dirty pages
> and that would be second call to dirty log read. Not first
> call after initial write protect where any page in range can
> be assumed dirty. I'll remove this, not sure if there would be any
> use case to call dirty log only once.
> 

Calling dirty log once can not give you anything meaningful, right?  You
must assume all memory is 'dirty' at this point, no?

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  2015-01-08  1:51       ` Mario Smarduch
  (?)
  (?)
@ 2015-01-08 10:56         ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 10:56 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Wed, Jan 07, 2015 at 05:51:15PM -0800, Mario Smarduch wrote:
> On 01/07/2015 04:47 AM, Christoffer Dall wrote:
> > On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
> >> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
> > 
> >                            dirty
> yeah.
> > 
> >> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
> >> logging at architecture layer.
> >>
> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >> ---
> >>  arch/arm/include/asm/kvm_host.h | 12 ------------
> >>  arch/arm/kvm/arm.c              |  4 ----
> >>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
> >>  arch/arm64/kvm/Kconfig          |  2 ++
> >>  4 files changed, 13 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> >> index b138431..088ea87 100644
> >> --- a/arch/arm/include/asm/kvm_host.h
> >> +++ b/arch/arm/include/asm/kvm_host.h
> >> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
> >>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
> >>  }
> >>  
> >> -/**
> >> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
> >> - * @kvm:	pointer to kvm structure.
> >> - *
> >> - * Interface to HYP function to flush all VM TLB entries without address
> >> - * parameter.
> >> - */
> >> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> >> -{
> >> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> >> -}
> >> -
> >>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
> >>  {
> >>  	return 0;
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 6e4290c..1b6577c 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> >>   */
> >>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> >>  {
> >> -#ifdef CONFIG_ARM
> >>  	bool is_dirty = false;
> >>  	int r;
> >>  
> >> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> >>  
> >>  	mutex_unlock(&kvm->slots_lock);
> >>  	return r;
> >> -#else /* arm64 */
> >> -	return -EINVAL;
> >> -#endif
> >>  }
> >>  
> >>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index dc763bb..59003df 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
> >>  
> >>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
> >>  {
> >> -#ifdef CONFIG_ARM
> >>  	return !!memslot->dirty_bitmap;
> >> -#else
> >> -	return false;
> >> -#endif
> >> +}
> >> +
> >> +/**
> >> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
> >> + * @kvm:	pointer to kvm structure.
> >> + *
> >> + * Interface to HYP function to flush all VM TLB entries
> >> + */
> >> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> > 
> > did you intend for a non-staic inline here?
> 
> Yes it's used in arm.c and mmu.c

then why inline?

I'm not a compiler expert by any measure, but poking around I'm pretty
sure the inline keyword in this context is useless.  See for example:
http://www.cs.nyu.edu/~xiaojian/bookmark/c_programming/Inline_Functions.htm

So I suggest either make it a normal stand-alone function or keep it as
duplicate static inlines in the header files if you're adamant about
this being inlined.

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08 10:56         ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 10:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 07, 2015 at 05:51:15PM -0800, Mario Smarduch wrote:
> On 01/07/2015 04:47 AM, Christoffer Dall wrote:
> > On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
> >> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
> > 
> >                            dirty
> yeah.
> > 
> >> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
> >> logging at architecture layer.
> >>
> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >> ---
> >>  arch/arm/include/asm/kvm_host.h | 12 ------------
> >>  arch/arm/kvm/arm.c              |  4 ----
> >>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
> >>  arch/arm64/kvm/Kconfig          |  2 ++
> >>  4 files changed, 13 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> >> index b138431..088ea87 100644
> >> --- a/arch/arm/include/asm/kvm_host.h
> >> +++ b/arch/arm/include/asm/kvm_host.h
> >> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
> >>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
> >>  }
> >>  
> >> -/**
> >> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
> >> - * @kvm:	pointer to kvm structure.
> >> - *
> >> - * Interface to HYP function to flush all VM TLB entries without address
> >> - * parameter.
> >> - */
> >> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> >> -{
> >> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> >> -}
> >> -
> >>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
> >>  {
> >>  	return 0;
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 6e4290c..1b6577c 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> >>   */
> >>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> >>  {
> >> -#ifdef CONFIG_ARM
> >>  	bool is_dirty = false;
> >>  	int r;
> >>  
> >> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> >>  
> >>  	mutex_unlock(&kvm->slots_lock);
> >>  	return r;
> >> -#else /* arm64 */
> >> -	return -EINVAL;
> >> -#endif
> >>  }
> >>  
> >>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index dc763bb..59003df 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
> >>  
> >>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
> >>  {
> >> -#ifdef CONFIG_ARM
> >>  	return !!memslot->dirty_bitmap;
> >> -#else
> >> -	return false;
> >> -#endif
> >> +}
> >> +
> >> +/**
> >> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
> >> + * @kvm:	pointer to kvm structure.
> >> + *
> >> + * Interface to HYP function to flush all VM TLB entries
> >> + */
> >> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> > 
> > did you intend for a non-staic inline here?
> 
> Yes it's used in arm.c and mmu.c

then why inline?

I'm not a compiler expert by any measure, but poking around I'm pretty
sure the inline keyword in this context is useless.  See for example:
http://www.cs.nyu.edu/~xiaojian/bookmark/c_programming/Inline_Functions.htm

So I suggest either make it a normal stand-alone function or keep it as
duplicate static inlines in the header files if you're adamant about
this being inlined.

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08 10:56         ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 10:56 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On Wed, Jan 07, 2015 at 05:51:15PM -0800, Mario Smarduch wrote:
> On 01/07/2015 04:47 AM, Christoffer Dall wrote:
> > On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
> >> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
> > 
> >                            dirty
> yeah.
> > 
> >> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
> >> logging at architecture layer.
> >>
> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >> ---
> >>  arch/arm/include/asm/kvm_host.h | 12 ------------
> >>  arch/arm/kvm/arm.c              |  4 ----
> >>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
> >>  arch/arm64/kvm/Kconfig          |  2 ++
> >>  4 files changed, 13 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> >> index b138431..088ea87 100644
> >> --- a/arch/arm/include/asm/kvm_host.h
> >> +++ b/arch/arm/include/asm/kvm_host.h
> >> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
> >>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
> >>  }
> >>  
> >> -/**
> >> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
> >> - * @kvm:	pointer to kvm structure.
> >> - *
> >> - * Interface to HYP function to flush all VM TLB entries without address
> >> - * parameter.
> >> - */
> >> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> >> -{
> >> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> >> -}
> >> -
> >>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
> >>  {
> >>  	return 0;
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 6e4290c..1b6577c 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> >>   */
> >>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> >>  {
> >> -#ifdef CONFIG_ARM
> >>  	bool is_dirty = false;
> >>  	int r;
> >>  
> >> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> >>  
> >>  	mutex_unlock(&kvm->slots_lock);
> >>  	return r;
> >> -#else /* arm64 */
> >> -	return -EINVAL;
> >> -#endif
> >>  }
> >>  
> >>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index dc763bb..59003df 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
> >>  
> >>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
> >>  {
> >> -#ifdef CONFIG_ARM
> >>  	return !!memslot->dirty_bitmap;
> >> -#else
> >> -	return false;
> >> -#endif
> >> +}
> >> +
> >> +/**
> >> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
> >> + * @kvm:	pointer to kvm structure.
> >> + *
> >> + * Interface to HYP function to flush all VM TLB entries
> >> + */
> >> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> > 
> > did you intend for a non-staic inline here?
> 
> Yes it's used in arm.c and mmu.c

then why inline?

I'm not a compiler expert by any measure, but poking around I'm pretty
sure the inline keyword in this context is useless.  See for example:
http://www.cs.nyu.edu/~xiaojian/bookmark/c_programming/Inline_Functions.htm

So I suggest either make it a normal stand-alone function or keep it as
duplicate static inlines in the header files if you're adamant about
this being inlined.

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08 10:56         ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 10:56 UTC (permalink / raw)
  To: kvm-ia64

On Wed, Jan 07, 2015 at 05:51:15PM -0800, Mario Smarduch wrote:
> On 01/07/2015 04:47 AM, Christoffer Dall wrote:
> > On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
> >> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
> > 
> >                            dirty
> yeah.
> > 
> >> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
> >> logging at architecture layer.
> >>
> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >> ---
> >>  arch/arm/include/asm/kvm_host.h | 12 ------------
> >>  arch/arm/kvm/arm.c              |  4 ----
> >>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
> >>  arch/arm64/kvm/Kconfig          |  2 ++
> >>  4 files changed, 13 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> >> index b138431..088ea87 100644
> >> --- a/arch/arm/include/asm/kvm_host.h
> >> +++ b/arch/arm/include/asm/kvm_host.h
> >> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
> >>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
> >>  }
> >>  
> >> -/**
> >> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
> >> - * @kvm:	pointer to kvm structure.
> >> - *
> >> - * Interface to HYP function to flush all VM TLB entries without address
> >> - * parameter.
> >> - */
> >> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> >> -{
> >> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
> >> -}
> >> -
> >>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
> >>  {
> >>  	return 0;
> >> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> >> index 6e4290c..1b6577c 100644
> >> --- a/arch/arm/kvm/arm.c
> >> +++ b/arch/arm/kvm/arm.c
> >> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> >>   */
> >>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> >>  {
> >> -#ifdef CONFIG_ARM
> >>  	bool is_dirty = false;
> >>  	int r;
> >>  
> >> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> >>  
> >>  	mutex_unlock(&kvm->slots_lock);
> >>  	return r;
> >> -#else /* arm64 */
> >> -	return -EINVAL;
> >> -#endif
> >>  }
> >>  
> >>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index dc763bb..59003df 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
> >>  
> >>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
> >>  {
> >> -#ifdef CONFIG_ARM
> >>  	return !!memslot->dirty_bitmap;
> >> -#else
> >> -	return false;
> >> -#endif
> >> +}
> >> +
> >> +/**
> >> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
> >> + * @kvm:	pointer to kvm structure.
> >> + *
> >> + * Interface to HYP function to flush all VM TLB entries
> >> + */
> >> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
> > 
> > did you intend for a non-staic inline here?
> 
> Yes it's used in arm.c and mmu.c

then why inline?

I'm not a compiler expert by any measure, but poking around I'm pretty
sure the inline keyword in this context is useless.  See for example:
http://www.cs.nyu.edu/~xiaojian/bookmark/c_programming/Inline_Functions.htm

So I suggest either make it a normal stand-alone function or keep it as
duplicate static inlines in the header files if you're adamant about
this being inlined.

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
  2015-01-08  3:01       ` Mario Smarduch
  (?)
  (?)
@ 2015-01-08 11:32         ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 11:32 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: peter.maydell, james.hogan, kvm-ia64, kvm, steve.capper,
	marc.zyngier, catalin.marinas, Ard Biesheuvel, agraf, kvm-ppc,
	borntraeger, cornelia.huck, pbonzini, kvmarm, linux-arm-kernel

On Wed, Jan 07, 2015 at 07:01:10PM -0800, Mario Smarduch wrote:
> On 01/07/2015 05:05 AM, Christoffer Dall wrote:
> > On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
> >> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
> >> write protected for initial memory region write protection. Code to dissolve 
> >> huge PUD is supported in user_mem_abort(). At this time this code has not been 
> >> tested, but similar approach to current ARMv8 page logging test is in work,
> >> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
> >> 4k page/48 bit host, some host kernel test code needs to be added to detect
> >> page fault to this region and side step general processing. Also similar to 
> >> PMD case all pages in range are marked dirty when PUD entry is cleared.
> > 
> > the note about this code being untested shouldn't be part of the commit
> > message but after the '---' separater or in the cover letter I think.
> 
> Ah ok.
> > 
> >>
> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >> ---
> >>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
> >>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
> >>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
> >>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
> >>  4 files changed, 81 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index dda0046..703d04d 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> >>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
> >>  }
> >>  
> >> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> >> +{
> >> +}
> >> +
> >> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> >> +{
> >> +	return false;
> >> +}
> >>  
> >>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
> >>  #define kvm_pgd_addr_end(addr, end)					\
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 59003df..35840fb 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
> >>  	}
> >>  }
> >>  
> >> +/**
> >> +  * stage2_find_pud() - find a PUD entry
> >> +  * @kvm:	pointer to kvm structure.
> >> +  * @addr:	IPA address
> >> +  *
> >> +  * Return address of PUD entry or NULL if not allocated.
> >> +  */
> >> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
> > 
> > why can't you reuse stage2_get_pud here?
> 
> stage2_get_* allocate intermediate tables, when they're called
> you know intermediate tables are needed to install a pmd or pte.
> But currently there is no way to tell we faulted in a PUD
> region, this code just checks if a PUD is set, and not
> allocate intermediate tables along the way.

hmmm, but if we get here it means that we are faulting on an address, so
we need to map something at that address regardless, so I don't see the
problem in using stage2_get_pud.

> 
> Overall not sure if this is in preparation for a new huge page (PUD sized)?
> Besides developing a custom test, not sure how to use this
> and determine we fault in a PUD region? Generic 'gup'
> code does handle PUDs but perhaps some arch. has PUD sized
> huge pages.
> 

When Marc and I discussed this we came to the conclusion that we wanted
code to support this code path for when huge PUDs were suddently used,
but now when I see the code, I am realizing that adding huge PUD support
on the Stage-2 level requires a lot of changes to this file, so I really
don't think we need to handle it at the point after all.

> > 
> >> +{
> >> +	pgd_t *pgd;
> >> +
> >> +	pgd = kvm->arch.pgd + pgd_index(addr);
> >> +	if (pgd_none(*pgd))
> >> +		return NULL;
> >> +
> >> +	return pud_offset(pgd, addr);
> >> +}
> >> +
> >> +/**
> >> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> >> + * @kvm:	pointer to kvm structure.
> >> + * @addr	IPA
> >> + *
> >> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> >> + * pages in the range dirty.
> >> + */
> >> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
> >> +{
> >> +	pud_t *pud;
> >> +	gfn_t gfn;
> >> +	long i;
> >> +
> >> +	pud = stage2_find_pud(kvm, addr);
> >> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
> > 
> > I'm just thinking here, why do we need to check if we get a valid pud
> > back here, but we don't need the equivalent check in dissolve_pmd from
> > patch 7?
> 
> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
> pud_none() is not the right way to check either, maybe pud_bad()
> first. Nothing is done in patch 7 since the pmd is retrieved from
> stage2_get_pmd().
> 

hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
IOMAP flag set...

> > 
> > I think the rationale is that it should never happen because we never
> > call these functions with the logging and iomap flags at the same
> > time...
> 
> I'm little lost here, not sure how it's related to above.
> But I think a VFIO device will have a memslot and
> it would be possible to enable logging. But to what
> end I'm not sure.
> 

As I said above, if you call the set_s2pte function with the IOMAP and
LOGGING flags set, then you'll end up in a situation where you can get a
NULL pointer back from stage2_get_pmd() but you're never checking
against that.

Now, this raises an interesting point, we have now added code that
prevents faults from ever happening on device maps, but introducing a
path here where the user can set logging on a memslot with device memory
regions, which introduces write faults on such regions.  My gut feeling
is that we should avoid that from ever happening, and not allow this
function to be called with both flags set.

> > 
> >> +		pud_clear(pud);
> >> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> >> +		put_page(virt_to_page(pud));
> >> +#ifdef CONFIG_SMP
> >> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
> >> +		/*
> >> +		 * Mark all pages in PUD range dirty, in case other
> >> +		 * CPUs are  writing to it.
> >> +		 */
> >> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
> >> +			mark_page_dirty(kvm, gfn + i);
> >> +#endif
> >> +	}
> >> +}
> >> +
> >>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> >>  				  int min, int max)
> >>  {
> >> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> >>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
> >>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
> >>  
> >> +	/*
> >> +	 * While dirty page logging - dissolve huge PUD, then continue on to
> >> +	 * allocate page.
> >> +	 */
> >> +	if (logging_active)
> >> +		stage2_dissolve_pud(kvm, addr);
> >> +
> > 
> > I know I asked for this, but what's the purpose really when we never set
> > a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
> > 
> > Marc, you may have some thoughts here...
> 
> Not sure myself what's the vision for PUD support.
> 

with 4-level paging on aarch64, we use PUDs but we haven't added any
code to insert huge PUDs (only regular ones) on the stage-2 page tables,
even if the host kernel happens to suddenly support huge PUDs for the
stage-1 page tables, which is what I think we were trying to address.


So I really think we can drop this whole patch.  As I said, really sorry
about this one!

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 11:32         ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 11:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jan 07, 2015 at 07:01:10PM -0800, Mario Smarduch wrote:
> On 01/07/2015 05:05 AM, Christoffer Dall wrote:
> > On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
> >> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
> >> write protected for initial memory region write protection. Code to dissolve 
> >> huge PUD is supported in user_mem_abort(). At this time this code has not been 
> >> tested, but similar approach to current ARMv8 page logging test is in work,
> >> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
> >> 4k page/48 bit host, some host kernel test code needs to be added to detect
> >> page fault to this region and side step general processing. Also similar to 
> >> PMD case all pages in range are marked dirty when PUD entry is cleared.
> > 
> > the note about this code being untested shouldn't be part of the commit
> > message but after the '---' separater or in the cover letter I think.
> 
> Ah ok.
> > 
> >>
> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >> ---
> >>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
> >>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
> >>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
> >>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
> >>  4 files changed, 81 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index dda0046..703d04d 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> >>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
> >>  }
> >>  
> >> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> >> +{
> >> +}
> >> +
> >> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> >> +{
> >> +	return false;
> >> +}
> >>  
> >>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
> >>  #define kvm_pgd_addr_end(addr, end)					\
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 59003df..35840fb 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
> >>  	}
> >>  }
> >>  
> >> +/**
> >> +  * stage2_find_pud() - find a PUD entry
> >> +  * @kvm:	pointer to kvm structure.
> >> +  * @addr:	IPA address
> >> +  *
> >> +  * Return address of PUD entry or NULL if not allocated.
> >> +  */
> >> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
> > 
> > why can't you reuse stage2_get_pud here?
> 
> stage2_get_* allocate intermediate tables, when they're called
> you know intermediate tables are needed to install a pmd or pte.
> But currently there is no way to tell we faulted in a PUD
> region, this code just checks if a PUD is set, and not
> allocate intermediate tables along the way.

hmmm, but if we get here it means that we are faulting on an address, so
we need to map something at that address regardless, so I don't see the
problem in using stage2_get_pud.

> 
> Overall not sure if this is in preparation for a new huge page (PUD sized)?
> Besides developing a custom test, not sure how to use this
> and determine we fault in a PUD region? Generic 'gup'
> code does handle PUDs but perhaps some arch. has PUD sized
> huge pages.
> 

When Marc and I discussed this we came to the conclusion that we wanted
code to support this code path for when huge PUDs were suddently used,
but now when I see the code, I am realizing that adding huge PUD support
on the Stage-2 level requires a lot of changes to this file, so I really
don't think we need to handle it at the point after all.

> > 
> >> +{
> >> +	pgd_t *pgd;
> >> +
> >> +	pgd = kvm->arch.pgd + pgd_index(addr);
> >> +	if (pgd_none(*pgd))
> >> +		return NULL;
> >> +
> >> +	return pud_offset(pgd, addr);
> >> +}
> >> +
> >> +/**
> >> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> >> + * @kvm:	pointer to kvm structure.
> >> + * @addr	IPA
> >> + *
> >> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> >> + * pages in the range dirty.
> >> + */
> >> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
> >> +{
> >> +	pud_t *pud;
> >> +	gfn_t gfn;
> >> +	long i;
> >> +
> >> +	pud = stage2_find_pud(kvm, addr);
> >> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
> > 
> > I'm just thinking here, why do we need to check if we get a valid pud
> > back here, but we don't need the equivalent check in dissolve_pmd from
> > patch 7?
> 
> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
> pud_none() is not the right way to check either, maybe pud_bad()
> first. Nothing is done in patch 7 since the pmd is retrieved from
> stage2_get_pmd().
> 

hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
IOMAP flag set...

> > 
> > I think the rationale is that it should never happen because we never
> > call these functions with the logging and iomap flags at the same
> > time...
> 
> I'm little lost here, not sure how it's related to above.
> But I think a VFIO device will have a memslot and
> it would be possible to enable logging. But to what
> end I'm not sure.
> 

As I said above, if you call the set_s2pte function with the IOMAP and
LOGGING flags set, then you'll end up in a situation where you can get a
NULL pointer back from stage2_get_pmd() but you're never checking
against that.

Now, this raises an interesting point, we have now added code that
prevents faults from ever happening on device maps, but introducing a
path here where the user can set logging on a memslot with device memory
regions, which introduces write faults on such regions.  My gut feeling
is that we should avoid that from ever happening, and not allow this
function to be called with both flags set.

> > 
> >> +		pud_clear(pud);
> >> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> >> +		put_page(virt_to_page(pud));
> >> +#ifdef CONFIG_SMP
> >> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
> >> +		/*
> >> +		 * Mark all pages in PUD range dirty, in case other
> >> +		 * CPUs are  writing to it.
> >> +		 */
> >> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
> >> +			mark_page_dirty(kvm, gfn + i);
> >> +#endif
> >> +	}
> >> +}
> >> +
> >>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> >>  				  int min, int max)
> >>  {
> >> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> >>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
> >>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
> >>  
> >> +	/*
> >> +	 * While dirty page logging - dissolve huge PUD, then continue on to
> >> +	 * allocate page.
> >> +	 */
> >> +	if (logging_active)
> >> +		stage2_dissolve_pud(kvm, addr);
> >> +
> > 
> > I know I asked for this, but what's the purpose really when we never set
> > a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
> > 
> > Marc, you may have some thoughts here...
> 
> Not sure myself what's the vision for PUD support.
> 

with 4-level paging on aarch64, we use PUDs but we haven't added any
code to insert huge PUDs (only regular ones) on the stage-2 page tables,
even if the host kernel happens to suddenly support huge PUDs for the
stage-1 page tables, which is what I think we were trying to address.


So I really think we can drop this whole patch.  As I said, really sorry
about this one!

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 11:32         ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 11:32 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: peter.maydell, james.hogan, kvm-ia64, kvm, steve.capper,
	marc.zyngier, catalin.marinas, Ard Biesheuvel, agraf, kvm-ppc,
	borntraeger, cornelia.huck, pbonzini, kvmarm, linux-arm-kernel

On Wed, Jan 07, 2015 at 07:01:10PM -0800, Mario Smarduch wrote:
> On 01/07/2015 05:05 AM, Christoffer Dall wrote:
> > On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
> >> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
> >> write protected for initial memory region write protection. Code to dissolve 
> >> huge PUD is supported in user_mem_abort(). At this time this code has not been 
> >> tested, but similar approach to current ARMv8 page logging test is in work,
> >> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
> >> 4k page/48 bit host, some host kernel test code needs to be added to detect
> >> page fault to this region and side step general processing. Also similar to 
> >> PMD case all pages in range are marked dirty when PUD entry is cleared.
> > 
> > the note about this code being untested shouldn't be part of the commit
> > message but after the '---' separater or in the cover letter I think.
> 
> Ah ok.
> > 
> >>
> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >> ---
> >>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
> >>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
> >>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
> >>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
> >>  4 files changed, 81 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index dda0046..703d04d 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> >>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
> >>  }
> >>  
> >> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> >> +{
> >> +}
> >> +
> >> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> >> +{
> >> +	return false;
> >> +}
> >>  
> >>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
> >>  #define kvm_pgd_addr_end(addr, end)					\
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 59003df..35840fb 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
> >>  	}
> >>  }
> >>  
> >> +/**
> >> +  * stage2_find_pud() - find a PUD entry
> >> +  * @kvm:	pointer to kvm structure.
> >> +  * @addr:	IPA address
> >> +  *
> >> +  * Return address of PUD entry or NULL if not allocated.
> >> +  */
> >> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
> > 
> > why can't you reuse stage2_get_pud here?
> 
> stage2_get_* allocate intermediate tables, when they're called
> you know intermediate tables are needed to install a pmd or pte.
> But currently there is no way to tell we faulted in a PUD
> region, this code just checks if a PUD is set, and not
> allocate intermediate tables along the way.

hmmm, but if we get here it means that we are faulting on an address, so
we need to map something at that address regardless, so I don't see the
problem in using stage2_get_pud.

> 
> Overall not sure if this is in preparation for a new huge page (PUD sized)?
> Besides developing a custom test, not sure how to use this
> and determine we fault in a PUD region? Generic 'gup'
> code does handle PUDs but perhaps some arch. has PUD sized
> huge pages.
> 

When Marc and I discussed this we came to the conclusion that we wanted
code to support this code path for when huge PUDs were suddently used,
but now when I see the code, I am realizing that adding huge PUD support
on the Stage-2 level requires a lot of changes to this file, so I really
don't think we need to handle it at the point after all.

> > 
> >> +{
> >> +	pgd_t *pgd;
> >> +
> >> +	pgd = kvm->arch.pgd + pgd_index(addr);
> >> +	if (pgd_none(*pgd))
> >> +		return NULL;
> >> +
> >> +	return pud_offset(pgd, addr);
> >> +}
> >> +
> >> +/**
> >> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> >> + * @kvm:	pointer to kvm structure.
> >> + * @addr	IPA
> >> + *
> >> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> >> + * pages in the range dirty.
> >> + */
> >> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
> >> +{
> >> +	pud_t *pud;
> >> +	gfn_t gfn;
> >> +	long i;
> >> +
> >> +	pud = stage2_find_pud(kvm, addr);
> >> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
> > 
> > I'm just thinking here, why do we need to check if we get a valid pud
> > back here, but we don't need the equivalent check in dissolve_pmd from
> > patch 7?
> 
> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
> pud_none() is not the right way to check either, maybe pud_bad()
> first. Nothing is done in patch 7 since the pmd is retrieved from
> stage2_get_pmd().
> 

hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
IOMAP flag set...

> > 
> > I think the rationale is that it should never happen because we never
> > call these functions with the logging and iomap flags at the same
> > time...
> 
> I'm little lost here, not sure how it's related to above.
> But I think a VFIO device will have a memslot and
> it would be possible to enable logging. But to what
> end I'm not sure.
> 

As I said above, if you call the set_s2pte function with the IOMAP and
LOGGING flags set, then you'll end up in a situation where you can get a
NULL pointer back from stage2_get_pmd() but you're never checking
against that.

Now, this raises an interesting point, we have now added code that
prevents faults from ever happening on device maps, but introducing a
path here where the user can set logging on a memslot with device memory
regions, which introduces write faults on such regions.  My gut feeling
is that we should avoid that from ever happening, and not allow this
function to be called with both flags set.

> > 
> >> +		pud_clear(pud);
> >> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> >> +		put_page(virt_to_page(pud));
> >> +#ifdef CONFIG_SMP
> >> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
> >> +		/*
> >> +		 * Mark all pages in PUD range dirty, in case other
> >> +		 * CPUs are  writing to it.
> >> +		 */
> >> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
> >> +			mark_page_dirty(kvm, gfn + i);
> >> +#endif
> >> +	}
> >> +}
> >> +
> >>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> >>  				  int min, int max)
> >>  {
> >> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> >>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
> >>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
> >>  
> >> +	/*
> >> +	 * While dirty page logging - dissolve huge PUD, then continue on to
> >> +	 * allocate page.
> >> +	 */
> >> +	if (logging_active)
> >> +		stage2_dissolve_pud(kvm, addr);
> >> +
> > 
> > I know I asked for this, but what's the purpose really when we never set
> > a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
> > 
> > Marc, you may have some thoughts here...
> 
> Not sure myself what's the vision for PUD support.
> 

with 4-level paging on aarch64, we use PUDs but we haven't added any
code to insert huge PUDs (only regular ones) on the stage-2 page tables,
even if the host kernel happens to suddenly support huge PUDs for the
stage-1 page tables, which is what I think we were trying to address.


So I really think we can drop this whole patch.  As I said, really sorry
about this one!

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 11:32         ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-08 11:32 UTC (permalink / raw)
  To: kvm-ia64

On Wed, Jan 07, 2015 at 07:01:10PM -0800, Mario Smarduch wrote:
> On 01/07/2015 05:05 AM, Christoffer Dall wrote:
> > On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
> >> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
> >> write protected for initial memory region write protection. Code to dissolve 
> >> huge PUD is supported in user_mem_abort(). At this time this code has not been 
> >> tested, but similar approach to current ARMv8 page logging test is in work,
> >> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
> >> 4k page/48 bit host, some host kernel test code needs to be added to detect
> >> page fault to this region and side step general processing. Also similar to 
> >> PMD case all pages in range are marked dirty when PUD entry is cleared.
> > 
> > the note about this code being untested shouldn't be part of the commit
> > message but after the '---' separater or in the cover letter I think.
> 
> Ah ok.
> > 
> >>
> >> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
> >> ---
> >>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
> >>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
> >>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
> >>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
> >>  4 files changed, 81 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index dda0046..703d04d 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> >>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
> >>  }
> >>  
> >> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
> >> +{
> >> +}
> >> +
> >> +static inline bool kvm_s2pud_readonly(pud_t *pud)
> >> +{
> >> +	return false;
> >> +}
> >>  
> >>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
> >>  #define kvm_pgd_addr_end(addr, end)					\
> >> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> >> index 59003df..35840fb 100644
> >> --- a/arch/arm/kvm/mmu.c
> >> +++ b/arch/arm/kvm/mmu.c
> >> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
> >>  	}
> >>  }
> >>  
> >> +/**
> >> +  * stage2_find_pud() - find a PUD entry
> >> +  * @kvm:	pointer to kvm structure.
> >> +  * @addr:	IPA address
> >> +  *
> >> +  * Return address of PUD entry or NULL if not allocated.
> >> +  */
> >> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
> > 
> > why can't you reuse stage2_get_pud here?
> 
> stage2_get_* allocate intermediate tables, when they're called
> you know intermediate tables are needed to install a pmd or pte.
> But currently there is no way to tell we faulted in a PUD
> region, this code just checks if a PUD is set, and not
> allocate intermediate tables along the way.

hmmm, but if we get here it means that we are faulting on an address, so
we need to map something at that address regardless, so I don't see the
problem in using stage2_get_pud.

> 
> Overall not sure if this is in preparation for a new huge page (PUD sized)?
> Besides developing a custom test, not sure how to use this
> and determine we fault in a PUD region? Generic 'gup'
> code does handle PUDs but perhaps some arch. has PUD sized
> huge pages.
> 

When Marc and I discussed this we came to the conclusion that we wanted
code to support this code path for when huge PUDs were suddently used,
but now when I see the code, I am realizing that adding huge PUD support
on the Stage-2 level requires a lot of changes to this file, so I really
don't think we need to handle it at the point after all.

> > 
> >> +{
> >> +	pgd_t *pgd;
> >> +
> >> +	pgd = kvm->arch.pgd + pgd_index(addr);
> >> +	if (pgd_none(*pgd))
> >> +		return NULL;
> >> +
> >> +	return pud_offset(pgd, addr);
> >> +}
> >> +
> >> +/**
> >> + * stage2_dissolve_pud() - clear and flush huge PUD entry
> >> + * @kvm:	pointer to kvm structure.
> >> + * @addr	IPA
> >> + *
> >> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
> >> + * pages in the range dirty.
> >> + */
> >> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
> >> +{
> >> +	pud_t *pud;
> >> +	gfn_t gfn;
> >> +	long i;
> >> +
> >> +	pud = stage2_find_pud(kvm, addr);
> >> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
> > 
> > I'm just thinking here, why do we need to check if we get a valid pud
> > back here, but we don't need the equivalent check in dissolve_pmd from
> > patch 7?
> 
> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
> pud_none() is not the right way to check either, maybe pud_bad()
> first. Nothing is done in patch 7 since the pmd is retrieved from
> stage2_get_pmd().
> 

hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
IOMAP flag set...

> > 
> > I think the rationale is that it should never happen because we never
> > call these functions with the logging and iomap flags at the same
> > time...
> 
> I'm little lost here, not sure how it's related to above.
> But I think a VFIO device will have a memslot and
> it would be possible to enable logging. But to what
> end I'm not sure.
> 

As I said above, if you call the set_s2pte function with the IOMAP and
LOGGING flags set, then you'll end up in a situation where you can get a
NULL pointer back from stage2_get_pmd() but you're never checking
against that.

Now, this raises an interesting point, we have now added code that
prevents faults from ever happening on device maps, but introducing a
path here where the user can set logging on a memslot with device memory
regions, which introduces write faults on such regions.  My gut feeling
is that we should avoid that from ever happening, and not allow this
function to be called with both flags set.

> > 
> >> +		pud_clear(pud);
> >> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
> >> +		put_page(virt_to_page(pud));
> >> +#ifdef CONFIG_SMP
> >> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
> >> +		/*
> >> +		 * Mark all pages in PUD range dirty, in case other
> >> +		 * CPUs are  writing to it.
> >> +		 */
> >> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
> >> +			mark_page_dirty(kvm, gfn + i);
> >> +#endif
> >> +	}
> >> +}
> >> +
> >>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
> >>  				  int min, int max)
> >>  {
> >> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> >>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
> >>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
> >>  
> >> +	/*
> >> +	 * While dirty page logging - dissolve huge PUD, then continue on to
> >> +	 * allocate page.
> >> +	 */
> >> +	if (logging_active)
> >> +		stage2_dissolve_pud(kvm, addr);
> >> +
> > 
> > I know I asked for this, but what's the purpose really when we never set
> > a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
> > 
> > Marc, you may have some thoughts here...
> 
> Not sure myself what's the vision for PUD support.
> 

with 4-level paging on aarch64, we use PUDs but we haven't added any
code to insert huge PUDs (only regular ones) on the stage-2 page tables,
even if the host kernel happens to suddenly support huge PUDs for the
stage-1 page tables, which is what I think we were trying to address.


So I really think we can drop this whole patch.  As I said, really sorry
about this one!

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
  2015-01-08 10:45         ` Christoffer Dall
@ 2015-01-08 16:28           ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:28 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: marc.zyngier, kvmarm, kvm, linux-arm-kernel, pbonzini, catalin.marinas

On 01/08/2015 02:45 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 05:43:18PM -0800, Mario Smarduch wrote:
>> Hi Christoffer,
>>   before going through your comments, I discovered that
>> in 3.18.0-rc2 - a generic __get_user_pages_fast()
>> was implemented, now ARM picks this up. This causes
>> gfn_to_pfn_prot() to return meaningful 'writable'
>> value for a read fault, provided the region is writable.
>>
>> Prior to that the weak version returned 0 and 'writable'
>> had no optimization effect to set pte/pmd - RW on
>> a read fault.
>>
>> As a consequence dirty logging broke in 3.18, I was seeing
Correction on this, proper __get_user_pages_fast()
behavior exposed a bug in page logging code.

>> weird but very intermittent issues. I just put in the
>> additional few lines to fix it, prevent pte RW (only R) on
>> read faults  while  logging writable region.
>>
>> On 01/07/2015 04:38 AM, Christoffer Dall wrote:
>>> On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
>>>> This patch is a followup to v15 patch series, with following changes:
>>>> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
>>>>   the state of whole range is unknown. After the huge page is dissolved 
>>>>   dirty page logging is at page granularity.
>>>
>>> What is the sequence of events where you could have dirtied another page
>>> within the PMD range after the user initially requested dirty page
>>> logging?
>>
>> No there is none. My issue was the start point for tracking dirty pages
>> and that would be second call to dirty log read. Not first
>> call after initial write protect where any page in range can
>> be assumed dirty. I'll remove this, not sure if there would be any
>> use case to call dirty log only once.
>>
> 
> Calling dirty log once can not give you anything meaningful, right?  You
> must assume all memory is 'dirty' at this point, no?

There is the interval between KVM_MEM_LOG_DIRTY_PAGES and first
call to KVM_GET_DIRTY_LOG. Not sure of any use case, maybe enable
logging, wait a while do a dirty log read, disable logging.
Get an accumulated snapshot of dirty page activity.

- Mario

> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2015-01-08 16:28           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/08/2015 02:45 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 05:43:18PM -0800, Mario Smarduch wrote:
>> Hi Christoffer,
>>   before going through your comments, I discovered that
>> in 3.18.0-rc2 - a generic __get_user_pages_fast()
>> was implemented, now ARM picks this up. This causes
>> gfn_to_pfn_prot() to return meaningful 'writable'
>> value for a read fault, provided the region is writable.
>>
>> Prior to that the weak version returned 0 and 'writable'
>> had no optimization effect to set pte/pmd - RW on
>> a read fault.
>>
>> As a consequence dirty logging broke in 3.18, I was seeing
Correction on this, proper __get_user_pages_fast()
behavior exposed a bug in page logging code.

>> weird but very intermittent issues. I just put in the
>> additional few lines to fix it, prevent pte RW (only R) on
>> read faults  while  logging writable region.
>>
>> On 01/07/2015 04:38 AM, Christoffer Dall wrote:
>>> On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
>>>> This patch is a followup to v15 patch series, with following changes:
>>>> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
>>>>   the state of whole range is unknown. After the huge page is dissolved 
>>>>   dirty page logging is at page granularity.
>>>
>>> What is the sequence of events where you could have dirtied another page
>>> within the PMD range after the user initially requested dirty page
>>> logging?
>>
>> No there is none. My issue was the start point for tracking dirty pages
>> and that would be second call to dirty log read. Not first
>> call after initial write protect where any page in range can
>> be assumed dirty. I'll remove this, not sure if there would be any
>> use case to call dirty log only once.
>>
> 
> Calling dirty log once can not give you anything meaningful, right?  You
> must assume all memory is 'dirty' at this point, no?

There is the interval between KVM_MEM_LOG_DIRTY_PAGES and first
call to KVM_GET_DIRTY_LOG. Not sure of any use case, maybe enable
logging, wait a while do a dirty log read, disable logging.
Get an accumulated snapshot of dirty page activity.

- Mario

> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
  2015-01-08 10:56         ` Christoffer Dall
  (?)
  (?)
@ 2015-01-08 16:30           ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:30 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On 01/08/2015 02:56 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 05:51:15PM -0800, Mario Smarduch wrote:
>> On 01/07/2015 04:47 AM, Christoffer Dall wrote:
>>> On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
>>>> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
>>>
>>>                            dirty
>> yeah.
>>>
>>>> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
>>>> logging at architecture layer.
>>>>
>>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_host.h | 12 ------------
>>>>  arch/arm/kvm/arm.c              |  4 ----
>>>>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>>>>  arch/arm64/kvm/Kconfig          |  2 ++
>>>>  4 files changed, 13 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>>>> index b138431..088ea87 100644
>>>> --- a/arch/arm/include/asm/kvm_host.h
>>>> +++ b/arch/arm/include/asm/kvm_host.h
>>>> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>>>  }
>>>>  
>>>> -/**
>>>> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
>>>> - * @kvm:	pointer to kvm structure.
>>>> - *
>>>> - * Interface to HYP function to flush all VM TLB entries without address
>>>> - * parameter.
>>>> - */
>>>> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>> -{
>>>> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>>> -}
>>>> -
>>>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>>>  {
>>>>  	return 0;
>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>> index 6e4290c..1b6577c 100644
>>>> --- a/arch/arm/kvm/arm.c
>>>> +++ b/arch/arm/kvm/arm.c
>>>> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>>>   */
>>>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>>>  {
>>>> -#ifdef CONFIG_ARM
>>>>  	bool is_dirty = false;
>>>>  	int r;
>>>>  
>>>> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>>>  
>>>>  	mutex_unlock(&kvm->slots_lock);
>>>>  	return r;
>>>> -#else /* arm64 */
>>>> -	return -EINVAL;
>>>> -#endif
>>>>  }
>>>>  
>>>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index dc763bb..59003df 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>>>>  
>>>>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>>>>  {
>>>> -#ifdef CONFIG_ARM
>>>>  	return !!memslot->dirty_bitmap;
>>>> -#else
>>>> -	return false;
>>>> -#endif
>>>> +}
>>>> +
>>>> +/**
>>>> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
>>>> + * @kvm:	pointer to kvm structure.
>>>> + *
>>>> + * Interface to HYP function to flush all VM TLB entries
>>>> + */
>>>> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>
>>> did you intend for a non-staic inline here?
>>
>> Yes it's used in arm.c and mmu.c
> 
> then why inline?
> 
> I'm not a compiler expert by any measure, but poking around I'm pretty
> sure the inline keyword in this context is useless.  See for example:
> http://www.cs.nyu.edu/~xiaojian/bookmark/c_programming/Inline_Functions.htm
> 
> So I suggest either make it a normal stand-alone function or keep it as
> duplicate static inlines in the header files if you're adamant about
> this being inlined.

Sorry about that, should have given this a closer look, this
shouldn't require a turnaround from you.

> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08 16:30           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:30 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/08/2015 02:56 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 05:51:15PM -0800, Mario Smarduch wrote:
>> On 01/07/2015 04:47 AM, Christoffer Dall wrote:
>>> On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
>>>> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
>>>
>>>                            dirty
>> yeah.
>>>
>>>> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
>>>> logging at architecture layer.
>>>>
>>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_host.h | 12 ------------
>>>>  arch/arm/kvm/arm.c              |  4 ----
>>>>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>>>>  arch/arm64/kvm/Kconfig          |  2 ++
>>>>  4 files changed, 13 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>>>> index b138431..088ea87 100644
>>>> --- a/arch/arm/include/asm/kvm_host.h
>>>> +++ b/arch/arm/include/asm/kvm_host.h
>>>> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>>>  }
>>>>  
>>>> -/**
>>>> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
>>>> - * @kvm:	pointer to kvm structure.
>>>> - *
>>>> - * Interface to HYP function to flush all VM TLB entries without address
>>>> - * parameter.
>>>> - */
>>>> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>> -{
>>>> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>>> -}
>>>> -
>>>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>>>  {
>>>>  	return 0;
>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>> index 6e4290c..1b6577c 100644
>>>> --- a/arch/arm/kvm/arm.c
>>>> +++ b/arch/arm/kvm/arm.c
>>>> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>>>   */
>>>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>>>  {
>>>> -#ifdef CONFIG_ARM
>>>>  	bool is_dirty = false;
>>>>  	int r;
>>>>  
>>>> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>>>  
>>>>  	mutex_unlock(&kvm->slots_lock);
>>>>  	return r;
>>>> -#else /* arm64 */
>>>> -	return -EINVAL;
>>>> -#endif
>>>>  }
>>>>  
>>>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index dc763bb..59003df 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>>>>  
>>>>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>>>>  {
>>>> -#ifdef CONFIG_ARM
>>>>  	return !!memslot->dirty_bitmap;
>>>> -#else
>>>> -	return false;
>>>> -#endif
>>>> +}
>>>> +
>>>> +/**
>>>> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
>>>> + * @kvm:	pointer to kvm structure.
>>>> + *
>>>> + * Interface to HYP function to flush all VM TLB entries
>>>> + */
>>>> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>
>>> did you intend for a non-staic inline here?
>>
>> Yes it's used in arm.c and mmu.c
> 
> then why inline?
> 
> I'm not a compiler expert by any measure, but poking around I'm pretty
> sure the inline keyword in this context is useless.  See for example:
> http://www.cs.nyu.edu/~xiaojian/bookmark/c_programming/Inline_Functions.htm
> 
> So I suggest either make it a normal stand-alone function or keep it as
> duplicate static inlines in the header files if you're adamant about
> this being inlined.

Sorry about that, should have given this a closer look, this
shouldn't require a turnaround from you.

> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08 16:30           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:30 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell

On 01/08/2015 02:56 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 05:51:15PM -0800, Mario Smarduch wrote:
>> On 01/07/2015 04:47 AM, Christoffer Dall wrote:
>>> On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
>>>> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
>>>
>>>                            dirty
>> yeah.
>>>
>>>> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
>>>> logging at architecture layer.
>>>>
>>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_host.h | 12 ------------
>>>>  arch/arm/kvm/arm.c              |  4 ----
>>>>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>>>>  arch/arm64/kvm/Kconfig          |  2 ++
>>>>  4 files changed, 13 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>>>> index b138431..088ea87 100644
>>>> --- a/arch/arm/include/asm/kvm_host.h
>>>> +++ b/arch/arm/include/asm/kvm_host.h
>>>> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>>>  }
>>>>  
>>>> -/**
>>>> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
>>>> - * @kvm:	pointer to kvm structure.
>>>> - *
>>>> - * Interface to HYP function to flush all VM TLB entries without address
>>>> - * parameter.
>>>> - */
>>>> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>> -{
>>>> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>>> -}
>>>> -
>>>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>>>  {
>>>>  	return 0;
>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>> index 6e4290c..1b6577c 100644
>>>> --- a/arch/arm/kvm/arm.c
>>>> +++ b/arch/arm/kvm/arm.c
>>>> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>>>   */
>>>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>>>  {
>>>> -#ifdef CONFIG_ARM
>>>>  	bool is_dirty = false;
>>>>  	int r;
>>>>  
>>>> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>>>  
>>>>  	mutex_unlock(&kvm->slots_lock);
>>>>  	return r;
>>>> -#else /* arm64 */
>>>> -	return -EINVAL;
>>>> -#endif
>>>>  }
>>>>  
>>>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index dc763bb..59003df 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>>>>  
>>>>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>>>>  {
>>>> -#ifdef CONFIG_ARM
>>>>  	return !!memslot->dirty_bitmap;
>>>> -#else
>>>> -	return false;
>>>> -#endif
>>>> +}
>>>> +
>>>> +/**
>>>> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
>>>> + * @kvm:	pointer to kvm structure.
>>>> + *
>>>> + * Interface to HYP function to flush all VM TLB entries
>>>> + */
>>>> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>
>>> did you intend for a non-staic inline here?
>>
>> Yes it's used in arm.c and mmu.c
> 
> then why inline?
> 
> I'm not a compiler expert by any measure, but poking around I'm pretty
> sure the inline keyword in this context is useless.  See for example:
> http://www.cs.nyu.edu/~xiaojian/bookmark/c_programming/Inline_Functions.htm
> 
> So I suggest either make it a normal stand-alone function or keep it as
> duplicate static inlines in the header files if you're adamant about
> this being inlined.

Sorry about that, should have given this a closer look, this
shouldn't require a turnaround from you.

> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8
@ 2015-01-08 16:30           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:30 UTC (permalink / raw)
  To: kvm-ia64

On 01/08/2015 02:56 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 05:51:15PM -0800, Mario Smarduch wrote:
>> On 01/07/2015 04:47 AM, Christoffer Dall wrote:
>>> On Sun, Dec 14, 2014 at 11:28:07PM -0800, Mario Smarduch wrote:
>>>> This patch enables ARMv8 ditry page logging support. Plugs ARMv8 into generic
>>>
>>>                            dirty
>> yeah.
>>>
>>>> layer through Kconfig symbol, and drops earlier ARM64 constraints to enable
>>>> logging at architecture layer.
>>>>
>>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_host.h | 12 ------------
>>>>  arch/arm/kvm/arm.c              |  4 ----
>>>>  arch/arm/kvm/mmu.c              | 19 +++++++++++--------
>>>>  arch/arm64/kvm/Kconfig          |  2 ++
>>>>  4 files changed, 13 insertions(+), 24 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>>>> index b138431..088ea87 100644
>>>> --- a/arch/arm/include/asm/kvm_host.h
>>>> +++ b/arch/arm/include/asm/kvm_host.h
>>>> @@ -223,18 +223,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>>>>  	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>>>>  }
>>>>  
>>>> -/**
>>>> - * kvm_flush_remote_tlbs() - flush all VM TLB entries
>>>> - * @kvm:	pointer to kvm structure.
>>>> - *
>>>> - * Interface to HYP function to flush all VM TLB entries without address
>>>> - * parameter.
>>>> - */
>>>> -static inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>> -{
>>>> -	kvm_call_hyp(__kvm_tlb_flush_vmid, kvm);
>>>> -}
>>>> -
>>>>  static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>>>>  {
>>>>  	return 0;
>>>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>>>> index 6e4290c..1b6577c 100644
>>>> --- a/arch/arm/kvm/arm.c
>>>> +++ b/arch/arm/kvm/arm.c
>>>> @@ -740,7 +740,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>>>   */
>>>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>>>  {
>>>> -#ifdef CONFIG_ARM
>>>>  	bool is_dirty = false;
>>>>  	int r;
>>>>  
>>>> @@ -753,9 +752,6 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>>>  
>>>>  	mutex_unlock(&kvm->slots_lock);
>>>>  	return r;
>>>> -#else /* arm64 */
>>>> -	return -EINVAL;
>>>> -#endif
>>>>  }
>>>>  
>>>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index dc763bb..59003df 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -52,11 +52,18 @@ static phys_addr_t hyp_idmap_vector;
>>>>  
>>>>  static bool kvm_get_logging_state(struct kvm_memory_slot *memslot)
>>>>  {
>>>> -#ifdef CONFIG_ARM
>>>>  	return !!memslot->dirty_bitmap;
>>>> -#else
>>>> -	return false;
>>>> -#endif
>>>> +}
>>>> +
>>>> +/**
>>>> + * kvm_flush_remote_tlbs() - flush all VM TLB entries for v7/8
>>>> + * @kvm:	pointer to kvm structure.
>>>> + *
>>>> + * Interface to HYP function to flush all VM TLB entries
>>>> + */
>>>> +inline void kvm_flush_remote_tlbs(struct kvm *kvm)
>>>
>>> did you intend for a non-staic inline here?
>>
>> Yes it's used in arm.c and mmu.c
> 
> then why inline?
> 
> I'm not a compiler expert by any measure, but poking around I'm pretty
> sure the inline keyword in this context is useless.  See for example:
> http://www.cs.nyu.edu/~xiaojian/bookmark/c_programming/Inline_Functions.htm
> 
> So I suggest either make it a normal stand-alone function or keep it as
> duplicate static inlines in the header files if you're adamant about
> this being inlined.

Sorry about that, should have given this a closer look, this
shouldn't require a turnaround from you.

> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
  2015-01-08 11:32         ` Christoffer Dall
  (?)
  (?)
@ 2015-01-08 16:41           ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:41 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell, Ard Biesheuvel

On 01/08/2015 03:32 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 07:01:10PM -0800, Mario Smarduch wrote:
>> On 01/07/2015 05:05 AM, Christoffer Dall wrote:
>>> On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
>>>> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
>>>> write protected for initial memory region write protection. Code to dissolve 
>>>> huge PUD is supported in user_mem_abort(). At this time this code has not been 
>>>> tested, but similar approach to current ARMv8 page logging test is in work,
>>>> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
>>>> 4k page/48 bit host, some host kernel test code needs to be added to detect
>>>> page fault to this region and side step general processing. Also similar to 
>>>> PMD case all pages in range are marked dirty when PUD entry is cleared.
>>>
>>> the note about this code being untested shouldn't be part of the commit
>>> message but after the '---' separater or in the cover letter I think.
>>
>> Ah ok.
>>>
>>>>
>>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>>>>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>>>>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>>>>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>>>>  4 files changed, 81 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>>>> index dda0046..703d04d 100644
>>>> --- a/arch/arm/include/asm/kvm_mmu.h
>>>> +++ b/arch/arm/include/asm/kvm_mmu.h
>>>> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>>>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
>>>>  }
>>>>  
>>>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>>>> +{
>>>> +}
>>>> +
>>>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>>>> +{
>>>> +	return false;
>>>> +}
>>>>  
>>>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>>>  #define kvm_pgd_addr_end(addr, end)					\
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 59003df..35840fb 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>>>>  	}
>>>>  }
>>>>  
>>>> +/**
>>>> +  * stage2_find_pud() - find a PUD entry
>>>> +  * @kvm:	pointer to kvm structure.
>>>> +  * @addr:	IPA address
>>>> +  *
>>>> +  * Return address of PUD entry or NULL if not allocated.
>>>> +  */
>>>> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
>>>
>>> why can't you reuse stage2_get_pud here?
>>
>> stage2_get_* allocate intermediate tables, when they're called
>> you know intermediate tables are needed to install a pmd or pte.
>> But currently there is no way to tell we faulted in a PUD
>> region, this code just checks if a PUD is set, and not
>> allocate intermediate tables along the way.
> 
> hmmm, but if we get here it means that we are faulting on an address, so
> we need to map something at that address regardless, so I don't see the
> problem in using stage2_get_pud.
> 
>>
>> Overall not sure if this is in preparation for a new huge page (PUD sized)?
>> Besides developing a custom test, not sure how to use this
>> and determine we fault in a PUD region? Generic 'gup'
>> code does handle PUDs but perhaps some arch. has PUD sized
>> huge pages.
>>
> 
> When Marc and I discussed this we came to the conclusion that we wanted
> code to support this code path for when huge PUDs were suddently used,
> but now when I see the code, I am realizing that adding huge PUD support
> on the Stage-2 level requires a lot of changes to this file, so I really
> don't think we need to handle it at the point after all.
> 
>>>
>>>> +{
>>>> +	pgd_t *pgd;
>>>> +
>>>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>>>> +	if (pgd_none(*pgd))
>>>> +		return NULL;
>>>> +
>>>> +	return pud_offset(pgd, addr);
>>>> +}
>>>> +
>>>> +/**
>>>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>>>> + * @kvm:	pointer to kvm structure.
>>>> + * @addr	IPA
>>>> + *
>>>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>>>> + * pages in the range dirty.
>>>> + */
>>>> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
>>>> +{
>>>> +	pud_t *pud;
>>>> +	gfn_t gfn;
>>>> +	long i;
>>>> +
>>>> +	pud = stage2_find_pud(kvm, addr);
>>>> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
>>>
>>> I'm just thinking here, why do we need to check if we get a valid pud
>>> back here, but we don't need the equivalent check in dissolve_pmd from
>>> patch 7?
>>
>> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
>> pud_none() is not the right way to check either, maybe pud_bad()
>> first. Nothing is done in patch 7 since the pmd is retrieved from
>> stage2_get_pmd().
>>
> 
> hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
> IOMAP flag set...
> 
>>>
>>> I think the rationale is that it should never happen because we never
>>> call these functions with the logging and iomap flags at the same
>>> time...
>>
>> I'm little lost here, not sure how it's related to above.
>> But I think a VFIO device will have a memslot and
>> it would be possible to enable logging. But to what
>> end I'm not sure.
>>
> 
> As I said above, if you call the set_s2pte function with the IOMAP and
> LOGGING flags set, then you'll end up in a situation where you can get a
> NULL pointer back from stage2_get_pmd() but you're never checking
> against that.

I see what you're saying now.
> 
> Now, this raises an interesting point, we have now added code that
> prevents faults from ever happening on device maps, but introducing a
> path here where the user can set logging on a memslot with device memory
> regions, which introduces write faults on such regions.  My gut feeling
> is that we should avoid that from ever happening, and not allow this
> function to be called with both flags set.

Maybe kvm_arch_prepare_memory_region() can check if
KVM_MEM_LOG_DIRTY_PAGES is being enabled for an IO region
and don't allow it.

- Mario
> 
>>>
>>>> +		pud_clear(pud);
>>>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>>>> +		put_page(virt_to_page(pud));
>>>> +#ifdef CONFIG_SMP
>>>> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
>>>> +		/*
>>>> +		 * Mark all pages in PUD range dirty, in case other
>>>> +		 * CPUs are  writing to it.
>>>> +		 */
>>>> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
>>>> +			mark_page_dirty(kvm, gfn + i);
>>>> +#endif
>>>> +	}
>>>> +}
>>>> +
>>>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>>>  				  int min, int max)
>>>>  {
>>>> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>>>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>>>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>>>>  
>>>> +	/*
>>>> +	 * While dirty page logging - dissolve huge PUD, then continue on to
>>>> +	 * allocate page.
>>>> +	 */
>>>> +	if (logging_active)
>>>> +		stage2_dissolve_pud(kvm, addr);
>>>> +
>>>
>>> I know I asked for this, but what's the purpose really when we never set
>>> a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
>>>
>>> Marc, you may have some thoughts here...
>>
>> Not sure myself what's the vision for PUD support.
>>
> 
> with 4-level paging on aarch64, we use PUDs but we haven't added any
> code to insert huge PUDs (only regular ones) on the stage-2 page tables,
> even if the host kernel happens to suddenly support huge PUDs for the
> stage-1 page tables, which is what I think we were trying to address.
> 
> 
> So I really think we can drop this whole patch.  As I said, really sorry
> about this one!
> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 16:41           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/08/2015 03:32 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 07:01:10PM -0800, Mario Smarduch wrote:
>> On 01/07/2015 05:05 AM, Christoffer Dall wrote:
>>> On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
>>>> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
>>>> write protected for initial memory region write protection. Code to dissolve 
>>>> huge PUD is supported in user_mem_abort(). At this time this code has not been 
>>>> tested, but similar approach to current ARMv8 page logging test is in work,
>>>> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
>>>> 4k page/48 bit host, some host kernel test code needs to be added to detect
>>>> page fault to this region and side step general processing. Also similar to 
>>>> PMD case all pages in range are marked dirty when PUD entry is cleared.
>>>
>>> the note about this code being untested shouldn't be part of the commit
>>> message but after the '---' separater or in the cover letter I think.
>>
>> Ah ok.
>>>
>>>>
>>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>>>>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>>>>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>>>>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>>>>  4 files changed, 81 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>>>> index dda0046..703d04d 100644
>>>> --- a/arch/arm/include/asm/kvm_mmu.h
>>>> +++ b/arch/arm/include/asm/kvm_mmu.h
>>>> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>>>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
>>>>  }
>>>>  
>>>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>>>> +{
>>>> +}
>>>> +
>>>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>>>> +{
>>>> +	return false;
>>>> +}
>>>>  
>>>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>>>  #define kvm_pgd_addr_end(addr, end)					\
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 59003df..35840fb 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>>>>  	}
>>>>  }
>>>>  
>>>> +/**
>>>> +  * stage2_find_pud() - find a PUD entry
>>>> +  * @kvm:	pointer to kvm structure.
>>>> +  * @addr:	IPA address
>>>> +  *
>>>> +  * Return address of PUD entry or NULL if not allocated.
>>>> +  */
>>>> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
>>>
>>> why can't you reuse stage2_get_pud here?
>>
>> stage2_get_* allocate intermediate tables, when they're called
>> you know intermediate tables are needed to install a pmd or pte.
>> But currently there is no way to tell we faulted in a PUD
>> region, this code just checks if a PUD is set, and not
>> allocate intermediate tables along the way.
> 
> hmmm, but if we get here it means that we are faulting on an address, so
> we need to map something at that address regardless, so I don't see the
> problem in using stage2_get_pud.
> 
>>
>> Overall not sure if this is in preparation for a new huge page (PUD sized)?
>> Besides developing a custom test, not sure how to use this
>> and determine we fault in a PUD region? Generic 'gup'
>> code does handle PUDs but perhaps some arch. has PUD sized
>> huge pages.
>>
> 
> When Marc and I discussed this we came to the conclusion that we wanted
> code to support this code path for when huge PUDs were suddently used,
> but now when I see the code, I am realizing that adding huge PUD support
> on the Stage-2 level requires a lot of changes to this file, so I really
> don't think we need to handle it at the point after all.
> 
>>>
>>>> +{
>>>> +	pgd_t *pgd;
>>>> +
>>>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>>>> +	if (pgd_none(*pgd))
>>>> +		return NULL;
>>>> +
>>>> +	return pud_offset(pgd, addr);
>>>> +}
>>>> +
>>>> +/**
>>>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>>>> + * @kvm:	pointer to kvm structure.
>>>> + * @addr	IPA
>>>> + *
>>>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>>>> + * pages in the range dirty.
>>>> + */
>>>> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
>>>> +{
>>>> +	pud_t *pud;
>>>> +	gfn_t gfn;
>>>> +	long i;
>>>> +
>>>> +	pud = stage2_find_pud(kvm, addr);
>>>> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
>>>
>>> I'm just thinking here, why do we need to check if we get a valid pud
>>> back here, but we don't need the equivalent check in dissolve_pmd from
>>> patch 7?
>>
>> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
>> pud_none() is not the right way to check either, maybe pud_bad()
>> first. Nothing is done in patch 7 since the pmd is retrieved from
>> stage2_get_pmd().
>>
> 
> hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
> IOMAP flag set...
> 
>>>
>>> I think the rationale is that it should never happen because we never
>>> call these functions with the logging and iomap flags at the same
>>> time...
>>
>> I'm little lost here, not sure how it's related to above.
>> But I think a VFIO device will have a memslot and
>> it would be possible to enable logging. But to what
>> end I'm not sure.
>>
> 
> As I said above, if you call the set_s2pte function with the IOMAP and
> LOGGING flags set, then you'll end up in a situation where you can get a
> NULL pointer back from stage2_get_pmd() but you're never checking
> against that.

I see what you're saying now.
> 
> Now, this raises an interesting point, we have now added code that
> prevents faults from ever happening on device maps, but introducing a
> path here where the user can set logging on a memslot with device memory
> regions, which introduces write faults on such regions.  My gut feeling
> is that we should avoid that from ever happening, and not allow this
> function to be called with both flags set.

Maybe kvm_arch_prepare_memory_region() can check if
KVM_MEM_LOG_DIRTY_PAGES is being enabled for an IO region
and don't allow it.

- Mario
> 
>>>
>>>> +		pud_clear(pud);
>>>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>>>> +		put_page(virt_to_page(pud));
>>>> +#ifdef CONFIG_SMP
>>>> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
>>>> +		/*
>>>> +		 * Mark all pages in PUD range dirty, in case other
>>>> +		 * CPUs are  writing to it.
>>>> +		 */
>>>> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
>>>> +			mark_page_dirty(kvm, gfn + i);
>>>> +#endif
>>>> +	}
>>>> +}
>>>> +
>>>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>>>  				  int min, int max)
>>>>  {
>>>> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>>>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>>>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>>>>  
>>>> +	/*
>>>> +	 * While dirty page logging - dissolve huge PUD, then continue on to
>>>> +	 * allocate page.
>>>> +	 */
>>>> +	if (logging_active)
>>>> +		stage2_dissolve_pud(kvm, addr);
>>>> +
>>>
>>> I know I asked for this, but what's the purpose really when we never set
>>> a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
>>>
>>> Marc, you may have some thoughts here...
>>
>> Not sure myself what's the vision for PUD support.
>>
> 
> with 4-level paging on aarch64, we use PUDs but we haven't added any
> code to insert huge PUDs (only regular ones) on the stage-2 page tables,
> even if the host kernel happens to suddenly support huge PUDs for the
> stage-1 page tables, which is what I think we were trying to address.
> 
> 
> So I really think we can drop this whole patch.  As I said, really sorry
> about this one!
> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 16:41           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:41 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell, Ard Biesheuvel

On 01/08/2015 03:32 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 07:01:10PM -0800, Mario Smarduch wrote:
>> On 01/07/2015 05:05 AM, Christoffer Dall wrote:
>>> On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
>>>> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
>>>> write protected for initial memory region write protection. Code to dissolve 
>>>> huge PUD is supported in user_mem_abort(). At this time this code has not been 
>>>> tested, but similar approach to current ARMv8 page logging test is in work,
>>>> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
>>>> 4k page/48 bit host, some host kernel test code needs to be added to detect
>>>> page fault to this region and side step general processing. Also similar to 
>>>> PMD case all pages in range are marked dirty when PUD entry is cleared.
>>>
>>> the note about this code being untested shouldn't be part of the commit
>>> message but after the '---' separater or in the cover letter I think.
>>
>> Ah ok.
>>>
>>>>
>>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>>>>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>>>>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>>>>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>>>>  4 files changed, 81 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>>>> index dda0046..703d04d 100644
>>>> --- a/arch/arm/include/asm/kvm_mmu.h
>>>> +++ b/arch/arm/include/asm/kvm_mmu.h
>>>> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>>>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
>>>>  }
>>>>  
>>>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>>>> +{
>>>> +}
>>>> +
>>>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>>>> +{
>>>> +	return false;
>>>> +}
>>>>  
>>>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>>>  #define kvm_pgd_addr_end(addr, end)					\
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 59003df..35840fb 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>>>>  	}
>>>>  }
>>>>  
>>>> +/**
>>>> +  * stage2_find_pud() - find a PUD entry
>>>> +  * @kvm:	pointer to kvm structure.
>>>> +  * @addr:	IPA address
>>>> +  *
>>>> +  * Return address of PUD entry or NULL if not allocated.
>>>> +  */
>>>> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
>>>
>>> why can't you reuse stage2_get_pud here?
>>
>> stage2_get_* allocate intermediate tables, when they're called
>> you know intermediate tables are needed to install a pmd or pte.
>> But currently there is no way to tell we faulted in a PUD
>> region, this code just checks if a PUD is set, and not
>> allocate intermediate tables along the way.
> 
> hmmm, but if we get here it means that we are faulting on an address, so
> we need to map something at that address regardless, so I don't see the
> problem in using stage2_get_pud.
> 
>>
>> Overall not sure if this is in preparation for a new huge page (PUD sized)?
>> Besides developing a custom test, not sure how to use this
>> and determine we fault in a PUD region? Generic 'gup'
>> code does handle PUDs but perhaps some arch. has PUD sized
>> huge pages.
>>
> 
> When Marc and I discussed this we came to the conclusion that we wanted
> code to support this code path for when huge PUDs were suddently used,
> but now when I see the code, I am realizing that adding huge PUD support
> on the Stage-2 level requires a lot of changes to this file, so I really
> don't think we need to handle it at the point after all.
> 
>>>
>>>> +{
>>>> +	pgd_t *pgd;
>>>> +
>>>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>>>> +	if (pgd_none(*pgd))
>>>> +		return NULL;
>>>> +
>>>> +	return pud_offset(pgd, addr);
>>>> +}
>>>> +
>>>> +/**
>>>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>>>> + * @kvm:	pointer to kvm structure.
>>>> + * @addr	IPA
>>>> + *
>>>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>>>> + * pages in the range dirty.
>>>> + */
>>>> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
>>>> +{
>>>> +	pud_t *pud;
>>>> +	gfn_t gfn;
>>>> +	long i;
>>>> +
>>>> +	pud = stage2_find_pud(kvm, addr);
>>>> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
>>>
>>> I'm just thinking here, why do we need to check if we get a valid pud
>>> back here, but we don't need the equivalent check in dissolve_pmd from
>>> patch 7?
>>
>> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
>> pud_none() is not the right way to check either, maybe pud_bad()
>> first. Nothing is done in patch 7 since the pmd is retrieved from
>> stage2_get_pmd().
>>
> 
> hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
> IOMAP flag set...
> 
>>>
>>> I think the rationale is that it should never happen because we never
>>> call these functions with the logging and iomap flags at the same
>>> time...
>>
>> I'm little lost here, not sure how it's related to above.
>> But I think a VFIO device will have a memslot and
>> it would be possible to enable logging. But to what
>> end I'm not sure.
>>
> 
> As I said above, if you call the set_s2pte function with the IOMAP and
> LOGGING flags set, then you'll end up in a situation where you can get a
> NULL pointer back from stage2_get_pmd() but you're never checking
> against that.

I see what you're saying now.
> 
> Now, this raises an interesting point, we have now added code that
> prevents faults from ever happening on device maps, but introducing a
> path here where the user can set logging on a memslot with device memory
> regions, which introduces write faults on such regions.  My gut feeling
> is that we should avoid that from ever happening, and not allow this
> function to be called with both flags set.

Maybe kvm_arch_prepare_memory_region() can check if
KVM_MEM_LOG_DIRTY_PAGES is being enabled for an IO region
and don't allow it.

- Mario
> 
>>>
>>>> +		pud_clear(pud);
>>>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>>>> +		put_page(virt_to_page(pud));
>>>> +#ifdef CONFIG_SMP
>>>> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
>>>> +		/*
>>>> +		 * Mark all pages in PUD range dirty, in case other
>>>> +		 * CPUs are  writing to it.
>>>> +		 */
>>>> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
>>>> +			mark_page_dirty(kvm, gfn + i);
>>>> +#endif
>>>> +	}
>>>> +}
>>>> +
>>>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>>>  				  int min, int max)
>>>>  {
>>>> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>>>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>>>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>>>>  
>>>> +	/*
>>>> +	 * While dirty page logging - dissolve huge PUD, then continue on to
>>>> +	 * allocate page.
>>>> +	 */
>>>> +	if (logging_active)
>>>> +		stage2_dissolve_pud(kvm, addr);
>>>> +
>>>
>>> I know I asked for this, but what's the purpose really when we never set
>>> a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
>>>
>>> Marc, you may have some thoughts here...
>>
>> Not sure myself what's the vision for PUD support.
>>
> 
> with 4-level paging on aarch64, we use PUDs but we haven't added any
> code to insert huge PUDs (only regular ones) on the stage-2 page tables,
> even if the host kernel happens to suddenly support huge PUDs for the
> stage-1 page tables, which is what I think we were trying to address.
> 
> 
> So I really think we can drop this whole patch.  As I said, really sorry
> about this one!
> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 16:41           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:41 UTC (permalink / raw)
  To: kvm-ia64

On 01/08/2015 03:32 AM, Christoffer Dall wrote:
> On Wed, Jan 07, 2015 at 07:01:10PM -0800, Mario Smarduch wrote:
>> On 01/07/2015 05:05 AM, Christoffer Dall wrote:
>>> On Sun, Dec 14, 2014 at 11:28:08PM -0800, Mario Smarduch wrote:
>>>> This patch adds the same support for PUD huge page as for PMD. Huge PUD is 
>>>> write protected for initial memory region write protection. Code to dissolve 
>>>> huge PUD is supported in user_mem_abort(). At this time this code has not been 
>>>> tested, but similar approach to current ARMv8 page logging test is in work,
>>>> limiting kernel memory and mapping in 1 or 2GB into Guest address space on a 
>>>> 4k page/48 bit host, some host kernel test code needs to be added to detect
>>>> page fault to this region and side step general processing. Also similar to 
>>>> PMD case all pages in range are marked dirty when PUD entry is cleared.
>>>
>>> the note about this code being untested shouldn't be part of the commit
>>> message but after the '---' separater or in the cover letter I think.
>>
>> Ah ok.
>>>
>>>>
>>>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>>>> ---
>>>>  arch/arm/include/asm/kvm_mmu.h         |  8 +++++
>>>>  arch/arm/kvm/mmu.c                     | 64 ++++++++++++++++++++++++++++++++--
>>>>  arch/arm64/include/asm/kvm_mmu.h       |  9 +++++
>>>>  arch/arm64/include/asm/pgtable-hwdef.h |  3 ++
>>>>  4 files changed, 81 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
>>>> index dda0046..703d04d 100644
>>>> --- a/arch/arm/include/asm/kvm_mmu.h
>>>> +++ b/arch/arm/include/asm/kvm_mmu.h
>>>> @@ -133,6 +133,14 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
>>>>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) = L_PMD_S2_RDONLY;
>>>>  }
>>>>  
>>>> +static inline void kvm_set_s2pud_readonly(pud_t *pud)
>>>> +{
>>>> +}
>>>> +
>>>> +static inline bool kvm_s2pud_readonly(pud_t *pud)
>>>> +{
>>>> +	return false;
>>>> +}
>>>>  
>>>>  /* Open coded p*d_addr_end that can deal with 64bit addresses */
>>>>  #define kvm_pgd_addr_end(addr, end)					\
>>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>>> index 59003df..35840fb 100644
>>>> --- a/arch/arm/kvm/mmu.c
>>>> +++ b/arch/arm/kvm/mmu.c
>>>> @@ -109,6 +109,55 @@ void stage2_dissolve_pmd(struct kvm *kvm, phys_addr_t addr, pmd_t *pmd)
>>>>  	}
>>>>  }
>>>>  
>>>> +/**
>>>> +  * stage2_find_pud() - find a PUD entry
>>>> +  * @kvm:	pointer to kvm structure.
>>>> +  * @addr:	IPA address
>>>> +  *
>>>> +  * Return address of PUD entry or NULL if not allocated.
>>>> +  */
>>>> +static pud_t *stage2_find_pud(struct kvm *kvm, phys_addr_t addr)
>>>
>>> why can't you reuse stage2_get_pud here?
>>
>> stage2_get_* allocate intermediate tables, when they're called
>> you know intermediate tables are needed to install a pmd or pte.
>> But currently there is no way to tell we faulted in a PUD
>> region, this code just checks if a PUD is set, and not
>> allocate intermediate tables along the way.
> 
> hmmm, but if we get here it means that we are faulting on an address, so
> we need to map something at that address regardless, so I don't see the
> problem in using stage2_get_pud.
> 
>>
>> Overall not sure if this is in preparation for a new huge page (PUD sized)?
>> Besides developing a custom test, not sure how to use this
>> and determine we fault in a PUD region? Generic 'gup'
>> code does handle PUDs but perhaps some arch. has PUD sized
>> huge pages.
>>
> 
> When Marc and I discussed this we came to the conclusion that we wanted
> code to support this code path for when huge PUDs were suddently used,
> but now when I see the code, I am realizing that adding huge PUD support
> on the Stage-2 level requires a lot of changes to this file, so I really
> don't think we need to handle it at the point after all.
> 
>>>
>>>> +{
>>>> +	pgd_t *pgd;
>>>> +
>>>> +	pgd = kvm->arch.pgd + pgd_index(addr);
>>>> +	if (pgd_none(*pgd))
>>>> +		return NULL;
>>>> +
>>>> +	return pud_offset(pgd, addr);
>>>> +}
>>>> +
>>>> +/**
>>>> + * stage2_dissolve_pud() - clear and flush huge PUD entry
>>>> + * @kvm:	pointer to kvm structure.
>>>> + * @addr	IPA
>>>> + *
>>>> + * Function clears a PUD entry, flushes addr 1st and 2nd stage TLBs. Marks all
>>>> + * pages in the range dirty.
>>>> + */
>>>> +void stage2_dissolve_pud(struct kvm *kvm, phys_addr_t addr)
>>>> +{
>>>> +	pud_t *pud;
>>>> +	gfn_t gfn;
>>>> +	long i;
>>>> +
>>>> +	pud = stage2_find_pud(kvm, addr);
>>>> +	if (pud && !pud_none(*pud) && kvm_pud_huge(*pud)) {
>>>
>>> I'm just thinking here, why do we need to check if we get a valid pud
>>> back here, but we don't need the equivalent check in dissolve_pmd from
>>> patch 7?
>>
>> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
>> pud_none() is not the right way to check either, maybe pud_bad()
>> first. Nothing is done in patch 7 since the pmd is retrieved from
>> stage2_get_pmd().
>>
> 
> hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
> IOMAP flag set...
> 
>>>
>>> I think the rationale is that it should never happen because we never
>>> call these functions with the logging and iomap flags at the same
>>> time...
>>
>> I'm little lost here, not sure how it's related to above.
>> But I think a VFIO device will have a memslot and
>> it would be possible to enable logging. But to what
>> end I'm not sure.
>>
> 
> As I said above, if you call the set_s2pte function with the IOMAP and
> LOGGING flags set, then you'll end up in a situation where you can get a
> NULL pointer back from stage2_get_pmd() but you're never checking
> against that.

I see what you're saying now.
> 
> Now, this raises an interesting point, we have now added code that
> prevents faults from ever happening on device maps, but introducing a
> path here where the user can set logging on a memslot with device memory
> regions, which introduces write faults on such regions.  My gut feeling
> is that we should avoid that from ever happening, and not allow this
> function to be called with both flags set.

Maybe kvm_arch_prepare_memory_region() can check if
KVM_MEM_LOG_DIRTY_PAGES is being enabled for an IO region
and don't allow it.

- Mario
> 
>>>
>>>> +		pud_clear(pud);
>>>> +		kvm_tlb_flush_vmid_ipa(kvm, addr);
>>>> +		put_page(virt_to_page(pud));
>>>> +#ifdef CONFIG_SMP
>>>> +		gfn = (addr & PUD_MASK) >> PAGE_SHIFT;
>>>> +		/*
>>>> +		 * Mark all pages in PUD range dirty, in case other
>>>> +		 * CPUs are  writing to it.
>>>> +		 */
>>>> +		for (i = 0; i < PTRS_PER_PUD * PTRS_PER_PMD; i++)
>>>> +			mark_page_dirty(kvm, gfn + i);
>>>> +#endif
>>>> +	}
>>>> +}
>>>> +
>>>>  static int mmu_topup_memory_cache(struct kvm_mmu_memory_cache *cache,
>>>>  				  int min, int max)
>>>>  {
>>>> @@ -761,6 +810,13 @@ static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
>>>>  	unsigned long iomap = flags & KVM_S2PTE_FLAG_IS_IOMAP;
>>>>  	unsigned long logging_active = flags & KVM_S2PTE_FLAG_LOGGING_ACTIVE;
>>>>  
>>>> +	/*
>>>> +	 * While dirty page logging - dissolve huge PUD, then continue on to
>>>> +	 * allocate page.
>>>> +	 */
>>>> +	if (logging_active)
>>>> +		stage2_dissolve_pud(kvm, addr);
>>>> +
>>>
>>> I know I asked for this, but what's the purpose really when we never set
>>> a huge stage-2 pud, shouldn't we just WARN/BUG if we encounter one?
>>>
>>> Marc, you may have some thoughts here...
>>
>> Not sure myself what's the vision for PUD support.
>>
> 
> with 4-level paging on aarch64, we use PUDs but we haven't added any
> code to insert huge PUDs (only regular ones) on the stage-2 page tables,
> even if the host kernel happens to suddenly support huge PUDs for the
> stage-1 page tables, which is what I think we were trying to address.
> 
> 
> So I really think we can drop this whole patch.  As I said, really sorry
> about this one!
> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
  2015-01-08 11:32         ` Christoffer Dall
  (?)
  (?)
@ 2015-01-08 16:42           ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:42 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell, Ard Biesheuvel

On 01/08/2015 03:32 AM, Christoffer Dall wrote:
[...]
>> Not sure myself what's the vision for PUD support.
>>
> 
> with 4-level paging on aarch64, we use PUDs but we haven't added any
> code to insert huge PUDs (only regular ones) on the stage-2 page tables,
> even if the host kernel happens to suddenly support huge PUDs for the
> stage-1 page tables, which is what I think we were trying to address.
> 
> 
> So I really think we can drop this whole patch.  As I said, really sorry
> about this one!
No problem I'll drop this patch.
> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 16:42           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:42 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/08/2015 03:32 AM, Christoffer Dall wrote:
[...]
>> Not sure myself what's the vision for PUD support.
>>
> 
> with 4-level paging on aarch64, we use PUDs but we haven't added any
> code to insert huge PUDs (only regular ones) on the stage-2 page tables,
> even if the host kernel happens to suddenly support huge PUDs for the
> stage-1 page tables, which is what I think we were trying to address.
> 
> 
> So I really think we can drop this whole patch.  As I said, really sorry
> about this one!
No problem I'll drop this patch.
> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 16:42           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:42 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell, Ard Biesheuvel

On 01/08/2015 03:32 AM, Christoffer Dall wrote:
[...]
>> Not sure myself what's the vision for PUD support.
>>
> 
> with 4-level paging on aarch64, we use PUDs but we haven't added any
> code to insert huge PUDs (only regular ones) on the stage-2 page tables,
> even if the host kernel happens to suddenly support huge PUDs for the
> stage-1 page tables, which is what I think we were trying to address.
> 
> 
> So I really think we can drop this whole patch.  As I said, really sorry
> about this one!
No problem I'll drop this patch.
> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-08 16:42           ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-08 16:42 UTC (permalink / raw)
  To: kvm-ia64

On 01/08/2015 03:32 AM, Christoffer Dall wrote:
[...]
>> Not sure myself what's the vision for PUD support.
>>
> 
> with 4-level paging on aarch64, we use PUDs but we haven't added any
> code to insert huge PUDs (only regular ones) on the stage-2 page tables,
> even if the host kernel happens to suddenly support huge PUDs for the
> stage-1 page tables, which is what I think we were trying to address.
> 
> 
> So I really think we can drop this whole patch.  As I said, really sorry
> about this one!
No problem I'll drop this patch.
> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
  2015-01-08 16:41           ` Mario Smarduch
  (?)
  (?)
@ 2015-01-09 10:23             ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-09 10:23 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell, Ard Biesheuvel

On Thu, Jan 08, 2015 at 08:41:15AM -0800, Mario Smarduch wrote:

[...]

> >>>
> >>> I'm just thinking here, why do we need to check if we get a valid pud
> >>> back here, but we don't need the equivalent check in dissolve_pmd from
> >>> patch 7?
> >>
> >> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
> >> pud_none() is not the right way to check either, maybe pud_bad()
> >> first. Nothing is done in patch 7 since the pmd is retrieved from
> >> stage2_get_pmd().
> >>
> > 
> > hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
> > IOMAP flag set...
> > 
> >>>
> >>> I think the rationale is that it should never happen because we never
> >>> call these functions with the logging and iomap flags at the same
> >>> time...
> >>
> >> I'm little lost here, not sure how it's related to above.
> >> But I think a VFIO device will have a memslot and
> >> it would be possible to enable logging. But to what
> >> end I'm not sure.
> >>
> > 
> > As I said above, if you call the set_s2pte function with the IOMAP and
> > LOGGING flags set, then you'll end up in a situation where you can get a
> > NULL pointer back from stage2_get_pmd() but you're never checking
> > against that.
> 
> I see what you're saying now.
> > 
> > Now, this raises an interesting point, we have now added code that
> > prevents faults from ever happening on device maps, but introducing a
> > path here where the user can set logging on a memslot with device memory
> > regions, which introduces write faults on such regions.  My gut feeling
> > is that we should avoid that from ever happening, and not allow this
> > function to be called with both flags set.
> 
> Maybe kvm_arch_prepare_memory_region() can check if
> KVM_MEM_LOG_DIRTY_PAGES is being enabled for an IO region
> and don't allow it.
> 

Yeah, I think we need to add a check for that somewhere as part of this
series (patch 7 perhaps?).

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-09 10:23             ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-09 10:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 08, 2015 at 08:41:15AM -0800, Mario Smarduch wrote:

[...]

> >>>
> >>> I'm just thinking here, why do we need to check if we get a valid pud
> >>> back here, but we don't need the equivalent check in dissolve_pmd from
> >>> patch 7?
> >>
> >> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
> >> pud_none() is not the right way to check either, maybe pud_bad()
> >> first. Nothing is done in patch 7 since the pmd is retrieved from
> >> stage2_get_pmd().
> >>
> > 
> > hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
> > IOMAP flag set...
> > 
> >>>
> >>> I think the rationale is that it should never happen because we never
> >>> call these functions with the logging and iomap flags at the same
> >>> time...
> >>
> >> I'm little lost here, not sure how it's related to above.
> >> But I think a VFIO device will have a memslot and
> >> it would be possible to enable logging. But to what
> >> end I'm not sure.
> >>
> > 
> > As I said above, if you call the set_s2pte function with the IOMAP and
> > LOGGING flags set, then you'll end up in a situation where you can get a
> > NULL pointer back from stage2_get_pmd() but you're never checking
> > against that.
> 
> I see what you're saying now.
> > 
> > Now, this raises an interesting point, we have now added code that
> > prevents faults from ever happening on device maps, but introducing a
> > path here where the user can set logging on a memslot with device memory
> > regions, which introduces write faults on such regions.  My gut feeling
> > is that we should avoid that from ever happening, and not allow this
> > function to be called with both flags set.
> 
> Maybe kvm_arch_prepare_memory_region() can check if
> KVM_MEM_LOG_DIRTY_PAGES is being enabled for an IO region
> and don't allow it.
> 

Yeah, I think we need to add a check for that somewhere as part of this
series (patch 7 perhaps?).

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-09 10:23             ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-09 10:23 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: pbonzini, james.hogan, agraf, marc.zyngier, cornelia.huck,
	borntraeger, catalin.marinas, kvmarm, kvm, kvm-ppc, kvm-ia64,
	linux-arm-kernel, steve.capper, peter.maydell, Ard Biesheuvel

On Thu, Jan 08, 2015 at 08:41:15AM -0800, Mario Smarduch wrote:

[...]

> >>>
> >>> I'm just thinking here, why do we need to check if we get a valid pud
> >>> back here, but we don't need the equivalent check in dissolve_pmd from
> >>> patch 7?
> >>
> >> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
> >> pud_none() is not the right way to check either, maybe pud_bad()
> >> first. Nothing is done in patch 7 since the pmd is retrieved from
> >> stage2_get_pmd().
> >>
> > 
> > hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
> > IOMAP flag set...
> > 
> >>>
> >>> I think the rationale is that it should never happen because we never
> >>> call these functions with the logging and iomap flags at the same
> >>> time...
> >>
> >> I'm little lost here, not sure how it's related to above.
> >> But I think a VFIO device will have a memslot and
> >> it would be possible to enable logging. But to what
> >> end I'm not sure.
> >>
> > 
> > As I said above, if you call the set_s2pte function with the IOMAP and
> > LOGGING flags set, then you'll end up in a situation where you can get a
> > NULL pointer back from stage2_get_pmd() but you're never checking
> > against that.
> 
> I see what you're saying now.
> > 
> > Now, this raises an interesting point, we have now added code that
> > prevents faults from ever happening on device maps, but introducing a
> > path here where the user can set logging on a memslot with device memory
> > regions, which introduces write faults on such regions.  My gut feeling
> > is that we should avoid that from ever happening, and not allow this
> > function to be called with both flags set.
> 
> Maybe kvm_arch_prepare_memory_region() can check if
> KVM_MEM_LOG_DIRTY_PAGES is being enabled for an IO region
> and don't allow it.
> 

Yeah, I think we need to add a check for that somewhere as part of this
series (patch 7 perhaps?).

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD
@ 2015-01-09 10:23             ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-09 10:23 UTC (permalink / raw)
  To: kvm-ia64

On Thu, Jan 08, 2015 at 08:41:15AM -0800, Mario Smarduch wrote:

[...]

> >>>
> >>> I'm just thinking here, why do we need to check if we get a valid pud
> >>> back here, but we don't need the equivalent check in dissolve_pmd from
> >>> patch 7?
> >>
> >> kvm_pud_huge() doesn't check bit 0 for invalid entry, but
> >> pud_none() is not the right way to check either, maybe pud_bad()
> >> first. Nothing is done in patch 7 since the pmd is retrieved from
> >> stage2_get_pmd().
> >>
> > 
> > hmmm, but stage2_get_pmd() can return a NULL pointer if you have the
> > IOMAP flag set...
> > 
> >>>
> >>> I think the rationale is that it should never happen because we never
> >>> call these functions with the logging and iomap flags at the same
> >>> time...
> >>
> >> I'm little lost here, not sure how it's related to above.
> >> But I think a VFIO device will have a memslot and
> >> it would be possible to enable logging. But to what
> >> end I'm not sure.
> >>
> > 
> > As I said above, if you call the set_s2pte function with the IOMAP and
> > LOGGING flags set, then you'll end up in a situation where you can get a
> > NULL pointer back from stage2_get_pmd() but you're never checking
> > against that.
> 
> I see what you're saying now.
> > 
> > Now, this raises an interesting point, we have now added code that
> > prevents faults from ever happening on device maps, but introducing a
> > path here where the user can set logging on a memslot with device memory
> > regions, which introduces write faults on such regions.  My gut feeling
> > is that we should avoid that from ever happening, and not allow this
> > function to be called with both flags set.
> 
> Maybe kvm_arch_prepare_memory_region() can check if
> KVM_MEM_LOG_DIRTY_PAGES is being enabled for an IO region
> and don't allow it.
> 

Yeah, I think we need to add a check for that somewhere as part of this
series (patch 7 perhaps?).

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
  2015-01-08 16:28           ` Mario Smarduch
@ 2015-01-09 10:24             ` Christoffer Dall
  -1 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-09 10:24 UTC (permalink / raw)
  To: Mario Smarduch
  Cc: kvm, marc.zyngier, catalin.marinas, pbonzini, kvmarm, linux-arm-kernel

On Thu, Jan 08, 2015 at 08:28:46AM -0800, Mario Smarduch wrote:
> On 01/08/2015 02:45 AM, Christoffer Dall wrote:
> > On Wed, Jan 07, 2015 at 05:43:18PM -0800, Mario Smarduch wrote:
> >> Hi Christoffer,
> >>   before going through your comments, I discovered that
> >> in 3.18.0-rc2 - a generic __get_user_pages_fast()
> >> was implemented, now ARM picks this up. This causes
> >> gfn_to_pfn_prot() to return meaningful 'writable'
> >> value for a read fault, provided the region is writable.
> >>
> >> Prior to that the weak version returned 0 and 'writable'
> >> had no optimization effect to set pte/pmd - RW on
> >> a read fault.
> >>
> >> As a consequence dirty logging broke in 3.18, I was seeing
> Correction on this, proper __get_user_pages_fast()
> behavior exposed a bug in page logging code.
> 
> >> weird but very intermittent issues. I just put in the
> >> additional few lines to fix it, prevent pte RW (only R) on
> >> read faults  while  logging writable region.
> >>
> >> On 01/07/2015 04:38 AM, Christoffer Dall wrote:
> >>> On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
> >>>> This patch is a followup to v15 patch series, with following changes:
> >>>> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
> >>>>   the state of whole range is unknown. After the huge page is dissolved 
> >>>>   dirty page logging is at page granularity.
> >>>
> >>> What is the sequence of events where you could have dirtied another page
> >>> within the PMD range after the user initially requested dirty page
> >>> logging?
> >>
> >> No there is none. My issue was the start point for tracking dirty pages
> >> and that would be second call to dirty log read. Not first
> >> call after initial write protect where any page in range can
> >> be assumed dirty. I'll remove this, not sure if there would be any
> >> use case to call dirty log only once.
> >>
> > 
> > Calling dirty log once can not give you anything meaningful, right?  You
> > must assume all memory is 'dirty' at this point, no?
> 
> There is the interval between KVM_MEM_LOG_DIRTY_PAGES and first
> call to KVM_GET_DIRTY_LOG. Not sure of any use case, maybe enable
> logging, wait a while do a dirty log read, disable logging.
> Get an accumulated snapshot of dirty page activity.
> 
ok, so from the time the user calls KVM_MEM_LOG_DIRTY_PAGES, then any
fault on any huge page will dissolve that huge page into pages, and each
dirty page will be logged accordingly for the first call to
KVM_GET_DIRTY_LOG, right?  What am I missing here?

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2015-01-09 10:24             ` Christoffer Dall
  0 siblings, 0 replies; 110+ messages in thread
From: Christoffer Dall @ 2015-01-09 10:24 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jan 08, 2015 at 08:28:46AM -0800, Mario Smarduch wrote:
> On 01/08/2015 02:45 AM, Christoffer Dall wrote:
> > On Wed, Jan 07, 2015 at 05:43:18PM -0800, Mario Smarduch wrote:
> >> Hi Christoffer,
> >>   before going through your comments, I discovered that
> >> in 3.18.0-rc2 - a generic __get_user_pages_fast()
> >> was implemented, now ARM picks this up. This causes
> >> gfn_to_pfn_prot() to return meaningful 'writable'
> >> value for a read fault, provided the region is writable.
> >>
> >> Prior to that the weak version returned 0 and 'writable'
> >> had no optimization effect to set pte/pmd - RW on
> >> a read fault.
> >>
> >> As a consequence dirty logging broke in 3.18, I was seeing
> Correction on this, proper __get_user_pages_fast()
> behavior exposed a bug in page logging code.
> 
> >> weird but very intermittent issues. I just put in the
> >> additional few lines to fix it, prevent pte RW (only R) on
> >> read faults  while  logging writable region.
> >>
> >> On 01/07/2015 04:38 AM, Christoffer Dall wrote:
> >>> On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
> >>>> This patch is a followup to v15 patch series, with following changes:
> >>>> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
> >>>>   the state of whole range is unknown. After the huge page is dissolved 
> >>>>   dirty page logging is at page granularity.
> >>>
> >>> What is the sequence of events where you could have dirtied another page
> >>> within the PMD range after the user initially requested dirty page
> >>> logging?
> >>
> >> No there is none. My issue was the start point for tracking dirty pages
> >> and that would be second call to dirty log read. Not first
> >> call after initial write protect where any page in range can
> >> be assumed dirty. I'll remove this, not sure if there would be any
> >> use case to call dirty log only once.
> >>
> > 
> > Calling dirty log once can not give you anything meaningful, right?  You
> > must assume all memory is 'dirty' at this point, no?
> 
> There is the interval between KVM_MEM_LOG_DIRTY_PAGES and first
> call to KVM_GET_DIRTY_LOG. Not sure of any use case, maybe enable
> logging, wait a while do a dirty log read, disable logging.
> Get an accumulated snapshot of dirty page activity.
> 
ok, so from the time the user calls KVM_MEM_LOG_DIRTY_PAGES, then any
fault on any huge page will dissolve that huge page into pages, and each
dirty page will be logged accordingly for the first call to
KVM_GET_DIRTY_LOG, right?  What am I missing here?

-Christoffer

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
  2015-01-09 10:24             ` Christoffer Dall
@ 2015-01-10  4:38               ` Mario Smarduch
  -1 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-10  4:38 UTC (permalink / raw)
  To: Christoffer Dall
  Cc: marc.zyngier, kvmarm, kvm, linux-arm-kernel, pbonzini, catalin.marinas

On 01/09/2015 02:24 AM, Christoffer Dall wrote:
> On Thu, Jan 08, 2015 at 08:28:46AM -0800, Mario Smarduch wrote:
>> On 01/08/2015 02:45 AM, Christoffer Dall wrote:
>>> On Wed, Jan 07, 2015 at 05:43:18PM -0800, Mario Smarduch wrote:
>>>> Hi Christoffer,
>>>>   before going through your comments, I discovered that
>>>> in 3.18.0-rc2 - a generic __get_user_pages_fast()
>>>> was implemented, now ARM picks this up. This causes
>>>> gfn_to_pfn_prot() to return meaningful 'writable'
>>>> value for a read fault, provided the region is writable.
>>>>
>>>> Prior to that the weak version returned 0 and 'writable'
>>>> had no optimization effect to set pte/pmd - RW on
>>>> a read fault.
>>>>
>>>> As a consequence dirty logging broke in 3.18, I was seeing
>> Correction on this, proper __get_user_pages_fast()
>> behavior exposed a bug in page logging code.
>>
>>>> weird but very intermittent issues. I just put in the
>>>> additional few lines to fix it, prevent pte RW (only R) on
>>>> read faults  while  logging writable region.
>>>>
>>>> On 01/07/2015 04:38 AM, Christoffer Dall wrote:
>>>>> On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
>>>>>> This patch is a followup to v15 patch series, with following changes:
>>>>>> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
>>>>>>   the state of whole range is unknown. After the huge page is dissolved 
>>>>>>   dirty page logging is at page granularity.
>>>>>
>>>>> What is the sequence of events where you could have dirtied another page
>>>>> within the PMD range after the user initially requested dirty page
>>>>> logging?
>>>>
>>>> No there is none. My issue was the start point for tracking dirty pages
>>>> and that would be second call to dirty log read. Not first
>>>> call after initial write protect where any page in range can
>>>> be assumed dirty. I'll remove this, not sure if there would be any
>>>> use case to call dirty log only once.
>>>>
>>>
>>> Calling dirty log once can not give you anything meaningful, right?  You
>>> must assume all memory is 'dirty' at this point, no?
>>
>> There is the interval between KVM_MEM_LOG_DIRTY_PAGES and first
>> call to KVM_GET_DIRTY_LOG. Not sure of any use case, maybe enable
>> logging, wait a while do a dirty log read, disable logging.
>> Get an accumulated snapshot of dirty page activity.
>>
> ok, so from the time the user calls KVM_MEM_LOG_DIRTY_PAGES, then any
> fault on any huge page will dissolve that huge page into pages, and each
> dirty page will be logged accordingly for the first call to
> KVM_GET_DIRTY_LOG, right?  What am I missing here?

Yes that's correct, this may or may not be meaningful in itself.
The original point was first time access to a huge page (on
first or some later call) and do we consider whole range dirty.
Keeping track at page granularity + original image provides
everything needed to reconstruct the source so it should
not matter.

I think I convoluted this issue a bit.

- Mario
> 
> -Christoffer
> 


^ permalink raw reply	[flat|nested] 110+ messages in thread

* [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling
@ 2015-01-10  4:38               ` Mario Smarduch
  0 siblings, 0 replies; 110+ messages in thread
From: Mario Smarduch @ 2015-01-10  4:38 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/09/2015 02:24 AM, Christoffer Dall wrote:
> On Thu, Jan 08, 2015 at 08:28:46AM -0800, Mario Smarduch wrote:
>> On 01/08/2015 02:45 AM, Christoffer Dall wrote:
>>> On Wed, Jan 07, 2015 at 05:43:18PM -0800, Mario Smarduch wrote:
>>>> Hi Christoffer,
>>>>   before going through your comments, I discovered that
>>>> in 3.18.0-rc2 - a generic __get_user_pages_fast()
>>>> was implemented, now ARM picks this up. This causes
>>>> gfn_to_pfn_prot() to return meaningful 'writable'
>>>> value for a read fault, provided the region is writable.
>>>>
>>>> Prior to that the weak version returned 0 and 'writable'
>>>> had no optimization effect to set pte/pmd - RW on
>>>> a read fault.
>>>>
>>>> As a consequence dirty logging broke in 3.18, I was seeing
>> Correction on this, proper __get_user_pages_fast()
>> behavior exposed a bug in page logging code.
>>
>>>> weird but very intermittent issues. I just put in the
>>>> additional few lines to fix it, prevent pte RW (only R) on
>>>> read faults  while  logging writable region.
>>>>
>>>> On 01/07/2015 04:38 AM, Christoffer Dall wrote:
>>>>> On Wed, Dec 17, 2014 at 06:07:29PM -0800, Mario Smarduch wrote:
>>>>>> This patch is a followup to v15 patch series, with following changes:
>>>>>> - When clearing/dissolving a huge, PMD mark huge page range dirty, since
>>>>>>   the state of whole range is unknown. After the huge page is dissolved 
>>>>>>   dirty page logging is at page granularity.
>>>>>
>>>>> What is the sequence of events where you could have dirtied another page
>>>>> within the PMD range after the user initially requested dirty page
>>>>> logging?
>>>>
>>>> No there is none. My issue was the start point for tracking dirty pages
>>>> and that would be second call to dirty log read. Not first
>>>> call after initial write protect where any page in range can
>>>> be assumed dirty. I'll remove this, not sure if there would be any
>>>> use case to call dirty log only once.
>>>>
>>>
>>> Calling dirty log once can not give you anything meaningful, right?  You
>>> must assume all memory is 'dirty' at this point, no?
>>
>> There is the interval between KVM_MEM_LOG_DIRTY_PAGES and first
>> call to KVM_GET_DIRTY_LOG. Not sure of any use case, maybe enable
>> logging, wait a while do a dirty log read, disable logging.
>> Get an accumulated snapshot of dirty page activity.
>>
> ok, so from the time the user calls KVM_MEM_LOG_DIRTY_PAGES, then any
> fault on any huge page will dissolve that huge page into pages, and each
> dirty page will be logged accordingly for the first call to
> KVM_GET_DIRTY_LOG, right?  What am I missing here?

Yes that's correct, this may or may not be meaningful in itself.
The original point was first time access to a huge page (on
first or some later call) and do we consider whole range dirty.
Keeping track at page granularity + original image provides
everything needed to reconstruct the source so it should
not matter.

I think I convoluted this issue a bit.

- Mario
> 
> -Christoffer
> 

^ permalink raw reply	[flat|nested] 110+ messages in thread

end of thread, other threads:[~2015-01-10  4:38 UTC | newest]

Thread overview: 110+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-15  7:27 [PATCH v15 00/11] KVM//x86/arm/arm64: dirty page logging for ARMv7/8 (3.18.0-rc2) Mario Smarduch
2014-12-15  7:27 ` Mario Smarduch
2014-12-15  7:27 ` Mario Smarduch
2014-12-15  7:27 ` Mario Smarduch
2014-12-15  7:27 ` [PATCH v15 01/11] KVM: Add architecture-defined TLB flush support Mario Smarduch
2014-12-15  7:27   ` Mario Smarduch
2014-12-15  7:27   ` Mario Smarduch
2014-12-15  7:27   ` Mario Smarduch
2014-12-15  7:27 ` [PATCH v15 02/11] KVM: Add generic support for dirty page logging Mario Smarduch
2014-12-15  7:27   ` Mario Smarduch
2014-12-15  7:27   ` Mario Smarduch
2014-12-15  7:27   ` Mario Smarduch
2014-12-15  7:28 ` [PATCH v15 03/11] KVM: x86: switch to kvm_get_dirty_log_protect Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28 ` [PATCH v15 04/11] KVM: arm: Add ARMv7 API to flush TLBs Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28 ` [PATCH v15 05/11] KVM: arm: Add initial dirty page locking support Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2015-01-07 13:05   ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2014-12-15  7:28 ` [PATCH v15 06/11] KVM: arm: dirty logging write protect support Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2015-01-07 13:05   ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2014-12-15  7:28 ` [PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28 ` [PATCH v15 08/11] KVM: arm64: ARMv8 header changes for page logging Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28 ` [PATCH v15 09/11] KVM: arm64: Add HYP interface to flush VM Stage 1/2 TLB entires Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28 ` [PATCH v15 10/11] KVM: arm/arm64: Enable Dirty Page logging for ARMv8 Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2015-01-07 12:47   ` Christoffer Dall
2015-01-07 12:47     ` Christoffer Dall
2015-01-07 12:47     ` Christoffer Dall
2015-01-07 12:47     ` Christoffer Dall
2015-01-08  1:51     ` Mario Smarduch
2015-01-08  1:51       ` Mario Smarduch
2015-01-08  1:51       ` Mario Smarduch
2015-01-08  1:51       ` Mario Smarduch
2015-01-08 10:56       ` Christoffer Dall
2015-01-08 10:56         ` Christoffer Dall
2015-01-08 10:56         ` Christoffer Dall
2015-01-08 10:56         ` Christoffer Dall
2015-01-08 16:30         ` Mario Smarduch
2015-01-08 16:30           ` Mario Smarduch
2015-01-08 16:30           ` Mario Smarduch
2015-01-08 16:30           ` Mario Smarduch
2014-12-15  7:28 ` [PATCH v15 11/11] KVM: arm/arm64: Add support to dissolve huge PUD Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2014-12-15  7:28   ` Mario Smarduch
2015-01-07 13:05   ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2015-01-07 13:05     ` Christoffer Dall
2015-01-08  3:01     ` Mario Smarduch
2015-01-08  3:01       ` Mario Smarduch
2015-01-08  3:01       ` Mario Smarduch
2015-01-08  3:01       ` Mario Smarduch
2015-01-08 11:32       ` Christoffer Dall
2015-01-08 11:32         ` Christoffer Dall
2015-01-08 11:32         ` Christoffer Dall
2015-01-08 11:32         ` Christoffer Dall
2015-01-08 16:41         ` Mario Smarduch
2015-01-08 16:41           ` Mario Smarduch
2015-01-08 16:41           ` Mario Smarduch
2015-01-08 16:41           ` Mario Smarduch
2015-01-09 10:23           ` Christoffer Dall
2015-01-09 10:23             ` Christoffer Dall
2015-01-09 10:23             ` Christoffer Dall
2015-01-09 10:23             ` Christoffer Dall
2015-01-08 16:42         ` Mario Smarduch
2015-01-08 16:42           ` Mario Smarduch
2015-01-08 16:42           ` Mario Smarduch
2015-01-08 16:42           ` Mario Smarduch
2014-12-18  2:07 ` [RESEND PATCH v15 07/11] KVM: arm: page logging 2nd stage fault handling Mario Smarduch
2014-12-18  2:07   ` Mario Smarduch
2015-01-07 12:38   ` Christoffer Dall
2015-01-07 12:38     ` Christoffer Dall
2015-01-08  1:43     ` Mario Smarduch
2015-01-08  1:43       ` Mario Smarduch
2015-01-08 10:45       ` Christoffer Dall
2015-01-08 10:45         ` Christoffer Dall
2015-01-08 16:28         ` Mario Smarduch
2015-01-08 16:28           ` Mario Smarduch
2015-01-09 10:24           ` Christoffer Dall
2015-01-09 10:24             ` Christoffer Dall
2015-01-10  4:38             ` Mario Smarduch
2015-01-10  4:38               ` Mario Smarduch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.