All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
@ 2015-05-14 17:00 ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, david.vrabel, konrad.wilk, boris.ostrovsky,
	wei.liu2, roger.pau

Hi all,

ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
hypercall interface and PV protocol are always based on 4KB page granularity.

Any attempt to boot a Linux guest with 64KB pages enabled will result to a
guest crash.

This series is a first attempt to allow those Linux running with the current
hypercall interface and PV protocol.

This solution has been chosen because we want to run Linux 64KB in released
Xen ARM version or/and platform using an old version of Linux DOM0.

There is room for improvement, such as support of 64KB grant, modification
of PV protocol to support different page size... They will be explored in a
separate patch series later.

TODO list:
    - Network not working well in the guest when using DOM0 64KB
    - swiotlb not yet converted to 64KB pages

Note that the first 9 patches of the series is a cleanup of the code.

A branch based on 4.1-rc3 can be found here:

git://xenbits.xen.org/people/julieng/linux-arm.git branch xen-64k-v0

Comments, suggestions are welcomed.

Sincerely yours,

Cc: david.vrabel@citrix.com
Cc: konrad.wilk@oracle.com
Cc: boris.ostrovsky@oracle.com
Cc: wei.liu2@citrix.com
Cc: roger.pau@citrix.com

Julien Grall (23):
  xen: Include xen/page.h rather than asm/xen/page.h
  xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
  xen/grant-table: Remove unused macro SPP
  block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
  block/xen-blkfront: Remove invalid comment
  block/xen-blkback: s/nr_pages/nr_segs/
  net/xen-netfront: Correct printf format in xennet_get_responses
  net/xen-netback: Remove unused code in xenvif_rx_action
  arm/xen: Drop duplicate define mfn_to_virt
  xen/biomerge: WORKAROUND always says the biovec are not mergeable
  xen: Add Xen specific page definition
  xen: Extend page_to_mfn to take an offset in the page
  xen/xenbus: Use Xen page definition
  tty/hvc: xen: Use xen page definition
  xen/balloon: Don't rely on the page granularity is the same for Xen
    and Linux
  xen/events: fifo: Make it running on 64KB granularity
  xen/grant-table: Make it running on 64KB granularity
  block/xen-blkfront: Make it running on 64KB page granularity
  block/xen-blkback: Make it running on 64KB page granularity
  net/xen-netfront: Make it running on 64KB page granularity
  net/xen-netback: Make it running on 64KB page granularity
  xen/privcmd: Add support for Linux 64KB page granularity
  arm/xen: Add support for 64KB page granularity

 arch/arm/include/asm/xen/page.h     |  13 +-
 arch/arm/xen/enlighten.c            |   6 +-
 arch/arm/xen/mm.c                   |   2 +-
 arch/arm/xen/p2m.c                  |   8 +-
 drivers/block/xen-blkback/blkback.c |  15 +-
 drivers/block/xen-blkback/common.h  |  18 ++-
 drivers/block/xen-blkback/xenbus.c  |   6 +-
 drivers/block/xen-blkfront.c        | 267 +++++++++++++++++++++---------------
 drivers/net/xen-netback/common.h    |   7 +-
 drivers/net/xen-netback/netback.c   |  34 ++---
 drivers/net/xen-netfront.c          |  46 ++++---
 drivers/tty/hvc/hvc_xen.c           |   6 +-
 drivers/xen/balloon.c               |  93 ++++++++-----
 drivers/xen/biomerge.c              |   3 +
 drivers/xen/events/events_base.c    |   2 +-
 drivers/xen/events/events_fifo.c    |   4 +-
 drivers/xen/gntdev.c                |   2 +-
 drivers/xen/grant-table.c           |   7 +-
 drivers/xen/manage.c                |   2 +-
 drivers/xen/privcmd.c               |   8 +-
 drivers/xen/tmem.c                  |   2 +-
 drivers/xen/xenbus/xenbus_client.c  |   8 +-
 drivers/xen/xenbus/xenbus_probe.c   |   4 +-
 drivers/xen/xlate_mmu.c             |  31 +++--
 include/xen/page.h                  |  21 ++-
 25 files changed, 366 insertions(+), 249 deletions(-)

-- 
2.1.4


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
@ 2015-05-14 17:00 ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
hypercall interface and PV protocol are always based on 4KB page granularity.

Any attempt to boot a Linux guest with 64KB pages enabled will result to a
guest crash.

This series is a first attempt to allow those Linux running with the current
hypercall interface and PV protocol.

This solution has been chosen because we want to run Linux 64KB in released
Xen ARM version or/and platform using an old version of Linux DOM0.

There is room for improvement, such as support of 64KB grant, modification
of PV protocol to support different page size... They will be explored in a
separate patch series later.

TODO list:
    - Network not working well in the guest when using DOM0 64KB
    - swiotlb not yet converted to 64KB pages

Note that the first 9 patches of the series is a cleanup of the code.

A branch based on 4.1-rc3 can be found here:

git://xenbits.xen.org/people/julieng/linux-arm.git branch xen-64k-v0

Comments, suggestions are welcomed.

Sincerely yours,

Cc: david.vrabel at citrix.com
Cc: konrad.wilk at oracle.com
Cc: boris.ostrovsky at oracle.com
Cc: wei.liu2 at citrix.com
Cc: roger.pau at citrix.com

Julien Grall (23):
  xen: Include xen/page.h rather than asm/xen/page.h
  xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
  xen/grant-table: Remove unused macro SPP
  block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
  block/xen-blkfront: Remove invalid comment
  block/xen-blkback: s/nr_pages/nr_segs/
  net/xen-netfront: Correct printf format in xennet_get_responses
  net/xen-netback: Remove unused code in xenvif_rx_action
  arm/xen: Drop duplicate define mfn_to_virt
  xen/biomerge: WORKAROUND always says the biovec are not mergeable
  xen: Add Xen specific page definition
  xen: Extend page_to_mfn to take an offset in the page
  xen/xenbus: Use Xen page definition
  tty/hvc: xen: Use xen page definition
  xen/balloon: Don't rely on the page granularity is the same for Xen
    and Linux
  xen/events: fifo: Make it running on 64KB granularity
  xen/grant-table: Make it running on 64KB granularity
  block/xen-blkfront: Make it running on 64KB page granularity
  block/xen-blkback: Make it running on 64KB page granularity
  net/xen-netfront: Make it running on 64KB page granularity
  net/xen-netback: Make it running on 64KB page granularity
  xen/privcmd: Add support for Linux 64KB page granularity
  arm/xen: Add support for 64KB page granularity

 arch/arm/include/asm/xen/page.h     |  13 +-
 arch/arm/xen/enlighten.c            |   6 +-
 arch/arm/xen/mm.c                   |   2 +-
 arch/arm/xen/p2m.c                  |   8 +-
 drivers/block/xen-blkback/blkback.c |  15 +-
 drivers/block/xen-blkback/common.h  |  18 ++-
 drivers/block/xen-blkback/xenbus.c  |   6 +-
 drivers/block/xen-blkfront.c        | 267 +++++++++++++++++++++---------------
 drivers/net/xen-netback/common.h    |   7 +-
 drivers/net/xen-netback/netback.c   |  34 ++---
 drivers/net/xen-netfront.c          |  46 ++++---
 drivers/tty/hvc/hvc_xen.c           |   6 +-
 drivers/xen/balloon.c               |  93 ++++++++-----
 drivers/xen/biomerge.c              |   3 +
 drivers/xen/events/events_base.c    |   2 +-
 drivers/xen/events/events_fifo.c    |   4 +-
 drivers/xen/gntdev.c                |   2 +-
 drivers/xen/grant-table.c           |   7 +-
 drivers/xen/manage.c                |   2 +-
 drivers/xen/privcmd.c               |   8 +-
 drivers/xen/tmem.c                  |   2 +-
 drivers/xen/xenbus/xenbus_client.c  |   8 +-
 drivers/xen/xenbus/xenbus_probe.c   |   4 +-
 drivers/xen/xlate_mmu.c             |  31 +++--
 include/xen/page.h                  |  21 ++-
 25 files changed, 366 insertions(+), 249 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Wei Liu, Konrad Rzeszutek Wilk,
	Boris Ostrovsky, David Vrabel, netdev

Using xen/page.h will be necessary later for using common xen page
helpers.

As xen/page.h already include asm/xen/page.h, always use the later.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
---
 arch/arm/xen/mm.c                  | 2 +-
 arch/arm/xen/p2m.c                 | 2 +-
 drivers/net/xen-netback/netback.c  | 2 +-
 drivers/net/xen-netfront.c         | 1 -
 drivers/xen/events/events_base.c   | 2 +-
 drivers/xen/events/events_fifo.c   | 2 +-
 drivers/xen/gntdev.c               | 2 +-
 drivers/xen/manage.c               | 2 +-
 drivers/xen/tmem.c                 | 2 +-
 drivers/xen/xenbus/xenbus_client.c | 2 +-
 10 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 4983250..03e75fe 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -15,10 +15,10 @@
 #include <xen/xen.h>
 #include <xen/interface/grant_table.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 #include <xen/swiotlb-xen.h>
 
 #include <asm/cacheflush.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index cb7a14c..887596c 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -10,10 +10,10 @@
 
 #include <xen/xen.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 #include <xen/swiotlb-xen.h>
 
 #include <asm/cacheflush.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 4de46aa..9c6a504 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -44,9 +44,9 @@
 #include <xen/xen.h>
 #include <xen/events.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 
 /* Provide an option to disable split event channels at load time as
  * event channels are limited resource. Split event channels are
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 3f45afd..ff88f31 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -45,7 +45,6 @@
 #include <linux/slab.h>
 #include <net/ip.h>
 
-#include <asm/xen/page.h>
 #include <xen/xen.h>
 #include <xen/xenbus.h>
 #include <xen/events.h>
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 2b8553b..704d36e 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -39,8 +39,8 @@
 #include <asm/irq.h>
 #include <asm/idle.h>
 #include <asm/io_apic.h>
-#include <asm/xen/page.h>
 #include <asm/xen/pci.h>
+#include <xen/page.h>
 #endif
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 417415d..ed673e1 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -44,13 +44,13 @@
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
-#include <asm/xen/page.h>
 
 #include <xen/xen.h>
 #include <xen/xen-ops.h>
 #include <xen/events.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
+#include <xen/page.h>
 
 #include "events_internal.h"
 
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 8927485..67b9163 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -41,9 +41,9 @@
 #include <xen/balloon.h>
 #include <xen/gntdev.h>
 #include <xen/events.h>
+#include <xen/page.h>
 #include <asm/xen/hypervisor.h>
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Derek G. Murray <Derek.Murray@cl.cam.ac.uk>, "
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 9e6a851..d10effe 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -19,10 +19,10 @@
 #include <xen/grant_table.h>
 #include <xen/events.h>
 #include <xen/hvc-console.h>
+#include <xen/page.h>
 #include <xen/xen-ops.h>
 
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypervisor.h>
 
 enum shutdown_state {
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index c4211a3..3718b4a 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -17,8 +17,8 @@
 
 #include <xen/xen.h>
 #include <xen/interface/xen.h>
+#include <xen/page.h>
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypervisor.h>
 #include <xen/tmem.h>
 
diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index 96b2011..a014016 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -37,7 +37,7 @@
 #include <linux/vmalloc.h>
 #include <linux/export.h>
 #include <asm/xen/hypervisor.h>
-#include <asm/xen/page.h>
+#include <xen/page.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 #include <xen/balloon.h>
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

Using xen/page.h will be necessary later for using common xen page
helpers.

As xen/page.h already include asm/xen/page.h, always use the later.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev at vger.kernel.org
---
 arch/arm/xen/mm.c                  | 2 +-
 arch/arm/xen/p2m.c                 | 2 +-
 drivers/net/xen-netback/netback.c  | 2 +-
 drivers/net/xen-netfront.c         | 1 -
 drivers/xen/events/events_base.c   | 2 +-
 drivers/xen/events/events_fifo.c   | 2 +-
 drivers/xen/gntdev.c               | 2 +-
 drivers/xen/manage.c               | 2 +-
 drivers/xen/tmem.c                 | 2 +-
 drivers/xen/xenbus/xenbus_client.c | 2 +-
 10 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 4983250..03e75fe 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -15,10 +15,10 @@
 #include <xen/xen.h>
 #include <xen/interface/grant_table.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 #include <xen/swiotlb-xen.h>
 
 #include <asm/cacheflush.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index cb7a14c..887596c 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -10,10 +10,10 @@
 
 #include <xen/xen.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 #include <xen/swiotlb-xen.h>
 
 #include <asm/cacheflush.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 4de46aa..9c6a504 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -44,9 +44,9 @@
 #include <xen/xen.h>
 #include <xen/events.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 
 /* Provide an option to disable split event channels at load time as
  * event channels are limited resource. Split event channels are
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 3f45afd..ff88f31 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -45,7 +45,6 @@
 #include <linux/slab.h>
 #include <net/ip.h>
 
-#include <asm/xen/page.h>
 #include <xen/xen.h>
 #include <xen/xenbus.h>
 #include <xen/events.h>
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 2b8553b..704d36e 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -39,8 +39,8 @@
 #include <asm/irq.h>
 #include <asm/idle.h>
 #include <asm/io_apic.h>
-#include <asm/xen/page.h>
 #include <asm/xen/pci.h>
+#include <xen/page.h>
 #endif
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 417415d..ed673e1 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -44,13 +44,13 @@
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
-#include <asm/xen/page.h>
 
 #include <xen/xen.h>
 #include <xen/xen-ops.h>
 #include <xen/events.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
+#include <xen/page.h>
 
 #include "events_internal.h"
 
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 8927485..67b9163 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -41,9 +41,9 @@
 #include <xen/balloon.h>
 #include <xen/gntdev.h>
 #include <xen/events.h>
+#include <xen/page.h>
 #include <asm/xen/hypervisor.h>
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Derek G. Murray <Derek.Murray@cl.cam.ac.uk>, "
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 9e6a851..d10effe 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -19,10 +19,10 @@
 #include <xen/grant_table.h>
 #include <xen/events.h>
 #include <xen/hvc-console.h>
+#include <xen/page.h>
 #include <xen/xen-ops.h>
 
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypervisor.h>
 
 enum shutdown_state {
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index c4211a3..3718b4a 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -17,8 +17,8 @@
 
 #include <xen/xen.h>
 #include <xen/interface/xen.h>
+#include <xen/page.h>
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypervisor.h>
 #include <xen/tmem.h>
 
diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index 96b2011..a014016 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -37,7 +37,7 @@
 #include <linux/vmalloc.h>
 #include <linux/export.h>
 #include <asm/xen/hypervisor.h>
-#include <asm/xen/page.h>
+#include <xen/page.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 #include <xen/balloon.h>
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, Julien Grall, David Vrabel, Boris Ostrovsky,
	linux-arm-kernel

Using xen/page.h will be necessary later for using common xen page
helpers.

As xen/page.h already include asm/xen/page.h, always use the later.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
---
 arch/arm/xen/mm.c                  | 2 +-
 arch/arm/xen/p2m.c                 | 2 +-
 drivers/net/xen-netback/netback.c  | 2 +-
 drivers/net/xen-netfront.c         | 1 -
 drivers/xen/events/events_base.c   | 2 +-
 drivers/xen/events/events_fifo.c   | 2 +-
 drivers/xen/gntdev.c               | 2 +-
 drivers/xen/manage.c               | 2 +-
 drivers/xen/tmem.c                 | 2 +-
 drivers/xen/xenbus/xenbus_client.c | 2 +-
 10 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/arm/xen/mm.c b/arch/arm/xen/mm.c
index 4983250..03e75fe 100644
--- a/arch/arm/xen/mm.c
+++ b/arch/arm/xen/mm.c
@@ -15,10 +15,10 @@
 #include <xen/xen.h>
 #include <xen/interface/grant_table.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 #include <xen/swiotlb-xen.h>
 
 #include <asm/cacheflush.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index cb7a14c..887596c 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -10,10 +10,10 @@
 
 #include <xen/xen.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 #include <xen/swiotlb-xen.h>
 
 #include <asm/cacheflush.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/interface.h>
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 4de46aa..9c6a504 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -44,9 +44,9 @@
 #include <xen/xen.h>
 #include <xen/events.h>
 #include <xen/interface/memory.h>
+#include <xen/page.h>
 
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 
 /* Provide an option to disable split event channels at load time as
  * event channels are limited resource. Split event channels are
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 3f45afd..ff88f31 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -45,7 +45,6 @@
 #include <linux/slab.h>
 #include <net/ip.h>
 
-#include <asm/xen/page.h>
 #include <xen/xen.h>
 #include <xen/xenbus.h>
 #include <xen/events.h>
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 2b8553b..704d36e 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -39,8 +39,8 @@
 #include <asm/irq.h>
 #include <asm/idle.h>
 #include <asm/io_apic.h>
-#include <asm/xen/page.h>
 #include <asm/xen/pci.h>
+#include <xen/page.h>
 #endif
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index 417415d..ed673e1 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -44,13 +44,13 @@
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
-#include <asm/xen/page.h>
 
 #include <xen/xen.h>
 #include <xen/xen-ops.h>
 #include <xen/events.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
+#include <xen/page.h>
 
 #include "events_internal.h"
 
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 8927485..67b9163 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -41,9 +41,9 @@
 #include <xen/balloon.h>
 #include <xen/gntdev.h>
 #include <xen/events.h>
+#include <xen/page.h>
 #include <asm/xen/hypervisor.h>
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Derek G. Murray <Derek.Murray@cl.cam.ac.uk>, "
diff --git a/drivers/xen/manage.c b/drivers/xen/manage.c
index 9e6a851..d10effe 100644
--- a/drivers/xen/manage.c
+++ b/drivers/xen/manage.c
@@ -19,10 +19,10 @@
 #include <xen/grant_table.h>
 #include <xen/events.h>
 #include <xen/hvc-console.h>
+#include <xen/page.h>
 #include <xen/xen-ops.h>
 
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypervisor.h>
 
 enum shutdown_state {
diff --git a/drivers/xen/tmem.c b/drivers/xen/tmem.c
index c4211a3..3718b4a 100644
--- a/drivers/xen/tmem.c
+++ b/drivers/xen/tmem.c
@@ -17,8 +17,8 @@
 
 #include <xen/xen.h>
 #include <xen/interface/xen.h>
+#include <xen/page.h>
 #include <asm/xen/hypercall.h>
-#include <asm/xen/page.h>
 #include <asm/xen/hypervisor.h>
 #include <xen/tmem.h>
 
diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index 96b2011..a014016 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -37,7 +37,7 @@
 #include <linux/vmalloc.h>
 #include <linux/export.h>
 #include <asm/xen/hypervisor.h>
-#include <asm/xen/page.h>
+#include <xen/page.h>
 #include <xen/interface/xen.h>
 #include <xen/interface/event_channel.h>
 #include <xen/balloon.h>
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Wei Liu, Konrad Rzeszutek Wilk,
	Boris Ostrovsky, David Vrabel

virt_to_mfn should take a void* rather an unsigned long. While it
doesn't really matter now, it would throw a compiler warning later when
virt_to_mfn will enforce the type.

At the same time, avoid to compute new virtual address every time in the
loop and directly increment the parameter as we don't use it later.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/xenbus/xenbus_client.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index a014016..d204562 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
 	int i, j;
 
 	for (i = 0; i < nr_pages; i++) {
-		unsigned long addr = (unsigned long)vaddr +
-			(PAGE_SIZE * i);
 		err = gnttab_grant_foreign_access(dev->otherend_id,
-						  virt_to_mfn(addr), 0);
+						  virt_to_mfn(vaddr), 0);
 		if (err < 0) {
 			xenbus_dev_fatal(dev, err,
 					 "granting access to ring page");
 			goto fail;
 		}
 		grefs[i] = err;
+
+		vaddr = (char *)vaddr + PAGE_SIZE;
 	}
 
 	return 0;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

virt_to_mfn should take a void* rather an unsigned long. While it
doesn't really matter now, it would throw a compiler warning later when
virt_to_mfn will enforce the type.

At the same time, avoid to compute new virtual address every time in the
loop and directly increment the parameter as we don't use it later.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/xenbus/xenbus_client.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index a014016..d204562 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
 	int i, j;
 
 	for (i = 0; i < nr_pages; i++) {
-		unsigned long addr = (unsigned long)vaddr +
-			(PAGE_SIZE * i);
 		err = gnttab_grant_foreign_access(dev->otherend_id,
-						  virt_to_mfn(addr), 0);
+						  virt_to_mfn(vaddr), 0);
 		if (err < 0) {
 			xenbus_dev_fatal(dev, err,
 					 "granting access to ring page");
 			goto fail;
 		}
 		grefs[i] = err;
+
+		vaddr = (char *)vaddr + PAGE_SIZE;
 	}
 
 	return 0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
  2015-05-14 17:00 ` Julien Grall
                   ` (2 preceding siblings ...)
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

virt_to_mfn should take a void* rather an unsigned long. While it
doesn't really matter now, it would throw a compiler warning later when
virt_to_mfn will enforce the type.

At the same time, avoid to compute new virtual address every time in the
loop and directly increment the parameter as we don't use it later.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/xenbus/xenbus_client.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_client.c b/drivers/xen/xenbus/xenbus_client.c
index a014016..d204562 100644
--- a/drivers/xen/xenbus/xenbus_client.c
+++ b/drivers/xen/xenbus/xenbus_client.c
@@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
 	int i, j;
 
 	for (i = 0; i < nr_pages; i++) {
-		unsigned long addr = (unsigned long)vaddr +
-			(PAGE_SIZE * i);
 		err = gnttab_grant_foreign_access(dev->otherend_id,
-						  virt_to_mfn(addr), 0);
+						  virt_to_mfn(vaddr), 0);
 		if (err < 0) {
 			xenbus_dev_fatal(dev, err,
 					 "granting access to ring page");
 			goto fail;
 		}
 		grefs[i] = err;
+
+		vaddr = (char *)vaddr + PAGE_SIZE;
 	}
 
 	return 0;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 03/23] xen/grant-table: Remove unused macro SPP
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel

SPP was used by the grant table v2 code which has been removed in
commit 438b33c7145ca8a5131a30c36d8f59bce119a19a "xen/grant-table:
remove support for V2 tables".

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/grant-table.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index b1c7170..62f591f 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -138,7 +138,6 @@ static struct gnttab_free_callback *gnttab_free_callback_list;
 static int gnttab_expand(unsigned int req_entries);
 
 #define RPP (PAGE_SIZE / sizeof(grant_ref_t))
-#define SPP (PAGE_SIZE / sizeof(grant_status_t))
 
 static inline grant_ref_t *__gnttab_entry(grant_ref_t entry)
 {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 03/23] xen/grant-table: Remove unused macro SPP
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

SPP was used by the grant table v2 code which has been removed in
commit 438b33c7145ca8a5131a30c36d8f59bce119a19a "xen/grant-table:
remove support for V2 tables".

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/grant-table.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index b1c7170..62f591f 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -138,7 +138,6 @@ static struct gnttab_free_callback *gnttab_free_callback_list;
 static int gnttab_expand(unsigned int req_entries);
 
 #define RPP (PAGE_SIZE / sizeof(grant_ref_t))
-#define SPP (PAGE_SIZE / sizeof(grant_status_t))
 
 static inline grant_ref_t *__gnttab_entry(grant_ref_t entry)
 {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 03/23] xen/grant-table: Remove unused macro SPP
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

SPP was used by the grant table v2 code which has been removed in
commit 438b33c7145ca8a5131a30c36d8f59bce119a19a "xen/grant-table:
remove support for V2 tables".

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/grant-table.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index b1c7170..62f591f 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -138,7 +138,6 @@ static struct gnttab_free_callback *gnttab_free_callback_list;
 static int gnttab_expand(unsigned int req_entries);
 
 #define RPP (PAGE_SIZE / sizeof(grant_ref_t))
-#define SPP (PAGE_SIZE / sizeof(grant_status_t))
 
 static inline grant_ref_t *__gnttab_entry(grant_ref_t entry)
 {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Julien Grall, Konrad Rzeszutek Wilk,
	Roger Pau Monné,
	Boris Ostrovsky, David Vrabel

From: Julien Grall <julien.grall@linaro.org>

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/block/xen-blkfront.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 2c61cf8..5c72c25 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -139,8 +139,6 @@ static unsigned int nr_minors;
 static unsigned long *minors;
 static DEFINE_SPINLOCK(minor_lock);
 
-#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
-	(BLKIF_MAX_SEGMENTS_PER_REQUEST * BLK_RING_SIZE)
 #define GRANT_INVALID_REF	0
 
 #define PARTS_PER_DISK		16
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Julien Grall <julien.grall@linaro.org>

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monn? <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/block/xen-blkfront.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 2c61cf8..5c72c25 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -139,8 +139,6 @@ static unsigned int nr_minors;
 static unsigned long *minors;
 static DEFINE_SPINLOCK(minor_lock);
 
-#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
-	(BLKIF_MAX_SEGMENTS_PER_REQUEST * BLK_RING_SIZE)
 #define GRANT_INVALID_REF	0
 
 #define PARTS_PER_DISK		16
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
  2015-05-14 17:00 ` Julien Grall
                   ` (6 preceding siblings ...)
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, Julien Grall, David Vrabel, Boris Ostrovsky,
	linux-arm-kernel, Roger Pau Monné

From: Julien Grall <julien.grall@linaro.org>

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/block/xen-blkfront.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 2c61cf8..5c72c25 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -139,8 +139,6 @@ static unsigned int nr_minors;
 static unsigned long *minors;
 static DEFINE_SPINLOCK(minor_lock);
 
-#define MAXIMUM_OUTSTANDING_BLOCK_REQS \
-	(BLKIF_MAX_SEGMENTS_PER_REQUEST * BLK_RING_SIZE)
 #define GRANT_INVALID_REF	0
 
 #define PARTS_PER_DISK		16
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 05/23] block/xen-blkfront: Remove invalid comment
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Julien Grall, Konrad Rzeszutek Wilk,
	Boris Ostrovsky, David Vrabel, Roger Pau Monné

From: Julien Grall <julien.grall@linaro.org>

Since commit b764915 "xen-blkfront: use a different scatterlist for each
request", biovec has been replaced by scatterlist when copying back the
data during a completion request.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/block/xen-blkfront.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5c72c25..60cf1d6 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1056,12 +1056,6 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 		s->req.u.indirect.nr_segments : s->req.u.rw.nr_segments;
 
 	if (bret->operation == BLKIF_OP_READ && info->feature_persistent) {
-		/*
-		 * Copy the data received from the backend into the bvec.
-		 * Since bv_offset can be different than 0, and bv_len different
-		 * than PAGE_SIZE, we have to keep track of the current offset,
-		 * to be sure we are copying the data from the right shared page.
-		 */
 		for_each_sg(s->sg, sg, nseg, i) {
 			BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 			shared_data = kmap_atomic(
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 05/23] block/xen-blkfront: Remove invalid comment
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Julien Grall <julien.grall@linaro.org>

Since commit b764915 "xen-blkfront: use a different scatterlist for each
request", biovec has been replaced by scatterlist when copying back the
data during a completion request.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Roger Pau Monn? <roger.pau@citrix.com>
---
 drivers/block/xen-blkfront.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5c72c25..60cf1d6 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1056,12 +1056,6 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 		s->req.u.indirect.nr_segments : s->req.u.rw.nr_segments;
 
 	if (bret->operation == BLKIF_OP_READ && info->feature_persistent) {
-		/*
-		 * Copy the data received from the backend into the bvec.
-		 * Since bv_offset can be different than 0, and bv_len different
-		 * than PAGE_SIZE, we have to keep track of the current offset,
-		 * to be sure we are copying the data from the right shared page.
-		 */
 		for_each_sg(s->sg, sg, nseg, i) {
 			BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 			shared_data = kmap_atomic(
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 05/23] block/xen-blkfront: Remove invalid comment
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, Julien Grall, David Vrabel, Boris Ostrovsky,
	linux-arm-kernel, Roger Pau Monné

From: Julien Grall <julien.grall@linaro.org>

Since commit b764915 "xen-blkfront: use a different scatterlist for each
request", biovec has been replaced by scatterlist when copying back the
data during a completion request.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/block/xen-blkfront.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5c72c25..60cf1d6 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -1056,12 +1056,6 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 		s->req.u.indirect.nr_segments : s->req.u.rw.nr_segments;
 
 	if (bret->operation == BLKIF_OP_READ && info->feature_persistent) {
-		/*
-		 * Copy the data received from the backend into the bvec.
-		 * Since bv_offset can be different than 0, and bv_len different
-		 * than PAGE_SIZE, we have to keep track of the current offset,
-		 * to be sure we are copying the data from the right shared page.
-		 */
 		for_each_sg(s->sg, sg, nseg, i) {
 			BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 			shared_data = kmap_atomic(
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Julien Grall, Konrad Rzeszutek Wilk,
	Roger Pau Monné

From: Julien Grall <julien.grall@linaro.org>

Make the code less confusing to read now that Linux may not have the
same page size as Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/block/xen-blkback/blkback.c | 10 +++++-----
 drivers/block/xen-blkback/common.h  |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 713fc9f..7049528 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -729,7 +729,7 @@ static void xen_blkbk_unmap_and_respond(struct pending_req *req)
 	struct grant_page **pages = req->segments;
 	unsigned int invcount;
 
-	invcount = xen_blkbk_unmap_prepare(blkif, pages, req->nr_pages,
+	invcount = xen_blkbk_unmap_prepare(blkif, pages, req->nr_segs,
 					   req->unmap, req->unmap_pages);
 
 	work->data = req;
@@ -915,7 +915,7 @@ static int xen_blkbk_map_seg(struct pending_req *pending_req)
 	int rc;
 
 	rc = xen_blkbk_map(pending_req->blkif, pending_req->segments,
-			   pending_req->nr_pages,
+			   pending_req->nr_segs,
 	                   (pending_req->operation != BLKIF_OP_READ));
 
 	return rc;
@@ -931,7 +931,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req,
 	int indirect_grefs, rc, n, nseg, i;
 	struct blkif_request_segment *segments = NULL;
 
-	nseg = pending_req->nr_pages;
+	nseg = pending_req->nr_segs;
 	indirect_grefs = INDIRECT_PAGES(nseg);
 	BUG_ON(indirect_grefs > BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST);
 
@@ -1251,7 +1251,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 	pending_req->id        = req->u.rw.id;
 	pending_req->operation = req_operation;
 	pending_req->status    = BLKIF_RSP_OKAY;
-	pending_req->nr_pages  = nseg;
+	pending_req->nr_segs   = nseg;
 
 	if (req->operation != BLKIF_OP_INDIRECT) {
 		preq.dev               = req->u.rw.handle;
@@ -1372,7 +1372,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
  fail_flush:
 	xen_blkbk_unmap(blkif, pending_req->segments,
-	                pending_req->nr_pages);
+	                pending_req->nr_segs);
  fail_response:
 	/* Haven't submitted any bio's yet. */
 	make_response(blkif, req->u.rw.id, req_operation, BLKIF_RSP_ERROR);
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index f620b5d..7a03e07 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -343,7 +343,7 @@ struct grant_page {
 struct pending_req {
 	struct xen_blkif	*blkif;
 	u64			id;
-	int			nr_pages;
+	int			nr_segs;
 	atomic_t		pendcnt;
 	unsigned short		operation;
 	int			status;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Julien Grall <julien.grall@linaro.org>

Make the code less confusing to read now that Linux may not have the
same page size as Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monn? <roger.pau@citrix.com>
---
 drivers/block/xen-blkback/blkback.c | 10 +++++-----
 drivers/block/xen-blkback/common.h  |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 713fc9f..7049528 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -729,7 +729,7 @@ static void xen_blkbk_unmap_and_respond(struct pending_req *req)
 	struct grant_page **pages = req->segments;
 	unsigned int invcount;
 
-	invcount = xen_blkbk_unmap_prepare(blkif, pages, req->nr_pages,
+	invcount = xen_blkbk_unmap_prepare(blkif, pages, req->nr_segs,
 					   req->unmap, req->unmap_pages);
 
 	work->data = req;
@@ -915,7 +915,7 @@ static int xen_blkbk_map_seg(struct pending_req *pending_req)
 	int rc;
 
 	rc = xen_blkbk_map(pending_req->blkif, pending_req->segments,
-			   pending_req->nr_pages,
+			   pending_req->nr_segs,
 	                   (pending_req->operation != BLKIF_OP_READ));
 
 	return rc;
@@ -931,7 +931,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req,
 	int indirect_grefs, rc, n, nseg, i;
 	struct blkif_request_segment *segments = NULL;
 
-	nseg = pending_req->nr_pages;
+	nseg = pending_req->nr_segs;
 	indirect_grefs = INDIRECT_PAGES(nseg);
 	BUG_ON(indirect_grefs > BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST);
 
@@ -1251,7 +1251,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 	pending_req->id        = req->u.rw.id;
 	pending_req->operation = req_operation;
 	pending_req->status    = BLKIF_RSP_OKAY;
-	pending_req->nr_pages  = nseg;
+	pending_req->nr_segs   = nseg;
 
 	if (req->operation != BLKIF_OP_INDIRECT) {
 		preq.dev               = req->u.rw.handle;
@@ -1372,7 +1372,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
  fail_flush:
 	xen_blkbk_unmap(blkif, pending_req->segments,
-	                pending_req->nr_pages);
+	                pending_req->nr_segs);
  fail_response:
 	/* Haven't submitted any bio's yet. */
 	make_response(blkif, req->u.rw.id, req_operation, BLKIF_RSP_ERROR);
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index f620b5d..7a03e07 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -343,7 +343,7 @@ struct grant_page {
 struct pending_req {
 	struct xen_blkif	*blkif;
 	u64			id;
-	int			nr_pages;
+	int			nr_segs;
 	atomic_t		pendcnt;
 	unsigned short		operation;
 	int			status;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, Julien Grall, linux-arm-kernel,
	Roger Pau Monné

From: Julien Grall <julien.grall@linaro.org>

Make the code less confusing to read now that Linux may not have the
same page size as Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
---
 drivers/block/xen-blkback/blkback.c | 10 +++++-----
 drivers/block/xen-blkback/common.h  |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 713fc9f..7049528 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -729,7 +729,7 @@ static void xen_blkbk_unmap_and_respond(struct pending_req *req)
 	struct grant_page **pages = req->segments;
 	unsigned int invcount;
 
-	invcount = xen_blkbk_unmap_prepare(blkif, pages, req->nr_pages,
+	invcount = xen_blkbk_unmap_prepare(blkif, pages, req->nr_segs,
 					   req->unmap, req->unmap_pages);
 
 	work->data = req;
@@ -915,7 +915,7 @@ static int xen_blkbk_map_seg(struct pending_req *pending_req)
 	int rc;
 
 	rc = xen_blkbk_map(pending_req->blkif, pending_req->segments,
-			   pending_req->nr_pages,
+			   pending_req->nr_segs,
 	                   (pending_req->operation != BLKIF_OP_READ));
 
 	return rc;
@@ -931,7 +931,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req,
 	int indirect_grefs, rc, n, nseg, i;
 	struct blkif_request_segment *segments = NULL;
 
-	nseg = pending_req->nr_pages;
+	nseg = pending_req->nr_segs;
 	indirect_grefs = INDIRECT_PAGES(nseg);
 	BUG_ON(indirect_grefs > BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST);
 
@@ -1251,7 +1251,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 	pending_req->id        = req->u.rw.id;
 	pending_req->operation = req_operation;
 	pending_req->status    = BLKIF_RSP_OKAY;
-	pending_req->nr_pages  = nseg;
+	pending_req->nr_segs   = nseg;
 
 	if (req->operation != BLKIF_OP_INDIRECT) {
 		preq.dev               = req->u.rw.handle;
@@ -1372,7 +1372,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
  fail_flush:
 	xen_blkbk_unmap(blkif, pending_req->segments,
-	                pending_req->nr_pages);
+	                pending_req->nr_segs);
  fail_response:
 	/* Haven't submitted any bio's yet. */
 	make_response(blkif, req->u.rw.id, req_operation, BLKIF_RSP_ERROR);
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index f620b5d..7a03e07 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -343,7 +343,7 @@ struct grant_page {
 struct pending_req {
 	struct xen_blkif	*blkif;
 	u64			id;
-	int			nr_pages;
+	int			nr_segs;
 	atomic_t		pendcnt;
 	unsigned short		operation;
 	int			status;
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses
  2015-05-14 17:00 ` Julien Grall
  (?)
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel, netdev

rx->status is an int16_t, print it using %d rather than %u in order to
have a meaningful value when the field is negative.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index ff88f31..381d38f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -732,7 +732,7 @@ static int xennet_get_responses(struct netfront_queue *queue,
 		if (unlikely(rx->status < 0 ||
 			     rx->offset + rx->status > PAGE_SIZE)) {
 			if (net_ratelimit())
-				dev_warn(dev, "rx->offset: %x, size: %u\n",
+				dev_warn(dev, "rx->offset: %x, size: %d\n",
 					 rx->offset, rx->status);
 			xennet_move_rx_slot(queue, skb, ref);
 			err = -EINVAL;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

rx->status is an int16_t, print it using %d rather than %u in order to
have a meaningful value when the field is negative.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index ff88f31..381d38f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -732,7 +732,7 @@ static int xennet_get_responses(struct netfront_queue *queue,
 		if (unlikely(rx->status < 0 ||
 			     rx->offset + rx->status > PAGE_SIZE)) {
 			if (net_ratelimit())
-				dev_warn(dev, "rx->offset: %x, size: %u\n",
+				dev_warn(dev, "rx->offset: %x, size: %d\n",
 					 rx->offset, rx->status);
 			xennet_move_rx_slot(queue, skb, ref);
 			err = -EINVAL;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

rx->status is an int16_t, print it using %d rather than %u in order to
have a meaningful value when the field is negative.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev at vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index ff88f31..381d38f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -732,7 +732,7 @@ static int xennet_get_responses(struct netfront_queue *queue,
 		if (unlikely(rx->status < 0 ||
 			     rx->offset + rx->status > PAGE_SIZE)) {
 			if (net_ratelimit())
-				dev_warn(dev, "rx->offset: %x, size: %u\n",
+				dev_warn(dev, "rx->offset: %x, size: %d\n",
 					 rx->offset, rx->status);
 			xennet_move_rx_slot(queue, skb, ref);
 			err = -EINVAL;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

rx->status is an int16_t, print it using %d rather than %u in order to
have a meaningful value when the field is negative.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index ff88f31..381d38f 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -732,7 +732,7 @@ static int xennet_get_responses(struct netfront_queue *queue,
 		if (unlikely(rx->status < 0 ||
 			     rx->offset + rx->status > PAGE_SIZE)) {
 			if (net_ratelimit())
-				dev_warn(dev, "rx->offset: %x, size: %u\n",
+				dev_warn(dev, "rx->offset: %x, size: %d\n",
 					 rx->offset, rx->status);
 			xennet_move_rx_slot(queue, skb, ref);
 			err = -EINVAL;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Wei Liu, netdev

The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
always fully coalesce guest Rx packets".

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netback/netback.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9c6a504..9ae1d43 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
 	while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
 	       && (skb = xenvif_rx_dequeue(queue)) != NULL) {
-		RING_IDX old_req_cons;
-		RING_IDX ring_slots_used;
-
 		queue->last_rx_time = jiffies;
 
-		old_req_cons = queue->rx.req_cons;
 		XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, queue);
-		ring_slots_used = queue->rx.req_cons - old_req_cons;
 
 		__skb_queue_tail(&rxq, skb);
 	}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
always fully coalesce guest Rx packets".

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: netdev at vger.kernel.org
---
 drivers/net/xen-netback/netback.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9c6a504..9ae1d43 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
 	while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
 	       && (skb = xenvif_rx_dequeue(queue)) != NULL) {
-		RING_IDX old_req_cons;
-		RING_IDX ring_slots_used;
-
 		queue->last_rx_time = jiffies;
 
-		old_req_cons = queue->rx.req_cons;
 		XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, queue);
-		ring_slots_used = queue->rx.req_cons - old_req_cons;
 
 		__skb_queue_tail(&rxq, skb);
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action
  2015-05-14 17:00 ` Julien Grall
                   ` (11 preceding siblings ...)
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, Julien Grall, linux-arm-kernel

The variables old_req_cons and ring_slots_used are assigned but never
used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
always fully coalesce guest Rx packets".

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netback/netback.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9c6a504..9ae1d43 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
 
 	while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
 	       && (skb = xenvif_rx_dequeue(queue)) != NULL) {
-		RING_IDX old_req_cons;
-		RING_IDX ring_slots_used;
-
 		queue->last_rx_time = jiffies;
 
-		old_req_cons = queue->rx.req_cons;
 		XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, queue);
-		ring_slots_used = queue->rx.req_cons - old_req_cons;
 
 		__skb_queue_tail(&rxq, skb);
 	}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Julien Grall

From: Julien Grall <julien.grall@linaro.org>

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/arm/include/asm/xen/page.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 0b579b2..1bee8ca 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -12,7 +12,6 @@
 #include <xen/interface/grant_table.h>
 
 #define phys_to_machine_mapping_valid(pfn) (1)
-#define mfn_to_virt(m)			(__va(mfn_to_pfn(m) << PAGE_SHIFT))
 
 #define pte_mfn	    pte_pfn
 #define mfn_pte	    pfn_pte
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Julien Grall <julien.grall@linaro.org>

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/arm/include/asm/xen/page.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 0b579b2..1bee8ca 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -12,7 +12,6 @@
 #include <xen/interface/grant_table.h>
 
 #define phys_to_machine_mapping_valid(pfn) (1)
-#define mfn_to_virt(m)			(__va(mfn_to_pfn(m) << PAGE_SHIFT))
 
 #define pte_mfn	    pte_pfn
 #define mfn_pte	    pfn_pte
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
  2015-05-14 17:00 ` Julien Grall
                   ` (12 preceding siblings ...)
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, Julien Grall, linux-arm-kernel

From: Julien Grall <julien.grall@linaro.org>

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
 arch/arm/include/asm/xen/page.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 0b579b2..1bee8ca 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -12,7 +12,6 @@
 #include <xen/interface/grant_table.h>
 
 #define phys_to_machine_mapping_valid(pfn) (1)
-#define mfn_to_virt(m)			(__va(mfn_to_pfn(m) << PAGE_SHIFT))
 
 #define pte_mfn	    pte_pfn
 #define mfn_pte	    pfn_pte
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel

When Linux is using 64K page granularity, every page will be slipt in
multiple non-contiguous 4K MFN.

I'm not sure how to handle efficiently the check to know whether we can
merge 2 biovec with a such case. So for now, always says that biovec are
not mergeable.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/biomerge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
index 0edb91c..20387c2 100644
--- a/drivers/xen/biomerge.c
+++ b/drivers/xen/biomerge.c
@@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
 	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
 	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
 
+	/* TODO: Implement it correctly */
+	return 0;
+
 	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
 		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
 }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

When Linux is using 64K page granularity, every page will be slipt in
multiple non-contiguous 4K MFN.

I'm not sure how to handle efficiently the check to know whether we can
merge 2 biovec with a such case. So for now, always says that biovec are
not mergeable.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/biomerge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
index 0edb91c..20387c2 100644
--- a/drivers/xen/biomerge.c
+++ b/drivers/xen/biomerge.c
@@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
 	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
 	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
 
+	/* TODO: Implement it correctly */
+	return 0;
+
 	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
 		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

When Linux is using 64K page granularity, every page will be slipt in
multiple non-contiguous 4K MFN.

I'm not sure how to handle efficiently the check to know whether we can
merge 2 biovec with a such case. So for now, always says that biovec are
not mergeable.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/biomerge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
index 0edb91c..20387c2 100644
--- a/drivers/xen/biomerge.c
+++ b/drivers/xen/biomerge.c
@@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
 	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
 	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
 
+	/* TODO: Implement it correctly */
+	return 0;
+
 	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
 		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 11/23] xen: Add Xen specific page definition
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel

The Xen hypercall interface is always using 4K page granularity on ARM
and x86 architecture.

With the incoming support of 64K page granularity for ARM64 guest, it
won't be possible to re-use the Linux page definition in Xen drivers.

Introduce Xen page definition helpers based on the Linux page
definition. They have exactly the same name but prefixed with
XEN_/xen_ prefix.

Also modify page_to_pfn to use new Xen page definition.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 include/xen/page.h | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/xen/page.h b/include/xen/page.h
index c5ed20b..89ae01c 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -1,11 +1,28 @@
 #ifndef _XEN_PAGE_H
 #define _XEN_PAGE_H
 
+#include <asm/page.h>
+
+/* The hypercall interface supports only 4KB page */
+#define XEN_PAGE_SHIFT	12
+#define XEN_PAGE_SIZE	(_AC(1,UL) << XEN_PAGE_SHIFT)
+#define XEN_PAGE_MASK	(~(XEN_PAGE_SIZE-1))
+#define xen_offset_in_page(p)	((unsigned long)(p) & ~XEN_PAGE_MASK)
+#define xen_pfn_to_page(pfn)	\
+	((pfn_to_page(((unsigned long)(pfn) << XEN_PAGE_SHIFT) >> PAGE_SHIFT)))
+#define xen_page_to_pfn(page)	\
+	(((page_to_pfn(page)) << PAGE_SHIFT) >> XEN_PAGE_SHIFT)
+
+#define XEN_PFN_PER_PAGE	(PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define XEN_PFN_DOWN(x)	((x) >> XEN_PAGE_SHIFT)
+#define XEN_PFN_PHYS(x)	((phys_addr_t)(x) << XEN_PAGE_SHIFT)
+
 #include <asm/xen/page.h>
 
 static inline unsigned long page_to_mfn(struct page *page)
 {
-	return pfn_to_mfn(page_to_pfn(page));
+	return pfn_to_mfn(xen_page_to_pfn(page));
 }
 
 struct xen_memory_region {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 11/23] xen: Add Xen specific page definition
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

The Xen hypercall interface is always using 4K page granularity on ARM
and x86 architecture.

With the incoming support of 64K page granularity for ARM64 guest, it
won't be possible to re-use the Linux page definition in Xen drivers.

Introduce Xen page definition helpers based on the Linux page
definition. They have exactly the same name but prefixed with
XEN_/xen_ prefix.

Also modify page_to_pfn to use new Xen page definition.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 include/xen/page.h | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/xen/page.h b/include/xen/page.h
index c5ed20b..89ae01c 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -1,11 +1,28 @@
 #ifndef _XEN_PAGE_H
 #define _XEN_PAGE_H
 
+#include <asm/page.h>
+
+/* The hypercall interface supports only 4KB page */
+#define XEN_PAGE_SHIFT	12
+#define XEN_PAGE_SIZE	(_AC(1,UL) << XEN_PAGE_SHIFT)
+#define XEN_PAGE_MASK	(~(XEN_PAGE_SIZE-1))
+#define xen_offset_in_page(p)	((unsigned long)(p) & ~XEN_PAGE_MASK)
+#define xen_pfn_to_page(pfn)	\
+	((pfn_to_page(((unsigned long)(pfn) << XEN_PAGE_SHIFT) >> PAGE_SHIFT)))
+#define xen_page_to_pfn(page)	\
+	(((page_to_pfn(page)) << PAGE_SHIFT) >> XEN_PAGE_SHIFT)
+
+#define XEN_PFN_PER_PAGE	(PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define XEN_PFN_DOWN(x)	((x) >> XEN_PAGE_SHIFT)
+#define XEN_PFN_PHYS(x)	((phys_addr_t)(x) << XEN_PAGE_SHIFT)
+
 #include <asm/xen/page.h>
 
 static inline unsigned long page_to_mfn(struct page *page)
 {
-	return pfn_to_mfn(page_to_pfn(page));
+	return pfn_to_mfn(xen_page_to_pfn(page));
 }
 
 struct xen_memory_region {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 11/23] xen: Add Xen specific page definition
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

The Xen hypercall interface is always using 4K page granularity on ARM
and x86 architecture.

With the incoming support of 64K page granularity for ARM64 guest, it
won't be possible to re-use the Linux page definition in Xen drivers.

Introduce Xen page definition helpers based on the Linux page
definition. They have exactly the same name but prefixed with
XEN_/xen_ prefix.

Also modify page_to_pfn to use new Xen page definition.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 include/xen/page.h | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/include/xen/page.h b/include/xen/page.h
index c5ed20b..89ae01c 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -1,11 +1,28 @@
 #ifndef _XEN_PAGE_H
 #define _XEN_PAGE_H
 
+#include <asm/page.h>
+
+/* The hypercall interface supports only 4KB page */
+#define XEN_PAGE_SHIFT	12
+#define XEN_PAGE_SIZE	(_AC(1,UL) << XEN_PAGE_SHIFT)
+#define XEN_PAGE_MASK	(~(XEN_PAGE_SIZE-1))
+#define xen_offset_in_page(p)	((unsigned long)(p) & ~XEN_PAGE_MASK)
+#define xen_pfn_to_page(pfn)	\
+	((pfn_to_page(((unsigned long)(pfn) << XEN_PAGE_SHIFT) >> PAGE_SHIFT)))
+#define xen_page_to_pfn(page)	\
+	(((page_to_pfn(page)) << PAGE_SHIFT) >> XEN_PAGE_SHIFT)
+
+#define XEN_PFN_PER_PAGE	(PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define XEN_PFN_DOWN(x)	((x) >> XEN_PAGE_SHIFT)
+#define XEN_PFN_PHYS(x)	((phys_addr_t)(x) << XEN_PAGE_SHIFT)
+
 #include <asm/xen/page.h>
 
 static inline unsigned long page_to_mfn(struct page *page)
 {
-	return pfn_to_mfn(page_to_pfn(page));
+	return pfn_to_mfn(xen_page_to_pfn(page));
 }
 
 struct xen_memory_region {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel, netdev

With 64KB page granularity support in Linux, a page will be split accross
multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
contiguous.

With the offset in the page, the helper will be able to know which MFN
the driver needs to retrieve.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 include/xen/page.h         | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 381d38f..6a0e329 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -431,7 +431,7 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	BUG_ON((signed short)ref < 0);
 
 	gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
-					page_to_mfn(page), GNTMAP_readonly);
+					page_to_mfn(page, 0), GNTMAP_readonly);
 
 	queue->tx_skbs[id].skb = skb;
 	queue->grant_tx_page[id] = page;
diff --git a/include/xen/page.h b/include/xen/page.h
index 89ae01c..8848da1 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -20,9 +20,9 @@
 
 #include <asm/xen/page.h>
 
-static inline unsigned long page_to_mfn(struct page *page)
+static inline unsigned long page_to_mfn(struct page *page, unsigned int offset)
 {
-	return pfn_to_mfn(xen_page_to_pfn(page));
+	return pfn_to_mfn(xen_page_to_pfn(page) + (offset >> XEN_PAGE_SHIFT));
 }
 
 struct xen_memory_region {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

With 64KB page granularity support in Linux, a page will be split accross
multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
contiguous.

With the offset in the page, the helper will be able to know which MFN
the driver needs to retrieve.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev at vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 include/xen/page.h         | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 381d38f..6a0e329 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -431,7 +431,7 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	BUG_ON((signed short)ref < 0);
 
 	gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
-					page_to_mfn(page), GNTMAP_readonly);
+					page_to_mfn(page, 0), GNTMAP_readonly);
 
 	queue->tx_skbs[id].skb = skb;
 	queue->grant_tx_page[id] = page;
diff --git a/include/xen/page.h b/include/xen/page.h
index 89ae01c..8848da1 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -20,9 +20,9 @@
 
 #include <asm/xen/page.h>
 
-static inline unsigned long page_to_mfn(struct page *page)
+static inline unsigned long page_to_mfn(struct page *page, unsigned int offset)
 {
-	return pfn_to_mfn(xen_page_to_pfn(page));
+	return pfn_to_mfn(xen_page_to_pfn(page) + (offset >> XEN_PAGE_SHIFT));
 }
 
 struct xen_memory_region {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
  2015-05-14 17:00 ` Julien Grall
                   ` (16 preceding siblings ...)
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

With 64KB page granularity support in Linux, a page will be split accross
multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
contiguous.

With the offset in the page, the helper will be able to know which MFN
the driver needs to retrieve.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org
---
 drivers/net/xen-netfront.c | 2 +-
 include/xen/page.h         | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 381d38f..6a0e329 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -431,7 +431,7 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	BUG_ON((signed short)ref < 0);
 
 	gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
-					page_to_mfn(page), GNTMAP_readonly);
+					page_to_mfn(page, 0), GNTMAP_readonly);
 
 	queue->tx_skbs[id].skb = skb;
 	queue->grant_tx_page[id] = page;
diff --git a/include/xen/page.h b/include/xen/page.h
index 89ae01c..8848da1 100644
--- a/include/xen/page.h
+++ b/include/xen/page.h
@@ -20,9 +20,9 @@
 
 #include <asm/xen/page.h>
 
-static inline unsigned long page_to_mfn(struct page *page)
+static inline unsigned long page_to_mfn(struct page *page, unsigned int offset)
 {
-	return pfn_to_mfn(xen_page_to_pfn(page));
+	return pfn_to_mfn(xen_page_to_pfn(page) + (offset >> XEN_PAGE_SHIFT));
 }
 
 struct xen_memory_region {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 13/23] xen/xenbus: Use Xen page definition
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel

The xenstore ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/xenbus/xenbus_probe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 5390a67..f99933d9 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
 
 	xen_store_mfn = xen_start_info->store_mfn =
 		pfn_to_mfn(virt_to_phys((void *)page) >>
-			   PAGE_SHIFT);
+			   XEN_PAGE_SHIFT);
 
 	/* Next allocate a local port which xenstored can bind to */
 	alloc_unbound.dom        = DOMID_SELF;
@@ -804,7 +804,7 @@ static int __init xenbus_init(void)
 			goto out_error;
 		xen_store_mfn = (unsigned long)v;
 		xen_store_interface =
-			xen_remap(xen_store_mfn << PAGE_SHIFT, PAGE_SIZE);
+			xen_remap(xen_store_mfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
 		break;
 	default:
 		pr_warn("Xenstore state unknown\n");
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 13/23] xen/xenbus: Use Xen page definition
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

The xenstore ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/xenbus/xenbus_probe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 5390a67..f99933d9 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
 
 	xen_store_mfn = xen_start_info->store_mfn =
 		pfn_to_mfn(virt_to_phys((void *)page) >>
-			   PAGE_SHIFT);
+			   XEN_PAGE_SHIFT);
 
 	/* Next allocate a local port which xenstored can bind to */
 	alloc_unbound.dom        = DOMID_SELF;
@@ -804,7 +804,7 @@ static int __init xenbus_init(void)
 			goto out_error;
 		xen_store_mfn = (unsigned long)v;
 		xen_store_interface =
-			xen_remap(xen_store_mfn << PAGE_SHIFT, PAGE_SIZE);
+			xen_remap(xen_store_mfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
 		break;
 	default:
 		pr_warn("Xenstore state unknown\n");
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 13/23] xen/xenbus: Use Xen page definition
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

The xenstore ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/xenbus/xenbus_probe.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/xenbus/xenbus_probe.c b/drivers/xen/xenbus/xenbus_probe.c
index 5390a67..f99933d9 100644
--- a/drivers/xen/xenbus/xenbus_probe.c
+++ b/drivers/xen/xenbus/xenbus_probe.c
@@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
 
 	xen_store_mfn = xen_start_info->store_mfn =
 		pfn_to_mfn(virt_to_phys((void *)page) >>
-			   PAGE_SHIFT);
+			   XEN_PAGE_SHIFT);
 
 	/* Next allocate a local port which xenstored can bind to */
 	alloc_unbound.dom        = DOMID_SELF;
@@ -804,7 +804,7 @@ static int __init xenbus_init(void)
 			goto out_error;
 		xen_store_mfn = (unsigned long)v;
 		xen_store_interface =
-			xen_remap(xen_store_mfn << PAGE_SHIFT, PAGE_SIZE);
+			xen_remap(xen_store_mfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
 		break;
 	default:
 		pr_warn("Xenstore state unknown\n");
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 14/23] tty/hvc: xen: Use xen page definition
  2015-05-14 17:00 ` Julien Grall
  (?)
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Greg Kroah-Hartman, Jiri Slaby, David Vrabel,
	Boris Ostrovsky, linuxppc-dev

The console ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: linuxppc-dev@lists.ozlabs.org
---
 drivers/tty/hvc/hvc_xen.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index 5bab1c6..a68d115 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -230,7 +230,7 @@ static int xen_hvm_console_init(void)
 	if (r < 0 || v == 0)
 		goto err;
 	mfn = v;
-	info->intf = xen_remap(mfn << PAGE_SHIFT, PAGE_SIZE);
+	info->intf = xen_remap(mfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
 	if (info->intf == NULL)
 		goto err;
 	info->vtermno = HVC_COOKIE;
@@ -392,7 +392,7 @@ static int xencons_connect_backend(struct xenbus_device *dev,
 	if (xen_pv_domain())
 		mfn = virt_to_mfn(info->intf);
 	else
-		mfn = __pa(info->intf) >> PAGE_SHIFT;
+		mfn = __pa(info->intf) >> XEN_PAGE_SHIFT;
 	ret = gnttab_alloc_grant_references(1, &gref_head);
 	if (ret < 0)
 		return ret;
@@ -476,7 +476,7 @@ static int xencons_resume(struct xenbus_device *dev)
 	struct xencons_info *info = dev_get_drvdata(&dev->dev);
 
 	xencons_disconnect_backend(info);
-	memset(info->intf, 0, PAGE_SIZE);
+	memset(info->intf, 0, XEN_PAGE_SIZE);
 	return xencons_connect_backend(dev, info);
 }
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 14/23] tty/hvc: xen: Use xen page definition
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, Greg Kroah-Hartman,
	linuxppc-dev, tim, linux-kernel, Julien Grall, David Vrabel,
	Boris Ostrovsky, Jiri Slaby, linux-arm-kernel

The console ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: linuxppc-dev@lists.ozlabs.org
---
 drivers/tty/hvc/hvc_xen.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index 5bab1c6..a68d115 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -230,7 +230,7 @@ static int xen_hvm_console_init(void)
 	if (r < 0 || v == 0)
 		goto err;
 	mfn = v;
-	info->intf = xen_remap(mfn << PAGE_SHIFT, PAGE_SIZE);
+	info->intf = xen_remap(mfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
 	if (info->intf == NULL)
 		goto err;
 	info->vtermno = HVC_COOKIE;
@@ -392,7 +392,7 @@ static int xencons_connect_backend(struct xenbus_device *dev,
 	if (xen_pv_domain())
 		mfn = virt_to_mfn(info->intf);
 	else
-		mfn = __pa(info->intf) >> PAGE_SHIFT;
+		mfn = __pa(info->intf) >> XEN_PAGE_SHIFT;
 	ret = gnttab_alloc_grant_references(1, &gref_head);
 	if (ret < 0)
 		return ret;
@@ -476,7 +476,7 @@ static int xencons_resume(struct xenbus_device *dev)
 	struct xencons_info *info = dev_get_drvdata(&dev->dev);
 
 	xencons_disconnect_backend(info);
-	memset(info->intf, 0, PAGE_SIZE);
+	memset(info->intf, 0, XEN_PAGE_SIZE);
 	return xencons_connect_backend(dev, info);
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 14/23] tty/hvc: xen: Use xen page definition
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

The console ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: linuxppc-dev at lists.ozlabs.org
---
 drivers/tty/hvc/hvc_xen.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index 5bab1c6..a68d115 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -230,7 +230,7 @@ static int xen_hvm_console_init(void)
 	if (r < 0 || v == 0)
 		goto err;
 	mfn = v;
-	info->intf = xen_remap(mfn << PAGE_SHIFT, PAGE_SIZE);
+	info->intf = xen_remap(mfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
 	if (info->intf == NULL)
 		goto err;
 	info->vtermno = HVC_COOKIE;
@@ -392,7 +392,7 @@ static int xencons_connect_backend(struct xenbus_device *dev,
 	if (xen_pv_domain())
 		mfn = virt_to_mfn(info->intf);
 	else
-		mfn = __pa(info->intf) >> PAGE_SHIFT;
+		mfn = __pa(info->intf) >> XEN_PAGE_SHIFT;
 	ret = gnttab_alloc_grant_references(1, &gref_head);
 	if (ret < 0)
 		return ret;
@@ -476,7 +476,7 @@ static int xencons_resume(struct xenbus_device *dev)
 	struct xencons_info *info = dev_get_drvdata(&dev->dev);
 
 	xencons_disconnect_backend(info);
-	memset(info->intf, 0, PAGE_SIZE);
+	memset(info->intf, 0, XEN_PAGE_SIZE);
 	return xencons_connect_backend(dev, info);
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 14/23] tty/hvc: xen: Use xen page definition
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, Greg Kroah-Hartman,
	linuxppc-dev, tim, linux-kernel, Julien Grall, David Vrabel,
	Boris Ostrovsky, Jiri Slaby, linux-arm-kernel

The console ring is always based on the page granularity of Xen.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: linuxppc-dev@lists.ozlabs.org
---
 drivers/tty/hvc/hvc_xen.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/tty/hvc/hvc_xen.c b/drivers/tty/hvc/hvc_xen.c
index 5bab1c6..a68d115 100644
--- a/drivers/tty/hvc/hvc_xen.c
+++ b/drivers/tty/hvc/hvc_xen.c
@@ -230,7 +230,7 @@ static int xen_hvm_console_init(void)
 	if (r < 0 || v == 0)
 		goto err;
 	mfn = v;
-	info->intf = xen_remap(mfn << PAGE_SHIFT, PAGE_SIZE);
+	info->intf = xen_remap(mfn << XEN_PAGE_SHIFT, XEN_PAGE_SIZE);
 	if (info->intf == NULL)
 		goto err;
 	info->vtermno = HVC_COOKIE;
@@ -392,7 +392,7 @@ static int xencons_connect_backend(struct xenbus_device *dev,
 	if (xen_pv_domain())
 		mfn = virt_to_mfn(info->intf);
 	else
-		mfn = __pa(info->intf) >> PAGE_SHIFT;
+		mfn = __pa(info->intf) >> XEN_PAGE_SHIFT;
 	ret = gnttab_alloc_grant_references(1, &gref_head);
 	if (ret < 0)
 		return ret;
@@ -476,7 +476,7 @@ static int xencons_resume(struct xenbus_device *dev)
 	struct xencons_info *info = dev_get_drvdata(&dev->dev);
 
 	xencons_disconnect_backend(info);
-	memset(info->intf, 0, PAGE_SIZE);
+	memset(info->intf, 0, XEN_PAGE_SIZE);
 	return xencons_connect_backend(dev, info);
 }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel, Wei Liu

For ARM64 guests, Linux is able to support either 64K or 4K page
granularity. Although, the hypercall interface is always based on 4K
page granularity.

With 64K page granuliarty, a single page will be spread over multiple
Xen frame.

When a driver request/free a balloon page, the balloon driver will have
to split the Linux page in 4K chunk before asking Xen to add/remove the
frame from the guest.

Note that this can work on any page granularity assuming it's a multiple
of 4K.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>

---

TODO/LIMITATIONS:
    - When CONFIG_XEN_HAVE_PMMU only 4K page granularity is supported
    - It may be possible to extend the concept for ballooning 2M/1G
    page.
---
 drivers/xen/balloon.c | 93 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 60 insertions(+), 33 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index fd93369..f0d8666 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -91,7 +91,7 @@ struct balloon_stats balloon_stats;
 EXPORT_SYMBOL_GPL(balloon_stats);
 
 /* We increase/decrease in batches which fit in a page */
-static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
+static xen_pfn_t frame_list[XEN_PAGE_SIZE / sizeof(unsigned long)];
 
 
 /* List of ballooned pages, threaded through the mem_map array. */
@@ -326,7 +326,7 @@ static enum bp_state reserve_additional_memory(long credit)
 static enum bp_state increase_reservation(unsigned long nr_pages)
 {
 	int rc;
-	unsigned long  pfn, i;
+	unsigned long  pfn, i, nr_frames;
 	struct page   *page;
 	struct xen_memory_reservation reservation = {
 		.address_bits = 0,
@@ -343,30 +343,43 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 	}
 #endif
 
-	if (nr_pages > ARRAY_SIZE(frame_list))
-		nr_pages = ARRAY_SIZE(frame_list);
+	if (nr_pages > (ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE))
+		nr_pages = ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE;
+
+	nr_frames = nr_pages * XEN_PFN_PER_PAGE;
+
+	pfn = 0; /* make gcc happy */
 
 	page = list_first_entry_or_null(&ballooned_pages, struct page, lru);
-	for (i = 0; i < nr_pages; i++) {
-		if (!page) {
-			nr_pages = i;
-			break;
+	for (i = 0; i < nr_frames; i++) {
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			if (!page) {
+				nr_frames = i;
+				break;
+			}
+			pfn = xen_page_to_pfn(page);
+			page = balloon_next_page(page);
 		}
-		frame_list[i] = page_to_pfn(page);
-		page = balloon_next_page(page);
+		frame_list[i] = pfn++;
 	}
 
 	set_xen_guest_handle(reservation.extent_start, frame_list);
-	reservation.nr_extents = nr_pages;
+	reservation.nr_extents = nr_frames;
 	rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
 	if (rc <= 0)
 		return BP_EAGAIN;
 
 	for (i = 0; i < rc; i++) {
-		page = balloon_retrieve(false);
-		BUG_ON(page == NULL);
 
-		pfn = page_to_pfn(page);
+		/* TODO: Make this code cleaner to make CONFIG_XEN_HAVE_PVMMU
+		 * with 64K Pages
+		 */
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			page = balloon_retrieve(false);
+			BUG_ON(page == NULL);
+
+			pfn = page_to_pfn(page);
+		}
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
 		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
@@ -385,7 +398,8 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 #endif
 
 		/* Relinquish the page back to the allocator. */
-		__free_reserved_page(page);
+		if (!(i % XEN_PFN_PER_PAGE))
+			__free_reserved_page(page);
 	}
 
 	balloon_stats.current_pages += rc;
@@ -396,7 +410,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 {
 	enum bp_state state = BP_DONE;
-	unsigned long  pfn, i;
+	unsigned long  pfn, i, nr_frames;
 	struct page   *page;
 	int ret;
 	struct xen_memory_reservation reservation = {
@@ -414,19 +428,27 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 	}
 #endif
 
-	if (nr_pages > ARRAY_SIZE(frame_list))
-		nr_pages = ARRAY_SIZE(frame_list);
+	if (nr_pages > (ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE))
+		nr_pages = ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE;
 
-	for (i = 0; i < nr_pages; i++) {
-		page = alloc_page(gfp);
-		if (page == NULL) {
-			nr_pages = i;
-			state = BP_EAGAIN;
-			break;
+	nr_frames = nr_pages * XEN_PFN_PER_PAGE;
+
+	pfn = 0; /* Make GCC happy */
+
+	for (i = 0; i < nr_frames; i++) {
+
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			page = alloc_page(gfp);
+			if (page == NULL) {
+				nr_frames = i;
+				state = BP_EAGAIN;
+				break;
+			}
+			scrub_page(page);
+			pfn = xen_page_to_pfn(page);
 		}
-		scrub_page(page);
 
-		frame_list[i] = page_to_pfn(page);
+		frame_list[i] = pfn++;
 	}
 
 	/*
@@ -439,16 +461,20 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 	kmap_flush_unused();
 
 	/* Update direct mapping, invalidate P2M, and add to balloon. */
-	for (i = 0; i < nr_pages; i++) {
+	for (i = 0; i < nr_frames; i++) {
 		pfn = frame_list[i];
 		frame_list[i] = pfn_to_mfn(pfn);
-		page = pfn_to_page(pfn);
+		page = xen_pfn_to_page(pfn);
+
+		/* TODO: Make this code cleaner to make CONFIG_XEN_HAVE_PVMMU
+		 * work with 64K pages
+		 */
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
 		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
 			if (!PageHighMem(page)) {
 				ret = HYPERVISOR_update_va_mapping(
-						(unsigned long)__va(pfn << PAGE_SHIFT),
+						(unsigned long)__va(pfn << XEN_PAGE_SHIFT),
 						__pte_ma(0), 0);
 				BUG_ON(ret);
 			}
@@ -456,17 +482,18 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 		}
 #endif
 
-		balloon_append(page);
+		if (!(i % XEN_PFN_PER_PAGE))
+			balloon_append(page);
 	}
 
 	flush_tlb_all();
 
 	set_xen_guest_handle(reservation.extent_start, frame_list);
-	reservation.nr_extents   = nr_pages;
+	reservation.nr_extents   = nr_frames;
 	ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
-	BUG_ON(ret != nr_pages);
+	BUG_ON(ret != nr_frames);
 
-	balloon_stats.current_pages -= nr_pages;
+	balloon_stats.current_pages -= nr_frames * XEN_PFN_PER_PAGE;
 
 	return state;
 }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

For ARM64 guests, Linux is able to support either 64K or 4K page
granularity. Although, the hypercall interface is always based on 4K
page granularity.

With 64K page granuliarty, a single page will be spread over multiple
Xen frame.

When a driver request/free a balloon page, the balloon driver will have
to split the Linux page in 4K chunk before asking Xen to add/remove the
frame from the guest.

Note that this can work on any page granularity assuming it's a multiple
of 4K.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>

---

TODO/LIMITATIONS:
    - When CONFIG_XEN_HAVE_PMMU only 4K page granularity is supported
    - It may be possible to extend the concept for ballooning 2M/1G
    page.
---
 drivers/xen/balloon.c | 93 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 60 insertions(+), 33 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index fd93369..f0d8666 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -91,7 +91,7 @@ struct balloon_stats balloon_stats;
 EXPORT_SYMBOL_GPL(balloon_stats);
 
 /* We increase/decrease in batches which fit in a page */
-static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
+static xen_pfn_t frame_list[XEN_PAGE_SIZE / sizeof(unsigned long)];
 
 
 /* List of ballooned pages, threaded through the mem_map array. */
@@ -326,7 +326,7 @@ static enum bp_state reserve_additional_memory(long credit)
 static enum bp_state increase_reservation(unsigned long nr_pages)
 {
 	int rc;
-	unsigned long  pfn, i;
+	unsigned long  pfn, i, nr_frames;
 	struct page   *page;
 	struct xen_memory_reservation reservation = {
 		.address_bits = 0,
@@ -343,30 +343,43 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 	}
 #endif
 
-	if (nr_pages > ARRAY_SIZE(frame_list))
-		nr_pages = ARRAY_SIZE(frame_list);
+	if (nr_pages > (ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE))
+		nr_pages = ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE;
+
+	nr_frames = nr_pages * XEN_PFN_PER_PAGE;
+
+	pfn = 0; /* make gcc happy */
 
 	page = list_first_entry_or_null(&ballooned_pages, struct page, lru);
-	for (i = 0; i < nr_pages; i++) {
-		if (!page) {
-			nr_pages = i;
-			break;
+	for (i = 0; i < nr_frames; i++) {
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			if (!page) {
+				nr_frames = i;
+				break;
+			}
+			pfn = xen_page_to_pfn(page);
+			page = balloon_next_page(page);
 		}
-		frame_list[i] = page_to_pfn(page);
-		page = balloon_next_page(page);
+		frame_list[i] = pfn++;
 	}
 
 	set_xen_guest_handle(reservation.extent_start, frame_list);
-	reservation.nr_extents = nr_pages;
+	reservation.nr_extents = nr_frames;
 	rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
 	if (rc <= 0)
 		return BP_EAGAIN;
 
 	for (i = 0; i < rc; i++) {
-		page = balloon_retrieve(false);
-		BUG_ON(page == NULL);
 
-		pfn = page_to_pfn(page);
+		/* TODO: Make this code cleaner to make CONFIG_XEN_HAVE_PVMMU
+		 * with 64K Pages
+		 */
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			page = balloon_retrieve(false);
+			BUG_ON(page == NULL);
+
+			pfn = page_to_pfn(page);
+		}
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
 		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
@@ -385,7 +398,8 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 #endif
 
 		/* Relinquish the page back to the allocator. */
-		__free_reserved_page(page);
+		if (!(i % XEN_PFN_PER_PAGE))
+			__free_reserved_page(page);
 	}
 
 	balloon_stats.current_pages += rc;
@@ -396,7 +410,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 {
 	enum bp_state state = BP_DONE;
-	unsigned long  pfn, i;
+	unsigned long  pfn, i, nr_frames;
 	struct page   *page;
 	int ret;
 	struct xen_memory_reservation reservation = {
@@ -414,19 +428,27 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 	}
 #endif
 
-	if (nr_pages > ARRAY_SIZE(frame_list))
-		nr_pages = ARRAY_SIZE(frame_list);
+	if (nr_pages > (ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE))
+		nr_pages = ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE;
 
-	for (i = 0; i < nr_pages; i++) {
-		page = alloc_page(gfp);
-		if (page == NULL) {
-			nr_pages = i;
-			state = BP_EAGAIN;
-			break;
+	nr_frames = nr_pages * XEN_PFN_PER_PAGE;
+
+	pfn = 0; /* Make GCC happy */
+
+	for (i = 0; i < nr_frames; i++) {
+
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			page = alloc_page(gfp);
+			if (page == NULL) {
+				nr_frames = i;
+				state = BP_EAGAIN;
+				break;
+			}
+			scrub_page(page);
+			pfn = xen_page_to_pfn(page);
 		}
-		scrub_page(page);
 
-		frame_list[i] = page_to_pfn(page);
+		frame_list[i] = pfn++;
 	}
 
 	/*
@@ -439,16 +461,20 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 	kmap_flush_unused();
 
 	/* Update direct mapping, invalidate P2M, and add to balloon. */
-	for (i = 0; i < nr_pages; i++) {
+	for (i = 0; i < nr_frames; i++) {
 		pfn = frame_list[i];
 		frame_list[i] = pfn_to_mfn(pfn);
-		page = pfn_to_page(pfn);
+		page = xen_pfn_to_page(pfn);
+
+		/* TODO: Make this code cleaner to make CONFIG_XEN_HAVE_PVMMU
+		 * work with 64K pages
+		 */
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
 		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
 			if (!PageHighMem(page)) {
 				ret = HYPERVISOR_update_va_mapping(
-						(unsigned long)__va(pfn << PAGE_SHIFT),
+						(unsigned long)__va(pfn << XEN_PAGE_SHIFT),
 						__pte_ma(0), 0);
 				BUG_ON(ret);
 			}
@@ -456,17 +482,18 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 		}
 #endif
 
-		balloon_append(page);
+		if (!(i % XEN_PFN_PER_PAGE))
+			balloon_append(page);
 	}
 
 	flush_tlb_all();
 
 	set_xen_guest_handle(reservation.extent_start, frame_list);
-	reservation.nr_extents   = nr_pages;
+	reservation.nr_extents   = nr_frames;
 	ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
-	BUG_ON(ret != nr_pages);
+	BUG_ON(ret != nr_frames);
 
-	balloon_stats.current_pages -= nr_pages;
+	balloon_stats.current_pages -= nr_frames * XEN_PFN_PER_PAGE;
 
 	return state;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
  2015-05-14 17:00 ` Julien Grall
                   ` (21 preceding siblings ...)
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

For ARM64 guests, Linux is able to support either 64K or 4K page
granularity. Although, the hypercall interface is always based on 4K
page granularity.

With 64K page granuliarty, a single page will be spread over multiple
Xen frame.

When a driver request/free a balloon page, the balloon driver will have
to split the Linux page in 4K chunk before asking Xen to add/remove the
frame from the guest.

Note that this can work on any page granularity assuming it's a multiple
of 4K.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>

---

TODO/LIMITATIONS:
    - When CONFIG_XEN_HAVE_PMMU only 4K page granularity is supported
    - It may be possible to extend the concept for ballooning 2M/1G
    page.
---
 drivers/xen/balloon.c | 93 +++++++++++++++++++++++++++++++++------------------
 1 file changed, 60 insertions(+), 33 deletions(-)

diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index fd93369..f0d8666 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -91,7 +91,7 @@ struct balloon_stats balloon_stats;
 EXPORT_SYMBOL_GPL(balloon_stats);
 
 /* We increase/decrease in batches which fit in a page */
-static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
+static xen_pfn_t frame_list[XEN_PAGE_SIZE / sizeof(unsigned long)];
 
 
 /* List of ballooned pages, threaded through the mem_map array. */
@@ -326,7 +326,7 @@ static enum bp_state reserve_additional_memory(long credit)
 static enum bp_state increase_reservation(unsigned long nr_pages)
 {
 	int rc;
-	unsigned long  pfn, i;
+	unsigned long  pfn, i, nr_frames;
 	struct page   *page;
 	struct xen_memory_reservation reservation = {
 		.address_bits = 0,
@@ -343,30 +343,43 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 	}
 #endif
 
-	if (nr_pages > ARRAY_SIZE(frame_list))
-		nr_pages = ARRAY_SIZE(frame_list);
+	if (nr_pages > (ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE))
+		nr_pages = ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE;
+
+	nr_frames = nr_pages * XEN_PFN_PER_PAGE;
+
+	pfn = 0; /* make gcc happy */
 
 	page = list_first_entry_or_null(&ballooned_pages, struct page, lru);
-	for (i = 0; i < nr_pages; i++) {
-		if (!page) {
-			nr_pages = i;
-			break;
+	for (i = 0; i < nr_frames; i++) {
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			if (!page) {
+				nr_frames = i;
+				break;
+			}
+			pfn = xen_page_to_pfn(page);
+			page = balloon_next_page(page);
 		}
-		frame_list[i] = page_to_pfn(page);
-		page = balloon_next_page(page);
+		frame_list[i] = pfn++;
 	}
 
 	set_xen_guest_handle(reservation.extent_start, frame_list);
-	reservation.nr_extents = nr_pages;
+	reservation.nr_extents = nr_frames;
 	rc = HYPERVISOR_memory_op(XENMEM_populate_physmap, &reservation);
 	if (rc <= 0)
 		return BP_EAGAIN;
 
 	for (i = 0; i < rc; i++) {
-		page = balloon_retrieve(false);
-		BUG_ON(page == NULL);
 
-		pfn = page_to_pfn(page);
+		/* TODO: Make this code cleaner to make CONFIG_XEN_HAVE_PVMMU
+		 * with 64K Pages
+		 */
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			page = balloon_retrieve(false);
+			BUG_ON(page == NULL);
+
+			pfn = page_to_pfn(page);
+		}
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
 		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
@@ -385,7 +398,8 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 #endif
 
 		/* Relinquish the page back to the allocator. */
-		__free_reserved_page(page);
+		if (!(i % XEN_PFN_PER_PAGE))
+			__free_reserved_page(page);
 	}
 
 	balloon_stats.current_pages += rc;
@@ -396,7 +410,7 @@ static enum bp_state increase_reservation(unsigned long nr_pages)
 static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 {
 	enum bp_state state = BP_DONE;
-	unsigned long  pfn, i;
+	unsigned long  pfn, i, nr_frames;
 	struct page   *page;
 	int ret;
 	struct xen_memory_reservation reservation = {
@@ -414,19 +428,27 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 	}
 #endif
 
-	if (nr_pages > ARRAY_SIZE(frame_list))
-		nr_pages = ARRAY_SIZE(frame_list);
+	if (nr_pages > (ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE))
+		nr_pages = ARRAY_SIZE(frame_list) / XEN_PFN_PER_PAGE;
 
-	for (i = 0; i < nr_pages; i++) {
-		page = alloc_page(gfp);
-		if (page == NULL) {
-			nr_pages = i;
-			state = BP_EAGAIN;
-			break;
+	nr_frames = nr_pages * XEN_PFN_PER_PAGE;
+
+	pfn = 0; /* Make GCC happy */
+
+	for (i = 0; i < nr_frames; i++) {
+
+		if (!(i % XEN_PFN_PER_PAGE)) {
+			page = alloc_page(gfp);
+			if (page == NULL) {
+				nr_frames = i;
+				state = BP_EAGAIN;
+				break;
+			}
+			scrub_page(page);
+			pfn = xen_page_to_pfn(page);
 		}
-		scrub_page(page);
 
-		frame_list[i] = page_to_pfn(page);
+		frame_list[i] = pfn++;
 	}
 
 	/*
@@ -439,16 +461,20 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 	kmap_flush_unused();
 
 	/* Update direct mapping, invalidate P2M, and add to balloon. */
-	for (i = 0; i < nr_pages; i++) {
+	for (i = 0; i < nr_frames; i++) {
 		pfn = frame_list[i];
 		frame_list[i] = pfn_to_mfn(pfn);
-		page = pfn_to_page(pfn);
+		page = xen_pfn_to_page(pfn);
+
+		/* TODO: Make this code cleaner to make CONFIG_XEN_HAVE_PVMMU
+		 * work with 64K pages
+		 */
 
 #ifdef CONFIG_XEN_HAVE_PVMMU
 		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
 			if (!PageHighMem(page)) {
 				ret = HYPERVISOR_update_va_mapping(
-						(unsigned long)__va(pfn << PAGE_SHIFT),
+						(unsigned long)__va(pfn << XEN_PAGE_SHIFT),
 						__pte_ma(0), 0);
 				BUG_ON(ret);
 			}
@@ -456,17 +482,18 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
 		}
 #endif
 
-		balloon_append(page);
+		if (!(i % XEN_PFN_PER_PAGE))
+			balloon_append(page);
 	}
 
 	flush_tlb_all();
 
 	set_xen_guest_handle(reservation.extent_start, frame_list);
-	reservation.nr_extents   = nr_pages;
+	reservation.nr_extents   = nr_frames;
 	ret = HYPERVISOR_memory_op(XENMEM_decrease_reservation, &reservation);
-	BUG_ON(ret != nr_pages);
+	BUG_ON(ret != nr_frames);
 
-	balloon_stats.current_pages -= nr_pages;
+	balloon_stats.current_pages -= nr_frames * XEN_PFN_PER_PAGE;
 
 	return state;
 }
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel

Only use the first 4KB of the page to store the events channel info. It
means that we will wast 60KB every time we allocate page for:
     * control block: a page is allocating per CPU
     * event array: a page is allocating everytime we need to expand it

I think we can reduce the memory waste for the 2 areas by:

    * control block: sharing between multiple vCPUs. Although it will
    require some bookkeeping in order to not free the page when the CPU
    goes offline and the other CPUs sharing the page still there

    * event array: always extend the array event by 64K (i.e 16 4K
    chunk). That would require more care when we fail to expand the
    event channel.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/events_base.c | 2 +-
 drivers/xen/events/events_fifo.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 704d36e..24b97bd 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -40,11 +40,11 @@
 #include <asm/idle.h>
 #include <asm/io_apic.h>
 #include <asm/xen/pci.h>
-#include <xen/page.h>
 #endif
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
+#include <xen/page.h>
 
 #include <xen/xen.h>
 #include <xen/hvm.h>
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index ed673e1..d53c297 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -54,7 +54,7 @@
 
 #include "events_internal.h"
 
-#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t))
+#define EVENT_WORDS_PER_PAGE (XEN_PAGE_SIZE / sizeof(event_word_t))
 #define MAX_EVENT_ARRAY_PAGES (EVTCHN_FIFO_NR_CHANNELS / EVENT_WORDS_PER_PAGE)
 
 struct evtchn_fifo_queue {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

Only use the first 4KB of the page to store the events channel info. It
means that we will wast 60KB every time we allocate page for:
     * control block: a page is allocating per CPU
     * event array: a page is allocating everytime we need to expand it

I think we can reduce the memory waste for the 2 areas by:

    * control block: sharing between multiple vCPUs. Although it will
    require some bookkeeping in order to not free the page when the CPU
    goes offline and the other CPUs sharing the page still there

    * event array: always extend the array event by 64K (i.e 16 4K
    chunk). That would require more care when we fail to expand the
    event channel.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/events_base.c | 2 +-
 drivers/xen/events/events_fifo.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 704d36e..24b97bd 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -40,11 +40,11 @@
 #include <asm/idle.h>
 #include <asm/io_apic.h>
 #include <asm/xen/pci.h>
-#include <xen/page.h>
 #endif
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
+#include <xen/page.h>
 
 #include <xen/xen.h>
 #include <xen/hvm.h>
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index ed673e1..d53c297 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -54,7 +54,7 @@
 
 #include "events_internal.h"
 
-#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t))
+#define EVENT_WORDS_PER_PAGE (XEN_PAGE_SIZE / sizeof(event_word_t))
 #define MAX_EVENT_ARRAY_PAGES (EVTCHN_FIFO_NR_CHANNELS / EVENT_WORDS_PER_PAGE)
 
 struct evtchn_fifo_queue {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

Only use the first 4KB of the page to store the events channel info. It
means that we will wast 60KB every time we allocate page for:
     * control block: a page is allocating per CPU
     * event array: a page is allocating everytime we need to expand it

I think we can reduce the memory waste for the 2 areas by:

    * control block: sharing between multiple vCPUs. Although it will
    require some bookkeeping in order to not free the page when the CPU
    goes offline and the other CPUs sharing the page still there

    * event array: always extend the array event by 64K (i.e 16 4K
    chunk). That would require more care when we fail to expand the
    event channel.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/events/events_base.c | 2 +-
 drivers/xen/events/events_fifo.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 704d36e..24b97bd 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -40,11 +40,11 @@
 #include <asm/idle.h>
 #include <asm/io_apic.h>
 #include <asm/xen/pci.h>
-#include <xen/page.h>
 #endif
 #include <asm/sync_bitops.h>
 #include <asm/xen/hypercall.h>
 #include <asm/xen/hypervisor.h>
+#include <xen/page.h>
 
 #include <xen/xen.h>
 #include <xen/hvm.h>
diff --git a/drivers/xen/events/events_fifo.c b/drivers/xen/events/events_fifo.c
index ed673e1..d53c297 100644
--- a/drivers/xen/events/events_fifo.c
+++ b/drivers/xen/events/events_fifo.c
@@ -54,7 +54,7 @@
 
 #include "events_internal.h"
 
-#define EVENT_WORDS_PER_PAGE (PAGE_SIZE / sizeof(event_word_t))
+#define EVENT_WORDS_PER_PAGE (XEN_PAGE_SIZE / sizeof(event_word_t))
 #define MAX_EVENT_ARRAY_PAGES (EVTCHN_FIFO_NR_CHANNELS / EVENT_WORDS_PER_PAGE)
 
 struct evtchn_fifo_queue {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 17/23] xen/grant-table: Make it running on 64KB granularity
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Russell King, Konrad Rzeszutek Wilk,
	Boris Ostrovsky, David Vrabel

The Xen interface is using 4KB page granularity. This means that each
grant is 4KB.

The current implementation allocates a Linux page per grant. On Linux
using 64KB page granularity, only the first 4KB of the page will be
used.

We could decrease the memory wasted by sharing the page with multiple
grant. It will require some care with the {Set,Clear}ForeignPage macro.

Note that no changes has been made in the x86 code because both Linux
and Xen will only use 4KB page granularity.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 arch/arm/xen/p2m.c        | 6 +++---
 drivers/xen/grant-table.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index 887596c..0ed01f2 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -93,8 +93,8 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops,
 	for (i = 0; i < count; i++) {
 		if (map_ops[i].status)
 			continue;
-		set_phys_to_machine(map_ops[i].host_addr >> PAGE_SHIFT,
-				    map_ops[i].dev_bus_addr >> PAGE_SHIFT);
+		set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT,
+				    map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT);
 	}
 
 	return 0;
@@ -108,7 +108,7 @@ int clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref *unmap_ops,
 	int i;
 
 	for (i = 0; i < count; i++) {
-		set_phys_to_machine(unmap_ops[i].host_addr >> PAGE_SHIFT,
+		set_phys_to_machine(unmap_ops[i].host_addr >> XEN_PAGE_SHIFT,
 				    INVALID_P2M_ENTRY);
 	}
 
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 62f591f..dc0a787 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -642,7 +642,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
 	if (xen_auto_xlat_grant_frames.count)
 		return -EINVAL;
 
-	vaddr = xen_remap(addr, PAGE_SIZE * max_nr_gframes);
+	vaddr = xen_remap(addr, XEN_PAGE_SIZE * max_nr_gframes);
 	if (vaddr == NULL) {
 		pr_warn("Failed to ioremap gnttab share frames (addr=%pa)!\n",
 			&addr);
@@ -654,7 +654,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
 		return -ENOMEM;
 	}
 	for (i = 0; i < max_nr_gframes; i++)
-		pfn[i] = PFN_DOWN(addr) + i;
+		pfn[i] = XEN_PFN_DOWN(addr) + i;
 
 	xen_auto_xlat_grant_frames.vaddr = vaddr;
 	xen_auto_xlat_grant_frames.pfn = pfn;
@@ -978,7 +978,7 @@ static void gnttab_request_version(void)
 {
 	/* Only version 1 is used, which will always be available. */
 	grant_table_version = 1;
-	grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
+	grefs_per_grant_frame = XEN_PAGE_SIZE / sizeof(struct grant_entry_v1);
 	gnttab_interface = &gnttab_v1_ops;
 
 	pr_info("Grant tables using version %d layout\n", grant_table_version);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 17/23] xen/grant-table: Make it running on 64KB granularity
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

The Xen interface is using 4KB page granularity. This means that each
grant is 4KB.

The current implementation allocates a Linux page per grant. On Linux
using 64KB page granularity, only the first 4KB of the page will be
used.

We could decrease the memory wasted by sharing the page with multiple
grant. It will require some care with the {Set,Clear}ForeignPage macro.

Note that no changes has been made in the x86 code because both Linux
and Xen will only use 4KB page granularity.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 arch/arm/xen/p2m.c        | 6 +++---
 drivers/xen/grant-table.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index 887596c..0ed01f2 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -93,8 +93,8 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops,
 	for (i = 0; i < count; i++) {
 		if (map_ops[i].status)
 			continue;
-		set_phys_to_machine(map_ops[i].host_addr >> PAGE_SHIFT,
-				    map_ops[i].dev_bus_addr >> PAGE_SHIFT);
+		set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT,
+				    map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT);
 	}
 
 	return 0;
@@ -108,7 +108,7 @@ int clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref *unmap_ops,
 	int i;
 
 	for (i = 0; i < count; i++) {
-		set_phys_to_machine(unmap_ops[i].host_addr >> PAGE_SHIFT,
+		set_phys_to_machine(unmap_ops[i].host_addr >> XEN_PAGE_SHIFT,
 				    INVALID_P2M_ENTRY);
 	}
 
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 62f591f..dc0a787 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -642,7 +642,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
 	if (xen_auto_xlat_grant_frames.count)
 		return -EINVAL;
 
-	vaddr = xen_remap(addr, PAGE_SIZE * max_nr_gframes);
+	vaddr = xen_remap(addr, XEN_PAGE_SIZE * max_nr_gframes);
 	if (vaddr == NULL) {
 		pr_warn("Failed to ioremap gnttab share frames (addr=%pa)!\n",
 			&addr);
@@ -654,7 +654,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
 		return -ENOMEM;
 	}
 	for (i = 0; i < max_nr_gframes; i++)
-		pfn[i] = PFN_DOWN(addr) + i;
+		pfn[i] = XEN_PFN_DOWN(addr) + i;
 
 	xen_auto_xlat_grant_frames.vaddr = vaddr;
 	xen_auto_xlat_grant_frames.pfn = pfn;
@@ -978,7 +978,7 @@ static void gnttab_request_version(void)
 {
 	/* Only version 1 is used, which will always be available. */
 	grant_table_version = 1;
-	grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
+	grefs_per_grant_frame = XEN_PAGE_SIZE / sizeof(struct grant_entry_v1);
 	gnttab_interface = &gnttab_v1_ops;
 
 	pr_info("Grant tables using version %d layout\n", grant_table_version);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 17/23] xen/grant-table: Make it running on 64KB granularity
  2015-05-14 17:00 ` Julien Grall
                   ` (23 preceding siblings ...)
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: Russell King, ian.campbell, stefano.stabellini, tim,
	linux-kernel, Julien Grall, David Vrabel, Boris Ostrovsky,
	linux-arm-kernel

The Xen interface is using 4KB page granularity. This means that each
grant is 4KB.

The current implementation allocates a Linux page per grant. On Linux
using 64KB page granularity, only the first 4KB of the page will be
used.

We could decrease the memory wasted by sharing the page with multiple
grant. It will require some care with the {Set,Clear}ForeignPage macro.

Note that no changes has been made in the x86 code because both Linux
and Xen will only use 4KB page granularity.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 arch/arm/xen/p2m.c        | 6 +++---
 drivers/xen/grant-table.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm/xen/p2m.c b/arch/arm/xen/p2m.c
index 887596c..0ed01f2 100644
--- a/arch/arm/xen/p2m.c
+++ b/arch/arm/xen/p2m.c
@@ -93,8 +93,8 @@ int set_foreign_p2m_mapping(struct gnttab_map_grant_ref *map_ops,
 	for (i = 0; i < count; i++) {
 		if (map_ops[i].status)
 			continue;
-		set_phys_to_machine(map_ops[i].host_addr >> PAGE_SHIFT,
-				    map_ops[i].dev_bus_addr >> PAGE_SHIFT);
+		set_phys_to_machine(map_ops[i].host_addr >> XEN_PAGE_SHIFT,
+				    map_ops[i].dev_bus_addr >> XEN_PAGE_SHIFT);
 	}
 
 	return 0;
@@ -108,7 +108,7 @@ int clear_foreign_p2m_mapping(struct gnttab_unmap_grant_ref *unmap_ops,
 	int i;
 
 	for (i = 0; i < count; i++) {
-		set_phys_to_machine(unmap_ops[i].host_addr >> PAGE_SHIFT,
+		set_phys_to_machine(unmap_ops[i].host_addr >> XEN_PAGE_SHIFT,
 				    INVALID_P2M_ENTRY);
 	}
 
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 62f591f..dc0a787 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -642,7 +642,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
 	if (xen_auto_xlat_grant_frames.count)
 		return -EINVAL;
 
-	vaddr = xen_remap(addr, PAGE_SIZE * max_nr_gframes);
+	vaddr = xen_remap(addr, XEN_PAGE_SIZE * max_nr_gframes);
 	if (vaddr == NULL) {
 		pr_warn("Failed to ioremap gnttab share frames (addr=%pa)!\n",
 			&addr);
@@ -654,7 +654,7 @@ int gnttab_setup_auto_xlat_frames(phys_addr_t addr)
 		return -ENOMEM;
 	}
 	for (i = 0; i < max_nr_gframes; i++)
-		pfn[i] = PFN_DOWN(addr) + i;
+		pfn[i] = XEN_PFN_DOWN(addr) + i;
 
 	xen_auto_xlat_grant_frames.vaddr = vaddr;
 	xen_auto_xlat_grant_frames.pfn = pfn;
@@ -978,7 +978,7 @@ static void gnttab_request_version(void)
 {
 	/* Only version 1 is used, which will always be available. */
 	grant_table_version = 1;
-	grefs_per_grant_frame = PAGE_SIZE / sizeof(struct grant_entry_v1);
+	grefs_per_grant_frame = XEN_PAGE_SIZE / sizeof(struct grant_entry_v1);
 	gnttab_interface = &gnttab_v1_ops;
 
 	pr_info("Grant tables using version %d layout\n", grant_table_version);
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 18/23] block/xen-blkfront: Make it running on 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Julien Grall, Konrad Rzeszutek Wilk,
	Roger Pau Monné,
	Boris Ostrovsky, David Vrabel

From: Julien Grall <julien.grall@linaro.org>

The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using block
device on a non-modified Xen.

The block API is using segment which should at least be the size of a
Linux page. Therefore, the driver will have to break the page in chunk
of 4K before giving the page to the backend.

Breaking a 64KB segment in 4KB chunk will result to have some chunk with
no data. As the PV protocol always require to have data in the chunk, we
have to count the number of Xen page which will be in use and avoid to
sent empty chunk.

Note that, a pre-defined number of grant is reserved before preparing
the request. This pre-defined number is based on the number and the
maximum size of the segments. If each segment contain a very small
amount of data, the driver may reserve too much grant (16 grant is
reserved per segment with 64KB page granularity).

Futhermore, in the case of persistent grant we allocate one Linux page
per grant although only the 4KB of the page will be effectively use.
This could be improved by share the page with multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>

---

Improvement such as support 64KB grant is not taken into consideration in
this patch because we have the requirement to run a Linux using 64KB page
on a non-modified Xen.
---
 drivers/block/xen-blkfront.c | 259 ++++++++++++++++++++++++++-----------------
 1 file changed, 156 insertions(+), 103 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 60cf1d6..c6537ed 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -77,6 +77,7 @@ struct blk_shadow {
 	struct grant **grants_used;
 	struct grant **indirect_grants;
 	struct scatterlist *sg;
+	unsigned int num_sg;
 };
 
 struct split_bio {
@@ -98,7 +99,7 @@ static unsigned int xen_blkif_max_segments = 32;
 module_param_named(max, xen_blkif_max_segments, int, S_IRUGO);
 MODULE_PARM_DESC(max, "Maximum amount of segments in indirect requests (default is 32)");
 
-#define BLK_RING_SIZE __CONST_RING_SIZE(blkif, PAGE_SIZE)
+#define BLK_RING_SIZE __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE)
 
 /*
  * We have one of these per vbd, whether ide, scsi or 'other'.  They
@@ -131,6 +132,7 @@ struct blkfront_info
 	unsigned int discard_granularity;
 	unsigned int discard_alignment;
 	unsigned int feature_persistent:1;
+	/* Number of 4K segment handled */
 	unsigned int max_indirect_segments;
 	int is_ready;
 };
@@ -158,10 +160,19 @@ static DEFINE_SPINLOCK(minor_lock);
 
 #define DEV_NAME	"xvd"	/* name in /dev */
 
-#define SEGS_PER_INDIRECT_FRAME \
-	(PAGE_SIZE/sizeof(struct blkif_request_segment))
-#define INDIRECT_GREFS(_segs) \
-	((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define SEGS_PER_INDIRECT_FRAME	\
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / XEN_PAGES_PER_SEGMENT)
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+
+#define INDIRECT_GREFS(_pages) \
+	((_pages + XEN_PAGES_PER_INDIRECT_FRAME - 1)/XEN_PAGES_PER_INDIRECT_FRAME)
 
 static int blkfront_setup_indirect(struct blkfront_info *info);
 
@@ -204,7 +215,7 @@ static int fill_grant_buffer(struct blkfront_info *info, int num)
 				kfree(gnt_list_entry);
 				goto out_of_memory;
 			}
-			gnt_list_entry->pfn = page_to_pfn(granted_page);
+			gnt_list_entry->pfn = xen_page_to_pfn(granted_page);
 		}
 
 		gnt_list_entry->gref = GRANT_INVALID_REF;
@@ -219,7 +230,7 @@ out_of_memory:
 	                         &info->grants, node) {
 		list_del(&gnt_list_entry->node);
 		if (info->feature_persistent)
-			__free_page(pfn_to_page(gnt_list_entry->pfn));
+			__free_page(xen_pfn_to_page(gnt_list_entry->pfn));
 		kfree(gnt_list_entry);
 		i--;
 	}
@@ -389,7 +400,8 @@ static int blkif_queue_request(struct request *req)
 	struct blkif_request *ring_req;
 	unsigned long id;
 	unsigned int fsect, lsect;
-	int i, ref, n;
+	unsigned int shared_off, shared_len, bvec_off, sg_total;
+	int i, ref, n, grant;
 	struct blkif_request_segment *segments = NULL;
 
 	/*
@@ -401,18 +413,19 @@ static int blkif_queue_request(struct request *req)
 	grant_ref_t gref_head;
 	struct grant *gnt_list_entry = NULL;
 	struct scatterlist *sg;
-	int nseg, max_grefs;
+	int nseg, max_grefs, nr_page;
+	unsigned long pfn;
 
 	if (unlikely(info->connected != BLKIF_STATE_CONNECTED))
 		return 1;
 
-	max_grefs = req->nr_phys_segments;
+	max_grefs = req->nr_phys_segments * XEN_PAGES_PER_SEGMENT;
 	if (max_grefs > BLKIF_MAX_SEGMENTS_PER_REQUEST)
 		/*
 		 * If we are using indirect segments we need to account
 		 * for the indirect grefs used in the request.
 		 */
-		max_grefs += INDIRECT_GREFS(req->nr_phys_segments);
+		max_grefs += INDIRECT_GREFS(req->nr_phys_segments * XEN_PAGES_PER_SEGMENT);
 
 	/* Check if we have enough grants to allocate a requests */
 	if (info->persistent_gnts_c < max_grefs) {
@@ -446,12 +459,19 @@ static int blkif_queue_request(struct request *req)
 			ring_req->u.discard.flag = 0;
 	} else {
 		BUG_ON(info->max_indirect_segments == 0 &&
-		       req->nr_phys_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST);
+		       (XEN_PAGES_PER_SEGMENT * req->nr_phys_segments) > BLKIF_MAX_SEGMENTS_PER_REQUEST);
 		BUG_ON(info->max_indirect_segments &&
-		       req->nr_phys_segments > info->max_indirect_segments);
+		       (req->nr_phys_segments * XEN_PAGES_PER_SEGMENT) > info->max_indirect_segments);
 		nseg = blk_rq_map_sg(req->q, req, info->shadow[id].sg);
+		nr_page = 0;
+		/* Calculate the number of Xen pages used */
+		for_each_sg(info->shadow[id].sg, sg, nseg, i) {
+			nr_page += (round_up(sg->offset + sg->length, XEN_PAGE_SIZE) - round_down(sg->offset, XEN_PAGE_SIZE)) >> XEN_PAGE_SHIFT;
+		}
+
 		ring_req->u.rw.id = id;
-		if (nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST) {
+		info->shadow[id].num_sg = nseg;
+		if (nr_page > BLKIF_MAX_SEGMENTS_PER_REQUEST) {
 			/*
 			 * The indirect operation can only be a BLKIF_OP_READ or
 			 * BLKIF_OP_WRITE
@@ -462,7 +482,7 @@ static int blkif_queue_request(struct request *req)
 				BLKIF_OP_WRITE : BLKIF_OP_READ;
 			ring_req->u.indirect.sector_number = (blkif_sector_t)blk_rq_pos(req);
 			ring_req->u.indirect.handle = info->handle;
-			ring_req->u.indirect.nr_segments = nseg;
+			ring_req->u.indirect.nr_segments = nr_page;
 		} else {
 			ring_req->u.rw.sector_number = (blkif_sector_t)blk_rq_pos(req);
 			ring_req->u.rw.handle = info->handle;
@@ -490,79 +510,95 @@ static int blkif_queue_request(struct request *req)
 					ring_req->operation = 0;
 				}
 			}
-			ring_req->u.rw.nr_segments = nseg;
+			ring_req->u.rw.nr_segments = nr_page;
 		}
+		grant = 0;
 		for_each_sg(info->shadow[id].sg, sg, nseg, i) {
-			fsect = sg->offset >> 9;
-			lsect = fsect + (sg->length >> 9) - 1;
-
-			if ((ring_req->operation == BLKIF_OP_INDIRECT) &&
-			    (i % SEGS_PER_INDIRECT_FRAME == 0)) {
-				unsigned long uninitialized_var(pfn);
-
-				if (segments)
-					kunmap_atomic(segments);
-
-				n = i / SEGS_PER_INDIRECT_FRAME;
-				if (!info->feature_persistent) {
-					struct page *indirect_page;
-
-					/* Fetch a pre-allocated page to use for indirect grefs */
-					BUG_ON(list_empty(&info->indirect_pages));
-					indirect_page = list_first_entry(&info->indirect_pages,
-					                                 struct page, lru);
-					list_del(&indirect_page->lru);
-					pfn = page_to_pfn(indirect_page);
+			sg_total = sg->length;
+			shared_off = xen_offset_in_page(sg->offset);
+			bvec_off = sg->offset;
+			pfn = xen_page_to_pfn(sg_page(sg)) + (sg->offset >> XEN_PAGE_SHIFT);
+
+			while (sg_total != 0) {
+				if ((ring_req->operation == BLKIF_OP_INDIRECT) &&
+				    (grant % XEN_PAGES_PER_INDIRECT_FRAME == 0)) {
+					unsigned long uninitialized_var(pfn);
+
+					if (segments)
+						kunmap_atomic(segments);
+
+					n = grant / XEN_PAGES_PER_INDIRECT_FRAME;
+					if (!info->feature_persistent) {
+						struct page *indirect_page;
+
+						/* Fetch a pre-allocated page to use for indirect grefs */
+						BUG_ON(list_empty(&info->indirect_pages));
+						indirect_page = list_first_entry(&info->indirect_pages,
+						                                 struct page, lru);
+						list_del(&indirect_page->lru);
+						pfn = xen_page_to_pfn(indirect_page);
+					}
+					gnt_list_entry = get_grant(&gref_head, pfn, info);
+					info->shadow[id].indirect_grants[n] = gnt_list_entry;
+					segments = kmap_atomic(xen_pfn_to_page(gnt_list_entry->pfn));
+					ring_req->u.indirect.indirect_grefs[n] = gnt_list_entry->gref;
 				}
-				gnt_list_entry = get_grant(&gref_head, pfn, info);
-				info->shadow[id].indirect_grants[n] = gnt_list_entry;
-				segments = kmap_atomic(pfn_to_page(gnt_list_entry->pfn));
-				ring_req->u.indirect.indirect_grefs[n] = gnt_list_entry->gref;
-			}
 
-			gnt_list_entry = get_grant(&gref_head, page_to_pfn(sg_page(sg)), info);
-			ref = gnt_list_entry->gref;
+				shared_len = min(sg_total, (unsigned)XEN_PAGE_SIZE - shared_off);
 
-			info->shadow[id].grants_used[i] = gnt_list_entry;
 
-			if (rq_data_dir(req) && info->feature_persistent) {
-				char *bvec_data;
-				void *shared_data;
+				gnt_list_entry = get_grant(&gref_head, pfn++, info);
+				ref = gnt_list_entry->gref;
 
-				BUG_ON(sg->offset + sg->length > PAGE_SIZE);
+				info->shadow[id].grants_used[grant] = gnt_list_entry;
 
-				shared_data = kmap_atomic(pfn_to_page(gnt_list_entry->pfn));
-				bvec_data = kmap_atomic(sg_page(sg));
+				if (rq_data_dir(req) && info->feature_persistent) {
+					char *bvec_data;
+					void *shared_data;
 
-				/*
-				 * this does not wipe data stored outside the
-				 * range sg->offset..sg->offset+sg->length.
-				 * Therefore, blkback *could* see data from
-				 * previous requests. This is OK as long as
-				 * persistent grants are shared with just one
-				 * domain. It may need refactoring if this
-				 * changes
-				 */
-				memcpy(shared_data + sg->offset,
-				       bvec_data   + sg->offset,
-				       sg->length);
+					BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 
-				kunmap_atomic(bvec_data);
-				kunmap_atomic(shared_data);
-			}
-			if (ring_req->operation != BLKIF_OP_INDIRECT) {
-				ring_req->u.rw.seg[i] =
+					shared_data = kmap_atomic(xen_pfn_to_page(gnt_list_entry->pfn));
+					bvec_data = kmap_atomic(sg_page(sg));
+
+					/*
+					 * this does not wipe data stored outside the
+					 * range sg->offset..sg->offset+sg->length.
+					 * Therefore, blkback *could* see data from
+					 * previous requests. This is OK as long as
+					 * persistent grants are shared with just one
+					 * domain. It may need refactoring if this
+					 * changes
+					 */
+					memcpy(shared_data + shared_off,
+					       bvec_data   + bvec_off,
+					       sg->length);
+
+					kunmap_atomic(bvec_data);
+					kunmap_atomic(shared_data);
+					bvec_off += shared_off;
+				}
+
+				fsect = shared_off >> 9;
+				lsect = fsect + (shared_len >> 9) - 1;
+				if (ring_req->operation != BLKIF_OP_INDIRECT) {
+					ring_req->u.rw.seg[grant] =
+							(struct blkif_request_segment) {
+								.gref       = ref,
+								.first_sect = fsect,
+								.last_sect  = lsect };
+				} else {
+					n = grant % XEN_PAGES_PER_INDIRECT_FRAME;
+					segments[n] =
 						(struct blkif_request_segment) {
-							.gref       = ref,
-							.first_sect = fsect,
-							.last_sect  = lsect };
-			} else {
-				n = i % SEGS_PER_INDIRECT_FRAME;
-				segments[n] =
-					(struct blkif_request_segment) {
-							.gref       = ref,
-							.first_sect = fsect,
-							.last_sect  = lsect };
+								.gref       = ref,
+								.first_sect = fsect,
+								.last_sect  = lsect };
+				}
+
+				sg_total -= shared_len;
+				shared_off = 0;
+				grant++;
 			}
 		}
 		if (segments)
@@ -674,14 +710,14 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size,
 	/* Hard sector size and max sectors impersonate the equiv. hardware. */
 	blk_queue_logical_block_size(rq, sector_size);
 	blk_queue_physical_block_size(rq, physical_sector_size);
-	blk_queue_max_hw_sectors(rq, (segments * PAGE_SIZE) / 512);
+	blk_queue_max_hw_sectors(rq, (segments * XEN_PAGE_SIZE) / 512);
 
 	/* Each segment in a request is up to an aligned page in size. */
 	blk_queue_segment_boundary(rq, PAGE_SIZE - 1);
 	blk_queue_max_segment_size(rq, PAGE_SIZE);
 
 	/* Ensure a merged request will fit in a single I/O ring slot. */
-	blk_queue_max_segments(rq, segments);
+	blk_queue_max_segments(rq, segments / XEN_PAGES_PER_SEGMENT);
 
 	/* Make sure buffer addresses are sector-aligned. */
 	blk_queue_dma_alignment(rq, 511);
@@ -961,7 +997,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 				info->persistent_gnts_c--;
 			}
 			if (info->feature_persistent)
-				__free_page(pfn_to_page(persistent_gnt->pfn));
+				__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 	}
@@ -996,7 +1032,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 			persistent_gnt = info->shadow[i].grants_used[j];
 			gnttab_end_foreign_access(persistent_gnt->gref, 0, 0UL);
 			if (info->feature_persistent)
-				__free_page(pfn_to_page(persistent_gnt->pfn));
+				__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 
@@ -1010,7 +1046,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 		for (j = 0; j < INDIRECT_GREFS(segs); j++) {
 			persistent_gnt = info->shadow[i].indirect_grants[j];
 			gnttab_end_foreign_access(persistent_gnt->gref, 0, 0UL);
-			__free_page(pfn_to_page(persistent_gnt->pfn));
+			__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 
@@ -1050,26 +1086,42 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 	struct scatterlist *sg;
 	char *bvec_data;
 	void *shared_data;
-	int nseg;
+	int nseg, nr_page;
+	unsigned int total, bvec_offset, shared_offset, length;
+	unsigned int grant = 0;
 
-	nseg = s->req.operation == BLKIF_OP_INDIRECT ?
+	nr_page = s->req.operation == BLKIF_OP_INDIRECT ?
 		s->req.u.indirect.nr_segments : s->req.u.rw.nr_segments;
+	nseg = s->num_sg;
 
 	if (bret->operation == BLKIF_OP_READ && info->feature_persistent) {
 		for_each_sg(s->sg, sg, nseg, i) {
 			BUG_ON(sg->offset + sg->length > PAGE_SIZE);
-			shared_data = kmap_atomic(
-				pfn_to_page(s->grants_used[i]->pfn));
+
+			bvec_offset = sg->offset;
+			shared_offset = xen_offset_in_page(sg->offset);
 			bvec_data = kmap_atomic(sg_page(sg));
-			memcpy(bvec_data   + sg->offset,
-			       shared_data + sg->offset,
-			       sg->length);
+			total = sg->length;
+
+			while (total != 0) {
+				length = min(total, (unsigned)XEN_PAGE_SIZE + shared_offset);
+				shared_data = kmap_atomic(
+					xen_pfn_to_page(s->grants_used[grant]->pfn));
+				memcpy(bvec_data   + bvec_offset,
+				       shared_data + shared_offset,
+				       length);
+				kunmap_atomic(shared_data);
+
+				shared_offset = 0;
+				bvec_offset += length;
+				total -= length;
+				grant++;
+			}
 			kunmap_atomic(bvec_data);
-			kunmap_atomic(shared_data);
 		}
 	}
 	/* Add the persistent grant into the list of free grants */
-	for (i = 0; i < nseg; i++) {
+	for (i = 0; i < nr_page; i++) {
 		if (gnttab_query_foreign_access(s->grants_used[i]->gref)) {
 			/*
 			 * If the grant is still mapped by the backend (the
@@ -1095,7 +1147,7 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 		}
 	}
 	if (s->req.operation == BLKIF_OP_INDIRECT) {
-		for (i = 0; i < INDIRECT_GREFS(nseg); i++) {
+		for (i = 0; i < INDIRECT_GREFS(nr_page); i++) {
 			if (gnttab_query_foreign_access(s->indirect_grants[i]->gref)) {
 				if (!info->feature_persistent)
 					pr_alert_ratelimited("backed has not unmapped grant: %u\n",
@@ -1110,7 +1162,7 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 				 * Add the used indirect page back to the list of
 				 * available pages for indirect grefs.
 				 */
-				indirect_page = pfn_to_page(s->indirect_grants[i]->pfn);
+				indirect_page = xen_pfn_to_page(s->indirect_grants[i]->pfn);
 				list_add(&indirect_page->lru, &info->indirect_pages);
 				s->indirect_grants[i]->gref = GRANT_INVALID_REF;
 				list_add_tail(&s->indirect_grants[i]->node, &info->grants);
@@ -1248,7 +1300,7 @@ static int setup_blkring(struct xenbus_device *dev,
 		return -ENOMEM;
 	}
 	SHARED_RING_INIT(sring);
-	FRONT_RING_INIT(&info->ring, sring, PAGE_SIZE);
+	FRONT_RING_INIT(&info->ring, sring, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, info->ring.sring, 1, &gref);
 	if (err < 0) {
@@ -1562,8 +1614,8 @@ static int blkif_recover(struct blkfront_info *info)
 			atomic_set(&split_bio->pending, pending);
 			split_bio->bio = bio;
 			for (i = 0; i < pending; i++) {
-				offset = (i * segs * PAGE_SIZE) >> 9;
-				size = min((unsigned int)(segs * PAGE_SIZE) >> 9,
+				offset = (i * segs * XEN_PAGE_SIZE) >> 9;
+				size = min((unsigned int)(segs * XEN_PAGE_SIZE) >> 9,
 					   (unsigned int)bio_sectors(bio) - offset);
 				cloned_bio = bio_clone(bio, GFP_NOIO);
 				BUG_ON(cloned_bio == NULL);
@@ -1674,7 +1726,7 @@ static void blkfront_setup_discard(struct blkfront_info *info)
 
 static int blkfront_setup_indirect(struct blkfront_info *info)
 {
-	unsigned int indirect_segments, segs;
+	unsigned int indirect_segments, segs, nr_page;
 	int err, i;
 
 	err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
@@ -1682,14 +1734,15 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 			    NULL);
 	if (err) {
 		info->max_indirect_segments = 0;
-		segs = BLKIF_MAX_SEGMENTS_PER_REQUEST;
+		nr_page = BLKIF_MAX_SEGMENTS_PER_REQUEST;
 	} else {
 		info->max_indirect_segments = min(indirect_segments,
 						  xen_blkif_max_segments);
-		segs = info->max_indirect_segments;
+	    nr_page = info->max_indirect_segments;
 	}
+	segs = nr_page / XEN_PAGES_PER_SEGMENT;
 
-	err = fill_grant_buffer(info, (segs + INDIRECT_GREFS(segs)) * BLK_RING_SIZE);
+	err = fill_grant_buffer(info, (nr_page + INDIRECT_GREFS(nr_page)) * BLK_RING_SIZE);
 	if (err)
 		goto out_of_memory;
 
@@ -1699,7 +1752,7 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 		 * grants, we need to allocate a set of pages that can be
 		 * used for mapping indirect grefs
 		 */
-		int num = INDIRECT_GREFS(segs) * BLK_RING_SIZE;
+		int num = INDIRECT_GREFS(nr_page) * BLK_RING_SIZE;
 
 		BUG_ON(!list_empty(&info->indirect_pages));
 		for (i = 0; i < num; i++) {
@@ -1712,13 +1765,13 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 
 	for (i = 0; i < BLK_RING_SIZE; i++) {
 		info->shadow[i].grants_used = kzalloc(
-			sizeof(info->shadow[i].grants_used[0]) * segs,
+			sizeof(info->shadow[i].grants_used[0]) * nr_page,
 			GFP_NOIO);
 		info->shadow[i].sg = kzalloc(sizeof(info->shadow[i].sg[0]) * segs, GFP_NOIO);
 		if (info->max_indirect_segments)
 			info->shadow[i].indirect_grants = kzalloc(
 				sizeof(info->shadow[i].indirect_grants[0]) *
-				INDIRECT_GREFS(segs),
+				INDIRECT_GREFS(nr_page),
 				GFP_NOIO);
 		if ((info->shadow[i].grants_used == NULL) ||
 			(info->shadow[i].sg == NULL) ||
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 18/23] block/xen-blkfront: Make it running on 64KB page granularity
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

From: Julien Grall <julien.grall@linaro.org>

The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using block
device on a non-modified Xen.

The block API is using segment which should at least be the size of a
Linux page. Therefore, the driver will have to break the page in chunk
of 4K before giving the page to the backend.

Breaking a 64KB segment in 4KB chunk will result to have some chunk with
no data. As the PV protocol always require to have data in the chunk, we
have to count the number of Xen page which will be in use and avoid to
sent empty chunk.

Note that, a pre-defined number of grant is reserved before preparing
the request. This pre-defined number is based on the number and the
maximum size of the segments. If each segment contain a very small
amount of data, the driver may reserve too much grant (16 grant is
reserved per segment with 64KB page granularity).

Futhermore, in the case of persistent grant we allocate one Linux page
per grant although only the 4KB of the page will be effectively use.
This could be improved by share the page with multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monn? <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>

---

Improvement such as support 64KB grant is not taken into consideration in
this patch because we have the requirement to run a Linux using 64KB page
on a non-modified Xen.
---
 drivers/block/xen-blkfront.c | 259 ++++++++++++++++++++++++++-----------------
 1 file changed, 156 insertions(+), 103 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 60cf1d6..c6537ed 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -77,6 +77,7 @@ struct blk_shadow {
 	struct grant **grants_used;
 	struct grant **indirect_grants;
 	struct scatterlist *sg;
+	unsigned int num_sg;
 };
 
 struct split_bio {
@@ -98,7 +99,7 @@ static unsigned int xen_blkif_max_segments = 32;
 module_param_named(max, xen_blkif_max_segments, int, S_IRUGO);
 MODULE_PARM_DESC(max, "Maximum amount of segments in indirect requests (default is 32)");
 
-#define BLK_RING_SIZE __CONST_RING_SIZE(blkif, PAGE_SIZE)
+#define BLK_RING_SIZE __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE)
 
 /*
  * We have one of these per vbd, whether ide, scsi or 'other'.  They
@@ -131,6 +132,7 @@ struct blkfront_info
 	unsigned int discard_granularity;
 	unsigned int discard_alignment;
 	unsigned int feature_persistent:1;
+	/* Number of 4K segment handled */
 	unsigned int max_indirect_segments;
 	int is_ready;
 };
@@ -158,10 +160,19 @@ static DEFINE_SPINLOCK(minor_lock);
 
 #define DEV_NAME	"xvd"	/* name in /dev */
 
-#define SEGS_PER_INDIRECT_FRAME \
-	(PAGE_SIZE/sizeof(struct blkif_request_segment))
-#define INDIRECT_GREFS(_segs) \
-	((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define SEGS_PER_INDIRECT_FRAME	\
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / XEN_PAGES_PER_SEGMENT)
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+
+#define INDIRECT_GREFS(_pages) \
+	((_pages + XEN_PAGES_PER_INDIRECT_FRAME - 1)/XEN_PAGES_PER_INDIRECT_FRAME)
 
 static int blkfront_setup_indirect(struct blkfront_info *info);
 
@@ -204,7 +215,7 @@ static int fill_grant_buffer(struct blkfront_info *info, int num)
 				kfree(gnt_list_entry);
 				goto out_of_memory;
 			}
-			gnt_list_entry->pfn = page_to_pfn(granted_page);
+			gnt_list_entry->pfn = xen_page_to_pfn(granted_page);
 		}
 
 		gnt_list_entry->gref = GRANT_INVALID_REF;
@@ -219,7 +230,7 @@ out_of_memory:
 	                         &info->grants, node) {
 		list_del(&gnt_list_entry->node);
 		if (info->feature_persistent)
-			__free_page(pfn_to_page(gnt_list_entry->pfn));
+			__free_page(xen_pfn_to_page(gnt_list_entry->pfn));
 		kfree(gnt_list_entry);
 		i--;
 	}
@@ -389,7 +400,8 @@ static int blkif_queue_request(struct request *req)
 	struct blkif_request *ring_req;
 	unsigned long id;
 	unsigned int fsect, lsect;
-	int i, ref, n;
+	unsigned int shared_off, shared_len, bvec_off, sg_total;
+	int i, ref, n, grant;
 	struct blkif_request_segment *segments = NULL;
 
 	/*
@@ -401,18 +413,19 @@ static int blkif_queue_request(struct request *req)
 	grant_ref_t gref_head;
 	struct grant *gnt_list_entry = NULL;
 	struct scatterlist *sg;
-	int nseg, max_grefs;
+	int nseg, max_grefs, nr_page;
+	unsigned long pfn;
 
 	if (unlikely(info->connected != BLKIF_STATE_CONNECTED))
 		return 1;
 
-	max_grefs = req->nr_phys_segments;
+	max_grefs = req->nr_phys_segments * XEN_PAGES_PER_SEGMENT;
 	if (max_grefs > BLKIF_MAX_SEGMENTS_PER_REQUEST)
 		/*
 		 * If we are using indirect segments we need to account
 		 * for the indirect grefs used in the request.
 		 */
-		max_grefs += INDIRECT_GREFS(req->nr_phys_segments);
+		max_grefs += INDIRECT_GREFS(req->nr_phys_segments * XEN_PAGES_PER_SEGMENT);
 
 	/* Check if we have enough grants to allocate a requests */
 	if (info->persistent_gnts_c < max_grefs) {
@@ -446,12 +459,19 @@ static int blkif_queue_request(struct request *req)
 			ring_req->u.discard.flag = 0;
 	} else {
 		BUG_ON(info->max_indirect_segments == 0 &&
-		       req->nr_phys_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST);
+		       (XEN_PAGES_PER_SEGMENT * req->nr_phys_segments) > BLKIF_MAX_SEGMENTS_PER_REQUEST);
 		BUG_ON(info->max_indirect_segments &&
-		       req->nr_phys_segments > info->max_indirect_segments);
+		       (req->nr_phys_segments * XEN_PAGES_PER_SEGMENT) > info->max_indirect_segments);
 		nseg = blk_rq_map_sg(req->q, req, info->shadow[id].sg);
+		nr_page = 0;
+		/* Calculate the number of Xen pages used */
+		for_each_sg(info->shadow[id].sg, sg, nseg, i) {
+			nr_page += (round_up(sg->offset + sg->length, XEN_PAGE_SIZE) - round_down(sg->offset, XEN_PAGE_SIZE)) >> XEN_PAGE_SHIFT;
+		}
+
 		ring_req->u.rw.id = id;
-		if (nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST) {
+		info->shadow[id].num_sg = nseg;
+		if (nr_page > BLKIF_MAX_SEGMENTS_PER_REQUEST) {
 			/*
 			 * The indirect operation can only be a BLKIF_OP_READ or
 			 * BLKIF_OP_WRITE
@@ -462,7 +482,7 @@ static int blkif_queue_request(struct request *req)
 				BLKIF_OP_WRITE : BLKIF_OP_READ;
 			ring_req->u.indirect.sector_number = (blkif_sector_t)blk_rq_pos(req);
 			ring_req->u.indirect.handle = info->handle;
-			ring_req->u.indirect.nr_segments = nseg;
+			ring_req->u.indirect.nr_segments = nr_page;
 		} else {
 			ring_req->u.rw.sector_number = (blkif_sector_t)blk_rq_pos(req);
 			ring_req->u.rw.handle = info->handle;
@@ -490,79 +510,95 @@ static int blkif_queue_request(struct request *req)
 					ring_req->operation = 0;
 				}
 			}
-			ring_req->u.rw.nr_segments = nseg;
+			ring_req->u.rw.nr_segments = nr_page;
 		}
+		grant = 0;
 		for_each_sg(info->shadow[id].sg, sg, nseg, i) {
-			fsect = sg->offset >> 9;
-			lsect = fsect + (sg->length >> 9) - 1;
-
-			if ((ring_req->operation == BLKIF_OP_INDIRECT) &&
-			    (i % SEGS_PER_INDIRECT_FRAME == 0)) {
-				unsigned long uninitialized_var(pfn);
-
-				if (segments)
-					kunmap_atomic(segments);
-
-				n = i / SEGS_PER_INDIRECT_FRAME;
-				if (!info->feature_persistent) {
-					struct page *indirect_page;
-
-					/* Fetch a pre-allocated page to use for indirect grefs */
-					BUG_ON(list_empty(&info->indirect_pages));
-					indirect_page = list_first_entry(&info->indirect_pages,
-					                                 struct page, lru);
-					list_del(&indirect_page->lru);
-					pfn = page_to_pfn(indirect_page);
+			sg_total = sg->length;
+			shared_off = xen_offset_in_page(sg->offset);
+			bvec_off = sg->offset;
+			pfn = xen_page_to_pfn(sg_page(sg)) + (sg->offset >> XEN_PAGE_SHIFT);
+
+			while (sg_total != 0) {
+				if ((ring_req->operation == BLKIF_OP_INDIRECT) &&
+				    (grant % XEN_PAGES_PER_INDIRECT_FRAME == 0)) {
+					unsigned long uninitialized_var(pfn);
+
+					if (segments)
+						kunmap_atomic(segments);
+
+					n = grant / XEN_PAGES_PER_INDIRECT_FRAME;
+					if (!info->feature_persistent) {
+						struct page *indirect_page;
+
+						/* Fetch a pre-allocated page to use for indirect grefs */
+						BUG_ON(list_empty(&info->indirect_pages));
+						indirect_page = list_first_entry(&info->indirect_pages,
+						                                 struct page, lru);
+						list_del(&indirect_page->lru);
+						pfn = xen_page_to_pfn(indirect_page);
+					}
+					gnt_list_entry = get_grant(&gref_head, pfn, info);
+					info->shadow[id].indirect_grants[n] = gnt_list_entry;
+					segments = kmap_atomic(xen_pfn_to_page(gnt_list_entry->pfn));
+					ring_req->u.indirect.indirect_grefs[n] = gnt_list_entry->gref;
 				}
-				gnt_list_entry = get_grant(&gref_head, pfn, info);
-				info->shadow[id].indirect_grants[n] = gnt_list_entry;
-				segments = kmap_atomic(pfn_to_page(gnt_list_entry->pfn));
-				ring_req->u.indirect.indirect_grefs[n] = gnt_list_entry->gref;
-			}
 
-			gnt_list_entry = get_grant(&gref_head, page_to_pfn(sg_page(sg)), info);
-			ref = gnt_list_entry->gref;
+				shared_len = min(sg_total, (unsigned)XEN_PAGE_SIZE - shared_off);
 
-			info->shadow[id].grants_used[i] = gnt_list_entry;
 
-			if (rq_data_dir(req) && info->feature_persistent) {
-				char *bvec_data;
-				void *shared_data;
+				gnt_list_entry = get_grant(&gref_head, pfn++, info);
+				ref = gnt_list_entry->gref;
 
-				BUG_ON(sg->offset + sg->length > PAGE_SIZE);
+				info->shadow[id].grants_used[grant] = gnt_list_entry;
 
-				shared_data = kmap_atomic(pfn_to_page(gnt_list_entry->pfn));
-				bvec_data = kmap_atomic(sg_page(sg));
+				if (rq_data_dir(req) && info->feature_persistent) {
+					char *bvec_data;
+					void *shared_data;
 
-				/*
-				 * this does not wipe data stored outside the
-				 * range sg->offset..sg->offset+sg->length.
-				 * Therefore, blkback *could* see data from
-				 * previous requests. This is OK as long as
-				 * persistent grants are shared with just one
-				 * domain. It may need refactoring if this
-				 * changes
-				 */
-				memcpy(shared_data + sg->offset,
-				       bvec_data   + sg->offset,
-				       sg->length);
+					BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 
-				kunmap_atomic(bvec_data);
-				kunmap_atomic(shared_data);
-			}
-			if (ring_req->operation != BLKIF_OP_INDIRECT) {
-				ring_req->u.rw.seg[i] =
+					shared_data = kmap_atomic(xen_pfn_to_page(gnt_list_entry->pfn));
+					bvec_data = kmap_atomic(sg_page(sg));
+
+					/*
+					 * this does not wipe data stored outside the
+					 * range sg->offset..sg->offset+sg->length.
+					 * Therefore, blkback *could* see data from
+					 * previous requests. This is OK as long as
+					 * persistent grants are shared with just one
+					 * domain. It may need refactoring if this
+					 * changes
+					 */
+					memcpy(shared_data + shared_off,
+					       bvec_data   + bvec_off,
+					       sg->length);
+
+					kunmap_atomic(bvec_data);
+					kunmap_atomic(shared_data);
+					bvec_off += shared_off;
+				}
+
+				fsect = shared_off >> 9;
+				lsect = fsect + (shared_len >> 9) - 1;
+				if (ring_req->operation != BLKIF_OP_INDIRECT) {
+					ring_req->u.rw.seg[grant] =
+							(struct blkif_request_segment) {
+								.gref       = ref,
+								.first_sect = fsect,
+								.last_sect  = lsect };
+				} else {
+					n = grant % XEN_PAGES_PER_INDIRECT_FRAME;
+					segments[n] =
 						(struct blkif_request_segment) {
-							.gref       = ref,
-							.first_sect = fsect,
-							.last_sect  = lsect };
-			} else {
-				n = i % SEGS_PER_INDIRECT_FRAME;
-				segments[n] =
-					(struct blkif_request_segment) {
-							.gref       = ref,
-							.first_sect = fsect,
-							.last_sect  = lsect };
+								.gref       = ref,
+								.first_sect = fsect,
+								.last_sect  = lsect };
+				}
+
+				sg_total -= shared_len;
+				shared_off = 0;
+				grant++;
 			}
 		}
 		if (segments)
@@ -674,14 +710,14 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size,
 	/* Hard sector size and max sectors impersonate the equiv. hardware. */
 	blk_queue_logical_block_size(rq, sector_size);
 	blk_queue_physical_block_size(rq, physical_sector_size);
-	blk_queue_max_hw_sectors(rq, (segments * PAGE_SIZE) / 512);
+	blk_queue_max_hw_sectors(rq, (segments * XEN_PAGE_SIZE) / 512);
 
 	/* Each segment in a request is up to an aligned page in size. */
 	blk_queue_segment_boundary(rq, PAGE_SIZE - 1);
 	blk_queue_max_segment_size(rq, PAGE_SIZE);
 
 	/* Ensure a merged request will fit in a single I/O ring slot. */
-	blk_queue_max_segments(rq, segments);
+	blk_queue_max_segments(rq, segments / XEN_PAGES_PER_SEGMENT);
 
 	/* Make sure buffer addresses are sector-aligned. */
 	blk_queue_dma_alignment(rq, 511);
@@ -961,7 +997,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 				info->persistent_gnts_c--;
 			}
 			if (info->feature_persistent)
-				__free_page(pfn_to_page(persistent_gnt->pfn));
+				__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 	}
@@ -996,7 +1032,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 			persistent_gnt = info->shadow[i].grants_used[j];
 			gnttab_end_foreign_access(persistent_gnt->gref, 0, 0UL);
 			if (info->feature_persistent)
-				__free_page(pfn_to_page(persistent_gnt->pfn));
+				__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 
@@ -1010,7 +1046,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 		for (j = 0; j < INDIRECT_GREFS(segs); j++) {
 			persistent_gnt = info->shadow[i].indirect_grants[j];
 			gnttab_end_foreign_access(persistent_gnt->gref, 0, 0UL);
-			__free_page(pfn_to_page(persistent_gnt->pfn));
+			__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 
@@ -1050,26 +1086,42 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 	struct scatterlist *sg;
 	char *bvec_data;
 	void *shared_data;
-	int nseg;
+	int nseg, nr_page;
+	unsigned int total, bvec_offset, shared_offset, length;
+	unsigned int grant = 0;
 
-	nseg = s->req.operation == BLKIF_OP_INDIRECT ?
+	nr_page = s->req.operation == BLKIF_OP_INDIRECT ?
 		s->req.u.indirect.nr_segments : s->req.u.rw.nr_segments;
+	nseg = s->num_sg;
 
 	if (bret->operation == BLKIF_OP_READ && info->feature_persistent) {
 		for_each_sg(s->sg, sg, nseg, i) {
 			BUG_ON(sg->offset + sg->length > PAGE_SIZE);
-			shared_data = kmap_atomic(
-				pfn_to_page(s->grants_used[i]->pfn));
+
+			bvec_offset = sg->offset;
+			shared_offset = xen_offset_in_page(sg->offset);
 			bvec_data = kmap_atomic(sg_page(sg));
-			memcpy(bvec_data   + sg->offset,
-			       shared_data + sg->offset,
-			       sg->length);
+			total = sg->length;
+
+			while (total != 0) {
+				length = min(total, (unsigned)XEN_PAGE_SIZE + shared_offset);
+				shared_data = kmap_atomic(
+					xen_pfn_to_page(s->grants_used[grant]->pfn));
+				memcpy(bvec_data   + bvec_offset,
+				       shared_data + shared_offset,
+				       length);
+				kunmap_atomic(shared_data);
+
+				shared_offset = 0;
+				bvec_offset += length;
+				total -= length;
+				grant++;
+			}
 			kunmap_atomic(bvec_data);
-			kunmap_atomic(shared_data);
 		}
 	}
 	/* Add the persistent grant into the list of free grants */
-	for (i = 0; i < nseg; i++) {
+	for (i = 0; i < nr_page; i++) {
 		if (gnttab_query_foreign_access(s->grants_used[i]->gref)) {
 			/*
 			 * If the grant is still mapped by the backend (the
@@ -1095,7 +1147,7 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 		}
 	}
 	if (s->req.operation == BLKIF_OP_INDIRECT) {
-		for (i = 0; i < INDIRECT_GREFS(nseg); i++) {
+		for (i = 0; i < INDIRECT_GREFS(nr_page); i++) {
 			if (gnttab_query_foreign_access(s->indirect_grants[i]->gref)) {
 				if (!info->feature_persistent)
 					pr_alert_ratelimited("backed has not unmapped grant: %u\n",
@@ -1110,7 +1162,7 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 				 * Add the used indirect page back to the list of
 				 * available pages for indirect grefs.
 				 */
-				indirect_page = pfn_to_page(s->indirect_grants[i]->pfn);
+				indirect_page = xen_pfn_to_page(s->indirect_grants[i]->pfn);
 				list_add(&indirect_page->lru, &info->indirect_pages);
 				s->indirect_grants[i]->gref = GRANT_INVALID_REF;
 				list_add_tail(&s->indirect_grants[i]->node, &info->grants);
@@ -1248,7 +1300,7 @@ static int setup_blkring(struct xenbus_device *dev,
 		return -ENOMEM;
 	}
 	SHARED_RING_INIT(sring);
-	FRONT_RING_INIT(&info->ring, sring, PAGE_SIZE);
+	FRONT_RING_INIT(&info->ring, sring, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, info->ring.sring, 1, &gref);
 	if (err < 0) {
@@ -1562,8 +1614,8 @@ static int blkif_recover(struct blkfront_info *info)
 			atomic_set(&split_bio->pending, pending);
 			split_bio->bio = bio;
 			for (i = 0; i < pending; i++) {
-				offset = (i * segs * PAGE_SIZE) >> 9;
-				size = min((unsigned int)(segs * PAGE_SIZE) >> 9,
+				offset = (i * segs * XEN_PAGE_SIZE) >> 9;
+				size = min((unsigned int)(segs * XEN_PAGE_SIZE) >> 9,
 					   (unsigned int)bio_sectors(bio) - offset);
 				cloned_bio = bio_clone(bio, GFP_NOIO);
 				BUG_ON(cloned_bio == NULL);
@@ -1674,7 +1726,7 @@ static void blkfront_setup_discard(struct blkfront_info *info)
 
 static int blkfront_setup_indirect(struct blkfront_info *info)
 {
-	unsigned int indirect_segments, segs;
+	unsigned int indirect_segments, segs, nr_page;
 	int err, i;
 
 	err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
@@ -1682,14 +1734,15 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 			    NULL);
 	if (err) {
 		info->max_indirect_segments = 0;
-		segs = BLKIF_MAX_SEGMENTS_PER_REQUEST;
+		nr_page = BLKIF_MAX_SEGMENTS_PER_REQUEST;
 	} else {
 		info->max_indirect_segments = min(indirect_segments,
 						  xen_blkif_max_segments);
-		segs = info->max_indirect_segments;
+	    nr_page = info->max_indirect_segments;
 	}
+	segs = nr_page / XEN_PAGES_PER_SEGMENT;
 
-	err = fill_grant_buffer(info, (segs + INDIRECT_GREFS(segs)) * BLK_RING_SIZE);
+	err = fill_grant_buffer(info, (nr_page + INDIRECT_GREFS(nr_page)) * BLK_RING_SIZE);
 	if (err)
 		goto out_of_memory;
 
@@ -1699,7 +1752,7 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 		 * grants, we need to allocate a set of pages that can be
 		 * used for mapping indirect grefs
 		 */
-		int num = INDIRECT_GREFS(segs) * BLK_RING_SIZE;
+		int num = INDIRECT_GREFS(nr_page) * BLK_RING_SIZE;
 
 		BUG_ON(!list_empty(&info->indirect_pages));
 		for (i = 0; i < num; i++) {
@@ -1712,13 +1765,13 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 
 	for (i = 0; i < BLK_RING_SIZE; i++) {
 		info->shadow[i].grants_used = kzalloc(
-			sizeof(info->shadow[i].grants_used[0]) * segs,
+			sizeof(info->shadow[i].grants_used[0]) * nr_page,
 			GFP_NOIO);
 		info->shadow[i].sg = kzalloc(sizeof(info->shadow[i].sg[0]) * segs, GFP_NOIO);
 		if (info->max_indirect_segments)
 			info->shadow[i].indirect_grants = kzalloc(
 				sizeof(info->shadow[i].indirect_grants[0]) *
-				INDIRECT_GREFS(segs),
+				INDIRECT_GREFS(nr_page),
 				GFP_NOIO);
 		if ((info->shadow[i].grants_used == NULL) ||
 			(info->shadow[i].sg == NULL) ||
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 18/23] block/xen-blkfront: Make it running on 64KB page granularity
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, Julien Grall, David Vrabel, Boris Ostrovsky,
	linux-arm-kernel, Roger Pau Monné

From: Julien Grall <julien.grall@linaro.org>

The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using block
device on a non-modified Xen.

The block API is using segment which should at least be the size of a
Linux page. Therefore, the driver will have to break the page in chunk
of 4K before giving the page to the backend.

Breaking a 64KB segment in 4KB chunk will result to have some chunk with
no data. As the PV protocol always require to have data in the chunk, we
have to count the number of Xen page which will be in use and avoid to
sent empty chunk.

Note that, a pre-defined number of grant is reserved before preparing
the request. This pre-defined number is based on the number and the
maximum size of the segments. If each segment contain a very small
amount of data, the driver may reserve too much grant (16 grant is
reserved per segment with 64KB page granularity).

Futhermore, in the case of persistent grant we allocate one Linux page
per grant although only the 4KB of the page will be effectively use.
This could be improved by share the page with multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>

---

Improvement such as support 64KB grant is not taken into consideration in
this patch because we have the requirement to run a Linux using 64KB page
on a non-modified Xen.
---
 drivers/block/xen-blkfront.c | 259 ++++++++++++++++++++++++++-----------------
 1 file changed, 156 insertions(+), 103 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 60cf1d6..c6537ed 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -77,6 +77,7 @@ struct blk_shadow {
 	struct grant **grants_used;
 	struct grant **indirect_grants;
 	struct scatterlist *sg;
+	unsigned int num_sg;
 };
 
 struct split_bio {
@@ -98,7 +99,7 @@ static unsigned int xen_blkif_max_segments = 32;
 module_param_named(max, xen_blkif_max_segments, int, S_IRUGO);
 MODULE_PARM_DESC(max, "Maximum amount of segments in indirect requests (default is 32)");
 
-#define BLK_RING_SIZE __CONST_RING_SIZE(blkif, PAGE_SIZE)
+#define BLK_RING_SIZE __CONST_RING_SIZE(blkif, XEN_PAGE_SIZE)
 
 /*
  * We have one of these per vbd, whether ide, scsi or 'other'.  They
@@ -131,6 +132,7 @@ struct blkfront_info
 	unsigned int discard_granularity;
 	unsigned int discard_alignment;
 	unsigned int feature_persistent:1;
+	/* Number of 4K segment handled */
 	unsigned int max_indirect_segments;
 	int is_ready;
 };
@@ -158,10 +160,19 @@ static DEFINE_SPINLOCK(minor_lock);
 
 #define DEV_NAME	"xvd"	/* name in /dev */
 
-#define SEGS_PER_INDIRECT_FRAME \
-	(PAGE_SIZE/sizeof(struct blkif_request_segment))
-#define INDIRECT_GREFS(_segs) \
-	((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define SEGS_PER_INDIRECT_FRAME	\
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / XEN_PAGES_PER_SEGMENT)
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+
+#define INDIRECT_GREFS(_pages) \
+	((_pages + XEN_PAGES_PER_INDIRECT_FRAME - 1)/XEN_PAGES_PER_INDIRECT_FRAME)
 
 static int blkfront_setup_indirect(struct blkfront_info *info);
 
@@ -204,7 +215,7 @@ static int fill_grant_buffer(struct blkfront_info *info, int num)
 				kfree(gnt_list_entry);
 				goto out_of_memory;
 			}
-			gnt_list_entry->pfn = page_to_pfn(granted_page);
+			gnt_list_entry->pfn = xen_page_to_pfn(granted_page);
 		}
 
 		gnt_list_entry->gref = GRANT_INVALID_REF;
@@ -219,7 +230,7 @@ out_of_memory:
 	                         &info->grants, node) {
 		list_del(&gnt_list_entry->node);
 		if (info->feature_persistent)
-			__free_page(pfn_to_page(gnt_list_entry->pfn));
+			__free_page(xen_pfn_to_page(gnt_list_entry->pfn));
 		kfree(gnt_list_entry);
 		i--;
 	}
@@ -389,7 +400,8 @@ static int blkif_queue_request(struct request *req)
 	struct blkif_request *ring_req;
 	unsigned long id;
 	unsigned int fsect, lsect;
-	int i, ref, n;
+	unsigned int shared_off, shared_len, bvec_off, sg_total;
+	int i, ref, n, grant;
 	struct blkif_request_segment *segments = NULL;
 
 	/*
@@ -401,18 +413,19 @@ static int blkif_queue_request(struct request *req)
 	grant_ref_t gref_head;
 	struct grant *gnt_list_entry = NULL;
 	struct scatterlist *sg;
-	int nseg, max_grefs;
+	int nseg, max_grefs, nr_page;
+	unsigned long pfn;
 
 	if (unlikely(info->connected != BLKIF_STATE_CONNECTED))
 		return 1;
 
-	max_grefs = req->nr_phys_segments;
+	max_grefs = req->nr_phys_segments * XEN_PAGES_PER_SEGMENT;
 	if (max_grefs > BLKIF_MAX_SEGMENTS_PER_REQUEST)
 		/*
 		 * If we are using indirect segments we need to account
 		 * for the indirect grefs used in the request.
 		 */
-		max_grefs += INDIRECT_GREFS(req->nr_phys_segments);
+		max_grefs += INDIRECT_GREFS(req->nr_phys_segments * XEN_PAGES_PER_SEGMENT);
 
 	/* Check if we have enough grants to allocate a requests */
 	if (info->persistent_gnts_c < max_grefs) {
@@ -446,12 +459,19 @@ static int blkif_queue_request(struct request *req)
 			ring_req->u.discard.flag = 0;
 	} else {
 		BUG_ON(info->max_indirect_segments == 0 &&
-		       req->nr_phys_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST);
+		       (XEN_PAGES_PER_SEGMENT * req->nr_phys_segments) > BLKIF_MAX_SEGMENTS_PER_REQUEST);
 		BUG_ON(info->max_indirect_segments &&
-		       req->nr_phys_segments > info->max_indirect_segments);
+		       (req->nr_phys_segments * XEN_PAGES_PER_SEGMENT) > info->max_indirect_segments);
 		nseg = blk_rq_map_sg(req->q, req, info->shadow[id].sg);
+		nr_page = 0;
+		/* Calculate the number of Xen pages used */
+		for_each_sg(info->shadow[id].sg, sg, nseg, i) {
+			nr_page += (round_up(sg->offset + sg->length, XEN_PAGE_SIZE) - round_down(sg->offset, XEN_PAGE_SIZE)) >> XEN_PAGE_SHIFT;
+		}
+
 		ring_req->u.rw.id = id;
-		if (nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST) {
+		info->shadow[id].num_sg = nseg;
+		if (nr_page > BLKIF_MAX_SEGMENTS_PER_REQUEST) {
 			/*
 			 * The indirect operation can only be a BLKIF_OP_READ or
 			 * BLKIF_OP_WRITE
@@ -462,7 +482,7 @@ static int blkif_queue_request(struct request *req)
 				BLKIF_OP_WRITE : BLKIF_OP_READ;
 			ring_req->u.indirect.sector_number = (blkif_sector_t)blk_rq_pos(req);
 			ring_req->u.indirect.handle = info->handle;
-			ring_req->u.indirect.nr_segments = nseg;
+			ring_req->u.indirect.nr_segments = nr_page;
 		} else {
 			ring_req->u.rw.sector_number = (blkif_sector_t)blk_rq_pos(req);
 			ring_req->u.rw.handle = info->handle;
@@ -490,79 +510,95 @@ static int blkif_queue_request(struct request *req)
 					ring_req->operation = 0;
 				}
 			}
-			ring_req->u.rw.nr_segments = nseg;
+			ring_req->u.rw.nr_segments = nr_page;
 		}
+		grant = 0;
 		for_each_sg(info->shadow[id].sg, sg, nseg, i) {
-			fsect = sg->offset >> 9;
-			lsect = fsect + (sg->length >> 9) - 1;
-
-			if ((ring_req->operation == BLKIF_OP_INDIRECT) &&
-			    (i % SEGS_PER_INDIRECT_FRAME == 0)) {
-				unsigned long uninitialized_var(pfn);
-
-				if (segments)
-					kunmap_atomic(segments);
-
-				n = i / SEGS_PER_INDIRECT_FRAME;
-				if (!info->feature_persistent) {
-					struct page *indirect_page;
-
-					/* Fetch a pre-allocated page to use for indirect grefs */
-					BUG_ON(list_empty(&info->indirect_pages));
-					indirect_page = list_first_entry(&info->indirect_pages,
-					                                 struct page, lru);
-					list_del(&indirect_page->lru);
-					pfn = page_to_pfn(indirect_page);
+			sg_total = sg->length;
+			shared_off = xen_offset_in_page(sg->offset);
+			bvec_off = sg->offset;
+			pfn = xen_page_to_pfn(sg_page(sg)) + (sg->offset >> XEN_PAGE_SHIFT);
+
+			while (sg_total != 0) {
+				if ((ring_req->operation == BLKIF_OP_INDIRECT) &&
+				    (grant % XEN_PAGES_PER_INDIRECT_FRAME == 0)) {
+					unsigned long uninitialized_var(pfn);
+
+					if (segments)
+						kunmap_atomic(segments);
+
+					n = grant / XEN_PAGES_PER_INDIRECT_FRAME;
+					if (!info->feature_persistent) {
+						struct page *indirect_page;
+
+						/* Fetch a pre-allocated page to use for indirect grefs */
+						BUG_ON(list_empty(&info->indirect_pages));
+						indirect_page = list_first_entry(&info->indirect_pages,
+						                                 struct page, lru);
+						list_del(&indirect_page->lru);
+						pfn = xen_page_to_pfn(indirect_page);
+					}
+					gnt_list_entry = get_grant(&gref_head, pfn, info);
+					info->shadow[id].indirect_grants[n] = gnt_list_entry;
+					segments = kmap_atomic(xen_pfn_to_page(gnt_list_entry->pfn));
+					ring_req->u.indirect.indirect_grefs[n] = gnt_list_entry->gref;
 				}
-				gnt_list_entry = get_grant(&gref_head, pfn, info);
-				info->shadow[id].indirect_grants[n] = gnt_list_entry;
-				segments = kmap_atomic(pfn_to_page(gnt_list_entry->pfn));
-				ring_req->u.indirect.indirect_grefs[n] = gnt_list_entry->gref;
-			}
 
-			gnt_list_entry = get_grant(&gref_head, page_to_pfn(sg_page(sg)), info);
-			ref = gnt_list_entry->gref;
+				shared_len = min(sg_total, (unsigned)XEN_PAGE_SIZE - shared_off);
 
-			info->shadow[id].grants_used[i] = gnt_list_entry;
 
-			if (rq_data_dir(req) && info->feature_persistent) {
-				char *bvec_data;
-				void *shared_data;
+				gnt_list_entry = get_grant(&gref_head, pfn++, info);
+				ref = gnt_list_entry->gref;
 
-				BUG_ON(sg->offset + sg->length > PAGE_SIZE);
+				info->shadow[id].grants_used[grant] = gnt_list_entry;
 
-				shared_data = kmap_atomic(pfn_to_page(gnt_list_entry->pfn));
-				bvec_data = kmap_atomic(sg_page(sg));
+				if (rq_data_dir(req) && info->feature_persistent) {
+					char *bvec_data;
+					void *shared_data;
 
-				/*
-				 * this does not wipe data stored outside the
-				 * range sg->offset..sg->offset+sg->length.
-				 * Therefore, blkback *could* see data from
-				 * previous requests. This is OK as long as
-				 * persistent grants are shared with just one
-				 * domain. It may need refactoring if this
-				 * changes
-				 */
-				memcpy(shared_data + sg->offset,
-				       bvec_data   + sg->offset,
-				       sg->length);
+					BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 
-				kunmap_atomic(bvec_data);
-				kunmap_atomic(shared_data);
-			}
-			if (ring_req->operation != BLKIF_OP_INDIRECT) {
-				ring_req->u.rw.seg[i] =
+					shared_data = kmap_atomic(xen_pfn_to_page(gnt_list_entry->pfn));
+					bvec_data = kmap_atomic(sg_page(sg));
+
+					/*
+					 * this does not wipe data stored outside the
+					 * range sg->offset..sg->offset+sg->length.
+					 * Therefore, blkback *could* see data from
+					 * previous requests. This is OK as long as
+					 * persistent grants are shared with just one
+					 * domain. It may need refactoring if this
+					 * changes
+					 */
+					memcpy(shared_data + shared_off,
+					       bvec_data   + bvec_off,
+					       sg->length);
+
+					kunmap_atomic(bvec_data);
+					kunmap_atomic(shared_data);
+					bvec_off += shared_off;
+				}
+
+				fsect = shared_off >> 9;
+				lsect = fsect + (shared_len >> 9) - 1;
+				if (ring_req->operation != BLKIF_OP_INDIRECT) {
+					ring_req->u.rw.seg[grant] =
+							(struct blkif_request_segment) {
+								.gref       = ref,
+								.first_sect = fsect,
+								.last_sect  = lsect };
+				} else {
+					n = grant % XEN_PAGES_PER_INDIRECT_FRAME;
+					segments[n] =
 						(struct blkif_request_segment) {
-							.gref       = ref,
-							.first_sect = fsect,
-							.last_sect  = lsect };
-			} else {
-				n = i % SEGS_PER_INDIRECT_FRAME;
-				segments[n] =
-					(struct blkif_request_segment) {
-							.gref       = ref,
-							.first_sect = fsect,
-							.last_sect  = lsect };
+								.gref       = ref,
+								.first_sect = fsect,
+								.last_sect  = lsect };
+				}
+
+				sg_total -= shared_len;
+				shared_off = 0;
+				grant++;
 			}
 		}
 		if (segments)
@@ -674,14 +710,14 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size,
 	/* Hard sector size and max sectors impersonate the equiv. hardware. */
 	blk_queue_logical_block_size(rq, sector_size);
 	blk_queue_physical_block_size(rq, physical_sector_size);
-	blk_queue_max_hw_sectors(rq, (segments * PAGE_SIZE) / 512);
+	blk_queue_max_hw_sectors(rq, (segments * XEN_PAGE_SIZE) / 512);
 
 	/* Each segment in a request is up to an aligned page in size. */
 	blk_queue_segment_boundary(rq, PAGE_SIZE - 1);
 	blk_queue_max_segment_size(rq, PAGE_SIZE);
 
 	/* Ensure a merged request will fit in a single I/O ring slot. */
-	blk_queue_max_segments(rq, segments);
+	blk_queue_max_segments(rq, segments / XEN_PAGES_PER_SEGMENT);
 
 	/* Make sure buffer addresses are sector-aligned. */
 	blk_queue_dma_alignment(rq, 511);
@@ -961,7 +997,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 				info->persistent_gnts_c--;
 			}
 			if (info->feature_persistent)
-				__free_page(pfn_to_page(persistent_gnt->pfn));
+				__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 	}
@@ -996,7 +1032,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 			persistent_gnt = info->shadow[i].grants_used[j];
 			gnttab_end_foreign_access(persistent_gnt->gref, 0, 0UL);
 			if (info->feature_persistent)
-				__free_page(pfn_to_page(persistent_gnt->pfn));
+				__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 
@@ -1010,7 +1046,7 @@ static void blkif_free(struct blkfront_info *info, int suspend)
 		for (j = 0; j < INDIRECT_GREFS(segs); j++) {
 			persistent_gnt = info->shadow[i].indirect_grants[j];
 			gnttab_end_foreign_access(persistent_gnt->gref, 0, 0UL);
-			__free_page(pfn_to_page(persistent_gnt->pfn));
+			__free_page(xen_pfn_to_page(persistent_gnt->pfn));
 			kfree(persistent_gnt);
 		}
 
@@ -1050,26 +1086,42 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 	struct scatterlist *sg;
 	char *bvec_data;
 	void *shared_data;
-	int nseg;
+	int nseg, nr_page;
+	unsigned int total, bvec_offset, shared_offset, length;
+	unsigned int grant = 0;
 
-	nseg = s->req.operation == BLKIF_OP_INDIRECT ?
+	nr_page = s->req.operation == BLKIF_OP_INDIRECT ?
 		s->req.u.indirect.nr_segments : s->req.u.rw.nr_segments;
+	nseg = s->num_sg;
 
 	if (bret->operation == BLKIF_OP_READ && info->feature_persistent) {
 		for_each_sg(s->sg, sg, nseg, i) {
 			BUG_ON(sg->offset + sg->length > PAGE_SIZE);
-			shared_data = kmap_atomic(
-				pfn_to_page(s->grants_used[i]->pfn));
+
+			bvec_offset = sg->offset;
+			shared_offset = xen_offset_in_page(sg->offset);
 			bvec_data = kmap_atomic(sg_page(sg));
-			memcpy(bvec_data   + sg->offset,
-			       shared_data + sg->offset,
-			       sg->length);
+			total = sg->length;
+
+			while (total != 0) {
+				length = min(total, (unsigned)XEN_PAGE_SIZE + shared_offset);
+				shared_data = kmap_atomic(
+					xen_pfn_to_page(s->grants_used[grant]->pfn));
+				memcpy(bvec_data   + bvec_offset,
+				       shared_data + shared_offset,
+				       length);
+				kunmap_atomic(shared_data);
+
+				shared_offset = 0;
+				bvec_offset += length;
+				total -= length;
+				grant++;
+			}
 			kunmap_atomic(bvec_data);
-			kunmap_atomic(shared_data);
 		}
 	}
 	/* Add the persistent grant into the list of free grants */
-	for (i = 0; i < nseg; i++) {
+	for (i = 0; i < nr_page; i++) {
 		if (gnttab_query_foreign_access(s->grants_used[i]->gref)) {
 			/*
 			 * If the grant is still mapped by the backend (the
@@ -1095,7 +1147,7 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 		}
 	}
 	if (s->req.operation == BLKIF_OP_INDIRECT) {
-		for (i = 0; i < INDIRECT_GREFS(nseg); i++) {
+		for (i = 0; i < INDIRECT_GREFS(nr_page); i++) {
 			if (gnttab_query_foreign_access(s->indirect_grants[i]->gref)) {
 				if (!info->feature_persistent)
 					pr_alert_ratelimited("backed has not unmapped grant: %u\n",
@@ -1110,7 +1162,7 @@ static void blkif_completion(struct blk_shadow *s, struct blkfront_info *info,
 				 * Add the used indirect page back to the list of
 				 * available pages for indirect grefs.
 				 */
-				indirect_page = pfn_to_page(s->indirect_grants[i]->pfn);
+				indirect_page = xen_pfn_to_page(s->indirect_grants[i]->pfn);
 				list_add(&indirect_page->lru, &info->indirect_pages);
 				s->indirect_grants[i]->gref = GRANT_INVALID_REF;
 				list_add_tail(&s->indirect_grants[i]->node, &info->grants);
@@ -1248,7 +1300,7 @@ static int setup_blkring(struct xenbus_device *dev,
 		return -ENOMEM;
 	}
 	SHARED_RING_INIT(sring);
-	FRONT_RING_INIT(&info->ring, sring, PAGE_SIZE);
+	FRONT_RING_INIT(&info->ring, sring, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, info->ring.sring, 1, &gref);
 	if (err < 0) {
@@ -1562,8 +1614,8 @@ static int blkif_recover(struct blkfront_info *info)
 			atomic_set(&split_bio->pending, pending);
 			split_bio->bio = bio;
 			for (i = 0; i < pending; i++) {
-				offset = (i * segs * PAGE_SIZE) >> 9;
-				size = min((unsigned int)(segs * PAGE_SIZE) >> 9,
+				offset = (i * segs * XEN_PAGE_SIZE) >> 9;
+				size = min((unsigned int)(segs * XEN_PAGE_SIZE) >> 9,
 					   (unsigned int)bio_sectors(bio) - offset);
 				cloned_bio = bio_clone(bio, GFP_NOIO);
 				BUG_ON(cloned_bio == NULL);
@@ -1674,7 +1726,7 @@ static void blkfront_setup_discard(struct blkfront_info *info)
 
 static int blkfront_setup_indirect(struct blkfront_info *info)
 {
-	unsigned int indirect_segments, segs;
+	unsigned int indirect_segments, segs, nr_page;
 	int err, i;
 
 	err = xenbus_gather(XBT_NIL, info->xbdev->otherend,
@@ -1682,14 +1734,15 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 			    NULL);
 	if (err) {
 		info->max_indirect_segments = 0;
-		segs = BLKIF_MAX_SEGMENTS_PER_REQUEST;
+		nr_page = BLKIF_MAX_SEGMENTS_PER_REQUEST;
 	} else {
 		info->max_indirect_segments = min(indirect_segments,
 						  xen_blkif_max_segments);
-		segs = info->max_indirect_segments;
+	    nr_page = info->max_indirect_segments;
 	}
+	segs = nr_page / XEN_PAGES_PER_SEGMENT;
 
-	err = fill_grant_buffer(info, (segs + INDIRECT_GREFS(segs)) * BLK_RING_SIZE);
+	err = fill_grant_buffer(info, (nr_page + INDIRECT_GREFS(nr_page)) * BLK_RING_SIZE);
 	if (err)
 		goto out_of_memory;
 
@@ -1699,7 +1752,7 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 		 * grants, we need to allocate a set of pages that can be
 		 * used for mapping indirect grefs
 		 */
-		int num = INDIRECT_GREFS(segs) * BLK_RING_SIZE;
+		int num = INDIRECT_GREFS(nr_page) * BLK_RING_SIZE;
 
 		BUG_ON(!list_empty(&info->indirect_pages));
 		for (i = 0; i < num; i++) {
@@ -1712,13 +1765,13 @@ static int blkfront_setup_indirect(struct blkfront_info *info)
 
 	for (i = 0; i < BLK_RING_SIZE; i++) {
 		info->shadow[i].grants_used = kzalloc(
-			sizeof(info->shadow[i].grants_used[0]) * segs,
+			sizeof(info->shadow[i].grants_used[0]) * nr_page,
 			GFP_NOIO);
 		info->shadow[i].sg = kzalloc(sizeof(info->shadow[i].sg[0]) * segs, GFP_NOIO);
 		if (info->max_indirect_segments)
 			info->shadow[i].indirect_grants = kzalloc(
 				sizeof(info->shadow[i].indirect_grants[0]) *
-				INDIRECT_GREFS(segs),
+				INDIRECT_GREFS(nr_page),
 				GFP_NOIO);
 		if ((info->shadow[i].grants_used == NULL) ||
 			(info->shadow[i].sg == NULL) ||
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 19/23] block/xen-blkback: Make it running on 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:00   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Roger Pau Monné,
	Boris Ostrovsky, David Vrabel

The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity behaving as a
block backend on a non-modified Xen.

It's only necessary to adapt the ring size and the number of request per
indirect frames. The rest of the code is relying on the grant table
code.

Note that the grant table code is allocating a Linux page per grant
which will result to waste 6OKB for every grant when Linux is using 64KB
page granularity. This could be improved by sharing the page between
multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.

This has been tested only with a loop device. I plan to test passing
hard drive partition but I didn't yet convert the swiotlb code.
---
 drivers/block/xen-blkback/blkback.c |  5 +++--
 drivers/block/xen-blkback/common.h  | 16 +++++++++++++---
 drivers/block/xen-blkback/xenbus.c  |  6 +++---
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 7049528..1803c07 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -954,7 +954,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req,
 		seg[n].nsec = segments[i].last_sect -
 			segments[i].first_sect + 1;
 		seg[n].offset = (segments[i].first_sect << 9);
-		if ((segments[i].last_sect >= (PAGE_SIZE >> 9)) ||
+		if ((segments[i].last_sect >= (XEN_PAGE_SIZE >> 9)) ||
 		    (segments[i].last_sect < segments[i].first_sect)) {
 			rc = -EINVAL;
 			goto unmap;
@@ -1203,6 +1203,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
 	req_operation = req->operation == BLKIF_OP_INDIRECT ?
 			req->u.indirect.indirect_op : req->operation;
+
 	if ((req->operation == BLKIF_OP_INDIRECT) &&
 	    (req_operation != BLKIF_OP_READ) &&
 	    (req_operation != BLKIF_OP_WRITE)) {
@@ -1261,7 +1262,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 			seg[i].nsec = req->u.rw.seg[i].last_sect -
 				req->u.rw.seg[i].first_sect + 1;
 			seg[i].offset = (req->u.rw.seg[i].first_sect << 9);
-			if ((req->u.rw.seg[i].last_sect >= (PAGE_SIZE >> 9)) ||
+			if ((req->u.rw.seg[i].last_sect >= (XEN_PAGE_SIZE >> 9)) ||
 			    (req->u.rw.seg[i].last_sect <
 			     req->u.rw.seg[i].first_sect))
 				goto fail_response;
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index 7a03e07..ef15ad4 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -39,6 +39,7 @@
 #include <asm/pgalloc.h>
 #include <asm/hypervisor.h>
 #include <xen/grant_table.h>
+#include <xen/page.h>
 #include <xen/xenbus.h>
 #include <xen/interface/io/ring.h>
 #include <xen/interface/io/blkif.h>
@@ -50,12 +51,21 @@
  */
 #define MAX_INDIRECT_SEGMENTS 256
 
-#define SEGS_PER_INDIRECT_FRAME \
-	(PAGE_SIZE/sizeof(struct blkif_request_segment))
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define SEGS_PER_INDIRECT_FRAME	\
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / XEN_PAGES_PER_SEGMENT)
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+
 #define MAX_INDIRECT_PAGES \
 	((MAX_INDIRECT_SEGMENTS + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
 #define INDIRECT_PAGES(_segs) \
-	((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+	((_segs + XEN_PAGES_PER_INDIRECT_FRAME - 1)/XEN_PAGES_PER_INDIRECT_FRAME)
 
 /* Not a real protocol.  Used to generate ring structs which contain
  * the elements common to all protocols only.  This way we get a
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 6ab69ad..2fcf24e 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -217,21 +217,21 @@ static int xen_blkif_map(struct xen_blkif *blkif, grant_ref_t gref,
 	{
 		struct blkif_sring *sring;
 		sring = (struct blkif_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.native, sring, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.native, sring, XEN_PAGE_SIZE);
 		break;
 	}
 	case BLKIF_PROTOCOL_X86_32:
 	{
 		struct blkif_x86_32_sring *sring_x86_32;
 		sring_x86_32 = (struct blkif_x86_32_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, XEN_PAGE_SIZE);
 		break;
 	}
 	case BLKIF_PROTOCOL_X86_64:
 	{
 		struct blkif_x86_64_sring *sring_x86_64;
 		sring_x86_64 = (struct blkif_x86_64_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, XEN_PAGE_SIZE);
 		break;
 	}
 	default:
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 19/23] block/xen-blkback: Make it running on 64KB page granularity
@ 2015-05-14 17:00   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity behaving as a
block backend on a non-modified Xen.

It's only necessary to adapt the ring size and the number of request per
indirect frames. The rest of the code is relying on the grant table
code.

Note that the grant table code is allocating a Linux page per grant
which will result to waste 6OKB for every grant when Linux is using 64KB
page granularity. This could be improved by sharing the page between
multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Roger Pau Monn?" <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.

This has been tested only with a loop device. I plan to test passing
hard drive partition but I didn't yet convert the swiotlb code.
---
 drivers/block/xen-blkback/blkback.c |  5 +++--
 drivers/block/xen-blkback/common.h  | 16 +++++++++++++---
 drivers/block/xen-blkback/xenbus.c  |  6 +++---
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 7049528..1803c07 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -954,7 +954,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req,
 		seg[n].nsec = segments[i].last_sect -
 			segments[i].first_sect + 1;
 		seg[n].offset = (segments[i].first_sect << 9);
-		if ((segments[i].last_sect >= (PAGE_SIZE >> 9)) ||
+		if ((segments[i].last_sect >= (XEN_PAGE_SIZE >> 9)) ||
 		    (segments[i].last_sect < segments[i].first_sect)) {
 			rc = -EINVAL;
 			goto unmap;
@@ -1203,6 +1203,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
 	req_operation = req->operation == BLKIF_OP_INDIRECT ?
 			req->u.indirect.indirect_op : req->operation;
+
 	if ((req->operation == BLKIF_OP_INDIRECT) &&
 	    (req_operation != BLKIF_OP_READ) &&
 	    (req_operation != BLKIF_OP_WRITE)) {
@@ -1261,7 +1262,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 			seg[i].nsec = req->u.rw.seg[i].last_sect -
 				req->u.rw.seg[i].first_sect + 1;
 			seg[i].offset = (req->u.rw.seg[i].first_sect << 9);
-			if ((req->u.rw.seg[i].last_sect >= (PAGE_SIZE >> 9)) ||
+			if ((req->u.rw.seg[i].last_sect >= (XEN_PAGE_SIZE >> 9)) ||
 			    (req->u.rw.seg[i].last_sect <
 			     req->u.rw.seg[i].first_sect))
 				goto fail_response;
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index 7a03e07..ef15ad4 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -39,6 +39,7 @@
 #include <asm/pgalloc.h>
 #include <asm/hypervisor.h>
 #include <xen/grant_table.h>
+#include <xen/page.h>
 #include <xen/xenbus.h>
 #include <xen/interface/io/ring.h>
 #include <xen/interface/io/blkif.h>
@@ -50,12 +51,21 @@
  */
 #define MAX_INDIRECT_SEGMENTS 256
 
-#define SEGS_PER_INDIRECT_FRAME \
-	(PAGE_SIZE/sizeof(struct blkif_request_segment))
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define SEGS_PER_INDIRECT_FRAME	\
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / XEN_PAGES_PER_SEGMENT)
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+
 #define MAX_INDIRECT_PAGES \
 	((MAX_INDIRECT_SEGMENTS + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
 #define INDIRECT_PAGES(_segs) \
-	((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+	((_segs + XEN_PAGES_PER_INDIRECT_FRAME - 1)/XEN_PAGES_PER_INDIRECT_FRAME)
 
 /* Not a real protocol.  Used to generate ring structs which contain
  * the elements common to all protocols only.  This way we get a
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 6ab69ad..2fcf24e 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -217,21 +217,21 @@ static int xen_blkif_map(struct xen_blkif *blkif, grant_ref_t gref,
 	{
 		struct blkif_sring *sring;
 		sring = (struct blkif_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.native, sring, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.native, sring, XEN_PAGE_SIZE);
 		break;
 	}
 	case BLKIF_PROTOCOL_X86_32:
 	{
 		struct blkif_x86_32_sring *sring_x86_32;
 		sring_x86_32 = (struct blkif_x86_32_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, XEN_PAGE_SIZE);
 		break;
 	}
 	case BLKIF_PROTOCOL_X86_64:
 	{
 		struct blkif_x86_64_sring *sring_x86_64;
 		sring_x86_64 = (struct blkif_x86_64_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, XEN_PAGE_SIZE);
 		break;
 	}
 	default:
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 19/23] block/xen-blkback: Make it running on 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
                   ` (26 preceding siblings ...)
  (?)
@ 2015-05-14 17:00 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:00 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel,
	Roger Pau Monné

The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity behaving as a
block backend on a non-modified Xen.

It's only necessary to adapt the ring size and the number of request per
indirect frames. The rest of the code is relying on the grant table
code.

Note that the grant table code is allocating a Linux page per grant
which will result to waste 6OKB for every grant when Linux is using 64KB
page granularity. This could be improved by sharing the page between
multiple grants.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.

This has been tested only with a loop device. I plan to test passing
hard drive partition but I didn't yet convert the swiotlb code.
---
 drivers/block/xen-blkback/blkback.c |  5 +++--
 drivers/block/xen-blkback/common.h  | 16 +++++++++++++---
 drivers/block/xen-blkback/xenbus.c  |  6 +++---
 3 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 7049528..1803c07 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -954,7 +954,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req,
 		seg[n].nsec = segments[i].last_sect -
 			segments[i].first_sect + 1;
 		seg[n].offset = (segments[i].first_sect << 9);
-		if ((segments[i].last_sect >= (PAGE_SIZE >> 9)) ||
+		if ((segments[i].last_sect >= (XEN_PAGE_SIZE >> 9)) ||
 		    (segments[i].last_sect < segments[i].first_sect)) {
 			rc = -EINVAL;
 			goto unmap;
@@ -1203,6 +1203,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 
 	req_operation = req->operation == BLKIF_OP_INDIRECT ?
 			req->u.indirect.indirect_op : req->operation;
+
 	if ((req->operation == BLKIF_OP_INDIRECT) &&
 	    (req_operation != BLKIF_OP_READ) &&
 	    (req_operation != BLKIF_OP_WRITE)) {
@@ -1261,7 +1262,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 			seg[i].nsec = req->u.rw.seg[i].last_sect -
 				req->u.rw.seg[i].first_sect + 1;
 			seg[i].offset = (req->u.rw.seg[i].first_sect << 9);
-			if ((req->u.rw.seg[i].last_sect >= (PAGE_SIZE >> 9)) ||
+			if ((req->u.rw.seg[i].last_sect >= (XEN_PAGE_SIZE >> 9)) ||
 			    (req->u.rw.seg[i].last_sect <
 			     req->u.rw.seg[i].first_sect))
 				goto fail_response;
diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h
index 7a03e07..ef15ad4 100644
--- a/drivers/block/xen-blkback/common.h
+++ b/drivers/block/xen-blkback/common.h
@@ -39,6 +39,7 @@
 #include <asm/pgalloc.h>
 #include <asm/hypervisor.h>
 #include <xen/grant_table.h>
+#include <xen/page.h>
 #include <xen/xenbus.h>
 #include <xen/interface/io/ring.h>
 #include <xen/interface/io/blkif.h>
@@ -50,12 +51,21 @@
  */
 #define MAX_INDIRECT_SEGMENTS 256
 
-#define SEGS_PER_INDIRECT_FRAME \
-	(PAGE_SIZE/sizeof(struct blkif_request_segment))
+/*
+ * Xen use 4K pages. The guest may use different page size (4K or 64K)
+ * Number of Xen pages per segment
+ */
+#define XEN_PAGES_PER_SEGMENT   (PAGE_SIZE / XEN_PAGE_SIZE)
+
+#define SEGS_PER_INDIRECT_FRAME	\
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / XEN_PAGES_PER_SEGMENT)
+#define XEN_PAGES_PER_INDIRECT_FRAME \
+	(XEN_PAGE_SIZE/sizeof(struct blkif_request_segment))
+
 #define MAX_INDIRECT_PAGES \
 	((MAX_INDIRECT_SEGMENTS + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
 #define INDIRECT_PAGES(_segs) \
-	((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME)
+	((_segs + XEN_PAGES_PER_INDIRECT_FRAME - 1)/XEN_PAGES_PER_INDIRECT_FRAME)
 
 /* Not a real protocol.  Used to generate ring structs which contain
  * the elements common to all protocols only.  This way we get a
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 6ab69ad..2fcf24e 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -217,21 +217,21 @@ static int xen_blkif_map(struct xen_blkif *blkif, grant_ref_t gref,
 	{
 		struct blkif_sring *sring;
 		sring = (struct blkif_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.native, sring, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.native, sring, XEN_PAGE_SIZE);
 		break;
 	}
 	case BLKIF_PROTOCOL_X86_32:
 	{
 		struct blkif_x86_32_sring *sring_x86_32;
 		sring_x86_32 = (struct blkif_x86_32_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, XEN_PAGE_SIZE);
 		break;
 	}
 	case BLKIF_PROTOCOL_X86_64:
 	{
 		struct blkif_x86_64_sring *sring_x86_64;
 		sring_x86_64 = (struct blkif_x86_64_sring *)blkif->blk_ring;
-		BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, PAGE_SIZE);
+		BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, XEN_PAGE_SIZE);
 		break;
 	}
 	default:
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 20/23] net/xen-netfront: Make it running on 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:01   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel, netdev

The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using network
device on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Note that we allocate a Linux page for each rx skb but only the first
4KB is used. We may improve the memory usage by extending the size of
the rx skb.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a Linux
using 64KB pages on a non-modified Xen.

Tested with workload such as ping, ssh, wget, git... I would happy if
someone give details how to test all the path.
---
 drivers/net/xen-netfront.c | 43 ++++++++++++++++++++++++++-----------------
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 6a0e329..32a1cb2 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -74,8 +74,8 @@ struct netfront_cb {
 
 #define GRANT_INVALID_REF	0
 
-#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 /* Minimum number of Rx slots (includes slot for GSO metadata). */
 #define NET_RX_SLOTS_MIN (XEN_NETIF_NR_SLOTS_MIN + 1)
@@ -267,7 +267,7 @@ static struct sk_buff *xennet_alloc_one_rx_buffer(struct netfront_queue *queue)
 		kfree_skb(skb);
 		return NULL;
 	}
-	skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
+	skb_add_rx_frag(skb, 0, page, 0, 0, XEN_PAGE_SIZE);
 
 	/* Align ip header to a 16 bytes boundary */
 	skb_reserve(skb, NET_IP_ALIGN);
@@ -291,7 +291,7 @@ static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 		struct sk_buff *skb;
 		unsigned short id;
 		grant_ref_t ref;
-		unsigned long pfn;
+		unsigned long mfn;
 		struct xen_netif_rx_request *req;
 
 		skb = xennet_alloc_one_rx_buffer(queue);
@@ -307,12 +307,12 @@ static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 		BUG_ON((signed short)ref < 0);
 		queue->grant_rx_ref[id] = ref;
 
-		pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
+		mfn = page_to_mfn(skb_frag_page(&skb_shinfo(skb)->frags[0]), 0);
 
 		req = RING_GET_REQUEST(&queue->rx, req_prod);
 		gnttab_grant_foreign_access_ref(ref,
 						queue->info->xbdev->otherend_id,
-						pfn_to_mfn(pfn),
+						mfn,
 						0);
 
 		req->id = id;
@@ -422,8 +422,10 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	unsigned int id;
 	struct xen_netif_tx_request *tx;
 	grant_ref_t ref;
+	unsigned int off_grant;
 
-	len = min_t(unsigned int, PAGE_SIZE - offset, len);
+	off_grant = offset & ~XEN_PAGE_MASK;
+	len = min_t(unsigned int, XEN_PAGE_SIZE - off_grant, len);
 
 	id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs);
 	tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
@@ -431,7 +433,8 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	BUG_ON((signed short)ref < 0);
 
 	gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
-					page_to_mfn(page, 0), GNTMAP_readonly);
+					page_to_mfn(page, offset),
+					GNTMAP_readonly);
 
 	queue->tx_skbs[id].skb = skb;
 	queue->grant_tx_page[id] = page;
@@ -439,7 +442,7 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 
 	tx->id = id;
 	tx->gref = ref;
-	tx->offset = offset;
+	tx->offset = off_grant;
 	tx->size = len;
 	tx->flags = 0;
 
@@ -459,8 +462,11 @@ static struct xen_netif_tx_request *xennet_make_txreqs(
 		tx->flags |= XEN_NETTXF_more_data;
 		tx = xennet_make_one_txreq(queue, skb_get(skb),
 					   page, offset, len);
-		page++;
-		offset = 0;
+		offset += tx->size;
+		if (offset == PAGE_SIZE) {
+			page++;
+			offset = 0;
+		}
 		len -= tx->size;
 	}
 
@@ -567,8 +573,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* First request for the linear area. */
 	first_tx = tx = xennet_make_one_txreq(queue, skb,
 					      page, offset, len);
-	page++;
-	offset = 0;
+	offset += tx->size;
+	if ( offset == PAGE_SIZE ) {
+		page++;
+		offset = 0;
+	}
 	len -= tx->size;
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL)
@@ -730,7 +739,7 @@ static int xennet_get_responses(struct netfront_queue *queue,
 
 	for (;;) {
 		if (unlikely(rx->status < 0 ||
-			     rx->offset + rx->status > PAGE_SIZE)) {
+			     rx->offset + rx->status > XEN_PAGE_SIZE)) {
 			if (net_ratelimit())
 				dev_warn(dev, "rx->offset: %x, size: %d\n",
 					 rx->offset, rx->status);
@@ -839,7 +848,7 @@ static RING_IDX xennet_fill_frags(struct netfront_queue *queue,
 		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
 
 		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
-				rx->offset, rx->status, PAGE_SIZE);
+				rx->offset, rx->status, XEN_PAGE_SIZE);
 
 		skb_shinfo(nskb)->nr_frags = 0;
 		kfree_skb(nskb);
@@ -1497,7 +1506,7 @@ static int setup_netfront(struct xenbus_device *dev,
 		goto fail;
 	}
 	SHARED_RING_INIT(txs);
-	FRONT_RING_INIT(&queue->tx, txs, PAGE_SIZE);
+	FRONT_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, txs, 1, &gref);
 	if (err < 0)
@@ -1511,7 +1520,7 @@ static int setup_netfront(struct xenbus_device *dev,
 		goto alloc_rx_ring_fail;
 	}
 	SHARED_RING_INIT(rxs);
-	FRONT_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
+	FRONT_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, rxs, 1, &gref);
 	if (err < 0)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 20/23] net/xen-netfront: Make it running on 64KB page granularity
@ 2015-05-14 17:01   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using network
device on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Note that we allocate a Linux page for each rx skb but only the first
4KB is used. We may improve the memory usage by extending the size of
the rx skb.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev at vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a Linux
using 64KB pages on a non-modified Xen.

Tested with workload such as ping, ssh, wget, git... I would happy if
someone give details how to test all the path.
---
 drivers/net/xen-netfront.c | 43 ++++++++++++++++++++++++++-----------------
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 6a0e329..32a1cb2 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -74,8 +74,8 @@ struct netfront_cb {
 
 #define GRANT_INVALID_REF	0
 
-#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 /* Minimum number of Rx slots (includes slot for GSO metadata). */
 #define NET_RX_SLOTS_MIN (XEN_NETIF_NR_SLOTS_MIN + 1)
@@ -267,7 +267,7 @@ static struct sk_buff *xennet_alloc_one_rx_buffer(struct netfront_queue *queue)
 		kfree_skb(skb);
 		return NULL;
 	}
-	skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
+	skb_add_rx_frag(skb, 0, page, 0, 0, XEN_PAGE_SIZE);
 
 	/* Align ip header to a 16 bytes boundary */
 	skb_reserve(skb, NET_IP_ALIGN);
@@ -291,7 +291,7 @@ static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 		struct sk_buff *skb;
 		unsigned short id;
 		grant_ref_t ref;
-		unsigned long pfn;
+		unsigned long mfn;
 		struct xen_netif_rx_request *req;
 
 		skb = xennet_alloc_one_rx_buffer(queue);
@@ -307,12 +307,12 @@ static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 		BUG_ON((signed short)ref < 0);
 		queue->grant_rx_ref[id] = ref;
 
-		pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
+		mfn = page_to_mfn(skb_frag_page(&skb_shinfo(skb)->frags[0]), 0);
 
 		req = RING_GET_REQUEST(&queue->rx, req_prod);
 		gnttab_grant_foreign_access_ref(ref,
 						queue->info->xbdev->otherend_id,
-						pfn_to_mfn(pfn),
+						mfn,
 						0);
 
 		req->id = id;
@@ -422,8 +422,10 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	unsigned int id;
 	struct xen_netif_tx_request *tx;
 	grant_ref_t ref;
+	unsigned int off_grant;
 
-	len = min_t(unsigned int, PAGE_SIZE - offset, len);
+	off_grant = offset & ~XEN_PAGE_MASK;
+	len = min_t(unsigned int, XEN_PAGE_SIZE - off_grant, len);
 
 	id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs);
 	tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
@@ -431,7 +433,8 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	BUG_ON((signed short)ref < 0);
 
 	gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
-					page_to_mfn(page, 0), GNTMAP_readonly);
+					page_to_mfn(page, offset),
+					GNTMAP_readonly);
 
 	queue->tx_skbs[id].skb = skb;
 	queue->grant_tx_page[id] = page;
@@ -439,7 +442,7 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 
 	tx->id = id;
 	tx->gref = ref;
-	tx->offset = offset;
+	tx->offset = off_grant;
 	tx->size = len;
 	tx->flags = 0;
 
@@ -459,8 +462,11 @@ static struct xen_netif_tx_request *xennet_make_txreqs(
 		tx->flags |= XEN_NETTXF_more_data;
 		tx = xennet_make_one_txreq(queue, skb_get(skb),
 					   page, offset, len);
-		page++;
-		offset = 0;
+		offset += tx->size;
+		if (offset == PAGE_SIZE) {
+			page++;
+			offset = 0;
+		}
 		len -= tx->size;
 	}
 
@@ -567,8 +573,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* First request for the linear area. */
 	first_tx = tx = xennet_make_one_txreq(queue, skb,
 					      page, offset, len);
-	page++;
-	offset = 0;
+	offset += tx->size;
+	if ( offset == PAGE_SIZE ) {
+		page++;
+		offset = 0;
+	}
 	len -= tx->size;
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL)
@@ -730,7 +739,7 @@ static int xennet_get_responses(struct netfront_queue *queue,
 
 	for (;;) {
 		if (unlikely(rx->status < 0 ||
-			     rx->offset + rx->status > PAGE_SIZE)) {
+			     rx->offset + rx->status > XEN_PAGE_SIZE)) {
 			if (net_ratelimit())
 				dev_warn(dev, "rx->offset: %x, size: %d\n",
 					 rx->offset, rx->status);
@@ -839,7 +848,7 @@ static RING_IDX xennet_fill_frags(struct netfront_queue *queue,
 		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
 
 		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
-				rx->offset, rx->status, PAGE_SIZE);
+				rx->offset, rx->status, XEN_PAGE_SIZE);
 
 		skb_shinfo(nskb)->nr_frags = 0;
 		kfree_skb(nskb);
@@ -1497,7 +1506,7 @@ static int setup_netfront(struct xenbus_device *dev,
 		goto fail;
 	}
 	SHARED_RING_INIT(txs);
-	FRONT_RING_INIT(&queue->tx, txs, PAGE_SIZE);
+	FRONT_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, txs, 1, &gref);
 	if (err < 0)
@@ -1511,7 +1520,7 @@ static int setup_netfront(struct xenbus_device *dev,
 		goto alloc_rx_ring_fail;
 	}
 	SHARED_RING_INIT(rxs);
-	FRONT_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
+	FRONT_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, rxs, 1, &gref);
 	if (err < 0)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 20/23] net/xen-netfront: Make it running on 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
                   ` (28 preceding siblings ...)
  (?)
@ 2015-05-14 17:01 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity using network
device on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Note that we allocate a Linux page for each rx skb but only the first
4KB is used. We may improve the memory usage by extending the size of
the rx skb.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Cc: netdev@vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a Linux
using 64KB pages on a non-modified Xen.

Tested with workload such as ping, ssh, wget, git... I would happy if
someone give details how to test all the path.
---
 drivers/net/xen-netfront.c | 43 ++++++++++++++++++++++++++-----------------
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index 6a0e329..32a1cb2 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -74,8 +74,8 @@ struct netfront_cb {
 
 #define GRANT_INVALID_REF	0
 
-#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define NET_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define NET_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 /* Minimum number of Rx slots (includes slot for GSO metadata). */
 #define NET_RX_SLOTS_MIN (XEN_NETIF_NR_SLOTS_MIN + 1)
@@ -267,7 +267,7 @@ static struct sk_buff *xennet_alloc_one_rx_buffer(struct netfront_queue *queue)
 		kfree_skb(skb);
 		return NULL;
 	}
-	skb_add_rx_frag(skb, 0, page, 0, 0, PAGE_SIZE);
+	skb_add_rx_frag(skb, 0, page, 0, 0, XEN_PAGE_SIZE);
 
 	/* Align ip header to a 16 bytes boundary */
 	skb_reserve(skb, NET_IP_ALIGN);
@@ -291,7 +291,7 @@ static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 		struct sk_buff *skb;
 		unsigned short id;
 		grant_ref_t ref;
-		unsigned long pfn;
+		unsigned long mfn;
 		struct xen_netif_rx_request *req;
 
 		skb = xennet_alloc_one_rx_buffer(queue);
@@ -307,12 +307,12 @@ static void xennet_alloc_rx_buffers(struct netfront_queue *queue)
 		BUG_ON((signed short)ref < 0);
 		queue->grant_rx_ref[id] = ref;
 
-		pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
+		mfn = page_to_mfn(skb_frag_page(&skb_shinfo(skb)->frags[0]), 0);
 
 		req = RING_GET_REQUEST(&queue->rx, req_prod);
 		gnttab_grant_foreign_access_ref(ref,
 						queue->info->xbdev->otherend_id,
-						pfn_to_mfn(pfn),
+						mfn,
 						0);
 
 		req->id = id;
@@ -422,8 +422,10 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	unsigned int id;
 	struct xen_netif_tx_request *tx;
 	grant_ref_t ref;
+	unsigned int off_grant;
 
-	len = min_t(unsigned int, PAGE_SIZE - offset, len);
+	off_grant = offset & ~XEN_PAGE_MASK;
+	len = min_t(unsigned int, XEN_PAGE_SIZE - off_grant, len);
 
 	id = get_id_from_freelist(&queue->tx_skb_freelist, queue->tx_skbs);
 	tx = RING_GET_REQUEST(&queue->tx, queue->tx.req_prod_pvt++);
@@ -431,7 +433,8 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 	BUG_ON((signed short)ref < 0);
 
 	gnttab_grant_foreign_access_ref(ref, queue->info->xbdev->otherend_id,
-					page_to_mfn(page, 0), GNTMAP_readonly);
+					page_to_mfn(page, offset),
+					GNTMAP_readonly);
 
 	queue->tx_skbs[id].skb = skb;
 	queue->grant_tx_page[id] = page;
@@ -439,7 +442,7 @@ static struct xen_netif_tx_request *xennet_make_one_txreq(
 
 	tx->id = id;
 	tx->gref = ref;
-	tx->offset = offset;
+	tx->offset = off_grant;
 	tx->size = len;
 	tx->flags = 0;
 
@@ -459,8 +462,11 @@ static struct xen_netif_tx_request *xennet_make_txreqs(
 		tx->flags |= XEN_NETTXF_more_data;
 		tx = xennet_make_one_txreq(queue, skb_get(skb),
 					   page, offset, len);
-		page++;
-		offset = 0;
+		offset += tx->size;
+		if (offset == PAGE_SIZE) {
+			page++;
+			offset = 0;
+		}
 		len -= tx->size;
 	}
 
@@ -567,8 +573,11 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* First request for the linear area. */
 	first_tx = tx = xennet_make_one_txreq(queue, skb,
 					      page, offset, len);
-	page++;
-	offset = 0;
+	offset += tx->size;
+	if ( offset == PAGE_SIZE ) {
+		page++;
+		offset = 0;
+	}
 	len -= tx->size;
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL)
@@ -730,7 +739,7 @@ static int xennet_get_responses(struct netfront_queue *queue,
 
 	for (;;) {
 		if (unlikely(rx->status < 0 ||
-			     rx->offset + rx->status > PAGE_SIZE)) {
+			     rx->offset + rx->status > XEN_PAGE_SIZE)) {
 			if (net_ratelimit())
 				dev_warn(dev, "rx->offset: %x, size: %d\n",
 					 rx->offset, rx->status);
@@ -839,7 +848,7 @@ static RING_IDX xennet_fill_frags(struct netfront_queue *queue,
 		BUG_ON(shinfo->nr_frags >= MAX_SKB_FRAGS);
 
 		skb_add_rx_frag(skb, shinfo->nr_frags, skb_frag_page(nfrag),
-				rx->offset, rx->status, PAGE_SIZE);
+				rx->offset, rx->status, XEN_PAGE_SIZE);
 
 		skb_shinfo(nskb)->nr_frags = 0;
 		kfree_skb(nskb);
@@ -1497,7 +1506,7 @@ static int setup_netfront(struct xenbus_device *dev,
 		goto fail;
 	}
 	SHARED_RING_INIT(txs);
-	FRONT_RING_INIT(&queue->tx, txs, PAGE_SIZE);
+	FRONT_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, txs, 1, &gref);
 	if (err < 0)
@@ -1511,7 +1520,7 @@ static int setup_netfront(struct xenbus_device *dev,
 		goto alloc_rx_ring_fail;
 	}
 	SHARED_RING_INIT(rxs);
-	FRONT_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
+	FRONT_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
 
 	err = xenbus_grant_ring(dev, rxs, 1, &gref);
 	if (err < 0)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
  (?)
  (?)
@ 2015-05-14 17:01   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Wei Liu, netdev

The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Although only simple workload is working (dhcp request, ping). If I try
to use wget in the guest, it will stall until a tcpdump is started on
the vif interface in DOM0. I wasn't able to find why.

I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
it's used for (I have limited knowledge on the network driver).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: netdev@vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.
---
 drivers/net/xen-netback/common.h  |  7 ++++---
 drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8a495b3..0eda6e9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -44,6 +44,7 @@
 #include <xen/interface/grant_table.h>
 #include <xen/grant_table.h>
 #include <xen/xenbus.h>
+#include <xen/page.h>
 #include <linux/debugfs.h>
 
 typedef unsigned int pending_ring_idx_t;
@@ -64,8 +65,8 @@ struct pending_tx_info {
 	struct ubuf_info callback_struct;
 };
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 struct xenvif_rx_meta {
 	int id;
@@ -80,7 +81,7 @@ struct xenvif_rx_meta {
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0xFFFF
 
-#define MAX_BUFFER_OFFSET PAGE_SIZE
+#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
 
 #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9ae1d43..ea5ce84 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 {
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
-	unsigned long bytes;
+	unsigned long bytes, off_grant;
 	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
 	/* Data must not cross a page boundary. */
@@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		if (npo->copy_off == MAX_BUFFER_OFFSET)
 			meta = get_next_rx_buffer(queue, npo);
 
-		bytes = PAGE_SIZE - offset;
+		off_grant = offset & ~XEN_PAGE_MASK;
+		bytes = XEN_PAGE_SIZE - off_grant;
 		if (bytes > size)
 			bytes = size;
 
@@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		} else {
 			copy_gop->source.domid = DOMID_SELF;
 			copy_gop->source.u.gmfn =
-				virt_to_mfn(page_address(page));
+				virt_to_mfn(page_address(page) + offset);
 		}
-		copy_gop->source.offset = offset;
+		copy_gop->source.offset = off_grant;
 
 		copy_gop->dest.domid = queue->vif->domid;
 		copy_gop->dest.offset = npo->copy_off;
@@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
 		first->size -= txp->size;
 		slots++;
 
-		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
+		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
 			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
 				 txp->offset, txp->size);
 			xenvif_fatal_tx_err(queue->vif);
@@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 		}
 
 		/* No crossing a page as the payload mustn't fragment. */
-		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
+		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
 			netdev_err(queue->vif->dev,
 				   "txreq.offset: %x, size: %u, end: %lu\n",
 				   txreq.offset, txreq.size,
-				   (txreq.offset&~PAGE_MASK) + txreq.size);
+				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
 			xenvif_fatal_tx_err(queue->vif);
 			break;
 		}
@@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			virt_to_mfn(skb->data);
 		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
 		queue->tx_copy_ops[*copy_ops].dest.offset =
-			offset_in_page(skb->data);
+			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
 
 		queue->tx_copy_ops[*copy_ops].len = data_len;
 		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
@@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
 			return -ENOMEM;
 		}
 
-		if (offset + PAGE_SIZE < skb->len)
-			len = PAGE_SIZE;
+		if (offset + XEN_PAGE_SIZE < skb->len)
+			len = XEN_PAGE_SIZE;
 		else
 			len = skb->len - offset;
 		if (skb_copy_bits(skb, offset, page_address(page), len))
@@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
 	/* Fill the skb with the new (local) frags. */
 	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
 	skb_shinfo(skb)->nr_frags = i;
-	skb->truesize += i * PAGE_SIZE;
+	skb->truesize += i * XEN_PAGE_SIZE;
 
 	return 0;
 }
@@ -1780,7 +1781,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
 		goto err;
 
 	txs = (struct xen_netif_tx_sring *)addr;
-	BACK_RING_INIT(&queue->tx, txs, PAGE_SIZE);
+	BACK_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(queue->vif),
 				     &rx_ring_ref, 1, &addr);
@@ -1788,7 +1789,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
 		goto err;
 
 	rxs = (struct xen_netif_rx_sring *)addr;
-	BACK_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
+	BACK_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
 
 	return 0;
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-14 17:01   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, Julien Grall, linux-arm-kernel

The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Although only simple workload is working (dhcp request, ping). If I try
to use wget in the guest, it will stall until a tcpdump is started on
the vif interface in DOM0. I wasn't able to find why.

I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
it's used for (I have limited knowledge on the network driver).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: netdev@vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.
---
 drivers/net/xen-netback/common.h  |  7 ++++---
 drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8a495b3..0eda6e9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -44,6 +44,7 @@
 #include <xen/interface/grant_table.h>
 #include <xen/grant_table.h>
 #include <xen/xenbus.h>
+#include <xen/page.h>
 #include <linux/debugfs.h>
 
 typedef unsigned int pending_ring_idx_t;
@@ -64,8 +65,8 @@ struct pending_tx_info {
 	struct ubuf_info callback_struct;
 };
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 struct xenvif_rx_meta {
 	int id;
@@ -80,7 +81,7 @@ struct xenvif_rx_meta {
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0xFFFF
 
-#define MAX_BUFFER_OFFSET PAGE_SIZE
+#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
 
 #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9ae1d43..ea5ce84 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 {
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
-	unsigned long bytes;
+	unsigned long bytes, off_grant;
 	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
 	/* Data must not cross a page boundary. */
@@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		if (npo->copy_off == MAX_BUFFER_OFFSET)
 			meta = get_next_rx_buffer(queue, npo);
 
-		bytes = PAGE_SIZE - offset;
+		off_grant = offset & ~XEN_PAGE_MASK;
+		bytes = XEN_PAGE_SIZE - off_grant;
 		if (bytes > size)
 			bytes = size;
 
@@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		} else {
 			copy_gop->source.domid = DOMID_SELF;
 			copy_gop->source.u.gmfn =
-				virt_to_mfn(page_address(page));
+				virt_to_mfn(page_address(page) + offset);
 		}
-		copy_gop->source.offset = offset;
+		copy_gop->source.offset = off_grant;
 
 		copy_gop->dest.domid = queue->vif->domid;
 		copy_gop->dest.offset = npo->copy_off;
@@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
 		first->size -= txp->size;
 		slots++;
 
-		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
+		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
 			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
 				 txp->offset, txp->size);
 			xenvif_fatal_tx_err(queue->vif);
@@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 		}
 
 		/* No crossing a page as the payload mustn't fragment. */
-		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
+		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
 			netdev_err(queue->vif->dev,
 				   "txreq.offset: %x, size: %u, end: %lu\n",
 				   txreq.offset, txreq.size,
-				   (txreq.offset&~PAGE_MASK) + txreq.size);
+				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
 			xenvif_fatal_tx_err(queue->vif);
 			break;
 		}
@@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			virt_to_mfn(skb->data);
 		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
 		queue->tx_copy_ops[*copy_ops].dest.offset =
-			offset_in_page(skb->data);
+			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
 
 		queue->tx_copy_ops[*copy_ops].len = data_len;
 		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
@@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
 			return -ENOMEM;
 		}
 
-		if (offset + PAGE_SIZE < skb->len)
-			len = PAGE_SIZE;
+		if (offset + XEN_PAGE_SIZE < skb->len)
+			len = XEN_PAGE_SIZE;
 		else
 			len = skb->len - offset;
 		if (skb_copy_bits(skb, offset, page_address(page), len))
@@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
 	/* Fill the skb with the new (local) frags. */
 	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
 	skb_shinfo(skb)->nr_frags = i;
-	skb->truesize += i * PAGE_SIZE;
+	skb->truesize += i * XEN_PAGE_SIZE;
 
 	return 0;
 }
@@ -1780,7 +1781,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
 		goto err;
 
 	txs = (struct xen_netif_tx_sring *)addr;
-	BACK_RING_INIT(&queue->tx, txs, PAGE_SIZE);
+	BACK_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(queue->vif),
 				     &rx_ring_ref, 1, &addr);
@@ -1788,7 +1789,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
 		goto err;
 
 	rxs = (struct xen_netif_rx_sring *)addr;
-	BACK_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
+	BACK_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
 
 	return 0;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-14 17:01   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Although only simple workload is working (dhcp request, ping). If I try
to use wget in the guest, it will stall until a tcpdump is started on
the vif interface in DOM0. I wasn't able to find why.

I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
it's used for (I have limited knowledge on the network driver).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: netdev at vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.
---
 drivers/net/xen-netback/common.h  |  7 ++++---
 drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8a495b3..0eda6e9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -44,6 +44,7 @@
 #include <xen/interface/grant_table.h>
 #include <xen/grant_table.h>
 #include <xen/xenbus.h>
+#include <xen/page.h>
 #include <linux/debugfs.h>
 
 typedef unsigned int pending_ring_idx_t;
@@ -64,8 +65,8 @@ struct pending_tx_info {
 	struct ubuf_info callback_struct;
 };
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 struct xenvif_rx_meta {
 	int id;
@@ -80,7 +81,7 @@ struct xenvif_rx_meta {
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0xFFFF
 
-#define MAX_BUFFER_OFFSET PAGE_SIZE
+#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
 
 #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9ae1d43..ea5ce84 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 {
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
-	unsigned long bytes;
+	unsigned long bytes, off_grant;
 	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
 	/* Data must not cross a page boundary. */
@@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		if (npo->copy_off == MAX_BUFFER_OFFSET)
 			meta = get_next_rx_buffer(queue, npo);
 
-		bytes = PAGE_SIZE - offset;
+		off_grant = offset & ~XEN_PAGE_MASK;
+		bytes = XEN_PAGE_SIZE - off_grant;
 		if (bytes > size)
 			bytes = size;
 
@@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		} else {
 			copy_gop->source.domid = DOMID_SELF;
 			copy_gop->source.u.gmfn =
-				virt_to_mfn(page_address(page));
+				virt_to_mfn(page_address(page) + offset);
 		}
-		copy_gop->source.offset = offset;
+		copy_gop->source.offset = off_grant;
 
 		copy_gop->dest.domid = queue->vif->domid;
 		copy_gop->dest.offset = npo->copy_off;
@@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
 		first->size -= txp->size;
 		slots++;
 
-		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
+		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
 			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
 				 txp->offset, txp->size);
 			xenvif_fatal_tx_err(queue->vif);
@@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 		}
 
 		/* No crossing a page as the payload mustn't fragment. */
-		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
+		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
 			netdev_err(queue->vif->dev,
 				   "txreq.offset: %x, size: %u, end: %lu\n",
 				   txreq.offset, txreq.size,
-				   (txreq.offset&~PAGE_MASK) + txreq.size);
+				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
 			xenvif_fatal_tx_err(queue->vif);
 			break;
 		}
@@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			virt_to_mfn(skb->data);
 		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
 		queue->tx_copy_ops[*copy_ops].dest.offset =
-			offset_in_page(skb->data);
+			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
 
 		queue->tx_copy_ops[*copy_ops].len = data_len;
 		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
@@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
 			return -ENOMEM;
 		}
 
-		if (offset + PAGE_SIZE < skb->len)
-			len = PAGE_SIZE;
+		if (offset + XEN_PAGE_SIZE < skb->len)
+			len = XEN_PAGE_SIZE;
 		else
 			len = skb->len - offset;
 		if (skb_copy_bits(skb, offset, page_address(page), len))
@@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
 	/* Fill the skb with the new (local) frags. */
 	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
 	skb_shinfo(skb)->nr_frags = i;
-	skb->truesize += i * PAGE_SIZE;
+	skb->truesize += i * XEN_PAGE_SIZE;
 
 	return 0;
 }
@@ -1780,7 +1781,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
 		goto err;
 
 	txs = (struct xen_netif_tx_sring *)addr;
-	BACK_RING_INIT(&queue->tx, txs, PAGE_SIZE);
+	BACK_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(queue->vif),
 				     &rx_ring_ref, 1, &addr);
@@ -1788,7 +1789,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
 		goto err;
 
 	rxs = (struct xen_netif_rx_sring *)addr;
-	BACK_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
+	BACK_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
 
 	return 0;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-14 17:01   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, Julien Grall, linux-arm-kernel

The PV network protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity working as a
network backend on a non-modified Xen.

It's only necessary to adapt the ring size and break skb data in small
chunk of 4KB. The rest of the code is relying on the grant table code.

Although only simple workload is working (dhcp request, ping). If I try
to use wget in the guest, it will stall until a tcpdump is started on
the vif interface in DOM0. I wasn't able to find why.

I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
it's used for (I have limited knowledge on the network driver).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: netdev@vger.kernel.org

---

Improvement such as support of 64KB grant is not taken into
consideration in this patch because we have the requirement to run a
Linux using 64KB pages on a non-modified Xen.
---
 drivers/net/xen-netback/common.h  |  7 ++++---
 drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 8a495b3..0eda6e9 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -44,6 +44,7 @@
 #include <xen/interface/grant_table.h>
 #include <xen/grant_table.h>
 #include <xen/xenbus.h>
+#include <xen/page.h>
 #include <linux/debugfs.h>
 
 typedef unsigned int pending_ring_idx_t;
@@ -64,8 +65,8 @@ struct pending_tx_info {
 	struct ubuf_info callback_struct;
 };
 
-#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
-#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
+#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
+#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
 
 struct xenvif_rx_meta {
 	int id;
@@ -80,7 +81,7 @@ struct xenvif_rx_meta {
 /* Discriminate from any valid pending_idx value. */
 #define INVALID_PENDING_IDX 0xFFFF
 
-#define MAX_BUFFER_OFFSET PAGE_SIZE
+#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
 
 #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
 
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 9ae1d43..ea5ce84 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 {
 	struct gnttab_copy *copy_gop;
 	struct xenvif_rx_meta *meta;
-	unsigned long bytes;
+	unsigned long bytes, off_grant;
 	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
 
 	/* Data must not cross a page boundary. */
@@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		if (npo->copy_off == MAX_BUFFER_OFFSET)
 			meta = get_next_rx_buffer(queue, npo);
 
-		bytes = PAGE_SIZE - offset;
+		off_grant = offset & ~XEN_PAGE_MASK;
+		bytes = XEN_PAGE_SIZE - off_grant;
 		if (bytes > size)
 			bytes = size;
 
@@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
 		} else {
 			copy_gop->source.domid = DOMID_SELF;
 			copy_gop->source.u.gmfn =
-				virt_to_mfn(page_address(page));
+				virt_to_mfn(page_address(page) + offset);
 		}
-		copy_gop->source.offset = offset;
+		copy_gop->source.offset = off_grant;
 
 		copy_gop->dest.domid = queue->vif->domid;
 		copy_gop->dest.offset = npo->copy_off;
@@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
 		first->size -= txp->size;
 		slots++;
 
-		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
+		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
 			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
 				 txp->offset, txp->size);
 			xenvif_fatal_tx_err(queue->vif);
@@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 		}
 
 		/* No crossing a page as the payload mustn't fragment. */
-		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
+		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
 			netdev_err(queue->vif->dev,
 				   "txreq.offset: %x, size: %u, end: %lu\n",
 				   txreq.offset, txreq.size,
-				   (txreq.offset&~PAGE_MASK) + txreq.size);
+				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
 			xenvif_fatal_tx_err(queue->vif);
 			break;
 		}
@@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
 			virt_to_mfn(skb->data);
 		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
 		queue->tx_copy_ops[*copy_ops].dest.offset =
-			offset_in_page(skb->data);
+			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
 
 		queue->tx_copy_ops[*copy_ops].len = data_len;
 		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
@@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
 			return -ENOMEM;
 		}
 
-		if (offset + PAGE_SIZE < skb->len)
-			len = PAGE_SIZE;
+		if (offset + XEN_PAGE_SIZE < skb->len)
+			len = XEN_PAGE_SIZE;
 		else
 			len = skb->len - offset;
 		if (skb_copy_bits(skb, offset, page_address(page), len))
@@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
 	/* Fill the skb with the new (local) frags. */
 	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
 	skb_shinfo(skb)->nr_frags = i;
-	skb->truesize += i * PAGE_SIZE;
+	skb->truesize += i * XEN_PAGE_SIZE;
 
 	return 0;
 }
@@ -1780,7 +1781,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
 		goto err;
 
 	txs = (struct xen_netif_tx_sring *)addr;
-	BACK_RING_INIT(&queue->tx, txs, PAGE_SIZE);
+	BACK_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
 
 	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(queue->vif),
 				     &rx_ring_ref, 1, &addr);
@@ -1788,7 +1789,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
 		goto err;
 
 	rxs = (struct xen_netif_rx_sring *)addr;
-	BACK_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
+	BACK_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
 
 	return 0;
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-14 17:01   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel

The hypercall interface (as well as the toolstack) is always using 4KB
page granularity. When the toolstack is asking for mapping a series of
guest PFN in a batch, it expects to have the page map contiguously in
its virtual memory.

When Linux is using 64KB page granularity, the privcmd driver will have
to map multiple Xen PFN in a single Linux page.

Note that this solution works on page granularity which is a multiple of
4KB.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/privcmd.c   |  8 +++++---
 drivers/xen/xlate_mmu.c | 31 ++++++++++++++++++++-----------
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 5a29616..e8714b4 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -446,7 +446,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 		return -EINVAL;
 	}
 
-	nr_pages = m.num;
+	nr_pages = DIV_ROUND_UP_ULL(m.num, PAGE_SIZE / XEN_PAGE_SIZE);
 	if ((m.num <= 0) || (nr_pages > (LONG_MAX >> PAGE_SHIFT)))
 		return -EINVAL;
 
@@ -494,7 +494,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 			goto out_unlock;
 		}
 		if (xen_feature(XENFEAT_auto_translated_physmap)) {
-			ret = alloc_empty_pages(vma, m.num);
+			ret = alloc_empty_pages(vma, nr_pages);
 			if (ret < 0)
 				goto out_unlock;
 		} else
@@ -518,6 +518,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 	state.global_error  = 0;
 	state.version       = version;
 
+	BUILD_BUG_ON(((PAGE_SIZE / sizeof(xen_pfn_t)) % XEN_PFN_PER_PAGE) != 0);
 	/* mmap_batch_fn guarantees ret == 0 */
 	BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
 				    &pagelist, mmap_batch_fn, &state));
@@ -582,12 +583,13 @@ static void privcmd_close(struct vm_area_struct *vma)
 {
 	struct page **pages = vma->vm_private_data;
 	int numpgs = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+	int nr_pfn = (vma->vm_end - vma->vm_start) >> XEN_PAGE_SHIFT;
 	int rc;
 
 	if (!xen_feature(XENFEAT_auto_translated_physmap) || !numpgs || !pages)
 		return;
 
-	rc = xen_unmap_domain_mfn_range(vma, numpgs, pages);
+	rc = xen_unmap_domain_mfn_range(vma, nr_pfn, pages);
 	if (rc == 0)
 		free_xenballooned_pages(numpgs, pages);
 	else
diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
index 58a5389..b9dfe1b 100644
--- a/drivers/xen/xlate_mmu.c
+++ b/drivers/xen/xlate_mmu.c
@@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
 
 struct remap_data {
 	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
+	xen_pfn_t *egmfn; /* end foreign domain's gmfn */
 	pgprot_t prot;
 	domid_t  domid;
 	struct vm_area_struct *vma;
@@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
 {
 	struct remap_data *info = data;
 	struct page *page = info->pages[info->index++];
-	unsigned long pfn = page_to_pfn(page);
-	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
+	unsigned long pfn = xen_page_to_pfn(page);
+	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
 	int rc;
-
-	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
-	*info->err_ptr++ = rc;
-	if (!rc) {
-		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
-		info->mapped++;
+	uint32_t i;
+
+	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
+		if (info->fgmfn == info->egmfn)
+			break;
+
+		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
+		*info->err_ptr++ = rc;
+		if (!rc) {
+			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
+			info->mapped++;
+		}
+		info->fgmfn++;
 	}
-	info->fgmfn++;
 
 	return 0;
 }
@@ -102,13 +109,14 @@ int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
 {
 	int err;
 	struct remap_data data;
-	unsigned long range = nr << PAGE_SHIFT;
+	unsigned long range = round_up(nr, XEN_PFN_PER_PAGE) << XEN_PAGE_SHIFT;
 
 	/* Kept here for the purpose of making sure code doesn't break
 	   x86 PVOPS */
 	BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP | VM_IO)));
 
 	data.fgmfn = mfn;
+	data.egmfn = mfn + nr;
 	data.prot  = prot;
 	data.domid = domid;
 	data.vma   = vma;
@@ -132,7 +140,8 @@ int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
 		struct xen_remove_from_physmap xrp;
 		unsigned long pfn;
 
-		pfn = page_to_pfn(pages[i]);
+		pfn = xen_page_to_pfn(pages[i / XEN_PFN_PER_PAGE]) +
+			(i % XEN_PFN_PER_PAGE);
 
 		xrp.domid = DOMID_SELF;
 		xrp.gpfn = pfn;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
@ 2015-05-14 17:01   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

The hypercall interface (as well as the toolstack) is always using 4KB
page granularity. When the toolstack is asking for mapping a series of
guest PFN in a batch, it expects to have the page map contiguously in
its virtual memory.

When Linux is using 64KB page granularity, the privcmd driver will have
to map multiple Xen PFN in a single Linux page.

Note that this solution works on page granularity which is a multiple of
4KB.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/privcmd.c   |  8 +++++---
 drivers/xen/xlate_mmu.c | 31 ++++++++++++++++++++-----------
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 5a29616..e8714b4 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -446,7 +446,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 		return -EINVAL;
 	}
 
-	nr_pages = m.num;
+	nr_pages = DIV_ROUND_UP_ULL(m.num, PAGE_SIZE / XEN_PAGE_SIZE);
 	if ((m.num <= 0) || (nr_pages > (LONG_MAX >> PAGE_SHIFT)))
 		return -EINVAL;
 
@@ -494,7 +494,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 			goto out_unlock;
 		}
 		if (xen_feature(XENFEAT_auto_translated_physmap)) {
-			ret = alloc_empty_pages(vma, m.num);
+			ret = alloc_empty_pages(vma, nr_pages);
 			if (ret < 0)
 				goto out_unlock;
 		} else
@@ -518,6 +518,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 	state.global_error  = 0;
 	state.version       = version;
 
+	BUILD_BUG_ON(((PAGE_SIZE / sizeof(xen_pfn_t)) % XEN_PFN_PER_PAGE) != 0);
 	/* mmap_batch_fn guarantees ret == 0 */
 	BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
 				    &pagelist, mmap_batch_fn, &state));
@@ -582,12 +583,13 @@ static void privcmd_close(struct vm_area_struct *vma)
 {
 	struct page **pages = vma->vm_private_data;
 	int numpgs = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+	int nr_pfn = (vma->vm_end - vma->vm_start) >> XEN_PAGE_SHIFT;
 	int rc;
 
 	if (!xen_feature(XENFEAT_auto_translated_physmap) || !numpgs || !pages)
 		return;
 
-	rc = xen_unmap_domain_mfn_range(vma, numpgs, pages);
+	rc = xen_unmap_domain_mfn_range(vma, nr_pfn, pages);
 	if (rc == 0)
 		free_xenballooned_pages(numpgs, pages);
 	else
diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
index 58a5389..b9dfe1b 100644
--- a/drivers/xen/xlate_mmu.c
+++ b/drivers/xen/xlate_mmu.c
@@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
 
 struct remap_data {
 	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
+	xen_pfn_t *egmfn; /* end foreign domain's gmfn */
 	pgprot_t prot;
 	domid_t  domid;
 	struct vm_area_struct *vma;
@@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
 {
 	struct remap_data *info = data;
 	struct page *page = info->pages[info->index++];
-	unsigned long pfn = page_to_pfn(page);
-	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
+	unsigned long pfn = xen_page_to_pfn(page);
+	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
 	int rc;
-
-	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
-	*info->err_ptr++ = rc;
-	if (!rc) {
-		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
-		info->mapped++;
+	uint32_t i;
+
+	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
+		if (info->fgmfn == info->egmfn)
+			break;
+
+		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
+		*info->err_ptr++ = rc;
+		if (!rc) {
+			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
+			info->mapped++;
+		}
+		info->fgmfn++;
 	}
-	info->fgmfn++;
 
 	return 0;
 }
@@ -102,13 +109,14 @@ int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
 {
 	int err;
 	struct remap_data data;
-	unsigned long range = nr << PAGE_SHIFT;
+	unsigned long range = round_up(nr, XEN_PFN_PER_PAGE) << XEN_PAGE_SHIFT;
 
 	/* Kept here for the purpose of making sure code doesn't break
 	   x86 PVOPS */
 	BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP | VM_IO)));
 
 	data.fgmfn = mfn;
+	data.egmfn = mfn + nr;
 	data.prot  = prot;
 	data.domid = domid;
 	data.vma   = vma;
@@ -132,7 +140,8 @@ int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
 		struct xen_remove_from_physmap xrp;
 		unsigned long pfn;
 
-		pfn = page_to_pfn(pages[i]);
+		pfn = xen_page_to_pfn(pages[i / XEN_PFN_PER_PAGE]) +
+			(i % XEN_PFN_PER_PAGE);
 
 		xrp.domid = DOMID_SELF;
 		xrp.gpfn = pfn;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
                   ` (31 preceding siblings ...)
  (?)
@ 2015-05-14 17:01 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Julien Grall, David Vrabel, Boris Ostrovsky, linux-arm-kernel

The hypercall interface (as well as the toolstack) is always using 4KB
page granularity. When the toolstack is asking for mapping a series of
guest PFN in a batch, it expects to have the page map contiguously in
its virtual memory.

When Linux is using 64KB page granularity, the privcmd driver will have
to map multiple Xen PFN in a single Linux page.

Note that this solution works on page granularity which is a multiple of
4KB.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
 drivers/xen/privcmd.c   |  8 +++++---
 drivers/xen/xlate_mmu.c | 31 ++++++++++++++++++++-----------
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
index 5a29616..e8714b4 100644
--- a/drivers/xen/privcmd.c
+++ b/drivers/xen/privcmd.c
@@ -446,7 +446,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 		return -EINVAL;
 	}
 
-	nr_pages = m.num;
+	nr_pages = DIV_ROUND_UP_ULL(m.num, PAGE_SIZE / XEN_PAGE_SIZE);
 	if ((m.num <= 0) || (nr_pages > (LONG_MAX >> PAGE_SHIFT)))
 		return -EINVAL;
 
@@ -494,7 +494,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 			goto out_unlock;
 		}
 		if (xen_feature(XENFEAT_auto_translated_physmap)) {
-			ret = alloc_empty_pages(vma, m.num);
+			ret = alloc_empty_pages(vma, nr_pages);
 			if (ret < 0)
 				goto out_unlock;
 		} else
@@ -518,6 +518,7 @@ static long privcmd_ioctl_mmap_batch(void __user *udata, int version)
 	state.global_error  = 0;
 	state.version       = version;
 
+	BUILD_BUG_ON(((PAGE_SIZE / sizeof(xen_pfn_t)) % XEN_PFN_PER_PAGE) != 0);
 	/* mmap_batch_fn guarantees ret == 0 */
 	BUG_ON(traverse_pages_block(m.num, sizeof(xen_pfn_t),
 				    &pagelist, mmap_batch_fn, &state));
@@ -582,12 +583,13 @@ static void privcmd_close(struct vm_area_struct *vma)
 {
 	struct page **pages = vma->vm_private_data;
 	int numpgs = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
+	int nr_pfn = (vma->vm_end - vma->vm_start) >> XEN_PAGE_SHIFT;
 	int rc;
 
 	if (!xen_feature(XENFEAT_auto_translated_physmap) || !numpgs || !pages)
 		return;
 
-	rc = xen_unmap_domain_mfn_range(vma, numpgs, pages);
+	rc = xen_unmap_domain_mfn_range(vma, nr_pfn, pages);
 	if (rc == 0)
 		free_xenballooned_pages(numpgs, pages);
 	else
diff --git a/drivers/xen/xlate_mmu.c b/drivers/xen/xlate_mmu.c
index 58a5389..b9dfe1b 100644
--- a/drivers/xen/xlate_mmu.c
+++ b/drivers/xen/xlate_mmu.c
@@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
 
 struct remap_data {
 	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
+	xen_pfn_t *egmfn; /* end foreign domain's gmfn */
 	pgprot_t prot;
 	domid_t  domid;
 	struct vm_area_struct *vma;
@@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
 {
 	struct remap_data *info = data;
 	struct page *page = info->pages[info->index++];
-	unsigned long pfn = page_to_pfn(page);
-	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
+	unsigned long pfn = xen_page_to_pfn(page);
+	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
 	int rc;
-
-	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
-	*info->err_ptr++ = rc;
-	if (!rc) {
-		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
-		info->mapped++;
+	uint32_t i;
+
+	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
+		if (info->fgmfn == info->egmfn)
+			break;
+
+		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
+		*info->err_ptr++ = rc;
+		if (!rc) {
+			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
+			info->mapped++;
+		}
+		info->fgmfn++;
 	}
-	info->fgmfn++;
 
 	return 0;
 }
@@ -102,13 +109,14 @@ int xen_xlate_remap_gfn_array(struct vm_area_struct *vma,
 {
 	int err;
 	struct remap_data data;
-	unsigned long range = nr << PAGE_SHIFT;
+	unsigned long range = round_up(nr, XEN_PFN_PER_PAGE) << XEN_PAGE_SHIFT;
 
 	/* Kept here for the purpose of making sure code doesn't break
 	   x86 PVOPS */
 	BUG_ON(!((vma->vm_flags & (VM_PFNMAP | VM_IO)) == (VM_PFNMAP | VM_IO)));
 
 	data.fgmfn = mfn;
+	data.egmfn = mfn + nr;
 	data.prot  = prot;
 	data.domid = domid;
 	data.vma   = vma;
@@ -132,7 +140,8 @@ int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
 		struct xen_remove_from_physmap xrp;
 		unsigned long pfn;
 
-		pfn = page_to_pfn(pages[i]);
+		pfn = xen_page_to_pfn(pages[i / XEN_PFN_PER_PAGE]) +
+			(i % XEN_PFN_PER_PAGE);
 
 		xrp.domid = DOMID_SELF;
 		xrp.gpfn = pfn;
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-05-14 17:00 ` Julien Grall
  (?)
@ 2015-05-14 17:01   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Russell King

The hypercall interface is always using 4KB page granularity. This is
requiring to use xen page definition macro when we deal with hypercall.

Note that pfn_to_mfn is working with a Xen pfn (i.e 4KB). We may want to
rename pfn_mfn to make this explicit.

We also allocate a 64KB page for the shared page even though only the
first 4KB is used. I don't think this is really important for now as it
helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
Xen PFN).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Russell King <linux@arm.linux.org.uk>
---
 arch/arm/include/asm/xen/page.h | 12 ++++++------
 arch/arm/xen/enlighten.c        |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 1bee8ca..ab6eb9a 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -56,19 +56,19 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn)
 
 static inline xmaddr_t phys_to_machine(xpaddr_t phys)
 {
-	unsigned offset = phys.paddr & ~PAGE_MASK;
-	return XMADDR(PFN_PHYS(pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
+	unsigned offset = phys.paddr & ~XEN_PAGE_MASK;
+	return XMADDR(XEN_PFN_PHYS(pfn_to_mfn(XEN_PFN_DOWN(phys.paddr))) | offset);
 }
 
 static inline xpaddr_t machine_to_phys(xmaddr_t machine)
 {
-	unsigned offset = machine.maddr & ~PAGE_MASK;
-	return XPADDR(PFN_PHYS(mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset);
+	unsigned offset = machine.maddr & ~XEN_PAGE_MASK;
+	return XPADDR(XEN_PFN_PHYS(mfn_to_pfn(XEN_PFN_DOWN(machine.maddr))) | offset);
 }
 /* VIRT <-> MACHINE conversion */
 #define virt_to_machine(v)	(phys_to_machine(XPADDR(__pa(v))))
-#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_pfn(v)))
-#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << PAGE_SHIFT))
+#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT))
+#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << XEN_PAGE_SHIFT))
 
 static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr)
 {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 224081c..dcfe251 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -93,8 +93,8 @@ static void xen_percpu_init(void)
 	pr_info("Xen: initializing cpu%d\n", cpu);
 	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
 
-	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
-	info.offset = offset_in_page(vcpup);
+	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
+	info.offset = xen_offset_in_page(vcpup);
 
 	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
 	BUG_ON(err);
@@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
 	xatp.domid = DOMID_SELF;
 	xatp.idx = 0;
 	xatp.space = XENMAPSPACE_shared_info;
-	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
+	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
 	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
 		BUG();
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 23/23] arm/xen: Add support for 64KB page granularity
@ 2015-05-14 17:01   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: linux-arm-kernel

The hypercall interface is always using 4KB page granularity. This is
requiring to use xen page definition macro when we deal with hypercall.

Note that pfn_to_mfn is working with a Xen pfn (i.e 4KB). We may want to
rename pfn_mfn to make this explicit.

We also allocate a 64KB page for the shared page even though only the
first 4KB is used. I don't think this is really important for now as it
helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
Xen PFN).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Russell King <linux@arm.linux.org.uk>
---
 arch/arm/include/asm/xen/page.h | 12 ++++++------
 arch/arm/xen/enlighten.c        |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 1bee8ca..ab6eb9a 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -56,19 +56,19 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn)
 
 static inline xmaddr_t phys_to_machine(xpaddr_t phys)
 {
-	unsigned offset = phys.paddr & ~PAGE_MASK;
-	return XMADDR(PFN_PHYS(pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
+	unsigned offset = phys.paddr & ~XEN_PAGE_MASK;
+	return XMADDR(XEN_PFN_PHYS(pfn_to_mfn(XEN_PFN_DOWN(phys.paddr))) | offset);
 }
 
 static inline xpaddr_t machine_to_phys(xmaddr_t machine)
 {
-	unsigned offset = machine.maddr & ~PAGE_MASK;
-	return XPADDR(PFN_PHYS(mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset);
+	unsigned offset = machine.maddr & ~XEN_PAGE_MASK;
+	return XPADDR(XEN_PFN_PHYS(mfn_to_pfn(XEN_PFN_DOWN(machine.maddr))) | offset);
 }
 /* VIRT <-> MACHINE conversion */
 #define virt_to_machine(v)	(phys_to_machine(XPADDR(__pa(v))))
-#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_pfn(v)))
-#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << PAGE_SHIFT))
+#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT))
+#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << XEN_PAGE_SHIFT))
 
 static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr)
 {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 224081c..dcfe251 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -93,8 +93,8 @@ static void xen_percpu_init(void)
 	pr_info("Xen: initializing cpu%d\n", cpu);
 	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
 
-	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
-	info.offset = offset_in_page(vcpup);
+	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
+	info.offset = xen_offset_in_page(vcpup);
 
 	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
 	BUG_ON(err);
@@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
 	xatp.domid = DOMID_SELF;
 	xatp.idx = 0;
 	xatp.space = XENMAPSPACE_shared_info;
-	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
+	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
 	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
 		BUG();
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [RFC 23/23] arm/xen: Add support for 64KB page granularity
@ 2015-05-14 17:01   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-14 17:01 UTC (permalink / raw)
  To: xen-devel
  Cc: Russell King, ian.campbell, stefano.stabellini, tim,
	linux-kernel, Julien Grall, linux-arm-kernel

The hypercall interface is always using 4KB page granularity. This is
requiring to use xen page definition macro when we deal with hypercall.

Note that pfn_to_mfn is working with a Xen pfn (i.e 4KB). We may want to
rename pfn_mfn to make this explicit.

We also allocate a 64KB page for the shared page even though only the
first 4KB is used. I don't think this is really important for now as it
helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
Xen PFN).

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Russell King <linux@arm.linux.org.uk>
---
 arch/arm/include/asm/xen/page.h | 12 ++++++------
 arch/arm/xen/enlighten.c        |  6 +++---
 2 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
index 1bee8ca..ab6eb9a 100644
--- a/arch/arm/include/asm/xen/page.h
+++ b/arch/arm/include/asm/xen/page.h
@@ -56,19 +56,19 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn)
 
 static inline xmaddr_t phys_to_machine(xpaddr_t phys)
 {
-	unsigned offset = phys.paddr & ~PAGE_MASK;
-	return XMADDR(PFN_PHYS(pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
+	unsigned offset = phys.paddr & ~XEN_PAGE_MASK;
+	return XMADDR(XEN_PFN_PHYS(pfn_to_mfn(XEN_PFN_DOWN(phys.paddr))) | offset);
 }
 
 static inline xpaddr_t machine_to_phys(xmaddr_t machine)
 {
-	unsigned offset = machine.maddr & ~PAGE_MASK;
-	return XPADDR(PFN_PHYS(mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset);
+	unsigned offset = machine.maddr & ~XEN_PAGE_MASK;
+	return XPADDR(XEN_PFN_PHYS(mfn_to_pfn(XEN_PFN_DOWN(machine.maddr))) | offset);
 }
 /* VIRT <-> MACHINE conversion */
 #define virt_to_machine(v)	(phys_to_machine(XPADDR(__pa(v))))
-#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_pfn(v)))
-#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << PAGE_SHIFT))
+#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT))
+#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << XEN_PAGE_SHIFT))
 
 static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr)
 {
diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
index 224081c..dcfe251 100644
--- a/arch/arm/xen/enlighten.c
+++ b/arch/arm/xen/enlighten.c
@@ -93,8 +93,8 @@ static void xen_percpu_init(void)
 	pr_info("Xen: initializing cpu%d\n", cpu);
 	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
 
-	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
-	info.offset = offset_in_page(vcpup);
+	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
+	info.offset = xen_offset_in_page(vcpup);
 
 	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
 	BUG_ON(err);
@@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
 	xatp.domid = DOMID_SELF;
 	xatp.idx = 0;
 	xatp.space = XENMAPSPACE_shared_info;
-	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
+	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
 	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
 		BUG();
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-15  0:26     ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15  0:26 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, linux-arm-kernel, ian.campbell, stefano.stabellini,
	linux-kernel, tim, Wei Liu, netdev

On Thu, May 14, 2015 at 06:00:48PM +0100, Julien Grall wrote:
> The variables old_req_cons and ring_slots_used are assigned but never
> used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
> always fully coalesce guest Rx packets".
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: netdev@vger.kernel.org

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
>  drivers/net/xen-netback/netback.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 9c6a504..9ae1d43 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
>  
>  	while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
>  	       && (skb = xenvif_rx_dequeue(queue)) != NULL) {
> -		RING_IDX old_req_cons;
> -		RING_IDX ring_slots_used;
> -
>  		queue->last_rx_time = jiffies;
>  
> -		old_req_cons = queue->rx.req_cons;
>  		XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, queue);
> -		ring_slots_used = queue->rx.req_cons - old_req_cons;
>  
>  		__skb_queue_tail(&rxq, skb);
>  	}
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action
@ 2015-05-15  0:26     ` Wei Liu
  0 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15  0:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 14, 2015 at 06:00:48PM +0100, Julien Grall wrote:
> The variables old_req_cons and ring_slots_used are assigned but never
> used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
> always fully coalesce guest Rx packets".
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: netdev at vger.kernel.org

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
>  drivers/net/xen-netback/netback.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 9c6a504..9ae1d43 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
>  
>  	while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
>  	       && (skb = xenvif_rx_dequeue(queue)) != NULL) {
> -		RING_IDX old_req_cons;
> -		RING_IDX ring_slots_used;
> -
>  		queue->last_rx_time = jiffies;
>  
> -		old_req_cons = queue->rx.req_cons;
>  		XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, queue);
> -		ring_slots_used = queue->rx.req_cons - old_req_cons;
>  
>  		__skb_queue_tail(&rxq, skb);
>  	}
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action
  2015-05-14 17:00   ` Julien Grall
  (?)
@ 2015-05-15  0:26   ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15  0:26 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Thu, May 14, 2015 at 06:00:48PM +0100, Julien Grall wrote:
> The variables old_req_cons and ring_slots_used are assigned but never
> used since commit 1650d5455bd2dc6b5ee134bd6fc1a3236c266b5b "xen-netback:
> always fully coalesce guest Rx packets".
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: netdev@vger.kernel.org

Acked-by: Wei Liu <wei.liu2@citrix.com>

> ---
>  drivers/net/xen-netback/netback.c | 5 -----
>  1 file changed, 5 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 9c6a504..9ae1d43 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -515,14 +515,9 @@ static void xenvif_rx_action(struct xenvif_queue *queue)
>  
>  	while (xenvif_rx_ring_slots_available(queue, XEN_NETBK_RX_SLOTS_MAX)
>  	       && (skb = xenvif_rx_dequeue(queue)) != NULL) {
> -		RING_IDX old_req_cons;
> -		RING_IDX ring_slots_used;
> -
>  		queue->last_rx_time = jiffies;
>  
> -		old_req_cons = queue->rx.req_cons;
>  		XENVIF_RX_CB(skb)->meta_slots_used = xenvif_gop_skb(skb, &npo, queue);
> -		ring_slots_used = queue->rx.req_cons - old_req_cons;
>  
>  		__skb_queue_tail(&rxq, skb);
>  	}
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-14 17:01   ` Julien Grall
@ 2015-05-15  2:35     ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15  2:35 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, linux-arm-kernel, ian.campbell, stefano.stabellini,
	linux-kernel, tim, Wei Liu, netdev

On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> The PV network protocol is using 4KB page granularity. The goal of this
> patch is to allow a Linux using 64KB page granularity working as a
> network backend on a non-modified Xen.
> 
> It's only necessary to adapt the ring size and break skb data in small
> chunk of 4KB. The rest of the code is relying on the grant table code.
> 
> Although only simple workload is working (dhcp request, ping). If I try
> to use wget in the guest, it will stall until a tcpdump is started on
> the vif interface in DOM0. I wasn't able to find why.
> 

I think in wget workload you're more likely to break down 64K pages to
4K pages. Some of your calculation of mfn, offset might be wrong.

> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> it's used for (I have limited knowledge on the network driver).
> 

This is the maximum slots a guest packet can use. AIUI the protocol
still works on 4K granularity (you break 64K page to a bunch of 4K
pages), you don't need to change this.

> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: netdev@vger.kernel.org
> 
> ---
> 
> Improvement such as support of 64KB grant is not taken into
> consideration in this patch because we have the requirement to run a
> Linux using 64KB pages on a non-modified Xen.
> ---
>  drivers/net/xen-netback/common.h  |  7 ++++---
>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
>  2 files changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 8a495b3..0eda6e9 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -44,6 +44,7 @@
>  #include <xen/interface/grant_table.h>
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
> +#include <xen/page.h>
>  #include <linux/debugfs.h>
>  
>  typedef unsigned int pending_ring_idx_t;
> @@ -64,8 +65,8 @@ struct pending_tx_info {
>  	struct ubuf_info callback_struct;
>  };
>  
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
>  
>  struct xenvif_rx_meta {
>  	int id;
> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
>  /* Discriminate from any valid pending_idx value. */
>  #define INVALID_PENDING_IDX 0xFFFF
>  
> -#define MAX_BUFFER_OFFSET PAGE_SIZE
> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
>  
>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 9ae1d43..ea5ce84 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  {
>  	struct gnttab_copy *copy_gop;
>  	struct xenvif_rx_meta *meta;
> -	unsigned long bytes;
> +	unsigned long bytes, off_grant;
>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
>  
>  	/* Data must not cross a page boundary. */
> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
>  			meta = get_next_rx_buffer(queue, npo);
>  
> -		bytes = PAGE_SIZE - offset;
> +		off_grant = offset & ~XEN_PAGE_MASK;
> +		bytes = XEN_PAGE_SIZE - off_grant;
>  		if (bytes > size)
>  			bytes = size;
>  
> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  		} else {
>  			copy_gop->source.domid = DOMID_SELF;
>  			copy_gop->source.u.gmfn =
> -				virt_to_mfn(page_address(page));
> +				virt_to_mfn(page_address(page) + offset);
>  		}
> -		copy_gop->source.offset = offset;
> +		copy_gop->source.offset = off_grant;
>  
>  		copy_gop->dest.domid = queue->vif->domid;
>  		copy_gop->dest.offset = npo->copy_off;
> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
>  		first->size -= txp->size;
>  		slots++;
>  
> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
>  				 txp->offset, txp->size);
>  			xenvif_fatal_tx_err(queue->vif);
> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>  		}
>  
>  		/* No crossing a page as the payload mustn't fragment. */
> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
>  			netdev_err(queue->vif->dev,
>  				   "txreq.offset: %x, size: %u, end: %lu\n",
>  				   txreq.offset, txreq.size,
> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
>  			xenvif_fatal_tx_err(queue->vif);
>  			break;
>  		}
> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>  			virt_to_mfn(skb->data);

You didn't change the calculation of MFN here. I think it returns the
MFN of the first 4K sub-page of that 64K page.  Do I miss anything?

>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>  		queue->tx_copy_ops[*copy_ops].dest.offset =
> -			offset_in_page(skb->data);
> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>  
>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s

This function is to coalesce frag_list to a new SKB. It's completely
fine to use the natural granularity of backend domain. The way you
modified it can lead to waste of memory, i.e. you only use first 4K of a
64K page.

>  			return -ENOMEM;
>  		}
>  
> -		if (offset + PAGE_SIZE < skb->len)
> -			len = PAGE_SIZE;
> +		if (offset + XEN_PAGE_SIZE < skb->len)
> +			len = XEN_PAGE_SIZE;
>  		else
>  			len = skb->len - offset;
>  		if (skb_copy_bits(skb, offset, page_address(page), len))
> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>  	/* Fill the skb with the new (local) frags. */
>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
>  	skb_shinfo(skb)->nr_frags = i;
> -	skb->truesize += i * PAGE_SIZE;
> +	skb->truesize += i * XEN_PAGE_SIZE;

The true size accounts for the actual memory occupied by this SKB. Since
the page is allocated with alloc_page, the granularity should be
PAGE_SIZE not XEN_PAGE_SIZE.

>  
>  	return 0;
>  }
> @@ -1780,7 +1781,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
>  		goto err;
>  
>  	txs = (struct xen_netif_tx_sring *)addr;
> -	BACK_RING_INIT(&queue->tx, txs, PAGE_SIZE);
> +	BACK_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(queue->vif),
>  				     &rx_ring_ref, 1, &addr);
> @@ -1788,7 +1789,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
>  		goto err;
>  
>  	rxs = (struct xen_netif_rx_sring *)addr;
> -	BACK_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
> +	BACK_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
>  
>  	return 0;
>  
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-15  2:35     ` Wei Liu
  0 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15  2:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> The PV network protocol is using 4KB page granularity. The goal of this
> patch is to allow a Linux using 64KB page granularity working as a
> network backend on a non-modified Xen.
> 
> It's only necessary to adapt the ring size and break skb data in small
> chunk of 4KB. The rest of the code is relying on the grant table code.
> 
> Although only simple workload is working (dhcp request, ping). If I try
> to use wget in the guest, it will stall until a tcpdump is started on
> the vif interface in DOM0. I wasn't able to find why.
> 

I think in wget workload you're more likely to break down 64K pages to
4K pages. Some of your calculation of mfn, offset might be wrong.

> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> it's used for (I have limited knowledge on the network driver).
> 

This is the maximum slots a guest packet can use. AIUI the protocol
still works on 4K granularity (you break 64K page to a bunch of 4K
pages), you don't need to change this.

> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: netdev at vger.kernel.org
> 
> ---
> 
> Improvement such as support of 64KB grant is not taken into
> consideration in this patch because we have the requirement to run a
> Linux using 64KB pages on a non-modified Xen.
> ---
>  drivers/net/xen-netback/common.h  |  7 ++++---
>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
>  2 files changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 8a495b3..0eda6e9 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -44,6 +44,7 @@
>  #include <xen/interface/grant_table.h>
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
> +#include <xen/page.h>
>  #include <linux/debugfs.h>
>  
>  typedef unsigned int pending_ring_idx_t;
> @@ -64,8 +65,8 @@ struct pending_tx_info {
>  	struct ubuf_info callback_struct;
>  };
>  
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
>  
>  struct xenvif_rx_meta {
>  	int id;
> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
>  /* Discriminate from any valid pending_idx value. */
>  #define INVALID_PENDING_IDX 0xFFFF
>  
> -#define MAX_BUFFER_OFFSET PAGE_SIZE
> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
>  
>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 9ae1d43..ea5ce84 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  {
>  	struct gnttab_copy *copy_gop;
>  	struct xenvif_rx_meta *meta;
> -	unsigned long bytes;
> +	unsigned long bytes, off_grant;
>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
>  
>  	/* Data must not cross a page boundary. */
> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
>  			meta = get_next_rx_buffer(queue, npo);
>  
> -		bytes = PAGE_SIZE - offset;
> +		off_grant = offset & ~XEN_PAGE_MASK;
> +		bytes = XEN_PAGE_SIZE - off_grant;
>  		if (bytes > size)
>  			bytes = size;
>  
> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  		} else {
>  			copy_gop->source.domid = DOMID_SELF;
>  			copy_gop->source.u.gmfn =
> -				virt_to_mfn(page_address(page));
> +				virt_to_mfn(page_address(page) + offset);
>  		}
> -		copy_gop->source.offset = offset;
> +		copy_gop->source.offset = off_grant;
>  
>  		copy_gop->dest.domid = queue->vif->domid;
>  		copy_gop->dest.offset = npo->copy_off;
> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
>  		first->size -= txp->size;
>  		slots++;
>  
> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
>  				 txp->offset, txp->size);
>  			xenvif_fatal_tx_err(queue->vif);
> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>  		}
>  
>  		/* No crossing a page as the payload mustn't fragment. */
> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
>  			netdev_err(queue->vif->dev,
>  				   "txreq.offset: %x, size: %u, end: %lu\n",
>  				   txreq.offset, txreq.size,
> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
>  			xenvif_fatal_tx_err(queue->vif);
>  			break;
>  		}
> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>  			virt_to_mfn(skb->data);

You didn't change the calculation of MFN here. I think it returns the
MFN of the first 4K sub-page of that 64K page.  Do I miss anything?

>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>  		queue->tx_copy_ops[*copy_ops].dest.offset =
> -			offset_in_page(skb->data);
> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>  
>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s

This function is to coalesce frag_list to a new SKB. It's completely
fine to use the natural granularity of backend domain. The way you
modified it can lead to waste of memory, i.e. you only use first 4K of a
64K page.

>  			return -ENOMEM;
>  		}
>  
> -		if (offset + PAGE_SIZE < skb->len)
> -			len = PAGE_SIZE;
> +		if (offset + XEN_PAGE_SIZE < skb->len)
> +			len = XEN_PAGE_SIZE;
>  		else
>  			len = skb->len - offset;
>  		if (skb_copy_bits(skb, offset, page_address(page), len))
> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>  	/* Fill the skb with the new (local) frags. */
>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
>  	skb_shinfo(skb)->nr_frags = i;
> -	skb->truesize += i * PAGE_SIZE;
> +	skb->truesize += i * XEN_PAGE_SIZE;

The true size accounts for the actual memory occupied by this SKB. Since
the page is allocated with alloc_page, the granularity should be
PAGE_SIZE not XEN_PAGE_SIZE.

>  
>  	return 0;
>  }
> @@ -1780,7 +1781,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
>  		goto err;
>  
>  	txs = (struct xen_netif_tx_sring *)addr;
> -	BACK_RING_INIT(&queue->tx, txs, PAGE_SIZE);
> +	BACK_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(queue->vif),
>  				     &rx_ring_ref, 1, &addr);
> @@ -1788,7 +1789,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
>  		goto err;
>  
>  	rxs = (struct xen_netif_rx_sring *)addr;
> -	BACK_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
> +	BACK_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
>  
>  	return 0;
>  
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-14 17:01   ` Julien Grall
                     ` (3 preceding siblings ...)
  (?)
@ 2015-05-15  2:35   ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15  2:35 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> The PV network protocol is using 4KB page granularity. The goal of this
> patch is to allow a Linux using 64KB page granularity working as a
> network backend on a non-modified Xen.
> 
> It's only necessary to adapt the ring size and break skb data in small
> chunk of 4KB. The rest of the code is relying on the grant table code.
> 
> Although only simple workload is working (dhcp request, ping). If I try
> to use wget in the guest, it will stall until a tcpdump is started on
> the vif interface in DOM0. I wasn't able to find why.
> 

I think in wget workload you're more likely to break down 64K pages to
4K pages. Some of your calculation of mfn, offset might be wrong.

> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> it's used for (I have limited knowledge on the network driver).
> 

This is the maximum slots a guest packet can use. AIUI the protocol
still works on 4K granularity (you break 64K page to a bunch of 4K
pages), you don't need to change this.

> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Ian Campbell <ian.campbell@citrix.com>
> Cc: Wei Liu <wei.liu2@citrix.com>
> Cc: netdev@vger.kernel.org
> 
> ---
> 
> Improvement such as support of 64KB grant is not taken into
> consideration in this patch because we have the requirement to run a
> Linux using 64KB pages on a non-modified Xen.
> ---
>  drivers/net/xen-netback/common.h  |  7 ++++---
>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
>  2 files changed, 18 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 8a495b3..0eda6e9 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -44,6 +44,7 @@
>  #include <xen/interface/grant_table.h>
>  #include <xen/grant_table.h>
>  #include <xen/xenbus.h>
> +#include <xen/page.h>
>  #include <linux/debugfs.h>
>  
>  typedef unsigned int pending_ring_idx_t;
> @@ -64,8 +65,8 @@ struct pending_tx_info {
>  	struct ubuf_info callback_struct;
>  };
>  
> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
>  
>  struct xenvif_rx_meta {
>  	int id;
> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
>  /* Discriminate from any valid pending_idx value. */
>  #define INVALID_PENDING_IDX 0xFFFF
>  
> -#define MAX_BUFFER_OFFSET PAGE_SIZE
> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
>  
>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>  
> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> index 9ae1d43..ea5ce84 100644
> --- a/drivers/net/xen-netback/netback.c
> +++ b/drivers/net/xen-netback/netback.c
> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  {
>  	struct gnttab_copy *copy_gop;
>  	struct xenvif_rx_meta *meta;
> -	unsigned long bytes;
> +	unsigned long bytes, off_grant;
>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
>  
>  	/* Data must not cross a page boundary. */
> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
>  			meta = get_next_rx_buffer(queue, npo);
>  
> -		bytes = PAGE_SIZE - offset;
> +		off_grant = offset & ~XEN_PAGE_MASK;
> +		bytes = XEN_PAGE_SIZE - off_grant;
>  		if (bytes > size)
>  			bytes = size;
>  
> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>  		} else {
>  			copy_gop->source.domid = DOMID_SELF;
>  			copy_gop->source.u.gmfn =
> -				virt_to_mfn(page_address(page));
> +				virt_to_mfn(page_address(page) + offset);
>  		}
> -		copy_gop->source.offset = offset;
> +		copy_gop->source.offset = off_grant;
>  
>  		copy_gop->dest.domid = queue->vif->domid;
>  		copy_gop->dest.offset = npo->copy_off;
> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
>  		first->size -= txp->size;
>  		slots++;
>  
> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
>  				 txp->offset, txp->size);
>  			xenvif_fatal_tx_err(queue->vif);
> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>  		}
>  
>  		/* No crossing a page as the payload mustn't fragment. */
> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
>  			netdev_err(queue->vif->dev,
>  				   "txreq.offset: %x, size: %u, end: %lu\n",
>  				   txreq.offset, txreq.size,
> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
>  			xenvif_fatal_tx_err(queue->vif);
>  			break;
>  		}
> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>  			virt_to_mfn(skb->data);

You didn't change the calculation of MFN here. I think it returns the
MFN of the first 4K sub-page of that 64K page.  Do I miss anything?

>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>  		queue->tx_copy_ops[*copy_ops].dest.offset =
> -			offset_in_page(skb->data);
> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>  
>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s

This function is to coalesce frag_list to a new SKB. It's completely
fine to use the natural granularity of backend domain. The way you
modified it can lead to waste of memory, i.e. you only use first 4K of a
64K page.

>  			return -ENOMEM;
>  		}
>  
> -		if (offset + PAGE_SIZE < skb->len)
> -			len = PAGE_SIZE;
> +		if (offset + XEN_PAGE_SIZE < skb->len)
> +			len = XEN_PAGE_SIZE;
>  		else
>  			len = skb->len - offset;
>  		if (skb_copy_bits(skb, offset, page_address(page), len))
> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>  	/* Fill the skb with the new (local) frags. */
>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
>  	skb_shinfo(skb)->nr_frags = i;
> -	skb->truesize += i * PAGE_SIZE;
> +	skb->truesize += i * XEN_PAGE_SIZE;

The true size accounts for the actual memory occupied by this SKB. Since
the page is allocated with alloc_page, the granularity should be
PAGE_SIZE not XEN_PAGE_SIZE.

>  
>  	return 0;
>  }
> @@ -1780,7 +1781,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
>  		goto err;
>  
>  	txs = (struct xen_netif_tx_sring *)addr;
> -	BACK_RING_INIT(&queue->tx, txs, PAGE_SIZE);
> +	BACK_RING_INIT(&queue->tx, txs, XEN_PAGE_SIZE);
>  
>  	err = xenbus_map_ring_valloc(xenvif_to_xenbus_device(queue->vif),
>  				     &rx_ring_ref, 1, &addr);
> @@ -1788,7 +1789,7 @@ int xenvif_map_frontend_rings(struct xenvif_queue *queue,
>  		goto err;
>  
>  	rxs = (struct xen_netif_rx_sring *)addr;
> -	BACK_RING_INIT(&queue->rx, rxs, PAGE_SIZE);
> +	BACK_RING_INIT(&queue->rx, rxs, XEN_PAGE_SIZE);
>  
>  	return 0;
>  
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-15  2:35     ` Wei Liu
@ 2015-05-15 12:35       ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-15 12:35 UTC (permalink / raw)
  To: Wei Liu, Julien Grall
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	xen-devel, linux-arm-kernel

Hi Wei,

Thanks you for the review.

On 15/05/15 03:35, Wei Liu wrote:
> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>> The PV network protocol is using 4KB page granularity. The goal of this
>> patch is to allow a Linux using 64KB page granularity working as a
>> network backend on a non-modified Xen.
>>
>> It's only necessary to adapt the ring size and break skb data in small
>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>
>> Although only simple workload is working (dhcp request, ping). If I try
>> to use wget in the guest, it will stall until a tcpdump is started on
>> the vif interface in DOM0. I wasn't able to find why.
>>
> 
> I think in wget workload you're more likely to break down 64K pages to
> 4K pages. Some of your calculation of mfn, offset might be wrong.

If so, why tcpdump on the vif interface would make wget suddenly
working? Does it make netback use a different path?

>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
>> it's used for (I have limited knowledge on the network driver).
>>
> 
> This is the maximum slots a guest packet can use. AIUI the protocol
> still works on 4K granularity (you break 64K page to a bunch of 4K
> pages), you don't need to change this.

1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
number of Linux page. So we would have to get the number for Xen page.

Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
but it get stuck in the loop.

>> ---
>>  drivers/net/xen-netback/common.h  |  7 ++++---
>>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
>>  2 files changed, 18 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 8a495b3..0eda6e9 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -44,6 +44,7 @@
>>  #include <xen/interface/grant_table.h>
>>  #include <xen/grant_table.h>
>>  #include <xen/xenbus.h>
>> +#include <xen/page.h>
>>  #include <linux/debugfs.h>
>>  
>>  typedef unsigned int pending_ring_idx_t;
>> @@ -64,8 +65,8 @@ struct pending_tx_info {
>>  	struct ubuf_info callback_struct;
>>  };
>>  
>> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
>> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
>>  
>>  struct xenvif_rx_meta {
>>  	int id;
>> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
>>  /* Discriminate from any valid pending_idx value. */
>>  #define INVALID_PENDING_IDX 0xFFFF
>>  
>> -#define MAX_BUFFER_OFFSET PAGE_SIZE
>> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
>>  
>>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>>  
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 9ae1d43..ea5ce84 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  {
>>  	struct gnttab_copy *copy_gop;
>>  	struct xenvif_rx_meta *meta;
>> -	unsigned long bytes;
>> +	unsigned long bytes, off_grant;
>>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
>>  
>>  	/* Data must not cross a page boundary. */
>> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
>>  			meta = get_next_rx_buffer(queue, npo);
>>  
>> -		bytes = PAGE_SIZE - offset;
>> +		off_grant = offset & ~XEN_PAGE_MASK;
>> +		bytes = XEN_PAGE_SIZE - off_grant;
>>  		if (bytes > size)
>>  			bytes = size;
>>  
>> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  		} else {
>>  			copy_gop->source.domid = DOMID_SELF;
>>  			copy_gop->source.u.gmfn =
>> -				virt_to_mfn(page_address(page));
>> +				virt_to_mfn(page_address(page) + offset);
>>  		}
>> -		copy_gop->source.offset = offset;
>> +		copy_gop->source.offset = off_grant;
>>  
>>  		copy_gop->dest.domid = queue->vif->domid;
>>  		copy_gop->dest.offset = npo->copy_off;
>> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
>>  		first->size -= txp->size;
>>  		slots++;
>>  
>> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
>> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
>>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
>>  				 txp->offset, txp->size);
>>  			xenvif_fatal_tx_err(queue->vif);
>> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>  		}
>>  
>>  		/* No crossing a page as the payload mustn't fragment. */
>> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
>> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
>>  			netdev_err(queue->vif->dev,
>>  				   "txreq.offset: %x, size: %u, end: %lu\n",
>>  				   txreq.offset, txreq.size,
>> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
>> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
>>  			xenvif_fatal_tx_err(queue->vif);
>>  			break;
>>  		}
>> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>  			virt_to_mfn(skb->data);
> 
> You didn't change the calculation of MFN here. I think it returns the
> MFN of the first 4K sub-page of that 64K page.  Do I miss anything?

There is no change required. On ARM virt_to_mfn is implemented with:

pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT)

which will return a 4KB PFN (see patch #23).

> 
>>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>  		queue->tx_copy_ops[*copy_ops].dest.offset =
>> -			offset_in_page(skb->data);
>> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>>  
>>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
>> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> 
> This function is to coalesce frag_list to a new SKB. It's completely
> fine to use the natural granularity of backend domain. The way you
> modified it can lead to waste of memory, i.e. you only use first 4K of a
> 64K page.

Thanks for explaining. I wasn't sure how the function works so I change
it for safety. I will redo the change.

FWIW, I'm sure there is other place in netback where we waste memory
with 64KB page granularity (such as grant table). I need to track them.

Let me know if you have some place in mind where the memory usage can be
improved.

>>  			return -ENOMEM;
>>  		}
>>  
>> -		if (offset + PAGE_SIZE < skb->len)
>> -			len = PAGE_SIZE;
>> +		if (offset + XEN_PAGE_SIZE < skb->len)
>> +			len = XEN_PAGE_SIZE;
>>  		else
>>  			len = skb->len - offset;
>>  		if (skb_copy_bits(skb, offset, page_address(page), len))
>> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>>  	/* Fill the skb with the new (local) frags. */
>>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
>>  	skb_shinfo(skb)->nr_frags = i;
>> -	skb->truesize += i * PAGE_SIZE;
>> +	skb->truesize += i * XEN_PAGE_SIZE;
> 
> The true size accounts for the actual memory occupied by this SKB. Since
> the page is allocated with alloc_page, the granularity should be
> PAGE_SIZE not XEN_PAGE_SIZE.

Ok. I will replace with PAGE_SIZE.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-15 12:35       ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-15 12:35 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Wei,

Thanks you for the review.

On 15/05/15 03:35, Wei Liu wrote:
> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>> The PV network protocol is using 4KB page granularity. The goal of this
>> patch is to allow a Linux using 64KB page granularity working as a
>> network backend on a non-modified Xen.
>>
>> It's only necessary to adapt the ring size and break skb data in small
>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>
>> Although only simple workload is working (dhcp request, ping). If I try
>> to use wget in the guest, it will stall until a tcpdump is started on
>> the vif interface in DOM0. I wasn't able to find why.
>>
> 
> I think in wget workload you're more likely to break down 64K pages to
> 4K pages. Some of your calculation of mfn, offset might be wrong.

If so, why tcpdump on the vif interface would make wget suddenly
working? Does it make netback use a different path?

>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
>> it's used for (I have limited knowledge on the network driver).
>>
> 
> This is the maximum slots a guest packet can use. AIUI the protocol
> still works on 4K granularity (you break 64K page to a bunch of 4K
> pages), you don't need to change this.

1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
number of Linux page. So we would have to get the number for Xen page.

Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
but it get stuck in the loop.

>> ---
>>  drivers/net/xen-netback/common.h  |  7 ++++---
>>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
>>  2 files changed, 18 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 8a495b3..0eda6e9 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -44,6 +44,7 @@
>>  #include <xen/interface/grant_table.h>
>>  #include <xen/grant_table.h>
>>  #include <xen/xenbus.h>
>> +#include <xen/page.h>
>>  #include <linux/debugfs.h>
>>  
>>  typedef unsigned int pending_ring_idx_t;
>> @@ -64,8 +65,8 @@ struct pending_tx_info {
>>  	struct ubuf_info callback_struct;
>>  };
>>  
>> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
>> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
>>  
>>  struct xenvif_rx_meta {
>>  	int id;
>> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
>>  /* Discriminate from any valid pending_idx value. */
>>  #define INVALID_PENDING_IDX 0xFFFF
>>  
>> -#define MAX_BUFFER_OFFSET PAGE_SIZE
>> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
>>  
>>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>>  
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 9ae1d43..ea5ce84 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  {
>>  	struct gnttab_copy *copy_gop;
>>  	struct xenvif_rx_meta *meta;
>> -	unsigned long bytes;
>> +	unsigned long bytes, off_grant;
>>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
>>  
>>  	/* Data must not cross a page boundary. */
>> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
>>  			meta = get_next_rx_buffer(queue, npo);
>>  
>> -		bytes = PAGE_SIZE - offset;
>> +		off_grant = offset & ~XEN_PAGE_MASK;
>> +		bytes = XEN_PAGE_SIZE - off_grant;
>>  		if (bytes > size)
>>  			bytes = size;
>>  
>> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  		} else {
>>  			copy_gop->source.domid = DOMID_SELF;
>>  			copy_gop->source.u.gmfn =
>> -				virt_to_mfn(page_address(page));
>> +				virt_to_mfn(page_address(page) + offset);
>>  		}
>> -		copy_gop->source.offset = offset;
>> +		copy_gop->source.offset = off_grant;
>>  
>>  		copy_gop->dest.domid = queue->vif->domid;
>>  		copy_gop->dest.offset = npo->copy_off;
>> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
>>  		first->size -= txp->size;
>>  		slots++;
>>  
>> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
>> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
>>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
>>  				 txp->offset, txp->size);
>>  			xenvif_fatal_tx_err(queue->vif);
>> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>  		}
>>  
>>  		/* No crossing a page as the payload mustn't fragment. */
>> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
>> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
>>  			netdev_err(queue->vif->dev,
>>  				   "txreq.offset: %x, size: %u, end: %lu\n",
>>  				   txreq.offset, txreq.size,
>> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
>> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
>>  			xenvif_fatal_tx_err(queue->vif);
>>  			break;
>>  		}
>> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>  			virt_to_mfn(skb->data);
> 
> You didn't change the calculation of MFN here. I think it returns the
> MFN of the first 4K sub-page of that 64K page.  Do I miss anything?

There is no change required. On ARM virt_to_mfn is implemented with:

pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT)

which will return a 4KB PFN (see patch #23).

> 
>>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>  		queue->tx_copy_ops[*copy_ops].dest.offset =
>> -			offset_in_page(skb->data);
>> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>>  
>>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
>> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> 
> This function is to coalesce frag_list to a new SKB. It's completely
> fine to use the natural granularity of backend domain. The way you
> modified it can lead to waste of memory, i.e. you only use first 4K of a
> 64K page.

Thanks for explaining. I wasn't sure how the function works so I change
it for safety. I will redo the change.

FWIW, I'm sure there is other place in netback where we waste memory
with 64KB page granularity (such as grant table). I need to track them.

Let me know if you have some place in mind where the memory usage can be
improved.

>>  			return -ENOMEM;
>>  		}
>>  
>> -		if (offset + PAGE_SIZE < skb->len)
>> -			len = PAGE_SIZE;
>> +		if (offset + XEN_PAGE_SIZE < skb->len)
>> +			len = XEN_PAGE_SIZE;
>>  		else
>>  			len = skb->len - offset;
>>  		if (skb_copy_bits(skb, offset, page_address(page), len))
>> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>>  	/* Fill the skb with the new (local) frags. */
>>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
>>  	skb_shinfo(skb)->nr_frags = i;
>> -	skb->truesize += i * PAGE_SIZE;
>> +	skb->truesize += i * XEN_PAGE_SIZE;
> 
> The true size accounts for the actual memory occupied by this SKB. Since
> the page is allocated with alloc_page, the granularity should be
> PAGE_SIZE not XEN_PAGE_SIZE.

Ok. I will replace with PAGE_SIZE.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-15  2:35     ` Wei Liu
  (?)
  (?)
@ 2015-05-15 12:35     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-15 12:35 UTC (permalink / raw)
  To: Wei Liu, Julien Grall
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	xen-devel, linux-arm-kernel

Hi Wei,

Thanks you for the review.

On 15/05/15 03:35, Wei Liu wrote:
> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>> The PV network protocol is using 4KB page granularity. The goal of this
>> patch is to allow a Linux using 64KB page granularity working as a
>> network backend on a non-modified Xen.
>>
>> It's only necessary to adapt the ring size and break skb data in small
>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>
>> Although only simple workload is working (dhcp request, ping). If I try
>> to use wget in the guest, it will stall until a tcpdump is started on
>> the vif interface in DOM0. I wasn't able to find why.
>>
> 
> I think in wget workload you're more likely to break down 64K pages to
> 4K pages. Some of your calculation of mfn, offset might be wrong.

If so, why tcpdump on the vif interface would make wget suddenly
working? Does it make netback use a different path?

>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
>> it's used for (I have limited knowledge on the network driver).
>>
> 
> This is the maximum slots a guest packet can use. AIUI the protocol
> still works on 4K granularity (you break 64K page to a bunch of 4K
> pages), you don't need to change this.

1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
number of Linux page. So we would have to get the number for Xen page.

Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
but it get stuck in the loop.

>> ---
>>  drivers/net/xen-netback/common.h  |  7 ++++---
>>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
>>  2 files changed, 18 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 8a495b3..0eda6e9 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -44,6 +44,7 @@
>>  #include <xen/interface/grant_table.h>
>>  #include <xen/grant_table.h>
>>  #include <xen/xenbus.h>
>> +#include <xen/page.h>
>>  #include <linux/debugfs.h>
>>  
>>  typedef unsigned int pending_ring_idx_t;
>> @@ -64,8 +65,8 @@ struct pending_tx_info {
>>  	struct ubuf_info callback_struct;
>>  };
>>  
>> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
>> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
>> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
>> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
>>  
>>  struct xenvif_rx_meta {
>>  	int id;
>> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
>>  /* Discriminate from any valid pending_idx value. */
>>  #define INVALID_PENDING_IDX 0xFFFF
>>  
>> -#define MAX_BUFFER_OFFSET PAGE_SIZE
>> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
>>  
>>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
>>  
>> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
>> index 9ae1d43..ea5ce84 100644
>> --- a/drivers/net/xen-netback/netback.c
>> +++ b/drivers/net/xen-netback/netback.c
>> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  {
>>  	struct gnttab_copy *copy_gop;
>>  	struct xenvif_rx_meta *meta;
>> -	unsigned long bytes;
>> +	unsigned long bytes, off_grant;
>>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
>>  
>>  	/* Data must not cross a page boundary. */
>> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
>>  			meta = get_next_rx_buffer(queue, npo);
>>  
>> -		bytes = PAGE_SIZE - offset;
>> +		off_grant = offset & ~XEN_PAGE_MASK;
>> +		bytes = XEN_PAGE_SIZE - off_grant;
>>  		if (bytes > size)
>>  			bytes = size;
>>  
>> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
>>  		} else {
>>  			copy_gop->source.domid = DOMID_SELF;
>>  			copy_gop->source.u.gmfn =
>> -				virt_to_mfn(page_address(page));
>> +				virt_to_mfn(page_address(page) + offset);
>>  		}
>> -		copy_gop->source.offset = offset;
>> +		copy_gop->source.offset = off_grant;
>>  
>>  		copy_gop->dest.domid = queue->vif->domid;
>>  		copy_gop->dest.offset = npo->copy_off;
>> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
>>  		first->size -= txp->size;
>>  		slots++;
>>  
>> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
>> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
>>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
>>  				 txp->offset, txp->size);
>>  			xenvif_fatal_tx_err(queue->vif);
>> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>  		}
>>  
>>  		/* No crossing a page as the payload mustn't fragment. */
>> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
>> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
>>  			netdev_err(queue->vif->dev,
>>  				   "txreq.offset: %x, size: %u, end: %lu\n",
>>  				   txreq.offset, txreq.size,
>> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
>> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
>>  			xenvif_fatal_tx_err(queue->vif);
>>  			break;
>>  		}
>> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
>>  			virt_to_mfn(skb->data);
> 
> You didn't change the calculation of MFN here. I think it returns the
> MFN of the first 4K sub-page of that 64K page.  Do I miss anything?

There is no change required. On ARM virt_to_mfn is implemented with:

pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT)

which will return a 4KB PFN (see patch #23).

> 
>>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>  		queue->tx_copy_ops[*copy_ops].dest.offset =
>> -			offset_in_page(skb->data);
>> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>>  
>>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
>> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> 
> This function is to coalesce frag_list to a new SKB. It's completely
> fine to use the natural granularity of backend domain. The way you
> modified it can lead to waste of memory, i.e. you only use first 4K of a
> 64K page.

Thanks for explaining. I wasn't sure how the function works so I change
it for safety. I will redo the change.

FWIW, I'm sure there is other place in netback where we waste memory
with 64KB page granularity (such as grant table). I need to track them.

Let me know if you have some place in mind where the memory usage can be
improved.

>>  			return -ENOMEM;
>>  		}
>>  
>> -		if (offset + PAGE_SIZE < skb->len)
>> -			len = PAGE_SIZE;
>> +		if (offset + XEN_PAGE_SIZE < skb->len)
>> +			len = XEN_PAGE_SIZE;
>>  		else
>>  			len = skb->len - offset;
>>  		if (skb_copy_bits(skb, offset, page_address(page), len))
>> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>>  	/* Fill the skb with the new (local) frags. */
>>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
>>  	skb_shinfo(skb)->nr_frags = i;
>> -	skb->truesize += i * PAGE_SIZE;
>> +	skb->truesize += i * XEN_PAGE_SIZE;
> 
> The true size accounts for the actual memory occupied by this SKB. Since
> the page is allocated with alloc_page, the granularity should be
> PAGE_SIZE not XEN_PAGE_SIZE.

Ok. I will replace with PAGE_SIZE.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-15 12:35       ` Julien Grall
@ 2015-05-15 15:31         ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15 15:31 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> Thanks you for the review.
> 
> On 15/05/15 03:35, Wei Liu wrote:
> > On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> >> The PV network protocol is using 4KB page granularity. The goal of this
> >> patch is to allow a Linux using 64KB page granularity working as a
> >> network backend on a non-modified Xen.
> >>
> >> It's only necessary to adapt the ring size and break skb data in small
> >> chunk of 4KB. The rest of the code is relying on the grant table code.
> >>
> >> Although only simple workload is working (dhcp request, ping). If I try
> >> to use wget in the guest, it will stall until a tcpdump is started on
> >> the vif interface in DOM0. I wasn't able to find why.
> >>
> > 
> > I think in wget workload you're more likely to break down 64K pages to
> > 4K pages. Some of your calculation of mfn, offset might be wrong.
> 
> If so, why tcpdump on the vif interface would make wget suddenly
> working? Does it make netback use a different path?

No, but if might make core network component behave differently, this is
only my suspicion.

Do you see malformed packets with tcpdump?

> 
> >> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> >> it's used for (I have limited knowledge on the network driver).
> >>
> > 
> > This is the maximum slots a guest packet can use. AIUI the protocol
> > still works on 4K granularity (you break 64K page to a bunch of 4K
> > pages), you don't need to change this.
> 
> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
> number of Linux page. So we would have to get the number for Xen page.
> 

Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
change this constant to match underlying HV page.

> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
> but it get stuck in the loop.
> 

I don't follow. What is the new #define? Which loop does it get stuck?

> >> ---
> >>  drivers/net/xen-netback/common.h  |  7 ++++---
> >>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
> >>  2 files changed, 18 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> >> index 8a495b3..0eda6e9 100644
> >> --- a/drivers/net/xen-netback/common.h
> >> +++ b/drivers/net/xen-netback/common.h
> >> @@ -44,6 +44,7 @@
> >>  #include <xen/interface/grant_table.h>
> >>  #include <xen/grant_table.h>
> >>  #include <xen/xenbus.h>
> >> +#include <xen/page.h>
> >>  #include <linux/debugfs.h>
> >>  
> >>  typedef unsigned int pending_ring_idx_t;
> >> @@ -64,8 +65,8 @@ struct pending_tx_info {
> >>  	struct ubuf_info callback_struct;
> >>  };
> >>  
> >> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> >> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> >> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
> >> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
> >>  
> >>  struct xenvif_rx_meta {
> >>  	int id;
> >> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
> >>  /* Discriminate from any valid pending_idx value. */
> >>  #define INVALID_PENDING_IDX 0xFFFF
> >>  
> >> -#define MAX_BUFFER_OFFSET PAGE_SIZE
> >> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
> >>  
> >>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
> >>  
> >> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> >> index 9ae1d43..ea5ce84 100644
> >> --- a/drivers/net/xen-netback/netback.c
> >> +++ b/drivers/net/xen-netback/netback.c
> >> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  {
> >>  	struct gnttab_copy *copy_gop;
> >>  	struct xenvif_rx_meta *meta;
> >> -	unsigned long bytes;
> >> +	unsigned long bytes, off_grant;
> >>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
> >>  
> >>  	/* Data must not cross a page boundary. */
> >> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
> >>  			meta = get_next_rx_buffer(queue, npo);
> >>  
> >> -		bytes = PAGE_SIZE - offset;
> >> +		off_grant = offset & ~XEN_PAGE_MASK;
> >> +		bytes = XEN_PAGE_SIZE - off_grant;
> >>  		if (bytes > size)
> >>  			bytes = size;
> >>  
> >> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  		} else {
> >>  			copy_gop->source.domid = DOMID_SELF;
> >>  			copy_gop->source.u.gmfn =
> >> -				virt_to_mfn(page_address(page));
> >> +				virt_to_mfn(page_address(page) + offset);
> >>  		}
> >> -		copy_gop->source.offset = offset;
> >> +		copy_gop->source.offset = off_grant;
> >>  
> >>  		copy_gop->dest.domid = queue->vif->domid;
> >>  		copy_gop->dest.offset = npo->copy_off;
> >> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
> >>  		first->size -= txp->size;
> >>  		slots++;
> >>  
> >> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
> >> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
> >>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
> >>  				 txp->offset, txp->size);
> >>  			xenvif_fatal_tx_err(queue->vif);
> >> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
> >>  		}
> >>  
> >>  		/* No crossing a page as the payload mustn't fragment. */
> >> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
> >> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
> >>  			netdev_err(queue->vif->dev,
> >>  				   "txreq.offset: %x, size: %u, end: %lu\n",
> >>  				   txreq.offset, txreq.size,
> >> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
> >> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
> >>  			xenvif_fatal_tx_err(queue->vif);
> >>  			break;
> >>  		}
> >> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
> >>  			virt_to_mfn(skb->data);
> > 
> > You didn't change the calculation of MFN here. I think it returns the
> > MFN of the first 4K sub-page of that 64K page.  Do I miss anything?
> 
> There is no change required. On ARM virt_to_mfn is implemented with:
> 
> pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT)
> 
> which will return a 4KB PFN (see patch #23).
> 

OK. I missed that patch.

> > 
> >>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
> >>  		queue->tx_copy_ops[*copy_ops].dest.offset =
> >> -			offset_in_page(skb->data);
> >> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
> >>  
> >>  		queue->tx_copy_ops[*copy_ops].len = data_len;
> >>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
> >> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> > 
> > This function is to coalesce frag_list to a new SKB. It's completely
> > fine to use the natural granularity of backend domain. The way you
> > modified it can lead to waste of memory, i.e. you only use first 4K of a
> > 64K page.
> 
> Thanks for explaining. I wasn't sure how the function works so I change
> it for safety. I will redo the change.
> 
> FWIW, I'm sure there is other place in netback where we waste memory
> with 64KB page granularity (such as grant table). I need to track them.
> 
> Let me know if you have some place in mind where the memory usage can be
> improved.
> 

I was about to say the mmap_pages array is an array of pages. But that
probably belongs to grant table driver.

Wei.

> >>  			return -ENOMEM;
> >>  		}
> >>  
> >> -		if (offset + PAGE_SIZE < skb->len)
> >> -			len = PAGE_SIZE;
> >> +		if (offset + XEN_PAGE_SIZE < skb->len)
> >> +			len = XEN_PAGE_SIZE;
> >>  		else
> >>  			len = skb->len - offset;
> >>  		if (skb_copy_bits(skb, offset, page_address(page), len))
> >> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> >>  	/* Fill the skb with the new (local) frags. */
> >>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
> >>  	skb_shinfo(skb)->nr_frags = i;
> >> -	skb->truesize += i * PAGE_SIZE;
> >> +	skb->truesize += i * XEN_PAGE_SIZE;
> > 
> > The true size accounts for the actual memory occupied by this SKB. Since
> > the page is allocated with alloc_page, the granularity should be
> > PAGE_SIZE not XEN_PAGE_SIZE.
> 
> Ok. I will replace with PAGE_SIZE.
> 
> Regards,
> 
> -- 
> Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-15 15:31         ` Wei Liu
  0 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15 15:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> Thanks you for the review.
> 
> On 15/05/15 03:35, Wei Liu wrote:
> > On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> >> The PV network protocol is using 4KB page granularity. The goal of this
> >> patch is to allow a Linux using 64KB page granularity working as a
> >> network backend on a non-modified Xen.
> >>
> >> It's only necessary to adapt the ring size and break skb data in small
> >> chunk of 4KB. The rest of the code is relying on the grant table code.
> >>
> >> Although only simple workload is working (dhcp request, ping). If I try
> >> to use wget in the guest, it will stall until a tcpdump is started on
> >> the vif interface in DOM0. I wasn't able to find why.
> >>
> > 
> > I think in wget workload you're more likely to break down 64K pages to
> > 4K pages. Some of your calculation of mfn, offset might be wrong.
> 
> If so, why tcpdump on the vif interface would make wget suddenly
> working? Does it make netback use a different path?

No, but if might make core network component behave differently, this is
only my suspicion.

Do you see malformed packets with tcpdump?

> 
> >> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> >> it's used for (I have limited knowledge on the network driver).
> >>
> > 
> > This is the maximum slots a guest packet can use. AIUI the protocol
> > still works on 4K granularity (you break 64K page to a bunch of 4K
> > pages), you don't need to change this.
> 
> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
> number of Linux page. So we would have to get the number for Xen page.
> 

Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
change this constant to match underlying HV page.

> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
> but it get stuck in the loop.
> 

I don't follow. What is the new #define? Which loop does it get stuck?

> >> ---
> >>  drivers/net/xen-netback/common.h  |  7 ++++---
> >>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
> >>  2 files changed, 18 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> >> index 8a495b3..0eda6e9 100644
> >> --- a/drivers/net/xen-netback/common.h
> >> +++ b/drivers/net/xen-netback/common.h
> >> @@ -44,6 +44,7 @@
> >>  #include <xen/interface/grant_table.h>
> >>  #include <xen/grant_table.h>
> >>  #include <xen/xenbus.h>
> >> +#include <xen/page.h>
> >>  #include <linux/debugfs.h>
> >>  
> >>  typedef unsigned int pending_ring_idx_t;
> >> @@ -64,8 +65,8 @@ struct pending_tx_info {
> >>  	struct ubuf_info callback_struct;
> >>  };
> >>  
> >> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> >> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> >> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
> >> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
> >>  
> >>  struct xenvif_rx_meta {
> >>  	int id;
> >> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
> >>  /* Discriminate from any valid pending_idx value. */
> >>  #define INVALID_PENDING_IDX 0xFFFF
> >>  
> >> -#define MAX_BUFFER_OFFSET PAGE_SIZE
> >> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
> >>  
> >>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
> >>  
> >> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> >> index 9ae1d43..ea5ce84 100644
> >> --- a/drivers/net/xen-netback/netback.c
> >> +++ b/drivers/net/xen-netback/netback.c
> >> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  {
> >>  	struct gnttab_copy *copy_gop;
> >>  	struct xenvif_rx_meta *meta;
> >> -	unsigned long bytes;
> >> +	unsigned long bytes, off_grant;
> >>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
> >>  
> >>  	/* Data must not cross a page boundary. */
> >> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
> >>  			meta = get_next_rx_buffer(queue, npo);
> >>  
> >> -		bytes = PAGE_SIZE - offset;
> >> +		off_grant = offset & ~XEN_PAGE_MASK;
> >> +		bytes = XEN_PAGE_SIZE - off_grant;
> >>  		if (bytes > size)
> >>  			bytes = size;
> >>  
> >> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  		} else {
> >>  			copy_gop->source.domid = DOMID_SELF;
> >>  			copy_gop->source.u.gmfn =
> >> -				virt_to_mfn(page_address(page));
> >> +				virt_to_mfn(page_address(page) + offset);
> >>  		}
> >> -		copy_gop->source.offset = offset;
> >> +		copy_gop->source.offset = off_grant;
> >>  
> >>  		copy_gop->dest.domid = queue->vif->domid;
> >>  		copy_gop->dest.offset = npo->copy_off;
> >> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
> >>  		first->size -= txp->size;
> >>  		slots++;
> >>  
> >> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
> >> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
> >>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
> >>  				 txp->offset, txp->size);
> >>  			xenvif_fatal_tx_err(queue->vif);
> >> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
> >>  		}
> >>  
> >>  		/* No crossing a page as the payload mustn't fragment. */
> >> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
> >> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
> >>  			netdev_err(queue->vif->dev,
> >>  				   "txreq.offset: %x, size: %u, end: %lu\n",
> >>  				   txreq.offset, txreq.size,
> >> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
> >> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
> >>  			xenvif_fatal_tx_err(queue->vif);
> >>  			break;
> >>  		}
> >> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
> >>  			virt_to_mfn(skb->data);
> > 
> > You didn't change the calculation of MFN here. I think it returns the
> > MFN of the first 4K sub-page of that 64K page.  Do I miss anything?
> 
> There is no change required. On ARM virt_to_mfn is implemented with:
> 
> pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT)
> 
> which will return a 4KB PFN (see patch #23).
> 

OK. I missed that patch.

> > 
> >>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
> >>  		queue->tx_copy_ops[*copy_ops].dest.offset =
> >> -			offset_in_page(skb->data);
> >> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
> >>  
> >>  		queue->tx_copy_ops[*copy_ops].len = data_len;
> >>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
> >> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> > 
> > This function is to coalesce frag_list to a new SKB. It's completely
> > fine to use the natural granularity of backend domain. The way you
> > modified it can lead to waste of memory, i.e. you only use first 4K of a
> > 64K page.
> 
> Thanks for explaining. I wasn't sure how the function works so I change
> it for safety. I will redo the change.
> 
> FWIW, I'm sure there is other place in netback where we waste memory
> with 64KB page granularity (such as grant table). I need to track them.
> 
> Let me know if you have some place in mind where the memory usage can be
> improved.
> 

I was about to say the mmap_pages array is an array of pages. But that
probably belongs to grant table driver.

Wei.

> >>  			return -ENOMEM;
> >>  		}
> >>  
> >> -		if (offset + PAGE_SIZE < skb->len)
> >> -			len = PAGE_SIZE;
> >> +		if (offset + XEN_PAGE_SIZE < skb->len)
> >> +			len = XEN_PAGE_SIZE;
> >>  		else
> >>  			len = skb->len - offset;
> >>  		if (skb_copy_bits(skb, offset, page_address(page), len))
> >> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> >>  	/* Fill the skb with the new (local) frags. */
> >>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
> >>  	skb_shinfo(skb)->nr_frags = i;
> >> -	skb->truesize += i * PAGE_SIZE;
> >> +	skb->truesize += i * XEN_PAGE_SIZE;
> > 
> > The true size accounts for the actual memory occupied by this SKB. Since
> > the page is allocated with alloc_page, the granularity should be
> > PAGE_SIZE not XEN_PAGE_SIZE.
> 
> Ok. I will replace with PAGE_SIZE.
> 
> Regards,
> 
> -- 
> Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-15 12:35       ` Julien Grall
  (?)
  (?)
@ 2015-05-15 15:31       ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-15 15:31 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> Thanks you for the review.
> 
> On 15/05/15 03:35, Wei Liu wrote:
> > On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> >> The PV network protocol is using 4KB page granularity. The goal of this
> >> patch is to allow a Linux using 64KB page granularity working as a
> >> network backend on a non-modified Xen.
> >>
> >> It's only necessary to adapt the ring size and break skb data in small
> >> chunk of 4KB. The rest of the code is relying on the grant table code.
> >>
> >> Although only simple workload is working (dhcp request, ping). If I try
> >> to use wget in the guest, it will stall until a tcpdump is started on
> >> the vif interface in DOM0. I wasn't able to find why.
> >>
> > 
> > I think in wget workload you're more likely to break down 64K pages to
> > 4K pages. Some of your calculation of mfn, offset might be wrong.
> 
> If so, why tcpdump on the vif interface would make wget suddenly
> working? Does it make netback use a different path?

No, but if might make core network component behave differently, this is
only my suspicion.

Do you see malformed packets with tcpdump?

> 
> >> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> >> it's used for (I have limited knowledge on the network driver).
> >>
> > 
> > This is the maximum slots a guest packet can use. AIUI the protocol
> > still works on 4K granularity (you break 64K page to a bunch of 4K
> > pages), you don't need to change this.
> 
> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
> number of Linux page. So we would have to get the number for Xen page.
> 

Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
change this constant to match underlying HV page.

> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
> but it get stuck in the loop.
> 

I don't follow. What is the new #define? Which loop does it get stuck?

> >> ---
> >>  drivers/net/xen-netback/common.h  |  7 ++++---
> >>  drivers/net/xen-netback/netback.c | 27 ++++++++++++++-------------
> >>  2 files changed, 18 insertions(+), 16 deletions(-)
> >>
> >> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> >> index 8a495b3..0eda6e9 100644
> >> --- a/drivers/net/xen-netback/common.h
> >> +++ b/drivers/net/xen-netback/common.h
> >> @@ -44,6 +44,7 @@
> >>  #include <xen/interface/grant_table.h>
> >>  #include <xen/grant_table.h>
> >>  #include <xen/xenbus.h>
> >> +#include <xen/page.h>
> >>  #include <linux/debugfs.h>
> >>  
> >>  typedef unsigned int pending_ring_idx_t;
> >> @@ -64,8 +65,8 @@ struct pending_tx_info {
> >>  	struct ubuf_info callback_struct;
> >>  };
> >>  
> >> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE)
> >> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE)
> >> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, XEN_PAGE_SIZE)
> >> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, XEN_PAGE_SIZE)
> >>  
> >>  struct xenvif_rx_meta {
> >>  	int id;
> >> @@ -80,7 +81,7 @@ struct xenvif_rx_meta {
> >>  /* Discriminate from any valid pending_idx value. */
> >>  #define INVALID_PENDING_IDX 0xFFFF
> >>  
> >> -#define MAX_BUFFER_OFFSET PAGE_SIZE
> >> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE
> >>  
> >>  #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE
> >>  
> >> diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
> >> index 9ae1d43..ea5ce84 100644
> >> --- a/drivers/net/xen-netback/netback.c
> >> +++ b/drivers/net/xen-netback/netback.c
> >> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  {
> >>  	struct gnttab_copy *copy_gop;
> >>  	struct xenvif_rx_meta *meta;
> >> -	unsigned long bytes;
> >> +	unsigned long bytes, off_grant;
> >>  	int gso_type = XEN_NETIF_GSO_TYPE_NONE;
> >>  
> >>  	/* Data must not cross a page boundary. */
> >> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  		if (npo->copy_off == MAX_BUFFER_OFFSET)
> >>  			meta = get_next_rx_buffer(queue, npo);
> >>  
> >> -		bytes = PAGE_SIZE - offset;
> >> +		off_grant = offset & ~XEN_PAGE_MASK;
> >> +		bytes = XEN_PAGE_SIZE - off_grant;
> >>  		if (bytes > size)
> >>  			bytes = size;
> >>  
> >> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue *queue, struct sk_buff *skb
> >>  		} else {
> >>  			copy_gop->source.domid = DOMID_SELF;
> >>  			copy_gop->source.u.gmfn =
> >> -				virt_to_mfn(page_address(page));
> >> +				virt_to_mfn(page_address(page) + offset);
> >>  		}
> >> -		copy_gop->source.offset = offset;
> >> +		copy_gop->source.offset = off_grant;
> >>  
> >>  		copy_gop->dest.domid = queue->vif->domid;
> >>  		copy_gop->dest.offset = npo->copy_off;
> >> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
> >>  		first->size -= txp->size;
> >>  		slots++;
> >>  
> >> -		if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) {
> >> +		if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) {
> >>  			netdev_err(queue->vif->dev, "Cross page boundary, txp->offset: %x, size: %u\n",
> >>  				 txp->offset, txp->size);
> >>  			xenvif_fatal_tx_err(queue->vif);
> >> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
> >>  		}
> >>  
> >>  		/* No crossing a page as the payload mustn't fragment. */
> >> -		if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) {
> >> +		if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
> >>  			netdev_err(queue->vif->dev,
> >>  				   "txreq.offset: %x, size: %u, end: %lu\n",
> >>  				   txreq.offset, txreq.size,
> >> -				   (txreq.offset&~PAGE_MASK) + txreq.size);
> >> +				   (txreq.offset&~XEN_PAGE_MASK) + txreq.size);
> >>  			xenvif_fatal_tx_err(queue->vif);
> >>  			break;
> >>  		}
> >> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
> >>  			virt_to_mfn(skb->data);
> > 
> > You didn't change the calculation of MFN here. I think it returns the
> > MFN of the first 4K sub-page of that 64K page.  Do I miss anything?
> 
> There is no change required. On ARM virt_to_mfn is implemented with:
> 
> pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT)
> 
> which will return a 4KB PFN (see patch #23).
> 

OK. I missed that patch.

> > 
> >>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
> >>  		queue->tx_copy_ops[*copy_ops].dest.offset =
> >> -			offset_in_page(skb->data);
> >> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
> >>  
> >>  		queue->tx_copy_ops[*copy_ops].len = data_len;
> >>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
> >> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> > 
> > This function is to coalesce frag_list to a new SKB. It's completely
> > fine to use the natural granularity of backend domain. The way you
> > modified it can lead to waste of memory, i.e. you only use first 4K of a
> > 64K page.
> 
> Thanks for explaining. I wasn't sure how the function works so I change
> it for safety. I will redo the change.
> 
> FWIW, I'm sure there is other place in netback where we waste memory
> with 64KB page granularity (such as grant table). I need to track them.
> 
> Let me know if you have some place in mind where the memory usage can be
> improved.
> 

I was about to say the mmap_pages array is an array of pages. But that
probably belongs to grant table driver.

Wei.

> >>  			return -ENOMEM;
> >>  		}
> >>  
> >> -		if (offset + PAGE_SIZE < skb->len)
> >> -			len = PAGE_SIZE;
> >> +		if (offset + XEN_PAGE_SIZE < skb->len)
> >> +			len = XEN_PAGE_SIZE;
> >>  		else
> >>  			len = skb->len - offset;
> >>  		if (skb_copy_bits(skb, offset, page_address(page), len))
> >> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
> >>  	/* Fill the skb with the new (local) frags. */
> >>  	memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t));
> >>  	skb_shinfo(skb)->nr_frags = i;
> >> -	skb->truesize += i * PAGE_SIZE;
> >> +	skb->truesize += i * XEN_PAGE_SIZE;
> > 
> > The true size accounts for the actual memory occupied by this SKB. Since
> > the page is allocated with alloc_page, the granularity should be
> > PAGE_SIZE not XEN_PAGE_SIZE.
> 
> Ok. I will replace with PAGE_SIZE.
> 
> Regards,
> 
> -- 
> Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-15 15:31         ` Wei Liu
@ 2015-05-15 15:41           ` Ian Campbell
  -1 siblings, 0 replies; 200+ messages in thread
From: Ian Campbell @ 2015-05-15 15:41 UTC (permalink / raw)
  To: Wei Liu
  Cc: Julien Grall, stefano.stabellini, netdev, tim, linux-kernel,
	xen-devel, linux-arm-kernel

On Fri, 2015-05-15 at 16:31 +0100, Wei Liu wrote:
> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> > Hi Wei,
> > 
> > Thanks you for the review.
> > 
> > On 15/05/15 03:35, Wei Liu wrote:
> > > On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> > >> The PV network protocol is using 4KB page granularity. The goal of this
> > >> patch is to allow a Linux using 64KB page granularity working as a
> > >> network backend on a non-modified Xen.
> > >>
> > >> It's only necessary to adapt the ring size and break skb data in small
> > >> chunk of 4KB. The rest of the code is relying on the grant table code.
> > >>
> > >> Although only simple workload is working (dhcp request, ping). If I try
> > >> to use wget in the guest, it will stall until a tcpdump is started on
> > >> the vif interface in DOM0. I wasn't able to find why.
> > >>
> > > 
> > > I think in wget workload you're more likely to break down 64K pages to
> > > 4K pages. Some of your calculation of mfn, offset might be wrong.
> > 
> > If so, why tcpdump on the vif interface would make wget suddenly
> > working? Does it make netback use a different path?
> 
> No, but if might make core network component behave differently, this is
> only my suspicion.

Traffic being delivered to dom0 (as opposed to passing through a bridge
and going elsewhere) will get skb_orphan_frags called on it, since
tcpdump ends up cloning the skb to go to two places it's not out of the
question that this might have some impact (deliberate or otherwise) on
the other skb which isn't going to dom0.

Ian.






^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-15 15:41           ` Ian Campbell
  0 siblings, 0 replies; 200+ messages in thread
From: Ian Campbell @ 2015-05-15 15:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2015-05-15 at 16:31 +0100, Wei Liu wrote:
> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> > Hi Wei,
> > 
> > Thanks you for the review.
> > 
> > On 15/05/15 03:35, Wei Liu wrote:
> > > On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> > >> The PV network protocol is using 4KB page granularity. The goal of this
> > >> patch is to allow a Linux using 64KB page granularity working as a
> > >> network backend on a non-modified Xen.
> > >>
> > >> It's only necessary to adapt the ring size and break skb data in small
> > >> chunk of 4KB. The rest of the code is relying on the grant table code.
> > >>
> > >> Although only simple workload is working (dhcp request, ping). If I try
> > >> to use wget in the guest, it will stall until a tcpdump is started on
> > >> the vif interface in DOM0. I wasn't able to find why.
> > >>
> > > 
> > > I think in wget workload you're more likely to break down 64K pages to
> > > 4K pages. Some of your calculation of mfn, offset might be wrong.
> > 
> > If so, why tcpdump on the vif interface would make wget suddenly
> > working? Does it make netback use a different path?
> 
> No, but if might make core network component behave differently, this is
> only my suspicion.

Traffic being delivered to dom0 (as opposed to passing through a bridge
and going elsewhere) will get skb_orphan_frags called on it, since
tcpdump ends up cloning the skb to go to two places it's not out of the
question that this might have some impact (deliberate or otherwise) on
the other skb which isn't going to dom0.

Ian.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-15 15:31         ` Wei Liu
  (?)
  (?)
@ 2015-05-15 15:41         ` Ian Campbell
  -1 siblings, 0 replies; 200+ messages in thread
From: Ian Campbell @ 2015-05-15 15:41 UTC (permalink / raw)
  To: Wei Liu
  Cc: stefano.stabellini, netdev, tim, linux-kernel, Julien Grall,
	xen-devel, linux-arm-kernel

On Fri, 2015-05-15 at 16:31 +0100, Wei Liu wrote:
> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> > Hi Wei,
> > 
> > Thanks you for the review.
> > 
> > On 15/05/15 03:35, Wei Liu wrote:
> > > On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> > >> The PV network protocol is using 4KB page granularity. The goal of this
> > >> patch is to allow a Linux using 64KB page granularity working as a
> > >> network backend on a non-modified Xen.
> > >>
> > >> It's only necessary to adapt the ring size and break skb data in small
> > >> chunk of 4KB. The rest of the code is relying on the grant table code.
> > >>
> > >> Although only simple workload is working (dhcp request, ping). If I try
> > >> to use wget in the guest, it will stall until a tcpdump is started on
> > >> the vif interface in DOM0. I wasn't able to find why.
> > >>
> > > 
> > > I think in wget workload you're more likely to break down 64K pages to
> > > 4K pages. Some of your calculation of mfn, offset might be wrong.
> > 
> > If so, why tcpdump on the vif interface would make wget suddenly
> > working? Does it make netback use a different path?
> 
> No, but if might make core network component behave differently, this is
> only my suspicion.

Traffic being delivered to dom0 (as opposed to passing through a bridge
and going elsewhere) will get skb_orphan_frags called on it, since
tcpdump ends up cloning the skb to go to two places it's not out of the
question that this might have some impact (deliberate or otherwise) on
the other skb which isn't going to dom0.

Ian.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-05-14 17:00 ` Julien Grall
@ 2015-05-15 15:45   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-15 15:45 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: wei.liu2, ian.campbell, stefano.stabellini, tim, linux-kernel,
	david.vrabel, boris.ostrovsky, linux-arm-kernel, roger.pau

On 14/05/15 18:00, Julien Grall wrote:
> Hi all,
> 
> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> hypercall interface and PV protocol are always based on 4KB page granularity.
> 
> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> guest crash.
> 
> This series is a first attempt to allow those Linux running with the current
> hypercall interface and PV protocol.
> 
> This solution has been chosen because we want to run Linux 64KB in released
> Xen ARM version or/and platform using an old version of Linux DOM0.

The key problem I see with this approach is the confusion between guest
page size and Xen page size.  This is going to be particularly
problematic since the majority of development/usage will remain on x86
where PAGE_SIZE == XEN_PAGE_SIZE.

I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
backend drivers.  Perhaps with a suitable set of helper functions?

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
@ 2015-05-15 15:45   ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-15 15:45 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> Hi all,
> 
> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> hypercall interface and PV protocol are always based on 4KB page granularity.
> 
> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> guest crash.
> 
> This series is a first attempt to allow those Linux running with the current
> hypercall interface and PV protocol.
> 
> This solution has been chosen because we want to run Linux 64KB in released
> Xen ARM version or/and platform using an old version of Linux DOM0.

The key problem I see with this approach is the confusion between guest
page size and Xen page size.  This is going to be particularly
problematic since the majority of development/usage will remain on x86
where PAGE_SIZE == XEN_PAGE_SIZE.

I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
backend drivers.  Perhaps with a suitable set of helper functions?

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-05-14 17:00 ` Julien Grall
                   ` (34 preceding siblings ...)
  (?)
@ 2015-05-15 15:45 ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-15 15:45 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: wei.liu2, ian.campbell, stefano.stabellini, tim, linux-kernel,
	david.vrabel, boris.ostrovsky, linux-arm-kernel, roger.pau

On 14/05/15 18:00, Julien Grall wrote:
> Hi all,
> 
> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> hypercall interface and PV protocol are always based on 4KB page granularity.
> 
> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> guest crash.
> 
> This series is a first attempt to allow those Linux running with the current
> hypercall interface and PV protocol.
> 
> This solution has been chosen because we want to run Linux 64KB in released
> Xen ARM version or/and platform using an old version of Linux DOM0.

The key problem I see with this approach is the confusion between guest
page size and Xen page size.  This is going to be particularly
problematic since the majority of development/usage will remain on x86
where PAGE_SIZE == XEN_PAGE_SIZE.

I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
backend drivers.  Perhaps with a suitable set of helper functions?

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-05-15 15:45   ` David Vrabel
@ 2015-05-15 15:51     ` Boris Ostrovsky
  -1 siblings, 0 replies; 200+ messages in thread
From: Boris Ostrovsky @ 2015-05-15 15:51 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: wei.liu2, ian.campbell, stefano.stabellini, tim, linux-kernel,
	linux-arm-kernel, roger.pau

On 05/15/2015 11:45 AM, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> Hi all,
>>
>> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
>> hypercall interface and PV protocol are always based on 4KB page granularity.
>>
>> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
>> guest crash.
>>
>> This series is a first attempt to allow those Linux running with the current
>> hypercall interface and PV protocol.
>>
>> This solution has been chosen because we want to run Linux 64KB in released
>> Xen ARM version or/and platform using an old version of Linux DOM0.
> The key problem I see with this approach is the confusion between guest
> page size and Xen page size.  This is going to be particularly
> problematic since the majority of development/usage will remain on x86
> where PAGE_SIZE == XEN_PAGE_SIZE.
>
> I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> backend drivers.  Perhaps with a suitable set of helper functions?

I am thinking exactly the same thing as I am going over these patches.

-boris

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
@ 2015-05-15 15:51     ` Boris Ostrovsky
  0 siblings, 0 replies; 200+ messages in thread
From: Boris Ostrovsky @ 2015-05-15 15:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/15/2015 11:45 AM, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> Hi all,
>>
>> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
>> hypercall interface and PV protocol are always based on 4KB page granularity.
>>
>> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
>> guest crash.
>>
>> This series is a first attempt to allow those Linux running with the current
>> hypercall interface and PV protocol.
>>
>> This solution has been chosen because we want to run Linux 64KB in released
>> Xen ARM version or/and platform using an old version of Linux DOM0.
> The key problem I see with this approach is the confusion between guest
> page size and Xen page size.  This is going to be particularly
> problematic since the majority of development/usage will remain on x86
> where PAGE_SIZE == XEN_PAGE_SIZE.
>
> I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> backend drivers.  Perhaps with a suitable set of helper functions?

I am thinking exactly the same thing as I am going over these patches.

-boris

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-05-15 15:45   ` David Vrabel
  (?)
@ 2015-05-15 15:51   ` Boris Ostrovsky
  -1 siblings, 0 replies; 200+ messages in thread
From: Boris Ostrovsky @ 2015-05-15 15:51 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: wei.liu2, ian.campbell, stefano.stabellini, tim, linux-kernel,
	linux-arm-kernel, roger.pau

On 05/15/2015 11:45 AM, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> Hi all,
>>
>> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
>> hypercall interface and PV protocol are always based on 4KB page granularity.
>>
>> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
>> guest crash.
>>
>> This series is a first attempt to allow those Linux running with the current
>> hypercall interface and PV protocol.
>>
>> This solution has been chosen because we want to run Linux 64KB in released
>> Xen ARM version or/and platform using an old version of Linux DOM0.
> The key problem I see with this approach is the confusion between guest
> page size and Xen page size.  This is going to be particularly
> problematic since the majority of development/usage will remain on x86
> where PAGE_SIZE == XEN_PAGE_SIZE.
>
> I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> backend drivers.  Perhaps with a suitable set of helper functions?

I am thinking exactly the same thing as I am going over these patches.

-boris

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-15 15:54     ` Boris Ostrovsky
  -1 siblings, 0 replies; 200+ messages in thread
From: Boris Ostrovsky @ 2015-05-15 15:54 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Konrad Rzeszutek Wilk, David Vrabel

On 05/14/2015 01:00 PM, Julien Grall wrote:
> When Linux is using 64K page granularity, every page will be slipt in
> multiple non-contiguous 4K MFN.
>
> I'm not sure how to handle efficiently the check to know whether we can
> merge 2 biovec with a such case. So for now, always says that biovec are
> not mergeable.
>
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> ---
>   drivers/xen/biomerge.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
> index 0edb91c..20387c2 100644
> --- a/drivers/xen/biomerge.c
> +++ b/drivers/xen/biomerge.c
> @@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
>   	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
>   	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
>   
> +	/* TODO: Implement it correctly */
> +	return 0;
> +
>   	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
>   		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
>   }


I think this is a bit too blunt.  Perhaps check first whether page sizes 
are different in the hypervisor and the guest?

(And I am not sure we need __BIOVEC_PHYS_MERGEABLE() test here as it is 
already checked by BIOVEC_PHYS_MERGEABLE() which appears to be the only 
user of xen_biovec_phys_mergeable())


-boris

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
@ 2015-05-15 15:54     ` Boris Ostrovsky
  0 siblings, 0 replies; 200+ messages in thread
From: Boris Ostrovsky @ 2015-05-15 15:54 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/14/2015 01:00 PM, Julien Grall wrote:
> When Linux is using 64K page granularity, every page will be slipt in
> multiple non-contiguous 4K MFN.
>
> I'm not sure how to handle efficiently the check to know whether we can
> merge 2 biovec with a such case. So for now, always says that biovec are
> not mergeable.
>
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> ---
>   drivers/xen/biomerge.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
> index 0edb91c..20387c2 100644
> --- a/drivers/xen/biomerge.c
> +++ b/drivers/xen/biomerge.c
> @@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
>   	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
>   	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
>   
> +	/* TODO: Implement it correctly */
> +	return 0;
> +
>   	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
>   		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
>   }


I think this is a bit too blunt.  Perhaps check first whether page sizes 
are different in the hypervisor and the guest?

(And I am not sure we need __BIOVEC_PHYS_MERGEABLE() test here as it is 
already checked by BIOVEC_PHYS_MERGEABLE() which appears to be the only 
user of xen_biovec_phys_mergeable())


-boris

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
  2015-05-14 17:00   ` Julien Grall
  (?)
  (?)
@ 2015-05-15 15:54   ` Boris Ostrovsky
  -1 siblings, 0 replies; 200+ messages in thread
From: Boris Ostrovsky @ 2015-05-15 15:54 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, linux-arm-kernel

On 05/14/2015 01:00 PM, Julien Grall wrote:
> When Linux is using 64K page granularity, every page will be slipt in
> multiple non-contiguous 4K MFN.
>
> I'm not sure how to handle efficiently the check to know whether we can
> merge 2 biovec with a such case. So for now, always says that biovec are
> not mergeable.
>
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> Cc: David Vrabel <david.vrabel@citrix.com>
> ---
>   drivers/xen/biomerge.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
> index 0edb91c..20387c2 100644
> --- a/drivers/xen/biomerge.c
> +++ b/drivers/xen/biomerge.c
> @@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec *vec1,
>   	unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
>   	unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
>   
> +	/* TODO: Implement it correctly */
> +	return 0;
> +
>   	return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
>   		((mfn1 == mfn2) || ((mfn1+1) == mfn2));
>   }


I think this is a bit too blunt.  Perhaps check first whether page sizes 
are different in the hypervisor and the guest?

(And I am not sure we need __BIOVEC_PHYS_MERGEABLE() test here as it is 
already checked by BIOVEC_PHYS_MERGEABLE() which appears to be the only 
user of xen_biovec_phys_mergeable())


-boris

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-15 15:31         ` Wei Liu
@ 2015-05-18 12:11           ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-18 12:11 UTC (permalink / raw)
  To: Wei Liu, Julien Grall
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	xen-devel, linux-arm-kernel

Hi Wei,

On 15/05/15 16:31, Wei Liu wrote:
> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
>> On 15/05/15 03:35, Wei Liu wrote:
>>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>>>> The PV network protocol is using 4KB page granularity. The goal of this
>>>> patch is to allow a Linux using 64KB page granularity working as a
>>>> network backend on a non-modified Xen.
>>>>
>>>> It's only necessary to adapt the ring size and break skb data in small
>>>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>>>
>>>> Although only simple workload is working (dhcp request, ping). If I try
>>>> to use wget in the guest, it will stall until a tcpdump is started on
>>>> the vif interface in DOM0. I wasn't able to find why.
>>>>
>>>
>>> I think in wget workload you're more likely to break down 64K pages to
>>> 4K pages. Some of your calculation of mfn, offset might be wrong.
>>
>> If so, why tcpdump on the vif interface would make wget suddenly
>> working? Does it make netback use a different path?
> 
> No, but if might make core network component behave differently, this is
> only my suspicion.
> 
> Do you see malformed packets with tcpdump?

I don't see any malformed packets with tcpdump. The connection is stalling
until tcpdump is started on the vif in dom0.

>>
>>>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
>>>> it's used for (I have limited knowledge on the network driver).
>>>>
>>>
>>> This is the maximum slots a guest packet can use. AIUI the protocol
>>> still works on 4K granularity (you break 64K page to a bunch of 4K
>>> pages), you don't need to change this.
>>
>> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
>> number of Linux page. So we would have to get the number for Xen page.
>>
> 
> Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
> change this constant to match underlying HV page.
> 
>> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
>> but it get stuck in the loop.
>>
> 
> I don't follow. What is the new #define? Which loop does it get stuck?


diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 0eda6e9..c2a5402 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 /* Maximum number of Rx slots a to-guest packet may use, including the
  * slot needed for GSO meta-data.
  */
-#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
+#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
 
 enum state_bit_shift {
        /* This bit marks that the vif is connected */

The function xenvif_wait_for_rx_work never returns. I guess it's because there
is not enough slot available.

For 64KB page granularity we ask for 16 times more slots than 4KB page
granularity. Although, it's very unlikely that all the slot will be used.

FWIW I pointed out the same problem on blkfront.


>>>
>>>>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>>>  		queue->tx_copy_ops[*copy_ops].dest.offset =
>>>> -			offset_in_page(skb->data);
>>>> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>>>>  
>>>>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>>>>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
>>>> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>>>
>>> This function is to coalesce frag_list to a new SKB. It's completely
>>> fine to use the natural granularity of backend domain. The way you
>>> modified it can lead to waste of memory, i.e. you only use first 4K of a
>>> 64K page.
>>
>> Thanks for explaining. I wasn't sure how the function works so I change
>> it for safety. I will redo the change.
>>
>> FWIW, I'm sure there is other place in netback where we waste memory
>> with 64KB page granularity (such as grant table). I need to track them.
>>
>> Let me know if you have some place in mind where the memory usage can be
>> improved.
>>
> 
> I was about to say the mmap_pages array is an array of pages. But that
> probably belongs to grant table driver.

Yes, there is a lot of rework in the grant table driver in order
to avoid wasting memory.

Regards,

-- 
Julien Grall

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-18 12:11           ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-18 12:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Wei,

On 15/05/15 16:31, Wei Liu wrote:
> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
>> On 15/05/15 03:35, Wei Liu wrote:
>>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>>>> The PV network protocol is using 4KB page granularity. The goal of this
>>>> patch is to allow a Linux using 64KB page granularity working as a
>>>> network backend on a non-modified Xen.
>>>>
>>>> It's only necessary to adapt the ring size and break skb data in small
>>>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>>>
>>>> Although only simple workload is working (dhcp request, ping). If I try
>>>> to use wget in the guest, it will stall until a tcpdump is started on
>>>> the vif interface in DOM0. I wasn't able to find why.
>>>>
>>>
>>> I think in wget workload you're more likely to break down 64K pages to
>>> 4K pages. Some of your calculation of mfn, offset might be wrong.
>>
>> If so, why tcpdump on the vif interface would make wget suddenly
>> working? Does it make netback use a different path?
> 
> No, but if might make core network component behave differently, this is
> only my suspicion.
> 
> Do you see malformed packets with tcpdump?

I don't see any malformed packets with tcpdump. The connection is stalling
until tcpdump is started on the vif in dom0.

>>
>>>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
>>>> it's used for (I have limited knowledge on the network driver).
>>>>
>>>
>>> This is the maximum slots a guest packet can use. AIUI the protocol
>>> still works on 4K granularity (you break 64K page to a bunch of 4K
>>> pages), you don't need to change this.
>>
>> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
>> number of Linux page. So we would have to get the number for Xen page.
>>
> 
> Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
> change this constant to match underlying HV page.
> 
>> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
>> but it get stuck in the loop.
>>
> 
> I don't follow. What is the new #define? Which loop does it get stuck?


diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 0eda6e9..c2a5402 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 /* Maximum number of Rx slots a to-guest packet may use, including the
  * slot needed for GSO meta-data.
  */
-#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
+#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
 
 enum state_bit_shift {
        /* This bit marks that the vif is connected */

The function xenvif_wait_for_rx_work never returns. I guess it's because there
is not enough slot available.

For 64KB page granularity we ask for 16 times more slots than 4KB page
granularity. Although, it's very unlikely that all the slot will be used.

FWIW I pointed out the same problem on blkfront.


>>>
>>>>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>>>  		queue->tx_copy_ops[*copy_ops].dest.offset =
>>>> -			offset_in_page(skb->data);
>>>> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>>>>  
>>>>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>>>>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
>>>> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>>>
>>> This function is to coalesce frag_list to a new SKB. It's completely
>>> fine to use the natural granularity of backend domain. The way you
>>> modified it can lead to waste of memory, i.e. you only use first 4K of a
>>> 64K page.
>>
>> Thanks for explaining. I wasn't sure how the function works so I change
>> it for safety. I will redo the change.
>>
>> FWIW, I'm sure there is other place in netback where we waste memory
>> with 64KB page granularity (such as grant table). I need to track them.
>>
>> Let me know if you have some place in mind where the memory usage can be
>> improved.
>>
> 
> I was about to say the mmap_pages array is an array of pages. But that
> probably belongs to grant table driver.

Yes, there is a lot of rework in the grant table driver in order
to avoid wasting memory.

Regards,

-- 
Julien Grall

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-15 15:31         ` Wei Liu
                           ` (3 preceding siblings ...)
  (?)
@ 2015-05-18 12:11         ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-18 12:11 UTC (permalink / raw)
  To: Wei Liu, Julien Grall
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	xen-devel, linux-arm-kernel

Hi Wei,

On 15/05/15 16:31, Wei Liu wrote:
> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
>> On 15/05/15 03:35, Wei Liu wrote:
>>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>>>> The PV network protocol is using 4KB page granularity. The goal of this
>>>> patch is to allow a Linux using 64KB page granularity working as a
>>>> network backend on a non-modified Xen.
>>>>
>>>> It's only necessary to adapt the ring size and break skb data in small
>>>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>>>
>>>> Although only simple workload is working (dhcp request, ping). If I try
>>>> to use wget in the guest, it will stall until a tcpdump is started on
>>>> the vif interface in DOM0. I wasn't able to find why.
>>>>
>>>
>>> I think in wget workload you're more likely to break down 64K pages to
>>> 4K pages. Some of your calculation of mfn, offset might be wrong.
>>
>> If so, why tcpdump on the vif interface would make wget suddenly
>> working? Does it make netback use a different path?
> 
> No, but if might make core network component behave differently, this is
> only my suspicion.
> 
> Do you see malformed packets with tcpdump?

I don't see any malformed packets with tcpdump. The connection is stalling
until tcpdump is started on the vif in dom0.

>>
>>>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
>>>> it's used for (I have limited knowledge on the network driver).
>>>>
>>>
>>> This is the maximum slots a guest packet can use. AIUI the protocol
>>> still works on 4K granularity (you break 64K page to a bunch of 4K
>>> pages), you don't need to change this.
>>
>> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
>> number of Linux page. So we would have to get the number for Xen page.
>>
> 
> Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
> change this constant to match underlying HV page.
> 
>> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
>> but it get stuck in the loop.
>>
> 
> I don't follow. What is the new #define? Which loop does it get stuck?


diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 0eda6e9..c2a5402 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
 /* Maximum number of Rx slots a to-guest packet may use, including the
  * slot needed for GSO meta-data.
  */
-#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
+#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
 
 enum state_bit_shift {
        /* This bit marks that the vif is connected */

The function xenvif_wait_for_rx_work never returns. I guess it's because there
is not enough slot available.

For 64KB page granularity we ask for 16 times more slots than 4KB page
granularity. Although, it's very unlikely that all the slot will be used.

FWIW I pointed out the same problem on blkfront.


>>>
>>>>  		queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF;
>>>>  		queue->tx_copy_ops[*copy_ops].dest.offset =
>>>> -			offset_in_page(skb->data);
>>>> +			offset_in_page(skb->data) & ~XEN_PAGE_MASK;
>>>>  
>>>>  		queue->tx_copy_ops[*copy_ops].len = data_len;
>>>>  		queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref;
>>>> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue *queue, struct sk_buff *s
>>>
>>> This function is to coalesce frag_list to a new SKB. It's completely
>>> fine to use the natural granularity of backend domain. The way you
>>> modified it can lead to waste of memory, i.e. you only use first 4K of a
>>> 64K page.
>>
>> Thanks for explaining. I wasn't sure how the function works so I change
>> it for safety. I will redo the change.
>>
>> FWIW, I'm sure there is other place in netback where we waste memory
>> with 64KB page granularity (such as grant table). I need to track them.
>>
>> Let me know if you have some place in mind where the memory usage can be
>> improved.
>>
> 
> I was about to say the mmap_pages array is an array of pages. But that
> probably belongs to grant table driver.

Yes, there is a lot of rework in the grant table driver in order
to avoid wasting memory.

Regards,

-- 
Julien Grall

^ permalink raw reply related	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-05-15 15:45   ` David Vrabel
@ 2015-05-18 12:23     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-18 12:23 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: wei.liu2, ian.campbell, stefano.stabellini, tim, linux-kernel,
	boris.ostrovsky, linux-arm-kernel, roger.pau

Hi David,

On 15/05/15 16:45, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> Hi all,
>>
>> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
>> hypercall interface and PV protocol are always based on 4KB page granularity.
>>
>> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
>> guest crash.
>>
>> This series is a first attempt to allow those Linux running with the current
>> hypercall interface and PV protocol.
>>
>> This solution has been chosen because we want to run Linux 64KB in released
>> Xen ARM version or/and platform using an old version of Linux DOM0.
> 
> The key problem I see with this approach is the confusion between guest
> page size and Xen page size.  This is going to be particularly
> problematic since the majority of development/usage will remain on x86
> where PAGE_SIZE == XEN_PAGE_SIZE.
> 
> I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> backend drivers.  Perhaps with a suitable set of helper functions?

Even with the helpers, we are not protected from any change in the
frontend/backend that will impact 64K. It won't be possible to remove
all the XEN_PAGE_* usage (there is a lots of places where adding helpers
would not be possible) and we would still have to carefully review any
changes.

I think it may be possible to move the grant table splitting in helpers
which would be helpful to support different grant size.

Although, it would require a big amount of work at least in blkfront.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
@ 2015-05-18 12:23     ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-18 12:23 UTC (permalink / raw)
  To: linux-arm-kernel

Hi David,

On 15/05/15 16:45, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> Hi all,
>>
>> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
>> hypercall interface and PV protocol are always based on 4KB page granularity.
>>
>> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
>> guest crash.
>>
>> This series is a first attempt to allow those Linux running with the current
>> hypercall interface and PV protocol.
>>
>> This solution has been chosen because we want to run Linux 64KB in released
>> Xen ARM version or/and platform using an old version of Linux DOM0.
> 
> The key problem I see with this approach is the confusion between guest
> page size and Xen page size.  This is going to be particularly
> problematic since the majority of development/usage will remain on x86
> where PAGE_SIZE == XEN_PAGE_SIZE.
> 
> I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> backend drivers.  Perhaps with a suitable set of helper functions?

Even with the helpers, we are not protected from any change in the
frontend/backend that will impact 64K. It won't be possible to remove
all the XEN_PAGE_* usage (there is a lots of places where adding helpers
would not be possible) and we would still have to carefully review any
changes.

I think it may be possible to move the grant table splitting in helpers
which would be helpful to support different grant size.

Although, it would require a big amount of work at least in blkfront.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-05-15 15:45   ` David Vrabel
                     ` (2 preceding siblings ...)
  (?)
@ 2015-05-18 12:23   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-18 12:23 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: wei.liu2, ian.campbell, stefano.stabellini, tim, linux-kernel,
	boris.ostrovsky, linux-arm-kernel, roger.pau

Hi David,

On 15/05/15 16:45, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> Hi all,
>>
>> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
>> hypercall interface and PV protocol are always based on 4KB page granularity.
>>
>> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
>> guest crash.
>>
>> This series is a first attempt to allow those Linux running with the current
>> hypercall interface and PV protocol.
>>
>> This solution has been chosen because we want to run Linux 64KB in released
>> Xen ARM version or/and platform using an old version of Linux DOM0.
> 
> The key problem I see with this approach is the confusion between guest
> page size and Xen page size.  This is going to be particularly
> problematic since the majority of development/usage will remain on x86
> where PAGE_SIZE == XEN_PAGE_SIZE.
> 
> I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> backend drivers.  Perhaps with a suitable set of helper functions?

Even with the helpers, we are not protected from any change in the
frontend/backend that will impact 64K. It won't be possible to remove
all the XEN_PAGE_* usage (there is a lots of places where adding helpers
would not be possible) and we would still have to carefully review any
changes.

I think it may be possible to move the grant table splitting in helpers
which would be helpful to support different grant size.

Although, it would require a big amount of work at least in blkfront.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-18 12:11           ` Julien Grall
@ 2015-05-18 12:54             ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-18 12:54 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Mon, May 18, 2015 at 01:11:26PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> On 15/05/15 16:31, Wei Liu wrote:
> > On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> >> On 15/05/15 03:35, Wei Liu wrote:
> >>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> >>>> The PV network protocol is using 4KB page granularity. The goal of this
> >>>> patch is to allow a Linux using 64KB page granularity working as a
> >>>> network backend on a non-modified Xen.
> >>>>
> >>>> It's only necessary to adapt the ring size and break skb data in small
> >>>> chunk of 4KB. The rest of the code is relying on the grant table code.
> >>>>
> >>>> Although only simple workload is working (dhcp request, ping). If I try
> >>>> to use wget in the guest, it will stall until a tcpdump is started on
> >>>> the vif interface in DOM0. I wasn't able to find why.
> >>>>
> >>>
> >>> I think in wget workload you're more likely to break down 64K pages to
> >>> 4K pages. Some of your calculation of mfn, offset might be wrong.
> >>
> >> If so, why tcpdump on the vif interface would make wget suddenly
> >> working? Does it make netback use a different path?
> > 
> > No, but if might make core network component behave differently, this is
> > only my suspicion.
> > 
> > Do you see malformed packets with tcpdump?
> 
> I don't see any malformed packets with tcpdump. The connection is stalling
> until tcpdump is started on the vif in dom0.
> 

Hmm... Don't have immediate idea about this.

Ian said skb_orphan is called with tcpdump. If I remember correct that
would trigger the callback to release the slots in netback. It could be
that other part of Linux is holding onto the skbs for too long.

If you're wgetting from another host, I would suggest wgetting from Dom0
to limit the problem between Dom0 and DomU.

> >>
> >>>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> >>>> it's used for (I have limited knowledge on the network driver).
> >>>>
> >>>
> >>> This is the maximum slots a guest packet can use. AIUI the protocol
> >>> still works on 4K granularity (you break 64K page to a bunch of 4K
> >>> pages), you don't need to change this.
> >>
> >> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
> >> number of Linux page. So we would have to get the number for Xen page.
> >>
> > 
> > Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
> > change this constant to match underlying HV page.
> > 
> >> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
> >> but it get stuck in the loop.
> >>
> > 
> > I don't follow. What is the new #define? Which loop does it get stuck?
> 
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 0eda6e9..c2a5402 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>  /* Maximum number of Rx slots a to-guest packet may use, including the
>   * slot needed for GSO meta-data.
>   */
> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>  
>  enum state_bit_shift {
>         /* This bit marks that the vif is connected */
> 
> The function xenvif_wait_for_rx_work never returns. I guess it's because there
> is not enough slot available.
> 
> For 64KB page granularity we ask for 16 times more slots than 4KB page
> granularity. Although, it's very unlikely that all the slot will be used.
> 
> FWIW I pointed out the same problem on blkfront.
> 

This is not going to work. The ring in netfront / netback has only 256
slots. Now you ask for netback to reserve more than 256 slots -- (17 +
1) * (64 / 4) = 288, which can never be fulfilled. See the call to
xenvif_rx_ring_slots_available.

I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
the guest cannot be larger than 64K. So you might be able to

#define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)

Blk driver may have a different story. But the default ring size (1
page) yields even less slots than net (given that sizeof(union(req/rsp))
is larger IIRC).

Wei.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-18 12:54             ` Wei Liu
  0 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-18 12:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, May 18, 2015 at 01:11:26PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> On 15/05/15 16:31, Wei Liu wrote:
> > On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> >> On 15/05/15 03:35, Wei Liu wrote:
> >>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> >>>> The PV network protocol is using 4KB page granularity. The goal of this
> >>>> patch is to allow a Linux using 64KB page granularity working as a
> >>>> network backend on a non-modified Xen.
> >>>>
> >>>> It's only necessary to adapt the ring size and break skb data in small
> >>>> chunk of 4KB. The rest of the code is relying on the grant table code.
> >>>>
> >>>> Although only simple workload is working (dhcp request, ping). If I try
> >>>> to use wget in the guest, it will stall until a tcpdump is started on
> >>>> the vif interface in DOM0. I wasn't able to find why.
> >>>>
> >>>
> >>> I think in wget workload you're more likely to break down 64K pages to
> >>> 4K pages. Some of your calculation of mfn, offset might be wrong.
> >>
> >> If so, why tcpdump on the vif interface would make wget suddenly
> >> working? Does it make netback use a different path?
> > 
> > No, but if might make core network component behave differently, this is
> > only my suspicion.
> > 
> > Do you see malformed packets with tcpdump?
> 
> I don't see any malformed packets with tcpdump. The connection is stalling
> until tcpdump is started on the vif in dom0.
> 

Hmm... Don't have immediate idea about this.

Ian said skb_orphan is called with tcpdump. If I remember correct that
would trigger the callback to release the slots in netback. It could be
that other part of Linux is holding onto the skbs for too long.

If you're wgetting from another host, I would suggest wgetting from Dom0
to limit the problem between Dom0 and DomU.

> >>
> >>>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> >>>> it's used for (I have limited knowledge on the network driver).
> >>>>
> >>>
> >>> This is the maximum slots a guest packet can use. AIUI the protocol
> >>> still works on 4K granularity (you break 64K page to a bunch of 4K
> >>> pages), you don't need to change this.
> >>
> >> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
> >> number of Linux page. So we would have to get the number for Xen page.
> >>
> > 
> > Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
> > change this constant to match underlying HV page.
> > 
> >> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
> >> but it get stuck in the loop.
> >>
> > 
> > I don't follow. What is the new #define? Which loop does it get stuck?
> 
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 0eda6e9..c2a5402 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>  /* Maximum number of Rx slots a to-guest packet may use, including the
>   * slot needed for GSO meta-data.
>   */
> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>  
>  enum state_bit_shift {
>         /* This bit marks that the vif is connected */
> 
> The function xenvif_wait_for_rx_work never returns. I guess it's because there
> is not enough slot available.
> 
> For 64KB page granularity we ask for 16 times more slots than 4KB page
> granularity. Although, it's very unlikely that all the slot will be used.
> 
> FWIW I pointed out the same problem on blkfront.
> 

This is not going to work. The ring in netfront / netback has only 256
slots. Now you ask for netback to reserve more than 256 slots -- (17 +
1) * (64 / 4) = 288, which can never be fulfilled. See the call to
xenvif_rx_ring_slots_available.

I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
the guest cannot be larger than 64K. So you might be able to

#define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)

Blk driver may have a different story. But the default ring size (1
page) yields even less slots than net (given that sizeof(union(req/rsp))
is larger IIRC).

Wei.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-18 12:11           ` Julien Grall
  (?)
@ 2015-05-18 12:54           ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-18 12:54 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Mon, May 18, 2015 at 01:11:26PM +0100, Julien Grall wrote:
> Hi Wei,
> 
> On 15/05/15 16:31, Wei Liu wrote:
> > On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
> >> On 15/05/15 03:35, Wei Liu wrote:
> >>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
> >>>> The PV network protocol is using 4KB page granularity. The goal of this
> >>>> patch is to allow a Linux using 64KB page granularity working as a
> >>>> network backend on a non-modified Xen.
> >>>>
> >>>> It's only necessary to adapt the ring size and break skb data in small
> >>>> chunk of 4KB. The rest of the code is relying on the grant table code.
> >>>>
> >>>> Although only simple workload is working (dhcp request, ping). If I try
> >>>> to use wget in the guest, it will stall until a tcpdump is started on
> >>>> the vif interface in DOM0. I wasn't able to find why.
> >>>>
> >>>
> >>> I think in wget workload you're more likely to break down 64K pages to
> >>> 4K pages. Some of your calculation of mfn, offset might be wrong.
> >>
> >> If so, why tcpdump on the vif interface would make wget suddenly
> >> working? Does it make netback use a different path?
> > 
> > No, but if might make core network component behave differently, this is
> > only my suspicion.
> > 
> > Do you see malformed packets with tcpdump?
> 
> I don't see any malformed packets with tcpdump. The connection is stalling
> until tcpdump is started on the vif in dom0.
> 

Hmm... Don't have immediate idea about this.

Ian said skb_orphan is called with tcpdump. If I remember correct that
would trigger the callback to release the slots in netback. It could be
that other part of Linux is holding onto the skbs for too long.

If you're wgetting from another host, I would suggest wgetting from Dom0
to limit the problem between Dom0 and DomU.

> >>
> >>>> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what
> >>>> it's used for (I have limited knowledge on the network driver).
> >>>>
> >>>
> >>> This is the maximum slots a guest packet can use. AIUI the protocol
> >>> still works on 4K granularity (you break 64K page to a bunch of 4K
> >>> pages), you don't need to change this.
> >>
> >> 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the
> >> number of Linux page. So we would have to get the number for Xen page.
> >>
> > 
> > Yes, 1 slot = 1 grant. I see what you're up to now. Yes, you need to
> > change this constant to match underlying HV page.
> > 
> >> Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16)
> >> but it get stuck in the loop.
> >>
> > 
> > I don't follow. What is the new #define? Which loop does it get stuck?
> 
> 
> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> index 0eda6e9..c2a5402 100644
> --- a/drivers/net/xen-netback/common.h
> +++ b/drivers/net/xen-netback/common.h
> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>  /* Maximum number of Rx slots a to-guest packet may use, including the
>   * slot needed for GSO meta-data.
>   */
> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>  
>  enum state_bit_shift {
>         /* This bit marks that the vif is connected */
> 
> The function xenvif_wait_for_rx_work never returns. I guess it's because there
> is not enough slot available.
> 
> For 64KB page granularity we ask for 16 times more slots than 4KB page
> granularity. Although, it's very unlikely that all the slot will be used.
> 
> FWIW I pointed out the same problem on blkfront.
> 

This is not going to work. The ring in netfront / netback has only 256
slots. Now you ask for netback to reserve more than 256 slots -- (17 +
1) * (64 / 4) = 288, which can never be fulfilled. See the call to
xenvif_rx_ring_slots_available.

I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
the guest cannot be larger than 64K. So you might be able to

#define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)

Blk driver may have a different story. But the default ring size (1
page) yields even less slots than net (given that sizeof(union(req/rsp))
is larger IIRC).

Wei.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 13:50     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:50 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> Using xen/page.h will be necessary later for using common xen page
> helpers.
> 
> As xen/page.h already include asm/xen/page.h, always use the later.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h
@ 2015-05-19 13:50     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> Using xen/page.h will be necessary later for using common xen page
> helpers.
> 
> As xen/page.h already include asm/xen/page.h, always use the later.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h
  2015-05-14 17:00   ` Julien Grall
  (?)
  (?)
@ 2015-05-19 13:50   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:50 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> Using xen/page.h will be necessary later for using common xen page
> helpers.
> 
> As xen/page.h already include asm/xen/page.h, always use the later.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 13:51     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:51 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> virt_to_mfn should take a void* rather an unsigned long. While it
> doesn't really matter now, it would throw a compiler warning later when
> virt_to_mfn will enforce the type.
> 
> At the same time, avoid to compute new virtual address every time in the
> loop and directly increment the parameter as we don't use it later.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

But...

> --- a/drivers/xen/xenbus/xenbus_client.c
> +++ b/drivers/xen/xenbus/xenbus_client.c
> @@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
>  	int i, j;
>  
>  	for (i = 0; i < nr_pages; i++) {
> -		unsigned long addr = (unsigned long)vaddr +
> -			(PAGE_SIZE * i);
>  		err = gnttab_grant_foreign_access(dev->otherend_id,
> -						  virt_to_mfn(addr), 0);
> +						  virt_to_mfn(vaddr), 0);
>  		if (err < 0) {
>  			xenbus_dev_fatal(dev, err,
>  					 "granting access to ring page");
>  			goto fail;
>  		}
>  		grefs[i] = err;
> +
> +		vaddr = (char *)vaddr + PAGE_SIZE;

You don't need the cast here since vaddr is a void *.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
@ 2015-05-19 13:51     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> virt_to_mfn should take a void* rather an unsigned long. While it
> doesn't really matter now, it would throw a compiler warning later when
> virt_to_mfn will enforce the type.
> 
> At the same time, avoid to compute new virtual address every time in the
> loop and directly increment the parameter as we don't use it later.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

But...

> --- a/drivers/xen/xenbus/xenbus_client.c
> +++ b/drivers/xen/xenbus/xenbus_client.c
> @@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
>  	int i, j;
>  
>  	for (i = 0; i < nr_pages; i++) {
> -		unsigned long addr = (unsigned long)vaddr +
> -			(PAGE_SIZE * i);
>  		err = gnttab_grant_foreign_access(dev->otherend_id,
> -						  virt_to_mfn(addr), 0);
> +						  virt_to_mfn(vaddr), 0);
>  		if (err < 0) {
>  			xenbus_dev_fatal(dev, err,
>  					 "granting access to ring page");
>  			goto fail;
>  		}
>  		grefs[i] = err;
> +
> +		vaddr = (char *)vaddr + PAGE_SIZE;

You don't need the cast here since vaddr is a void *.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
  2015-05-14 17:00   ` Julien Grall
  (?)
@ 2015-05-19 13:51   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:51 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> virt_to_mfn should take a void* rather an unsigned long. While it
> doesn't really matter now, it would throw a compiler warning later when
> virt_to_mfn will enforce the type.
> 
> At the same time, avoid to compute new virtual address every time in the
> loop and directly increment the parameter as we don't use it later.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

But...

> --- a/drivers/xen/xenbus/xenbus_client.c
> +++ b/drivers/xen/xenbus/xenbus_client.c
> @@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
>  	int i, j;
>  
>  	for (i = 0; i < nr_pages; i++) {
> -		unsigned long addr = (unsigned long)vaddr +
> -			(PAGE_SIZE * i);
>  		err = gnttab_grant_foreign_access(dev->otherend_id,
> -						  virt_to_mfn(addr), 0);
> +						  virt_to_mfn(vaddr), 0);
>  		if (err < 0) {
>  			xenbus_dev_fatal(dev, err,
>  					 "granting access to ring page");
>  			goto fail;
>  		}
>  		grefs[i] = err;
> +
> +		vaddr = (char *)vaddr + PAGE_SIZE;

You don't need the cast here since vaddr is a void *.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 03/23] xen/grant-table: Remove unused macro SPP
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 13:52     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:52 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> SPP was used by the grant table v2 code which has been removed in
> commit 438b33c7145ca8a5131a30c36d8f59bce119a19a "xen/grant-table:
> remove support for V2 tables".

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 03/23] xen/grant-table: Remove unused macro SPP
@ 2015-05-19 13:52     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> SPP was used by the grant table v2 code which has been removed in
> commit 438b33c7145ca8a5131a30c36d8f59bce119a19a "xen/grant-table:
> remove support for V2 tables".

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 03/23] xen/grant-table: Remove unused macro SPP
  2015-05-14 17:00   ` Julien Grall
                     ` (2 preceding siblings ...)
  (?)
@ 2015-05-19 13:52   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:52 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> SPP was used by the grant table v2 code which has been removed in
> commit 438b33c7145ca8a5131a30c36d8f59bce119a19a "xen/grant-table:
> remove support for V2 tables".

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 13:53     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:53 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> rx->status is an int16_t, print it using %d rather than %u in order to
> have a meaningful value when the field is negative.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses
@ 2015-05-19 13:53     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> rx->status is an int16_t, print it using %d rather than %u in order to
> have a meaningful value when the field is negative.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses
  2015-05-14 17:00   ` Julien Grall
                     ` (3 preceding siblings ...)
  (?)
@ 2015-05-19 13:53   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:53 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> rx->status is an int16_t, print it using %d rather than %u in order to
> have a meaningful value when the field is negative.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 13:57     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:57 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> With 64KB page granularity support in Linux, a page will be split accross
> multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
> contiguous.
> 
> With the offset in the page, the helper will be able to know which MFN
> the driver needs to retrieve.

I think a gnttab_grant_foreign_access_ref()-like helper that takes a
page would be better.

You will probably want this helper able to return/fill a set of refs for
64 KiB pages.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
@ 2015-05-19 13:57     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> With 64KB page granularity support in Linux, a page will be split accross
> multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
> contiguous.
> 
> With the offset in the page, the helper will be able to know which MFN
> the driver needs to retrieve.

I think a gnttab_grant_foreign_access_ref()-like helper that takes a
page would be better.

You will probably want this helper able to return/fill a set of refs for
64 KiB pages.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
  2015-05-14 17:00   ` Julien Grall
  (?)
@ 2015-05-19 13:57   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:57 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> With 64KB page granularity support in Linux, a page will be split accross
> multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
> contiguous.
> 
> With the offset in the page, the helper will be able to know which MFN
> the driver needs to retrieve.

I think a gnttab_grant_foreign_access_ref()-like helper that takes a
page would be better.

You will probably want this helper able to return/fill a set of refs for
64 KiB pages.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 13/23] xen/xenbus: Use Xen page definition
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 13:59     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:59 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> The xenstore ring is always based on the page granularity of Xen.
[...]
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
>  
>  	xen_store_mfn = xen_start_info->store_mfn =
>  		pfn_to_mfn(virt_to_phys((void *)page) >>
> -			   PAGE_SHIFT);
> +			   XEN_PAGE_SHIFT);

This is page_to_mfn() that you adjusted in the previous patch.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 13/23] xen/xenbus: Use Xen page definition
@ 2015-05-19 13:59     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:59 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> The xenstore ring is always based on the page granularity of Xen.
[...]
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
>  
>  	xen_store_mfn = xen_start_info->store_mfn =
>  		pfn_to_mfn(virt_to_phys((void *)page) >>
> -			   PAGE_SHIFT);
> +			   XEN_PAGE_SHIFT);

This is page_to_mfn() that you adjusted in the previous patch.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 13/23] xen/xenbus: Use Xen page definition
  2015-05-14 17:00   ` Julien Grall
  (?)
  (?)
@ 2015-05-19 13:59   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 13:59 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> The xenstore ring is always based on the page granularity of Xen.
[...]
> --- a/drivers/xen/xenbus/xenbus_probe.c
> +++ b/drivers/xen/xenbus/xenbus_probe.c
> @@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
>  
>  	xen_store_mfn = xen_start_info->store_mfn =
>  		pfn_to_mfn(virt_to_phys((void *)page) >>
> -			   PAGE_SHIFT);
> +			   XEN_PAGE_SHIFT);

This is page_to_mfn() that you adjusted in the previous patch.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
  2015-05-19 13:51     ` David Vrabel
@ 2015-05-19 14:12       ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:12 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, tim, linux-kernel,
	Boris Ostrovsky, linux-arm-kernel

Hi David,

On 19/05/15 14:51, David Vrabel wrote:
>> --- a/drivers/xen/xenbus/xenbus_client.c
>> +++ b/drivers/xen/xenbus/xenbus_client.c
>> @@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
>>  	int i, j;
>>  
>>  	for (i = 0; i < nr_pages; i++) {
>> -		unsigned long addr = (unsigned long)vaddr +
>> -			(PAGE_SIZE * i);
>>  		err = gnttab_grant_foreign_access(dev->otherend_id,
>> -						  virt_to_mfn(addr), 0);
>> +						  virt_to_mfn(vaddr), 0);
>>  		if (err < 0) {
>>  			xenbus_dev_fatal(dev, err,
>>  					 "granting access to ring page");
>>  			goto fail;
>>  		}
>>  		grefs[i] = err;
>> +
>> +		vaddr = (char *)vaddr + PAGE_SIZE;
> 
> You don't need the cast here since vaddr is a void *.

Arithmetic on void pointer is a GCC extension [1]. I wasn't sure what is
the Linux policy on it.

Regards,

[1] https://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html#Pointer-Arith


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
@ 2015-05-19 14:12       ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:12 UTC (permalink / raw)
  To: linux-arm-kernel

Hi David,

On 19/05/15 14:51, David Vrabel wrote:
>> --- a/drivers/xen/xenbus/xenbus_client.c
>> +++ b/drivers/xen/xenbus/xenbus_client.c
>> @@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
>>  	int i, j;
>>  
>>  	for (i = 0; i < nr_pages; i++) {
>> -		unsigned long addr = (unsigned long)vaddr +
>> -			(PAGE_SIZE * i);
>>  		err = gnttab_grant_foreign_access(dev->otherend_id,
>> -						  virt_to_mfn(addr), 0);
>> +						  virt_to_mfn(vaddr), 0);
>>  		if (err < 0) {
>>  			xenbus_dev_fatal(dev, err,
>>  					 "granting access to ring page");
>>  			goto fail;
>>  		}
>>  		grefs[i] = err;
>> +
>> +		vaddr = (char *)vaddr + PAGE_SIZE;
> 
> You don't need the cast here since vaddr is a void *.

Arithmetic on void pointer is a GCC extension [1]. I wasn't sure what is
the Linux policy on it.

Regards,

[1] https://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html#Pointer-Arith


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
  2015-05-19 13:51     ` David Vrabel
  (?)
@ 2015-05-19 14:12     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:12 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, tim, linux-kernel,
	Boris Ostrovsky, linux-arm-kernel

Hi David,

On 19/05/15 14:51, David Vrabel wrote:
>> --- a/drivers/xen/xenbus/xenbus_client.c
>> +++ b/drivers/xen/xenbus/xenbus_client.c
>> @@ -379,16 +379,16 @@ int xenbus_grant_ring(struct xenbus_device *dev, void *vaddr,
>>  	int i, j;
>>  
>>  	for (i = 0; i < nr_pages; i++) {
>> -		unsigned long addr = (unsigned long)vaddr +
>> -			(PAGE_SIZE * i);
>>  		err = gnttab_grant_foreign_access(dev->otherend_id,
>> -						  virt_to_mfn(addr), 0);
>> +						  virt_to_mfn(vaddr), 0);
>>  		if (err < 0) {
>>  			xenbus_dev_fatal(dev, err,
>>  					 "granting access to ring page");
>>  			goto fail;
>>  		}
>>  		grefs[i] = err;
>> +
>> +		vaddr = (char *)vaddr + PAGE_SIZE;
> 
> You don't need the cast here since vaddr is a void *.

Arithmetic on void pointer is a GCC extension [1]. I wasn't sure what is
the Linux policy on it.

Regards,

[1] https://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html#Pointer-Arith


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
  2015-05-15 15:54     ` Boris Ostrovsky
@ 2015-05-19 14:16       ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:16 UTC (permalink / raw)
  To: Boris Ostrovsky, Julien Grall, xen-devel
  Cc: ian.campbell, Konrad Rzeszutek Wilk, stefano.stabellini, tim,
	linux-kernel, David Vrabel, linux-arm-kernel

Hi Boris,

On 15/05/15 16:54, Boris Ostrovsky wrote:
> On 05/14/2015 01:00 PM, Julien Grall wrote:
>> When Linux is using 64K page granularity, every page will be slipt in
>> multiple non-contiguous 4K MFN.
>>
>> I'm not sure how to handle efficiently the check to know whether we can
>> merge 2 biovec with a such case. So for now, always says that biovec are
>> not mergeable.
>>
>> Signed-off-by: Julien Grall <julien.grall@citrix.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: David Vrabel <david.vrabel@citrix.com>
>> ---
>>   drivers/xen/biomerge.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
>> index 0edb91c..20387c2 100644
>> --- a/drivers/xen/biomerge.c
>> +++ b/drivers/xen/biomerge.c
>> @@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec
>> *vec1,
>>       unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
>>       unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
>>   +    /* TODO: Implement it correctly */
>> +    return 0;
>> +
>>       return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
>>           ((mfn1 == mfn2) || ((mfn1+1) == mfn2));
>>   }
> 
> 
> I think this is a bit too blunt.  Perhaps check first whether page sizes
> are different in the hypervisor and the guest?

Sounds good.

> 
> (And I am not sure we need __BIOVEC_PHYS_MERGEABLE() test here as it is
> already checked by BIOVEC_PHYS_MERGEABLE() which appears to be the only
> user of xen_biovec_phys_mergeable())

I can send a patch to drop __BIOVEC_PHYS_MERGEABLE.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
@ 2015-05-19 14:16       ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Boris,

On 15/05/15 16:54, Boris Ostrovsky wrote:
> On 05/14/2015 01:00 PM, Julien Grall wrote:
>> When Linux is using 64K page granularity, every page will be slipt in
>> multiple non-contiguous 4K MFN.
>>
>> I'm not sure how to handle efficiently the check to know whether we can
>> merge 2 biovec with a such case. So for now, always says that biovec are
>> not mergeable.
>>
>> Signed-off-by: Julien Grall <julien.grall@citrix.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: David Vrabel <david.vrabel@citrix.com>
>> ---
>>   drivers/xen/biomerge.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
>> index 0edb91c..20387c2 100644
>> --- a/drivers/xen/biomerge.c
>> +++ b/drivers/xen/biomerge.c
>> @@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec
>> *vec1,
>>       unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
>>       unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
>>   +    /* TODO: Implement it correctly */
>> +    return 0;
>> +
>>       return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
>>           ((mfn1 == mfn2) || ((mfn1+1) == mfn2));
>>   }
> 
> 
> I think this is a bit too blunt.  Perhaps check first whether page sizes
> are different in the hypervisor and the guest?

Sounds good.

> 
> (And I am not sure we need __BIOVEC_PHYS_MERGEABLE() test here as it is
> already checked by BIOVEC_PHYS_MERGEABLE() which appears to be the only
> user of xen_biovec_phys_mergeable())

I can send a patch to drop __BIOVEC_PHYS_MERGEABLE.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable
  2015-05-15 15:54     ` Boris Ostrovsky
  (?)
  (?)
@ 2015-05-19 14:16     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:16 UTC (permalink / raw)
  To: Boris Ostrovsky, Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, linux-arm-kernel

Hi Boris,

On 15/05/15 16:54, Boris Ostrovsky wrote:
> On 05/14/2015 01:00 PM, Julien Grall wrote:
>> When Linux is using 64K page granularity, every page will be slipt in
>> multiple non-contiguous 4K MFN.
>>
>> I'm not sure how to handle efficiently the check to know whether we can
>> merge 2 biovec with a such case. So for now, always says that biovec are
>> not mergeable.
>>
>> Signed-off-by: Julien Grall <julien.grall@citrix.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
>> Cc: David Vrabel <david.vrabel@citrix.com>
>> ---
>>   drivers/xen/biomerge.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/drivers/xen/biomerge.c b/drivers/xen/biomerge.c
>> index 0edb91c..20387c2 100644
>> --- a/drivers/xen/biomerge.c
>> +++ b/drivers/xen/biomerge.c
>> @@ -9,6 +9,9 @@ bool xen_biovec_phys_mergeable(const struct bio_vec
>> *vec1,
>>       unsigned long mfn1 = pfn_to_mfn(page_to_pfn(vec1->bv_page));
>>       unsigned long mfn2 = pfn_to_mfn(page_to_pfn(vec2->bv_page));
>>   +    /* TODO: Implement it correctly */
>> +    return 0;
>> +
>>       return __BIOVEC_PHYS_MERGEABLE(vec1, vec2) &&
>>           ((mfn1 == mfn2) || ((mfn1+1) == mfn2));
>>   }
> 
> 
> I think this is a bit too blunt.  Perhaps check first whether page sizes
> are different in the hypervisor and the guest?

Sounds good.

> 
> (And I am not sure we need __BIOVEC_PHYS_MERGEABLE() test here as it is
> already checked by BIOVEC_PHYS_MERGEABLE() which appears to be the only
> user of xen_biovec_phys_mergeable())

I can send a patch to drop __BIOVEC_PHYS_MERGEABLE.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
  2015-05-19 13:57     ` David Vrabel
@ 2015-05-19 14:18       ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:18 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	Boris Ostrovsky, linux-arm-kernel

Hi David,

On 19/05/15 14:57, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> With 64KB page granularity support in Linux, a page will be split accross
>> multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
>> contiguous.
>>
>> With the offset in the page, the helper will be able to know which MFN
>> the driver needs to retrieve.
> 
> I think a gnttab_grant_foreign_access_ref()-like helper that takes a
> page would be better.
>
> You will probably want this helper able to return/fill a set of refs for
> 64 KiB pages.

I will see what I can do.

Although, I think this patch is still valid to avoid wrong usage with
64KB page granularity by the caller.

The developer may think that MFN are contiguous which is not always true.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
@ 2015-05-19 14:18       ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi David,

On 19/05/15 14:57, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> With 64KB page granularity support in Linux, a page will be split accross
>> multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
>> contiguous.
>>
>> With the offset in the page, the helper will be able to know which MFN
>> the driver needs to retrieve.
> 
> I think a gnttab_grant_foreign_access_ref()-like helper that takes a
> page would be better.
>
> You will probably want this helper able to return/fill a set of refs for
> 64 KiB pages.

I will see what I can do.

Although, I think this patch is still valid to avoid wrong usage with
64KB page granularity by the caller.

The developer may think that MFN are contiguous which is not always true.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page
  2015-05-19 13:57     ` David Vrabel
  (?)
  (?)
@ 2015-05-19 14:18     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:18 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	Boris Ostrovsky, linux-arm-kernel

Hi David,

On 19/05/15 14:57, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> With 64KB page granularity support in Linux, a page will be split accross
>> multiple MFN (Xen is using 4KB page granularity). Thoses MFNs may not be
>> contiguous.
>>
>> With the offset in the page, the helper will be able to know which MFN
>> the driver needs to retrieve.
> 
> I think a gnttab_grant_foreign_access_ref()-like helper that takes a
> page would be better.
>
> You will probably want this helper able to return/fill a set of refs for
> 64 KiB pages.

I will see what I can do.

Although, I think this patch is still valid to avoid wrong usage with
64KB page granularity by the caller.

The developer may think that MFN are contiguous which is not always true.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 13/23] xen/xenbus: Use Xen page definition
  2015-05-19 13:59     ` David Vrabel
@ 2015-05-19 14:19       ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:19 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Boris Ostrovsky, linux-arm-kernel

Hi David,

On 19/05/15 14:59, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> The xenstore ring is always based on the page granularity of Xen.
> [...]
>> --- a/drivers/xen/xenbus/xenbus_probe.c
>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>> @@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
>>  
>>  	xen_store_mfn = xen_start_info->store_mfn =
>>  		pfn_to_mfn(virt_to_phys((void *)page) >>
>> -			   PAGE_SHIFT);
>> +			   XEN_PAGE_SHIFT);
> 
> This is page_to_mfn() that you adjusted in the previous patch.

Right, I think there is a couple of other places where page_to_mfn can
be used.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 13/23] xen/xenbus: Use Xen page definition
@ 2015-05-19 14:19       ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi David,

On 19/05/15 14:59, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> The xenstore ring is always based on the page granularity of Xen.
> [...]
>> --- a/drivers/xen/xenbus/xenbus_probe.c
>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>> @@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
>>  
>>  	xen_store_mfn = xen_start_info->store_mfn =
>>  		pfn_to_mfn(virt_to_phys((void *)page) >>
>> -			   PAGE_SHIFT);
>> +			   XEN_PAGE_SHIFT);
> 
> This is page_to_mfn() that you adjusted in the previous patch.

Right, I think there is a couple of other places where page_to_mfn can
be used.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 13/23] xen/xenbus: Use Xen page definition
  2015-05-19 13:59     ` David Vrabel
  (?)
@ 2015-05-19 14:19     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 14:19 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Boris Ostrovsky, linux-arm-kernel

Hi David,

On 19/05/15 14:59, David Vrabel wrote:
> On 14/05/15 18:00, Julien Grall wrote:
>> The xenstore ring is always based on the page granularity of Xen.
> [...]
>> --- a/drivers/xen/xenbus/xenbus_probe.c
>> +++ b/drivers/xen/xenbus/xenbus_probe.c
>> @@ -713,7 +713,7 @@ static int __init xenstored_local_init(void)
>>  
>>  	xen_store_mfn = xen_start_info->store_mfn =
>>  		pfn_to_mfn(virt_to_phys((void *)page) >>
>> -			   PAGE_SHIFT);
>> +			   XEN_PAGE_SHIFT);
> 
> This is page_to_mfn() that you adjusted in the previous patch.

Right, I think there is a couple of other places where page_to_mfn can
be used.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 15:23     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:23 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> For ARM64 guests, Linux is able to support either 64K or 4K page
> granularity. Although, the hypercall interface is always based on 4K
> page granularity.
> 
> With 64K page granuliarty, a single page will be spread over multiple
> Xen frame.
> 
> When a driver request/free a balloon page, the balloon driver will have
> to split the Linux page in 4K chunk before asking Xen to add/remove the
> frame from the guest.
> 
> Note that this can work on any page granularity assuming it's a multiple
> of 4K.
[...]
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -91,7 +91,7 @@ struct balloon_stats balloon_stats;
>  EXPORT_SYMBOL_GPL(balloon_stats);
>  
>  /* We increase/decrease in batches which fit in a page */
> -static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
> +static xen_pfn_t frame_list[XEN_PAGE_SIZE / sizeof(unsigned long)];

PAGE_SIZE is appropriate here, since this is a guest-side array.

> +		if (!(i % XEN_PFN_PER_PAGE)) {

Ick.  Can you refactor this into a loop per page calling a function that
loops per MFN.

Also similar tests elsewhere.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
@ 2015-05-19 15:23     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> For ARM64 guests, Linux is able to support either 64K or 4K page
> granularity. Although, the hypercall interface is always based on 4K
> page granularity.
> 
> With 64K page granuliarty, a single page will be spread over multiple
> Xen frame.
> 
> When a driver request/free a balloon page, the balloon driver will have
> to split the Linux page in 4K chunk before asking Xen to add/remove the
> frame from the guest.
> 
> Note that this can work on any page granularity assuming it's a multiple
> of 4K.
[...]
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -91,7 +91,7 @@ struct balloon_stats balloon_stats;
>  EXPORT_SYMBOL_GPL(balloon_stats);
>  
>  /* We increase/decrease in batches which fit in a page */
> -static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
> +static xen_pfn_t frame_list[XEN_PAGE_SIZE / sizeof(unsigned long)];

PAGE_SIZE is appropriate here, since this is a guest-side array.

> +		if (!(i % XEN_PFN_PER_PAGE)) {

Ick.  Can you refactor this into a loop per page calling a function that
loops per MFN.

Also similar tests elsewhere.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux
  2015-05-14 17:00   ` Julien Grall
  (?)
  (?)
@ 2015-05-19 15:23   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:23 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Wei Liu, ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> For ARM64 guests, Linux is able to support either 64K or 4K page
> granularity. Although, the hypercall interface is always based on 4K
> page granularity.
> 
> With 64K page granuliarty, a single page will be spread over multiple
> Xen frame.
> 
> When a driver request/free a balloon page, the balloon driver will have
> to split the Linux page in 4K chunk before asking Xen to add/remove the
> frame from the guest.
> 
> Note that this can work on any page granularity assuming it's a multiple
> of 4K.
[...]
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -91,7 +91,7 @@ struct balloon_stats balloon_stats;
>  EXPORT_SYMBOL_GPL(balloon_stats);
>  
>  /* We increase/decrease in batches which fit in a page */
> -static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)];
> +static xen_pfn_t frame_list[XEN_PAGE_SIZE / sizeof(unsigned long)];

PAGE_SIZE is appropriate here, since this is a guest-side array.

> +		if (!(i % XEN_PFN_PER_PAGE)) {

Ick.  Can you refactor this into a loop per page calling a function that
loops per MFN.

Also similar tests elsewhere.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 15:25     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:25 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> Only use the first 4KB of the page to store the events channel info. It
> means that we will wast 60KB every time we allocate page for:
>      * control block: a page is allocating per CPU
>      * event array: a page is allocating everytime we need to expand it
> 
> I think we can reduce the memory waste for the 2 areas by:
> 
>     * control block: sharing between multiple vCPUs. Although it will
>     require some bookkeeping in order to not free the page when the CPU
>     goes offline and the other CPUs sharing the page still there
> 
>     * event array: always extend the array event by 64K (i.e 16 4K
>     chunk). That would require more care when we fail to expand the
>     event channel.

I think you want an xen_alloc_page_for_xen() or similar to allocate a 4
KiB size/aligned block.

But as-is:

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity
@ 2015-05-19 15:25     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> Only use the first 4KB of the page to store the events channel info. It
> means that we will wast 60KB every time we allocate page for:
>      * control block: a page is allocating per CPU
>      * event array: a page is allocating everytime we need to expand it
> 
> I think we can reduce the memory waste for the 2 areas by:
> 
>     * control block: sharing between multiple vCPUs. Although it will
>     require some bookkeeping in order to not free the page when the CPU
>     goes offline and the other CPUs sharing the page still there
> 
>     * event array: always extend the array event by 64K (i.e 16 4K
>     chunk). That would require more care when we fail to expand the
>     event channel.

I think you want an xen_alloc_page_for_xen() or similar to allocate a 4
KiB size/aligned block.

But as-is:

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity
  2015-05-14 17:00   ` Julien Grall
  (?)
  (?)
@ 2015-05-19 15:25   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:25 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> Only use the first 4KB of the page to store the events channel info. It
> means that we will wast 60KB every time we allocate page for:
>      * control block: a page is allocating per CPU
>      * event array: a page is allocating everytime we need to expand it
> 
> I think we can reduce the memory waste for the 2 areas by:
> 
>     * control block: sharing between multiple vCPUs. Although it will
>     require some bookkeeping in order to not free the page when the CPU
>     goes offline and the other CPUs sharing the page still there
> 
>     * event array: always extend the array event by 64K (i.e 16 4K
>     chunk). That would require more care when we fail to expand the
>     event channel.

I think you want an xen_alloc_page_for_xen() or similar to allocate a 4
KiB size/aligned block.

But as-is:

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 17/23] xen/grant-table: Make it running on 64KB granularity
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-19 15:27     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:27 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Russell King, ian.campbell, stefano.stabellini, tim,
	linux-kernel, David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> The Xen interface is using 4KB page granularity. This means that each
> grant is 4KB.
> 
> The current implementation allocates a Linux page per grant. On Linux
> using 64KB page granularity, only the first 4KB of the page will be
> used.
> 
> We could decrease the memory wasted by sharing the page with multiple
> grant. It will require some care with the {Set,Clear}ForeignPage macro.
> 
> Note that no changes has been made in the x86 code because both Linux
> and Xen will only use 4KB page granularity.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 17/23] xen/grant-table: Make it running on 64KB granularity
@ 2015-05-19 15:27     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> The Xen interface is using 4KB page granularity. This means that each
> grant is 4KB.
> 
> The current implementation allocates a Linux page per grant. On Linux
> using 64KB page granularity, only the first 4KB of the page will be
> used.
> 
> We could decrease the memory wasted by sharing the page with multiple
> grant. It will require some care with the {Set,Clear}ForeignPage macro.
> 
> Note that no changes has been made in the x86 code because both Linux
> and Xen will only use 4KB page granularity.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 17/23] xen/grant-table: Make it running on 64KB granularity
  2015-05-14 17:00   ` Julien Grall
  (?)
  (?)
@ 2015-05-19 15:27   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:27 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: Russell King, ian.campbell, stefano.stabellini, tim,
	linux-kernel, David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:00, Julien Grall wrote:
> The Xen interface is using 4KB page granularity. This means that each
> grant is 4KB.
> 
> The current implementation allocates a Linux page per grant. On Linux
> using 64KB page granularity, only the first 4KB of the page will be
> used.
> 
> We could decrease the memory wasted by sharing the page with multiple
> grant. It will require some care with the {Set,Clear}ForeignPage macro.
> 
> Note that no changes has been made in the x86 code because both Linux
> and Xen will only use 4KB page granularity.

Reviewed-by: David Vrabel <david.vrabel@citrix.com>

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
  2015-05-14 17:01   ` Julien Grall
@ 2015-05-19 15:39     ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:39 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:01, Julien Grall wrote:
> The hypercall interface (as well as the toolstack) is always using 4KB
> page granularity. When the toolstack is asking for mapping a series of
> guest PFN in a batch, it expects to have the page map contiguously in
> its virtual memory.
> 
> When Linux is using 64KB page granularity, the privcmd driver will have
> to map multiple Xen PFN in a single Linux page.
> 
> Note that this solution works on page granularity which is a multiple of
> 4KB.
[...]
> --- a/drivers/xen/xlate_mmu.c
> +++ b/drivers/xen/xlate_mmu.c
> @@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
>  
>  struct remap_data {
>  	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
> +	xen_pfn_t *egmfn; /* end foreign domain's gmfn */

I don't know what you mean by "end foreign domain".

>  	pgprot_t prot;
>  	domid_t  domid;
>  	struct vm_area_struct *vma;
> @@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
>  {
>  	struct remap_data *info = data;
>  	struct page *page = info->pages[info->index++];
> -	unsigned long pfn = page_to_pfn(page);
> -	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
> +	unsigned long pfn = xen_page_to_pfn(page);
> +	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
>  	int rc;
> -
> -	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
> -	*info->err_ptr++ = rc;
> -	if (!rc) {
> -		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
> -		info->mapped++;
> +	uint32_t i;
> +
> +	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
> +		if (info->fgmfn == info->egmfn)
> +			break;
> +
> +		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
> +		*info->err_ptr++ = rc;
> +		if (!rc) {
> +			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
> +			info->mapped++;
> +		}
> +		info->fgmfn++;

This doesn't make any sense to me.  Don't you need to gather the foreign
GFNs into batches of PAGE_SIZE / XEN_PAGE_SIZE and map these all at once
into a 64 KiB page?  I don't see how you can have a set_pte_at() for
each foreign GFN.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
@ 2015-05-19 15:39     ` David Vrabel
  0 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/05/15 18:01, Julien Grall wrote:
> The hypercall interface (as well as the toolstack) is always using 4KB
> page granularity. When the toolstack is asking for mapping a series of
> guest PFN in a batch, it expects to have the page map contiguously in
> its virtual memory.
> 
> When Linux is using 64KB page granularity, the privcmd driver will have
> to map multiple Xen PFN in a single Linux page.
> 
> Note that this solution works on page granularity which is a multiple of
> 4KB.
[...]
> --- a/drivers/xen/xlate_mmu.c
> +++ b/drivers/xen/xlate_mmu.c
> @@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
>  
>  struct remap_data {
>  	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
> +	xen_pfn_t *egmfn; /* end foreign domain's gmfn */

I don't know what you mean by "end foreign domain".

>  	pgprot_t prot;
>  	domid_t  domid;
>  	struct vm_area_struct *vma;
> @@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
>  {
>  	struct remap_data *info = data;
>  	struct page *page = info->pages[info->index++];
> -	unsigned long pfn = page_to_pfn(page);
> -	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
> +	unsigned long pfn = xen_page_to_pfn(page);
> +	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
>  	int rc;
> -
> -	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
> -	*info->err_ptr++ = rc;
> -	if (!rc) {
> -		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
> -		info->mapped++;
> +	uint32_t i;
> +
> +	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
> +		if (info->fgmfn == info->egmfn)
> +			break;
> +
> +		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
> +		*info->err_ptr++ = rc;
> +		if (!rc) {
> +			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
> +			info->mapped++;
> +		}
> +		info->fgmfn++;

This doesn't make any sense to me.  Don't you need to gather the foreign
GFNs into batches of PAGE_SIZE / XEN_PAGE_SIZE and map these all at once
into a 64 KiB page?  I don't see how you can have a set_pte_at() for
each foreign GFN.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
  2015-05-14 17:01   ` Julien Grall
  (?)
@ 2015-05-19 15:39   ` David Vrabel
  -1 siblings, 0 replies; 200+ messages in thread
From: David Vrabel @ 2015-05-19 15:39 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, Boris Ostrovsky, linux-arm-kernel

On 14/05/15 18:01, Julien Grall wrote:
> The hypercall interface (as well as the toolstack) is always using 4KB
> page granularity. When the toolstack is asking for mapping a series of
> guest PFN in a batch, it expects to have the page map contiguously in
> its virtual memory.
> 
> When Linux is using 64KB page granularity, the privcmd driver will have
> to map multiple Xen PFN in a single Linux page.
> 
> Note that this solution works on page granularity which is a multiple of
> 4KB.
[...]
> --- a/drivers/xen/xlate_mmu.c
> +++ b/drivers/xen/xlate_mmu.c
> @@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
>  
>  struct remap_data {
>  	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
> +	xen_pfn_t *egmfn; /* end foreign domain's gmfn */

I don't know what you mean by "end foreign domain".

>  	pgprot_t prot;
>  	domid_t  domid;
>  	struct vm_area_struct *vma;
> @@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
>  {
>  	struct remap_data *info = data;
>  	struct page *page = info->pages[info->index++];
> -	unsigned long pfn = page_to_pfn(page);
> -	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
> +	unsigned long pfn = xen_page_to_pfn(page);
> +	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
>  	int rc;
> -
> -	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
> -	*info->err_ptr++ = rc;
> -	if (!rc) {
> -		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
> -		info->mapped++;
> +	uint32_t i;
> +
> +	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
> +		if (info->fgmfn == info->egmfn)
> +			break;
> +
> +		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
> +		*info->err_ptr++ = rc;
> +		if (!rc) {
> +			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
> +			info->mapped++;
> +		}
> +		info->fgmfn++;

This doesn't make any sense to me.  Don't you need to gather the foreign
GFNs into batches of PAGE_SIZE / XEN_PAGE_SIZE and map these all at once
into a 64 KiB page?  I don't see how you can have a set_pte_at() for
each foreign GFN.

David

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-18 12:54             ` Wei Liu
@ 2015-05-19 22:56               ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 22:56 UTC (permalink / raw)
  To: Wei Liu, ian.campbell
  Cc: Julien Grall, stefano.stabellini, netdev, tim, linux-kernel,
	xen-devel, linux-arm-kernel

Hi,

On 18/05/2015 13:54, Wei Liu wrote:
> On Mon, May 18, 2015 at 01:11:26PM +0100, Julien Grall wrote:
>> On 15/05/15 16:31, Wei Liu wrote:
>>> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
>>>> On 15/05/15 03:35, Wei Liu wrote:
>>>>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>>>>>> The PV network protocol is using 4KB page granularity. The goal of this
>>>>>> patch is to allow a Linux using 64KB page granularity working as a
>>>>>> network backend on a non-modified Xen.
>>>>>>
>>>>>> It's only necessary to adapt the ring size and break skb data in small
>>>>>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>>>>>
>>>>>> Although only simple workload is working (dhcp request, ping). If I try
>>>>>> to use wget in the guest, it will stall until a tcpdump is started on
>>>>>> the vif interface in DOM0. I wasn't able to find why.
>>>>>>
>>>>>
>>>>> I think in wget workload you're more likely to break down 64K pages to
>>>>> 4K pages. Some of your calculation of mfn, offset might be wrong.
>>>>
>>>> If so, why tcpdump on the vif interface would make wget suddenly
>>>> working? Does it make netback use a different path?
>>>
>>> No, but if might make core network component behave differently, this is
>>> only my suspicion.
>>>
>>> Do you see malformed packets with tcpdump?
>>
>> I don't see any malformed packets with tcpdump. The connection is stalling
>> until tcpdump is started on the vif in dom0.
>>
>
> Hmm... Don't have immediate idea about this.
>
> Ian said skb_orphan is called with tcpdump. If I remember correct that
> would trigger the callback to release the slots in netback. It could be
> that other part of Linux is holding onto the skbs for too long.
>
> If you're wgetting from another host, I would suggest wgetting from Dom0
> to limit the problem between Dom0 and DomU.

Thanks to Wei, I was able to narrow the problem. It looks like the 
problem is not coming from netback but somewhere else down in the 
network stack: wget/ssh between Dom0 64KB and DomU is working fine.

Although, wget/ssh between a guest and an external host doesn't work 
when Dom0 is using 64KB page granularity unless if I start a tcpdump on 
the vif in DOM0. Anyone an idea?

I have no issue to wget/ssh in DOM0 to an external host and the same 
kernel with 4KB page granularity (i.e same source code but rebuilt with 
4KB) doesn't show any issue with wget/ssh in the guest.

This has been tested on AMD Seattle, the guest kernel is the same on 
every test (4KB page granularity).

I'm planning to give a try tomorrow on X-gene (ARM64 board and I think 
64KB page granularity is supported) to see if I can reproduce the bug.

>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 0eda6e9..c2a5402 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>   /* Maximum number of Rx slots a to-guest packet may use, including the
>>    * slot needed for GSO meta-data.
>>    */
>> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
>> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>>
>>   enum state_bit_shift {
>>          /* This bit marks that the vif is connected */
>>
>> The function xenvif_wait_for_rx_work never returns. I guess it's because there
>> is not enough slot available.
>>
>> For 64KB page granularity we ask for 16 times more slots than 4KB page
>> granularity. Although, it's very unlikely that all the slot will be used.
>>
>> FWIW I pointed out the same problem on blkfront.
>>
>
> This is not going to work. The ring in netfront / netback has only 256
> slots. Now you ask for netback to reserve more than 256 slots -- (17 +
> 1) * (64 / 4) = 288, which can never be fulfilled. See the call to
> xenvif_rx_ring_slots_available.
>
> I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
> the guest cannot be larger than 64K. So you might be able to
>
> #define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)

I didn't know that packet cannot be larger than 64KB. That's simply a 
lot the problem.

>
> Blk driver may have a different story. But the default ring size (1
> page) yields even less slots than net (given that sizeof(union(req/rsp))
> is larger IIRC).

I will see with Roger for Blkback.


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-19 22:56               ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 22:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 18/05/2015 13:54, Wei Liu wrote:
> On Mon, May 18, 2015 at 01:11:26PM +0100, Julien Grall wrote:
>> On 15/05/15 16:31, Wei Liu wrote:
>>> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
>>>> On 15/05/15 03:35, Wei Liu wrote:
>>>>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>>>>>> The PV network protocol is using 4KB page granularity. The goal of this
>>>>>> patch is to allow a Linux using 64KB page granularity working as a
>>>>>> network backend on a non-modified Xen.
>>>>>>
>>>>>> It's only necessary to adapt the ring size and break skb data in small
>>>>>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>>>>>
>>>>>> Although only simple workload is working (dhcp request, ping). If I try
>>>>>> to use wget in the guest, it will stall until a tcpdump is started on
>>>>>> the vif interface in DOM0. I wasn't able to find why.
>>>>>>
>>>>>
>>>>> I think in wget workload you're more likely to break down 64K pages to
>>>>> 4K pages. Some of your calculation of mfn, offset might be wrong.
>>>>
>>>> If so, why tcpdump on the vif interface would make wget suddenly
>>>> working? Does it make netback use a different path?
>>>
>>> No, but if might make core network component behave differently, this is
>>> only my suspicion.
>>>
>>> Do you see malformed packets with tcpdump?
>>
>> I don't see any malformed packets with tcpdump. The connection is stalling
>> until tcpdump is started on the vif in dom0.
>>
>
> Hmm... Don't have immediate idea about this.
>
> Ian said skb_orphan is called with tcpdump. If I remember correct that
> would trigger the callback to release the slots in netback. It could be
> that other part of Linux is holding onto the skbs for too long.
>
> If you're wgetting from another host, I would suggest wgetting from Dom0
> to limit the problem between Dom0 and DomU.

Thanks to Wei, I was able to narrow the problem. It looks like the 
problem is not coming from netback but somewhere else down in the 
network stack: wget/ssh between Dom0 64KB and DomU is working fine.

Although, wget/ssh between a guest and an external host doesn't work 
when Dom0 is using 64KB page granularity unless if I start a tcpdump on 
the vif in DOM0. Anyone an idea?

I have no issue to wget/ssh in DOM0 to an external host and the same 
kernel with 4KB page granularity (i.e same source code but rebuilt with 
4KB) doesn't show any issue with wget/ssh in the guest.

This has been tested on AMD Seattle, the guest kernel is the same on 
every test (4KB page granularity).

I'm planning to give a try tomorrow on X-gene (ARM64 board and I think 
64KB page granularity is supported) to see if I can reproduce the bug.

>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 0eda6e9..c2a5402 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>   /* Maximum number of Rx slots a to-guest packet may use, including the
>>    * slot needed for GSO meta-data.
>>    */
>> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
>> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>>
>>   enum state_bit_shift {
>>          /* This bit marks that the vif is connected */
>>
>> The function xenvif_wait_for_rx_work never returns. I guess it's because there
>> is not enough slot available.
>>
>> For 64KB page granularity we ask for 16 times more slots than 4KB page
>> granularity. Although, it's very unlikely that all the slot will be used.
>>
>> FWIW I pointed out the same problem on blkfront.
>>
>
> This is not going to work. The ring in netfront / netback has only 256
> slots. Now you ask for netback to reserve more than 256 slots -- (17 +
> 1) * (64 / 4) = 288, which can never be fulfilled. See the call to
> xenvif_rx_ring_slots_available.
>
> I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
> the guest cannot be larger than 64K. So you might be able to
>
> #define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)

I didn't know that packet cannot be larger than 64KB. That's simply a 
lot the problem.

>
> Blk driver may have a different story. But the default ring size (1
> page) yields even less slots than net (given that sizeof(union(req/rsp))
> is larger IIRC).

I will see with Roger for Blkback.


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-18 12:54             ` Wei Liu
  (?)
@ 2015-05-19 22:56             ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-19 22:56 UTC (permalink / raw)
  To: Wei Liu, ian.campbell
  Cc: stefano.stabellini, netdev, tim, linux-kernel, Julien Grall,
	xen-devel, linux-arm-kernel

Hi,

On 18/05/2015 13:54, Wei Liu wrote:
> On Mon, May 18, 2015 at 01:11:26PM +0100, Julien Grall wrote:
>> On 15/05/15 16:31, Wei Liu wrote:
>>> On Fri, May 15, 2015 at 01:35:42PM +0100, Julien Grall wrote:
>>>> On 15/05/15 03:35, Wei Liu wrote:
>>>>> On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote:
>>>>>> The PV network protocol is using 4KB page granularity. The goal of this
>>>>>> patch is to allow a Linux using 64KB page granularity working as a
>>>>>> network backend on a non-modified Xen.
>>>>>>
>>>>>> It's only necessary to adapt the ring size and break skb data in small
>>>>>> chunk of 4KB. The rest of the code is relying on the grant table code.
>>>>>>
>>>>>> Although only simple workload is working (dhcp request, ping). If I try
>>>>>> to use wget in the guest, it will stall until a tcpdump is started on
>>>>>> the vif interface in DOM0. I wasn't able to find why.
>>>>>>
>>>>>
>>>>> I think in wget workload you're more likely to break down 64K pages to
>>>>> 4K pages. Some of your calculation of mfn, offset might be wrong.
>>>>
>>>> If so, why tcpdump on the vif interface would make wget suddenly
>>>> working? Does it make netback use a different path?
>>>
>>> No, but if might make core network component behave differently, this is
>>> only my suspicion.
>>>
>>> Do you see malformed packets with tcpdump?
>>
>> I don't see any malformed packets with tcpdump. The connection is stalling
>> until tcpdump is started on the vif in dom0.
>>
>
> Hmm... Don't have immediate idea about this.
>
> Ian said skb_orphan is called with tcpdump. If I remember correct that
> would trigger the callback to release the slots in netback. It could be
> that other part of Linux is holding onto the skbs for too long.
>
> If you're wgetting from another host, I would suggest wgetting from Dom0
> to limit the problem between Dom0 and DomU.

Thanks to Wei, I was able to narrow the problem. It looks like the 
problem is not coming from netback but somewhere else down in the 
network stack: wget/ssh between Dom0 64KB and DomU is working fine.

Although, wget/ssh between a guest and an external host doesn't work 
when Dom0 is using 64KB page granularity unless if I start a tcpdump on 
the vif in DOM0. Anyone an idea?

I have no issue to wget/ssh in DOM0 to an external host and the same 
kernel with 4KB page granularity (i.e same source code but rebuilt with 
4KB) doesn't show any issue with wget/ssh in the guest.

This has been tested on AMD Seattle, the guest kernel is the same on 
every test (4KB page granularity).

I'm planning to give a try tomorrow on X-gene (ARM64 board and I think 
64KB page granularity is supported) to see if I can reproduce the bug.

>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>> index 0eda6e9..c2a5402 100644
>> --- a/drivers/net/xen-netback/common.h
>> +++ b/drivers/net/xen-netback/common.h
>> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>   /* Maximum number of Rx slots a to-guest packet may use, including the
>>    * slot needed for GSO meta-data.
>>    */
>> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
>> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>>
>>   enum state_bit_shift {
>>          /* This bit marks that the vif is connected */
>>
>> The function xenvif_wait_for_rx_work never returns. I guess it's because there
>> is not enough slot available.
>>
>> For 64KB page granularity we ask for 16 times more slots than 4KB page
>> granularity. Although, it's very unlikely that all the slot will be used.
>>
>> FWIW I pointed out the same problem on blkfront.
>>
>
> This is not going to work. The ring in netfront / netback has only 256
> slots. Now you ask for netback to reserve more than 256 slots -- (17 +
> 1) * (64 / 4) = 288, which can never be fulfilled. See the call to
> xenvif_rx_ring_slots_available.
>
> I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
> the guest cannot be larger than 64K. So you might be able to
>
> #define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)

I didn't know that packet cannot be larger than 64KB. That's simply a 
lot the problem.

>
> Blk driver may have a different story. But the default ring size (1
> page) yields even less slots than net (given that sizeof(union(req/rsp))
> is larger IIRC).

I will see with Roger for Blkback.


-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-19 22:56               ` Julien Grall
@ 2015-05-20  8:26                 ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-20  8:26 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Tue, May 19, 2015 at 11:56:39PM +0100, Julien Grall wrote:

> 
> >>diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> >>index 0eda6e9..c2a5402 100644
> >>--- a/drivers/net/xen-netback/common.h
> >>+++ b/drivers/net/xen-netback/common.h
> >>@@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
> >>  /* Maximum number of Rx slots a to-guest packet may use, including the
> >>   * slot needed for GSO meta-data.
> >>   */
> >>-#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
> >>+#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
> >>
> >>  enum state_bit_shift {
> >>         /* This bit marks that the vif is connected */
> >>
> >>The function xenvif_wait_for_rx_work never returns. I guess it's because there
> >>is not enough slot available.
> >>
> >>For 64KB page granularity we ask for 16 times more slots than 4KB page
> >>granularity. Although, it's very unlikely that all the slot will be used.
> >>
> >>FWIW I pointed out the same problem on blkfront.
> >>
> >
> >This is not going to work. The ring in netfront / netback has only 256
> >slots. Now you ask for netback to reserve more than 256 slots -- (17 +
> >1) * (64 / 4) = 288, which can never be fulfilled. See the call to
> >xenvif_rx_ring_slots_available.
> >
> >I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
> >the guest cannot be larger than 64K. So you might be able to
> >
> >#define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)
> 
> I didn't know that packet cannot be larger than 64KB. That's simply a lot
> the problem.
> 

I think about this more, you will need one more slot for GSO
information, so make it ((65536 / XEN_PAGE_SIZE) + 1 + 1).

> >
> >Blk driver may have a different story. But the default ring size (1
> >page) yields even less slots than net (given that sizeof(union(req/rsp))
> >is larger IIRC).
> 
> I will see with Roger for Blkback.
> 
> 
> -- 
> Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-20  8:26                 ` Wei Liu
  0 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-20  8:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, May 19, 2015 at 11:56:39PM +0100, Julien Grall wrote:

> 
> >>diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> >>index 0eda6e9..c2a5402 100644
> >>--- a/drivers/net/xen-netback/common.h
> >>+++ b/drivers/net/xen-netback/common.h
> >>@@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
> >>  /* Maximum number of Rx slots a to-guest packet may use, including the
> >>   * slot needed for GSO meta-data.
> >>   */
> >>-#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
> >>+#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
> >>
> >>  enum state_bit_shift {
> >>         /* This bit marks that the vif is connected */
> >>
> >>The function xenvif_wait_for_rx_work never returns. I guess it's because there
> >>is not enough slot available.
> >>
> >>For 64KB page granularity we ask for 16 times more slots than 4KB page
> >>granularity. Although, it's very unlikely that all the slot will be used.
> >>
> >>FWIW I pointed out the same problem on blkfront.
> >>
> >
> >This is not going to work. The ring in netfront / netback has only 256
> >slots. Now you ask for netback to reserve more than 256 slots -- (17 +
> >1) * (64 / 4) = 288, which can never be fulfilled. See the call to
> >xenvif_rx_ring_slots_available.
> >
> >I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
> >the guest cannot be larger than 64K. So you might be able to
> >
> >#define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)
> 
> I didn't know that packet cannot be larger than 64KB. That's simply a lot
> the problem.
> 

I think about this more, you will need one more slot for GSO
information, so make it ((65536 / XEN_PAGE_SIZE) + 1 + 1).

> >
> >Blk driver may have a different story. But the default ring size (1
> >page) yields even less slots than net (given that sizeof(union(req/rsp))
> >is larger IIRC).
> 
> I will see with Roger for Blkback.
> 
> 
> -- 
> Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-19 22:56               ` Julien Grall
  (?)
@ 2015-05-20  8:26               ` Wei Liu
  -1 siblings, 0 replies; 200+ messages in thread
From: Wei Liu @ 2015-05-20  8:26 UTC (permalink / raw)
  To: Julien Grall
  Cc: Wei Liu, ian.campbell, stefano.stabellini, netdev, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Tue, May 19, 2015 at 11:56:39PM +0100, Julien Grall wrote:

> 
> >>diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
> >>index 0eda6e9..c2a5402 100644
> >>--- a/drivers/net/xen-netback/common.h
> >>+++ b/drivers/net/xen-netback/common.h
> >>@@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
> >>  /* Maximum number of Rx slots a to-guest packet may use, including the
> >>   * slot needed for GSO meta-data.
> >>   */
> >>-#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
> >>+#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
> >>
> >>  enum state_bit_shift {
> >>         /* This bit marks that the vif is connected */
> >>
> >>The function xenvif_wait_for_rx_work never returns. I guess it's because there
> >>is not enough slot available.
> >>
> >>For 64KB page granularity we ask for 16 times more slots than 4KB page
> >>granularity. Although, it's very unlikely that all the slot will be used.
> >>
> >>FWIW I pointed out the same problem on blkfront.
> >>
> >
> >This is not going to work. The ring in netfront / netback has only 256
> >slots. Now you ask for netback to reserve more than 256 slots -- (17 +
> >1) * (64 / 4) = 288, which can never be fulfilled. See the call to
> >xenvif_rx_ring_slots_available.
> >
> >I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
> >the guest cannot be larger than 64K. So you might be able to
> >
> >#define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)
> 
> I didn't know that packet cannot be larger than 64KB. That's simply a lot
> the problem.
> 

I think about this more, you will need one more slot for GSO
information, so make it ((65536 / XEN_PAGE_SIZE) + 1 + 1).

> >
> >Blk driver may have a different story. But the default ring size (1
> >page) yields even less slots than net (given that sizeof(union(req/rsp))
> >is larger IIRC).
> 
> I will see with Roger for Blkback.
> 
> 
> -- 
> Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-20  8:26                 ` Wei Liu
@ 2015-05-20 14:26                   ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-20 14:26 UTC (permalink / raw)
  To: Wei Liu, Julien Grall
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	xen-devel, linux-arm-kernel

On 20/05/15 09:26, Wei Liu wrote:
> On Tue, May 19, 2015 at 11:56:39PM +0100, Julien Grall wrote:
> 
>>
>>>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>>>> index 0eda6e9..c2a5402 100644
>>>> --- a/drivers/net/xen-netback/common.h
>>>> +++ b/drivers/net/xen-netback/common.h
>>>> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>>>  /* Maximum number of Rx slots a to-guest packet may use, including the
>>>>   * slot needed for GSO meta-data.
>>>>   */
>>>> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
>>>> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>>>>
>>>>  enum state_bit_shift {
>>>>         /* This bit marks that the vif is connected */
>>>>
>>>> The function xenvif_wait_for_rx_work never returns. I guess it's because there
>>>> is not enough slot available.
>>>>
>>>> For 64KB page granularity we ask for 16 times more slots than 4KB page
>>>> granularity. Although, it's very unlikely that all the slot will be used.
>>>>
>>>> FWIW I pointed out the same problem on blkfront.
>>>>
>>>
>>> This is not going to work. The ring in netfront / netback has only 256
>>> slots. Now you ask for netback to reserve more than 256 slots -- (17 +
>>> 1) * (64 / 4) = 288, which can never be fulfilled. See the call to
>>> xenvif_rx_ring_slots_available.
>>>
>>> I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
>>> the guest cannot be larger than 64K. So you might be able to
>>>
>>> #define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)
>>
>> I didn't know that packet cannot be larger than 64KB. That's simply a lot
>> the problem.
>>
> 
> I think about this more, you will need one more slot for GSO
> information, so make it ((65536 / XEN_PAGE_SIZE) + 1 + 1).

I have introduced a XEN_MAX_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
because it's required in another place.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-20 14:26                   ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-20 14:26 UTC (permalink / raw)
  To: linux-arm-kernel

On 20/05/15 09:26, Wei Liu wrote:
> On Tue, May 19, 2015 at 11:56:39PM +0100, Julien Grall wrote:
> 
>>
>>>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>>>> index 0eda6e9..c2a5402 100644
>>>> --- a/drivers/net/xen-netback/common.h
>>>> +++ b/drivers/net/xen-netback/common.h
>>>> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>>>  /* Maximum number of Rx slots a to-guest packet may use, including the
>>>>   * slot needed for GSO meta-data.
>>>>   */
>>>> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
>>>> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>>>>
>>>>  enum state_bit_shift {
>>>>         /* This bit marks that the vif is connected */
>>>>
>>>> The function xenvif_wait_for_rx_work never returns. I guess it's because there
>>>> is not enough slot available.
>>>>
>>>> For 64KB page granularity we ask for 16 times more slots than 4KB page
>>>> granularity. Although, it's very unlikely that all the slot will be used.
>>>>
>>>> FWIW I pointed out the same problem on blkfront.
>>>>
>>>
>>> This is not going to work. The ring in netfront / netback has only 256
>>> slots. Now you ask for netback to reserve more than 256 slots -- (17 +
>>> 1) * (64 / 4) = 288, which can never be fulfilled. See the call to
>>> xenvif_rx_ring_slots_available.
>>>
>>> I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
>>> the guest cannot be larger than 64K. So you might be able to
>>>
>>> #define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)
>>
>> I didn't know that packet cannot be larger than 64KB. That's simply a lot
>> the problem.
>>
> 
> I think about this more, you will need one more slot for GSO
> information, so make it ((65536 / XEN_PAGE_SIZE) + 1 + 1).

I have introduced a XEN_MAX_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
because it's required in another place.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-20  8:26                 ` Wei Liu
  (?)
  (?)
@ 2015-05-20 14:26                 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-20 14:26 UTC (permalink / raw)
  To: Wei Liu, Julien Grall
  Cc: ian.campbell, stefano.stabellini, netdev, tim, linux-kernel,
	xen-devel, linux-arm-kernel

On 20/05/15 09:26, Wei Liu wrote:
> On Tue, May 19, 2015 at 11:56:39PM +0100, Julien Grall wrote:
> 
>>
>>>> diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
>>>> index 0eda6e9..c2a5402 100644
>>>> --- a/drivers/net/xen-netback/common.h
>>>> +++ b/drivers/net/xen-netback/common.h
>>>> @@ -204,7 +204,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
>>>>  /* Maximum number of Rx slots a to-guest packet may use, including the
>>>>   * slot needed for GSO meta-data.
>>>>   */
>>>> -#define XEN_NETBK_RX_SLOTS_MAX (MAX_SKB_FRAGS + 1)
>>>> +#define XEN_NETBK_RX_SLOTS_MAX ((MAX_SKB_FRAGS + 1) * XEN_PFN_PER_PAGE)
>>>>
>>>>  enum state_bit_shift {
>>>>         /* This bit marks that the vif is connected */
>>>>
>>>> The function xenvif_wait_for_rx_work never returns. I guess it's because there
>>>> is not enough slot available.
>>>>
>>>> For 64KB page granularity we ask for 16 times more slots than 4KB page
>>>> granularity. Although, it's very unlikely that all the slot will be used.
>>>>
>>>> FWIW I pointed out the same problem on blkfront.
>>>>
>>>
>>> This is not going to work. The ring in netfront / netback has only 256
>>> slots. Now you ask for netback to reserve more than 256 slots -- (17 +
>>> 1) * (64 / 4) = 288, which can never be fulfilled. See the call to
>>> xenvif_rx_ring_slots_available.
>>>
>>> I think XEN_NETBK_RX_SLOTS_MAX derived from the fact the each packet to
>>> the guest cannot be larger than 64K. So you might be able to
>>>
>>> #define XEN_NETBK_RX_SLOTS_MAX ((65536 / XEN_PAGE_SIZE) + 1)
>>
>> I didn't know that packet cannot be larger than 64KB. That's simply a lot
>> the problem.
>>
> 
> I think about this more, you will need one more slot for GSO
> information, so make it ((65536 / XEN_PAGE_SIZE) + 1 + 1).

I have introduced a XEN_MAX_SKB_FRAGS (65536 / XEN_PAGE_SIZE + 1)
because it's required in another place.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-19 22:56               ` Julien Grall
@ 2015-05-20 14:29                 ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-20 14:29 UTC (permalink / raw)
  To: Julien Grall, Wei Liu, ian.campbell
  Cc: stefano.stabellini, netdev, tim, linux-kernel, xen-devel,
	linux-arm-kernel, Suravee Suthikulpanit


On 19/05/15 23:56, Julien Grall wrote:
>> If you're wgetting from another host, I would suggest wgetting from Dom0
>> to limit the problem between Dom0 and DomU.
> 
> Thanks to Wei, I was able to narrow the problem. It looks like the
> problem is not coming from netback but somewhere else down in the
> network stack: wget/ssh between Dom0 64KB and DomU is working fine.
> 
> Although, wget/ssh between a guest and an external host doesn't work
> when Dom0 is using 64KB page granularity unless if I start a tcpdump on
> the vif in DOM0. Anyone an idea?
> 
> I have no issue to wget/ssh in DOM0 to an external host and the same
> kernel with 4KB page granularity (i.e same source code but rebuilt with
> 4KB) doesn't show any issue with wget/ssh in the guest.
> 
> This has been tested on AMD Seattle, the guest kernel is the same on
> every test (4KB page granularity).
> 
> I'm planning to give a try tomorrow on X-gene (ARM64 board and I think
> 64KB page granularity is supported) to see if I can reproduce the bug.

It's working on X-gene with the same kernel and configuration. I guess
we can deduce that it's a bug in the AMD network driver.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
@ 2015-05-20 14:29                 ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-20 14:29 UTC (permalink / raw)
  To: linux-arm-kernel


On 19/05/15 23:56, Julien Grall wrote:
>> If you're wgetting from another host, I would suggest wgetting from Dom0
>> to limit the problem between Dom0 and DomU.
> 
> Thanks to Wei, I was able to narrow the problem. It looks like the
> problem is not coming from netback but somewhere else down in the
> network stack: wget/ssh between Dom0 64KB and DomU is working fine.
> 
> Although, wget/ssh between a guest and an external host doesn't work
> when Dom0 is using 64KB page granularity unless if I start a tcpdump on
> the vif in DOM0. Anyone an idea?
> 
> I have no issue to wget/ssh in DOM0 to an external host and the same
> kernel with 4KB page granularity (i.e same source code but rebuilt with
> 4KB) doesn't show any issue with wget/ssh in the guest.
> 
> This has been tested on AMD Seattle, the guest kernel is the same on
> every test (4KB page granularity).
> 
> I'm planning to give a try tomorrow on X-gene (ARM64 board and I think
> 64KB page granularity is supported) to see if I can reproduce the bug.

It's working on X-gene with the same kernel and configuration. I guess
we can deduce that it's a bug in the AMD network driver.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity
  2015-05-19 22:56               ` Julien Grall
                                 ` (2 preceding siblings ...)
  (?)
@ 2015-05-20 14:29               ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-05-20 14:29 UTC (permalink / raw)
  To: Julien Grall, Wei Liu, ian.campbell
  Cc: stefano.stabellini, netdev, tim, linux-kernel,
	Suravee Suthikulpanit, xen-devel, linux-arm-kernel


On 19/05/15 23:56, Julien Grall wrote:
>> If you're wgetting from another host, I would suggest wgetting from Dom0
>> to limit the problem between Dom0 and DomU.
> 
> Thanks to Wei, I was able to narrow the problem. It looks like the
> problem is not coming from netback but somewhere else down in the
> network stack: wget/ssh between Dom0 64KB and DomU is working fine.
> 
> Although, wget/ssh between a guest and an external host doesn't work
> when Dom0 is using 64KB page granularity unless if I start a tcpdump on
> the vif in DOM0. Anyone an idea?
> 
> I have no issue to wget/ssh in DOM0 to an external host and the same
> kernel with 4KB page granularity (i.e same source code but rebuilt with
> 4KB) doesn't show any issue with wget/ssh in the guest.
> 
> This has been tested on AMD Seattle, the guest kernel is the same on
> every test (4KB page granularity).
> 
> I'm planning to give a try tomorrow on X-gene (ARM64 board and I think
> 64KB page granularity is supported) to see if I can reproduce the bug.

It's working on X-gene with the same kernel and configuration. I guess
we can deduce that it's a bug in the AMD network driver.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-20 14:36     ` Roger Pau Monné
  -1 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:36 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Note that Bob multipage ring patches also remove this.

Roger.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
@ 2015-05-20 14:36     ` Roger Pau Monné
  0 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:36 UTC (permalink / raw)
  To: linux-arm-kernel

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monn? <roger.pau@citrix.com>

Note that Bob multipage ring patches also remove this.

Roger.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
  2015-05-14 17:00   ` Julien Grall
  (?)
@ 2015-05-20 14:36   ` Roger Pau Monné
  -1 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:36 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, David Vrabel, Boris Ostrovsky, linux-arm-kernel

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Note that Bob multipage ring patches also remove this.

Roger.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 05/23] block/xen-blkfront: Remove invalid comment
  2015-05-14 17:00   ` Julien Grall
  (?)
@ 2015-05-20 14:42     ` Roger Pau Monné
  -1 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:42 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk, Boris Ostrovsky,
	David Vrabel

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Since commit b764915 "xen-blkfront: use a different scatterlist for each
> request", biovec has been replaced by scatterlist when copying back the
> data during a completion request.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 05/23] block/xen-blkfront: Remove invalid comment
@ 2015-05-20 14:42     ` Roger Pau Monné
  0 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:42 UTC (permalink / raw)
  To: linux-arm-kernel

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Since commit b764915 "xen-blkfront: use a different scatterlist for each
> request", biovec has been replaced by scatterlist when copying back the
> data during a completion request.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monn? <roger.pau@citrix.com>

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 05/23] block/xen-blkfront: Remove invalid comment
@ 2015-05-20 14:42     ` Roger Pau Monné
  0 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:42 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, David Vrabel, Boris Ostrovsky, linux-arm-kernel

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Since commit b764915 "xen-blkfront: use a different scatterlist for each
> request", biovec has been replaced by scatterlist when copying back the
> data during a completion request.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/
  2015-05-14 17:00   ` Julien Grall
@ 2015-05-20 14:54     ` Roger Pau Monné
  -1 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:54 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: linux-arm-kernel, ian.campbell, stefano.stabellini, linux-kernel,
	tim, Julien Grall, Konrad Rzeszutek Wilk

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Make the code less confusing to read now that Linux may not have the
> same page size as Xen.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>


^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/
@ 2015-05-20 14:54     ` Roger Pau Monné
  0 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:54 UTC (permalink / raw)
  To: linux-arm-kernel

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Make the code less confusing to read now that Linux may not have the
> same page size as Xen.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monn? <roger.pau@citrix.com>

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/
  2015-05-14 17:00   ` Julien Grall
                     ` (2 preceding siblings ...)
  (?)
@ 2015-05-20 14:54   ` Roger Pau Monné
  -1 siblings, 0 replies; 200+ messages in thread
From: Roger Pau Monné @ 2015-05-20 14:54 UTC (permalink / raw)
  To: Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, linux-arm-kernel

El 14/05/15 a les 19.00, Julien Grall ha escrit:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Make the code less confusing to read now that Linux may not have the
> same page size as Xen.
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
  2015-05-19 15:39     ` David Vrabel
@ 2015-06-18 17:05       ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-18 17:05 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Boris Ostrovsky, linux-arm-kernel

Hi David,

On 19/05/15 16:39, David Vrabel wrote:
> On 14/05/15 18:01, Julien Grall wrote:
>> The hypercall interface (as well as the toolstack) is always using 4KB
>> page granularity. When the toolstack is asking for mapping a series of
>> guest PFN in a batch, it expects to have the page map contiguously in
>> its virtual memory.
>>
>> When Linux is using 64KB page granularity, the privcmd driver will have
>> to map multiple Xen PFN in a single Linux page.
>>
>> Note that this solution works on page granularity which is a multiple of
>> 4KB.
> [...]
>> --- a/drivers/xen/xlate_mmu.c
>> +++ b/drivers/xen/xlate_mmu.c
>> @@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
>>  
>>  struct remap_data {
>>  	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
>> +	xen_pfn_t *egmfn; /* end foreign domain's gmfn */
> 
> I don't know what you mean by "end foreign domain".

I meant the last gmfn to map. This is because the Linux page may not be
fully mapped.

>>  	pgprot_t prot;
>>  	domid_t  domid;
>>  	struct vm_area_struct *vma;
>> @@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
>>  {
>>  	struct remap_data *info = data;
>>  	struct page *page = info->pages[info->index++];
>> -	unsigned long pfn = page_to_pfn(page);
>> -	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
>> +	unsigned long pfn = xen_page_to_pfn(page);
>> +	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
>>  	int rc;
>> -
>> -	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
>> -	*info->err_ptr++ = rc;
>> -	if (!rc) {
>> -		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
>> -		info->mapped++;
>> +	uint32_t i;
>> +
>> +	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
>> +		if (info->fgmfn == info->egmfn)
>> +			break;
>> +
>> +		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
>> +		*info->err_ptr++ = rc;
>> +		if (!rc) {
>> +			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
>> +			info->mapped++;
>> +		}
>> +		info->fgmfn++;
> 
> This doesn't make any sense to me.  Don't you need to gather the foreign
> GFNs into batches of PAGE_SIZE / XEN_PAGE_SIZE and map these all at once
> into a 64 KiB page?  I don't see how you can have a set_pte_at() for
> each foreign GFN.

I will see to rework this code. I've noticed few others error in the
privcmd code too.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
@ 2015-06-18 17:05       ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-18 17:05 UTC (permalink / raw)
  To: linux-arm-kernel

Hi David,

On 19/05/15 16:39, David Vrabel wrote:
> On 14/05/15 18:01, Julien Grall wrote:
>> The hypercall interface (as well as the toolstack) is always using 4KB
>> page granularity. When the toolstack is asking for mapping a series of
>> guest PFN in a batch, it expects to have the page map contiguously in
>> its virtual memory.
>>
>> When Linux is using 64KB page granularity, the privcmd driver will have
>> to map multiple Xen PFN in a single Linux page.
>>
>> Note that this solution works on page granularity which is a multiple of
>> 4KB.
> [...]
>> --- a/drivers/xen/xlate_mmu.c
>> +++ b/drivers/xen/xlate_mmu.c
>> @@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
>>  
>>  struct remap_data {
>>  	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
>> +	xen_pfn_t *egmfn; /* end foreign domain's gmfn */
> 
> I don't know what you mean by "end foreign domain".

I meant the last gmfn to map. This is because the Linux page may not be
fully mapped.

>>  	pgprot_t prot;
>>  	domid_t  domid;
>>  	struct vm_area_struct *vma;
>> @@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
>>  {
>>  	struct remap_data *info = data;
>>  	struct page *page = info->pages[info->index++];
>> -	unsigned long pfn = page_to_pfn(page);
>> -	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
>> +	unsigned long pfn = xen_page_to_pfn(page);
>> +	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
>>  	int rc;
>> -
>> -	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
>> -	*info->err_ptr++ = rc;
>> -	if (!rc) {
>> -		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
>> -		info->mapped++;
>> +	uint32_t i;
>> +
>> +	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
>> +		if (info->fgmfn == info->egmfn)
>> +			break;
>> +
>> +		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
>> +		*info->err_ptr++ = rc;
>> +		if (!rc) {
>> +			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
>> +			info->mapped++;
>> +		}
>> +		info->fgmfn++;
> 
> This doesn't make any sense to me.  Don't you need to gather the foreign
> GFNs into batches of PAGE_SIZE / XEN_PAGE_SIZE and map these all at once
> into a 64 KiB page?  I don't see how you can have a set_pte_at() for
> each foreign GFN.

I will see to rework this code. I've noticed few others error in the
privcmd code too.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 22/23] xen/privcmd: Add support for Linux 64KB page granularity
  2015-05-19 15:39     ` David Vrabel
  (?)
@ 2015-06-18 17:05     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-18 17:05 UTC (permalink / raw)
  To: David Vrabel, Julien Grall, xen-devel
  Cc: ian.campbell, stefano.stabellini, tim, linux-kernel,
	Boris Ostrovsky, linux-arm-kernel

Hi David,

On 19/05/15 16:39, David Vrabel wrote:
> On 14/05/15 18:01, Julien Grall wrote:
>> The hypercall interface (as well as the toolstack) is always using 4KB
>> page granularity. When the toolstack is asking for mapping a series of
>> guest PFN in a batch, it expects to have the page map contiguously in
>> its virtual memory.
>>
>> When Linux is using 64KB page granularity, the privcmd driver will have
>> to map multiple Xen PFN in a single Linux page.
>>
>> Note that this solution works on page granularity which is a multiple of
>> 4KB.
> [...]
>> --- a/drivers/xen/xlate_mmu.c
>> +++ b/drivers/xen/xlate_mmu.c
>> @@ -63,6 +63,7 @@ static int map_foreign_page(unsigned long lpfn, unsigned long fgmfn,
>>  
>>  struct remap_data {
>>  	xen_pfn_t *fgmfn; /* foreign domain's gmfn */
>> +	xen_pfn_t *egmfn; /* end foreign domain's gmfn */
> 
> I don't know what you mean by "end foreign domain".

I meant the last gmfn to map. This is because the Linux page may not be
fully mapped.

>>  	pgprot_t prot;
>>  	domid_t  domid;
>>  	struct vm_area_struct *vma;
>> @@ -78,17 +79,23 @@ static int remap_pte_fn(pte_t *ptep, pgtable_t token, unsigned long addr,
>>  {
>>  	struct remap_data *info = data;
>>  	struct page *page = info->pages[info->index++];
>> -	unsigned long pfn = page_to_pfn(page);
>> -	pte_t pte = pte_mkspecial(pfn_pte(pfn, info->prot));
>> +	unsigned long pfn = xen_page_to_pfn(page);
>> +	pte_t pte = pte_mkspecial(pfn_pte(page_to_pfn(page), info->prot));
>>  	int rc;
>> -
>> -	rc = map_foreign_page(pfn, *info->fgmfn, info->domid);
>> -	*info->err_ptr++ = rc;
>> -	if (!rc) {
>> -		set_pte_at(info->vma->vm_mm, addr, ptep, pte);
>> -		info->mapped++;
>> +	uint32_t i;
>> +
>> +	for (i = 0; i < XEN_PFN_PER_PAGE; i++) {
>> +		if (info->fgmfn == info->egmfn)
>> +			break;
>> +
>> +		rc = map_foreign_page(pfn++, *info->fgmfn, info->domid);
>> +		*info->err_ptr++ = rc;
>> +		if (!rc) {
>> +			set_pte_at(info->vma->vm_mm, addr, ptep, pte);
>> +			info->mapped++;
>> +		}
>> +		info->fgmfn++;
> 
> This doesn't make any sense to me.  Don't you need to gather the foreign
> GFNs into batches of PAGE_SIZE / XEN_PAGE_SIZE and map these all at once
> into a 64 KiB page?  I don't see how you can have a set_pte_at() for
> each foreign GFN.

I will see to rework this code. I've noticed few others error in the
privcmd code too.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
  2015-05-14 17:00   ` Julien Grall
@ 2015-06-23 13:25     ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, linux-arm-kernel, ian.campbell, stefano.stabellini,
	linux-kernel, tim, Julien Grall

On Thu, 14 May 2015, Julien Grall wrote:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  arch/arm/include/asm/xen/page.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> index 0b579b2..1bee8ca 100644
> --- a/arch/arm/include/asm/xen/page.h
> +++ b/arch/arm/include/asm/xen/page.h
> @@ -12,7 +12,6 @@
>  #include <xen/interface/grant_table.h>
>  
>  #define phys_to_machine_mapping_valid(pfn) (1)
> -#define mfn_to_virt(m)			(__va(mfn_to_pfn(m) << PAGE_SHIFT))
>  
>  #define pte_mfn	    pte_pfn
>  #define mfn_pte	    pfn_pte
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
@ 2015-06-23 13:25     ` Stefano Stabellini
  0 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 14 May 2015, Julien Grall wrote:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  arch/arm/include/asm/xen/page.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> index 0b579b2..1bee8ca 100644
> --- a/arch/arm/include/asm/xen/page.h
> +++ b/arch/arm/include/asm/xen/page.h
> @@ -12,7 +12,6 @@
>  #include <xen/interface/grant_table.h>
>  
>  #define phys_to_machine_mapping_valid(pfn) (1)
> -#define mfn_to_virt(m)			(__va(mfn_to_pfn(m) << PAGE_SHIFT))
>  
>  #define pte_mfn	    pte_pfn
>  #define mfn_pte	    pfn_pte
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
  2015-05-14 17:00   ` Julien Grall
  (?)
@ 2015-06-23 13:25   ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:25 UTC (permalink / raw)
  To: Julien Grall
  Cc: ian.campbell, stefano.stabellini, Julien Grall, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Thu, 14 May 2015, Julien Grall wrote:
> From: Julien Grall <julien.grall@linaro.org>
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>


>  arch/arm/include/asm/xen/page.h | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> index 0b579b2..1bee8ca 100644
> --- a/arch/arm/include/asm/xen/page.h
> +++ b/arch/arm/include/asm/xen/page.h
> @@ -12,7 +12,6 @@
>  #include <xen/interface/grant_table.h>
>  
>  #define phys_to_machine_mapping_valid(pfn) (1)
> -#define mfn_to_virt(m)			(__va(mfn_to_pfn(m) << PAGE_SHIFT))
>  
>  #define pte_mfn	    pte_pfn
>  #define mfn_pte	    pfn_pte
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-05-18 12:23     ` Julien Grall
@ 2015-06-23 13:37       ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:37 UTC (permalink / raw)
  To: Julien Grall
  Cc: David Vrabel, xen-devel, wei.liu2, ian.campbell,
	stefano.stabellini, tim, linux-kernel, boris.ostrovsky,
	linux-arm-kernel, roger.pau

On Mon, 18 May 2015, Julien Grall wrote:
> Hi David,
> 
> On 15/05/15 16:45, David Vrabel wrote:
> > On 14/05/15 18:00, Julien Grall wrote:
> >> Hi all,
> >>
> >> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> >> hypercall interface and PV protocol are always based on 4KB page granularity.
> >>
> >> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> >> guest crash.
> >>
> >> This series is a first attempt to allow those Linux running with the current
> >> hypercall interface and PV protocol.
> >>
> >> This solution has been chosen because we want to run Linux 64KB in released
> >> Xen ARM version or/and platform using an old version of Linux DOM0.
> > 
> > The key problem I see with this approach is the confusion between guest
> > page size and Xen page size.  This is going to be particularly
> > problematic since the majority of development/usage will remain on x86
> > where PAGE_SIZE == XEN_PAGE_SIZE.
> > 
> > I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> > backend drivers.  Perhaps with a suitable set of helper functions?
> 
> Even with the helpers, we are not protected from any change in the
> frontend/backend that will impact 64K. It won't be possible to remove
> all the XEN_PAGE_* usage (there is a lots of places where adding helpers
> would not be possible) and we would still have to carefully review any
> changes.

We could at least introduce a few asserts, so that an ARM64 kernel
build, that any x86 maintainers can easily and quickly do on their x86
machines, would spot these errors.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
@ 2015-06-23 13:37       ` Stefano Stabellini
  0 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 18 May 2015, Julien Grall wrote:
> Hi David,
> 
> On 15/05/15 16:45, David Vrabel wrote:
> > On 14/05/15 18:00, Julien Grall wrote:
> >> Hi all,
> >>
> >> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> >> hypercall interface and PV protocol are always based on 4KB page granularity.
> >>
> >> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> >> guest crash.
> >>
> >> This series is a first attempt to allow those Linux running with the current
> >> hypercall interface and PV protocol.
> >>
> >> This solution has been chosen because we want to run Linux 64KB in released
> >> Xen ARM version or/and platform using an old version of Linux DOM0.
> > 
> > The key problem I see with this approach is the confusion between guest
> > page size and Xen page size.  This is going to be particularly
> > problematic since the majority of development/usage will remain on x86
> > where PAGE_SIZE == XEN_PAGE_SIZE.
> > 
> > I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> > backend drivers.  Perhaps with a suitable set of helper functions?
> 
> Even with the helpers, we are not protected from any change in the
> frontend/backend that will impact 64K. It won't be possible to remove
> all the XEN_PAGE_* usage (there is a lots of places where adding helpers
> would not be possible) and we would still have to carefully review any
> changes.

We could at least introduce a few asserts, so that an ARM64 kernel
build, that any x86 maintainers can easily and quickly do on their x86
machines, would spot these errors.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-05-18 12:23     ` Julien Grall
  (?)
@ 2015-06-23 13:37     ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:37 UTC (permalink / raw)
  To: Julien Grall
  Cc: wei.liu2, ian.campbell, stefano.stabellini, tim, linux-kernel,
	David Vrabel, xen-devel, boris.ostrovsky, linux-arm-kernel,
	roger.pau

On Mon, 18 May 2015, Julien Grall wrote:
> Hi David,
> 
> On 15/05/15 16:45, David Vrabel wrote:
> > On 14/05/15 18:00, Julien Grall wrote:
> >> Hi all,
> >>
> >> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> >> hypercall interface and PV protocol are always based on 4KB page granularity.
> >>
> >> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> >> guest crash.
> >>
> >> This series is a first attempt to allow those Linux running with the current
> >> hypercall interface and PV protocol.
> >>
> >> This solution has been chosen because we want to run Linux 64KB in released
> >> Xen ARM version or/and platform using an old version of Linux DOM0.
> > 
> > The key problem I see with this approach is the confusion between guest
> > page size and Xen page size.  This is going to be particularly
> > problematic since the majority of development/usage will remain on x86
> > where PAGE_SIZE == XEN_PAGE_SIZE.
> > 
> > I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> > backend drivers.  Perhaps with a suitable set of helper functions?
> 
> Even with the helpers, we are not protected from any change in the
> frontend/backend that will impact 64K. It won't be possible to remove
> all the XEN_PAGE_* usage (there is a lots of places where adding helpers
> would not be possible) and we would still have to carefully review any
> changes.

We could at least introduce a few asserts, so that an ARM64 kernel
build, that any x86 maintainers can easily and quickly do on their x86
machines, would spot these errors.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-06-23 13:37       ` Stefano Stabellini
@ 2015-06-23 13:41         ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:41 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: Julien Grall, David Vrabel, xen-devel, wei.liu2, ian.campbell,
	tim, linux-kernel, boris.ostrovsky, linux-arm-kernel, roger.pau

On Tue, 23 Jun 2015, Stefano Stabellini wrote:
> On Mon, 18 May 2015, Julien Grall wrote:
> > Hi David,
> > 
> > On 15/05/15 16:45, David Vrabel wrote:
> > > On 14/05/15 18:00, Julien Grall wrote:
> > >> Hi all,
> > >>
> > >> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> > >> hypercall interface and PV protocol are always based on 4KB page granularity.
> > >>
> > >> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> > >> guest crash.
> > >>
> > >> This series is a first attempt to allow those Linux running with the current
> > >> hypercall interface and PV protocol.
> > >>
> > >> This solution has been chosen because we want to run Linux 64KB in released
> > >> Xen ARM version or/and platform using an old version of Linux DOM0.
> > > 
> > > The key problem I see with this approach is the confusion between guest
> > > page size and Xen page size.  This is going to be particularly
> > > problematic since the majority of development/usage will remain on x86
> > > where PAGE_SIZE == XEN_PAGE_SIZE.
> > > 
> > > I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> > > backend drivers.  Perhaps with a suitable set of helper functions?
> > 
> > Even with the helpers, we are not protected from any change in the
> > frontend/backend that will impact 64K. It won't be possible to remove
> > all the XEN_PAGE_* usage (there is a lots of places where adding helpers
> > would not be possible) and we would still have to carefully review any
> > changes.
> 
> We could at least introduce a few asserts, so that an ARM64 kernel
> build, that any x86 maintainers can easily and quickly do on their x86
> machines, would spot these errors.

I actually meant BUILD_BUG_ON

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
@ 2015-06-23 13:41         ` Stefano Stabellini
  0 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 23 Jun 2015, Stefano Stabellini wrote:
> On Mon, 18 May 2015, Julien Grall wrote:
> > Hi David,
> > 
> > On 15/05/15 16:45, David Vrabel wrote:
> > > On 14/05/15 18:00, Julien Grall wrote:
> > >> Hi all,
> > >>
> > >> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> > >> hypercall interface and PV protocol are always based on 4KB page granularity.
> > >>
> > >> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> > >> guest crash.
> > >>
> > >> This series is a first attempt to allow those Linux running with the current
> > >> hypercall interface and PV protocol.
> > >>
> > >> This solution has been chosen because we want to run Linux 64KB in released
> > >> Xen ARM version or/and platform using an old version of Linux DOM0.
> > > 
> > > The key problem I see with this approach is the confusion between guest
> > > page size and Xen page size.  This is going to be particularly
> > > problematic since the majority of development/usage will remain on x86
> > > where PAGE_SIZE == XEN_PAGE_SIZE.
> > > 
> > > I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> > > backend drivers.  Perhaps with a suitable set of helper functions?
> > 
> > Even with the helpers, we are not protected from any change in the
> > frontend/backend that will impact 64K. It won't be possible to remove
> > all the XEN_PAGE_* usage (there is a lots of places where adding helpers
> > would not be possible) and we would still have to carefully review any
> > changes.
> 
> We could at least introduce a few asserts, so that an ARM64 kernel
> build, that any x86 maintainers can easily and quickly do on their x86
> machines, would spot these errors.

I actually meant BUILD_BUG_ON

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest
  2015-06-23 13:37       ` Stefano Stabellini
  (?)
@ 2015-06-23 13:41       ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 13:41 UTC (permalink / raw)
  To: Stefano Stabellini
  Cc: wei.liu2, ian.campbell, tim, linux-kernel, Julien Grall,
	David Vrabel, xen-devel, boris.ostrovsky, linux-arm-kernel,
	roger.pau

On Tue, 23 Jun 2015, Stefano Stabellini wrote:
> On Mon, 18 May 2015, Julien Grall wrote:
> > Hi David,
> > 
> > On 15/05/15 16:45, David Vrabel wrote:
> > > On 14/05/15 18:00, Julien Grall wrote:
> > >> Hi all,
> > >>
> > >> ARM64 Linux is supporting both 4KB and 64KB page granularity. Although, Xen
> > >> hypercall interface and PV protocol are always based on 4KB page granularity.
> > >>
> > >> Any attempt to boot a Linux guest with 64KB pages enabled will result to a
> > >> guest crash.
> > >>
> > >> This series is a first attempt to allow those Linux running with the current
> > >> hypercall interface and PV protocol.
> > >>
> > >> This solution has been chosen because we want to run Linux 64KB in released
> > >> Xen ARM version or/and platform using an old version of Linux DOM0.
> > > 
> > > The key problem I see with this approach is the confusion between guest
> > > page size and Xen page size.  This is going to be particularly
> > > problematic since the majority of development/usage will remain on x86
> > > where PAGE_SIZE == XEN_PAGE_SIZE.
> > > 
> > > I think it would be nice to keep XEN_PAGE_SIZE etc out of front and
> > > backend drivers.  Perhaps with a suitable set of helper functions?
> > 
> > Even with the helpers, we are not protected from any change in the
> > frontend/backend that will impact 64K. It won't be possible to remove
> > all the XEN_PAGE_* usage (there is a lots of places where adding helpers
> > would not be possible) and we would still have to carefully review any
> > changes.
> 
> We could at least introduce a few asserts, so that an ARM64 kernel
> build, that any x86 maintainers can easily and quickly do on their x86
> machines, would spot these errors.

I actually meant BUILD_BUG_ON

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
  2015-06-23 13:25     ` Stefano Stabellini
@ 2015-06-23 13:53       ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 13:53 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: ian.campbell, Julien Grall, tim, linux-kernel, xen-devel,
	linux-arm-kernel

Hi,

On 23/06/15 14:25, Stefano Stabellini wrote:
> On Thu, 14 May 2015, Julien Grall wrote:
>> From: Julien Grall <julien.grall@linaro.org>
>>
>> Signed-off-by: Julien Grall <julien.grall@citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 
> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

FYI, David already queued this patch in xentip for Linux 4.2.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
@ 2015-06-23 13:53       ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 13:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 23/06/15 14:25, Stefano Stabellini wrote:
> On Thu, 14 May 2015, Julien Grall wrote:
>> From: Julien Grall <julien.grall@linaro.org>
>>
>> Signed-off-by: Julien Grall <julien.grall@citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 
> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

FYI, David already queued this patch in xentip for Linux 4.2.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt
  2015-06-23 13:25     ` Stefano Stabellini
  (?)
@ 2015-06-23 13:53     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 13:53 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: ian.campbell, Julien Grall, tim, linux-kernel, xen-devel,
	linux-arm-kernel

Hi,

On 23/06/15 14:25, Stefano Stabellini wrote:
> On Thu, 14 May 2015, Julien Grall wrote:
>> From: Julien Grall <julien.grall@linaro.org>
>>
>> Signed-off-by: Julien Grall <julien.grall@citrix.com>
>> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> 
> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

FYI, David already queued this patch in xentip for Linux 4.2.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-05-14 17:01   ` Julien Grall
@ 2015-06-23 14:19     ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 14:19 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, linux-arm-kernel, ian.campbell, stefano.stabellini,
	linux-kernel, tim, Russell King

On Thu, 14 May 2015, Julien Grall wrote:
> The hypercall interface is always using 4KB page granularity. This is
> requiring to use xen page definition macro when we deal with hypercall.
> 
> Note that pfn_to_mfn is working with a Xen pfn (i.e 4KB). We may want to
> rename pfn_mfn to make this explicit.
> 
> We also allocate a 64KB page for the shared page even though only the
> first 4KB is used. I don't think this is really important for now as it
> helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
> Xen PFN).
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Russell King <linux@arm.linux.org.uk>
>
>  arch/arm/include/asm/xen/page.h | 12 ++++++------
>  arch/arm/xen/enlighten.c        |  6 +++---
>  2 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> index 1bee8ca..ab6eb9a 100644
> --- a/arch/arm/include/asm/xen/page.h
> +++ b/arch/arm/include/asm/xen/page.h
> @@ -56,19 +56,19 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn)
>  
>  static inline xmaddr_t phys_to_machine(xpaddr_t phys)
>  {
> -	unsigned offset = phys.paddr & ~PAGE_MASK;
> -	return XMADDR(PFN_PHYS(pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
> +	unsigned offset = phys.paddr & ~XEN_PAGE_MASK;
> +	return XMADDR(XEN_PFN_PHYS(pfn_to_mfn(XEN_PFN_DOWN(phys.paddr))) | offset);
>  }
>  
>  static inline xpaddr_t machine_to_phys(xmaddr_t machine)
>  {
> -	unsigned offset = machine.maddr & ~PAGE_MASK;
> -	return XPADDR(PFN_PHYS(mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset);
> +	unsigned offset = machine.maddr & ~XEN_PAGE_MASK;
> +	return XPADDR(XEN_PFN_PHYS(mfn_to_pfn(XEN_PFN_DOWN(machine.maddr))) | offset);
>  }
>  /* VIRT <-> MACHINE conversion */
>  #define virt_to_machine(v)	(phys_to_machine(XPADDR(__pa(v))))
> -#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_pfn(v)))
> -#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << PAGE_SHIFT))
> +#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT))
> +#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << XEN_PAGE_SHIFT))
>  
>  static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr)
>  {
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 224081c..dcfe251 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
>  	pr_info("Xen: initializing cpu%d\n", cpu);
>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
>  
> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
> -	info.offset = offset_in_page(vcpup);
> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
> +	info.offset = xen_offset_in_page(vcpup);
>  
>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
>  	BUG_ON(err);
> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
>  	xatp.domid = DOMID_SELF;
>  	xatp.idx = 0;
>  	xatp.space = XENMAPSPACE_shared_info;
> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
>  		BUG();

What about xen_remap_domain_mfn_range? I guess we don't support that use
case on 64K guests? If so, I would appreaciate an assert and/or an error
message.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 23/23] arm/xen: Add support for 64KB page granularity
@ 2015-06-23 14:19     ` Stefano Stabellini
  0 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 14:19 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 14 May 2015, Julien Grall wrote:
> The hypercall interface is always using 4KB page granularity. This is
> requiring to use xen page definition macro when we deal with hypercall.
> 
> Note that pfn_to_mfn is working with a Xen pfn (i.e 4KB). We may want to
> rename pfn_mfn to make this explicit.
> 
> We also allocate a 64KB page for the shared page even though only the
> first 4KB is used. I don't think this is really important for now as it
> helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
> Xen PFN).
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Russell King <linux@arm.linux.org.uk>
>
>  arch/arm/include/asm/xen/page.h | 12 ++++++------
>  arch/arm/xen/enlighten.c        |  6 +++---
>  2 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> index 1bee8ca..ab6eb9a 100644
> --- a/arch/arm/include/asm/xen/page.h
> +++ b/arch/arm/include/asm/xen/page.h
> @@ -56,19 +56,19 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn)
>  
>  static inline xmaddr_t phys_to_machine(xpaddr_t phys)
>  {
> -	unsigned offset = phys.paddr & ~PAGE_MASK;
> -	return XMADDR(PFN_PHYS(pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
> +	unsigned offset = phys.paddr & ~XEN_PAGE_MASK;
> +	return XMADDR(XEN_PFN_PHYS(pfn_to_mfn(XEN_PFN_DOWN(phys.paddr))) | offset);
>  }
>  
>  static inline xpaddr_t machine_to_phys(xmaddr_t machine)
>  {
> -	unsigned offset = machine.maddr & ~PAGE_MASK;
> -	return XPADDR(PFN_PHYS(mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset);
> +	unsigned offset = machine.maddr & ~XEN_PAGE_MASK;
> +	return XPADDR(XEN_PFN_PHYS(mfn_to_pfn(XEN_PFN_DOWN(machine.maddr))) | offset);
>  }
>  /* VIRT <-> MACHINE conversion */
>  #define virt_to_machine(v)	(phys_to_machine(XPADDR(__pa(v))))
> -#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_pfn(v)))
> -#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << PAGE_SHIFT))
> +#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT))
> +#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << XEN_PAGE_SHIFT))
>  
>  static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr)
>  {
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 224081c..dcfe251 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
>  	pr_info("Xen: initializing cpu%d\n", cpu);
>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
>  
> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
> -	info.offset = offset_in_page(vcpup);
> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
> +	info.offset = xen_offset_in_page(vcpup);
>  
>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
>  	BUG_ON(err);
> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
>  	xatp.domid = DOMID_SELF;
>  	xatp.idx = 0;
>  	xatp.space = XENMAPSPACE_shared_info;
> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
>  		BUG();

What about xen_remap_domain_mfn_range? I guess we don't support that use
case on 64K guests? If so, I would appreaciate an assert and/or an error
message.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-05-14 17:01   ` Julien Grall
  (?)
  (?)
@ 2015-06-23 14:19   ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 14:19 UTC (permalink / raw)
  To: Julien Grall
  Cc: Russell King, ian.campbell, stefano.stabellini, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Thu, 14 May 2015, Julien Grall wrote:
> The hypercall interface is always using 4KB page granularity. This is
> requiring to use xen page definition macro when we deal with hypercall.
> 
> Note that pfn_to_mfn is working with a Xen pfn (i.e 4KB). We may want to
> rename pfn_mfn to make this explicit.
> 
> We also allocate a 64KB page for the shared page even though only the
> first 4KB is used. I don't think this is really important for now as it
> helps to have the pointer 4KB aligned (XENMEM_add_to_physmap is taking a
> Xen PFN).
> 
> Signed-off-by: Julien Grall <julien.grall@citrix.com>
> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Cc: Russell King <linux@arm.linux.org.uk>
>
>  arch/arm/include/asm/xen/page.h | 12 ++++++------
>  arch/arm/xen/enlighten.c        |  6 +++---
>  2 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/arm/include/asm/xen/page.h b/arch/arm/include/asm/xen/page.h
> index 1bee8ca..ab6eb9a 100644
> --- a/arch/arm/include/asm/xen/page.h
> +++ b/arch/arm/include/asm/xen/page.h
> @@ -56,19 +56,19 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn)
>  
>  static inline xmaddr_t phys_to_machine(xpaddr_t phys)
>  {
> -	unsigned offset = phys.paddr & ~PAGE_MASK;
> -	return XMADDR(PFN_PHYS(pfn_to_mfn(PFN_DOWN(phys.paddr))) | offset);
> +	unsigned offset = phys.paddr & ~XEN_PAGE_MASK;
> +	return XMADDR(XEN_PFN_PHYS(pfn_to_mfn(XEN_PFN_DOWN(phys.paddr))) | offset);
>  }
>  
>  static inline xpaddr_t machine_to_phys(xmaddr_t machine)
>  {
> -	unsigned offset = machine.maddr & ~PAGE_MASK;
> -	return XPADDR(PFN_PHYS(mfn_to_pfn(PFN_DOWN(machine.maddr))) | offset);
> +	unsigned offset = machine.maddr & ~XEN_PAGE_MASK;
> +	return XPADDR(XEN_PFN_PHYS(mfn_to_pfn(XEN_PFN_DOWN(machine.maddr))) | offset);
>  }
>  /* VIRT <-> MACHINE conversion */
>  #define virt_to_machine(v)	(phys_to_machine(XPADDR(__pa(v))))
> -#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_pfn(v)))
> -#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << PAGE_SHIFT))
> +#define virt_to_mfn(v)		(pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT))
> +#define mfn_to_virt(m)		(__va(mfn_to_pfn(m) << XEN_PAGE_SHIFT))
>  
>  static inline xmaddr_t arbitrary_virt_to_machine(void *vaddr)
>  {
> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> index 224081c..dcfe251 100644
> --- a/arch/arm/xen/enlighten.c
> +++ b/arch/arm/xen/enlighten.c
> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
>  	pr_info("Xen: initializing cpu%d\n", cpu);
>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
>  
> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
> -	info.offset = offset_in_page(vcpup);
> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
> +	info.offset = xen_offset_in_page(vcpup);
>  
>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
>  	BUG_ON(err);
> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
>  	xatp.domid = DOMID_SELF;
>  	xatp.idx = 0;
>  	xatp.space = XENMAPSPACE_shared_info;
> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
>  		BUG();

What about xen_remap_domain_mfn_range? I guess we don't support that use
case on 64K guests? If so, I would appreaciate an assert and/or an error
message.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-06-23 14:19     ` Stefano Stabellini
@ 2015-06-23 14:37       ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 14:37 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Russell King, ian.campbell, tim, linux-kernel, xen-devel,
	linux-arm-kernel

Hi,

On 23/06/15 15:19, Stefano Stabellini wrote:
>> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
>> index 224081c..dcfe251 100644
>> --- a/arch/arm/xen/enlighten.c
>> +++ b/arch/arm/xen/enlighten.c
>> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
>>  	pr_info("Xen: initializing cpu%d\n", cpu);
>>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
>>  
>> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
>> -	info.offset = offset_in_page(vcpup);
>> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
>> +	info.offset = xen_offset_in_page(vcpup);
>>  
>>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
>>  	BUG_ON(err);
>> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
>>  	xatp.domid = DOMID_SELF;
>>  	xatp.idx = 0;
>>  	xatp.space = XENMAPSPACE_shared_info;
>> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
>> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
>>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
>>  		BUG();
> 
> What about xen_remap_domain_mfn_range? I guess we don't support that use
> case on 64K guests? If so, I would appreaciate an assert and/or an error
> message.

The implementation of xen_remap_domain_mfn_range return -ENOSYS no
matter the page granularity.

This function is PV specific and has been added few months ago just for
a stub. See comment in the code:
"/* Not used by XENFEAT_auto_translated guests */"

Any logging/BUG_ON within this function is out of scope for this series.
And I don't think this will be really useful. Feel free to send a patch
for it.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 23/23] arm/xen: Add support for 64KB page granularity
@ 2015-06-23 14:37       ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 23/06/15 15:19, Stefano Stabellini wrote:
>> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
>> index 224081c..dcfe251 100644
>> --- a/arch/arm/xen/enlighten.c
>> +++ b/arch/arm/xen/enlighten.c
>> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
>>  	pr_info("Xen: initializing cpu%d\n", cpu);
>>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
>>  
>> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
>> -	info.offset = offset_in_page(vcpup);
>> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
>> +	info.offset = xen_offset_in_page(vcpup);
>>  
>>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
>>  	BUG_ON(err);
>> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
>>  	xatp.domid = DOMID_SELF;
>>  	xatp.idx = 0;
>>  	xatp.space = XENMAPSPACE_shared_info;
>> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
>> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
>>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
>>  		BUG();
> 
> What about xen_remap_domain_mfn_range? I guess we don't support that use
> case on 64K guests? If so, I would appreaciate an assert and/or an error
> message.

The implementation of xen_remap_domain_mfn_range return -ENOSYS no
matter the page granularity.

This function is PV specific and has been added few months ago just for
a stub. See comment in the code:
"/* Not used by XENFEAT_auto_translated guests */"

Any logging/BUG_ON within this function is out of scope for this series.
And I don't think this will be really useful. Feel free to send a patch
for it.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-06-23 14:19     ` Stefano Stabellini
  (?)
@ 2015-06-23 14:37     ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 14:37 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Russell King, ian.campbell, tim, linux-kernel, xen-devel,
	linux-arm-kernel

Hi,

On 23/06/15 15:19, Stefano Stabellini wrote:
>> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
>> index 224081c..dcfe251 100644
>> --- a/arch/arm/xen/enlighten.c
>> +++ b/arch/arm/xen/enlighten.c
>> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
>>  	pr_info("Xen: initializing cpu%d\n", cpu);
>>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
>>  
>> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
>> -	info.offset = offset_in_page(vcpup);
>> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
>> +	info.offset = xen_offset_in_page(vcpup);
>>  
>>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
>>  	BUG_ON(err);
>> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
>>  	xatp.domid = DOMID_SELF;
>>  	xatp.idx = 0;
>>  	xatp.space = XENMAPSPACE_shared_info;
>> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
>> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
>>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
>>  		BUG();
> 
> What about xen_remap_domain_mfn_range? I guess we don't support that use
> case on 64K guests? If so, I would appreaciate an assert and/or an error
> message.

The implementation of xen_remap_domain_mfn_range return -ENOSYS no
matter the page granularity.

This function is PV specific and has been added few months ago just for
a stub. See comment in the code:
"/* Not used by XENFEAT_auto_translated guests */"

Any logging/BUG_ON within this function is out of scope for this series.
And I don't think this will be really useful. Feel free to send a patch
for it.

Regards,

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-06-23 14:37       ` Julien Grall
@ 2015-06-23 14:49         ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 14:49 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Russell King, ian.campbell, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Tue, 23 Jun 2015, Julien Grall wrote:
> Hi,
> 
> On 23/06/15 15:19, Stefano Stabellini wrote:
> >> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> >> index 224081c..dcfe251 100644
> >> --- a/arch/arm/xen/enlighten.c
> >> +++ b/arch/arm/xen/enlighten.c
> >> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
> >>  	pr_info("Xen: initializing cpu%d\n", cpu);
> >>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
> >>  
> >> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
> >> -	info.offset = offset_in_page(vcpup);
> >> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
> >> +	info.offset = xen_offset_in_page(vcpup);
> >>  
> >>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
> >>  	BUG_ON(err);
> >> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
> >>  	xatp.domid = DOMID_SELF;
> >>  	xatp.idx = 0;
> >>  	xatp.space = XENMAPSPACE_shared_info;
> >> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
> >> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
> >>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
> >>  		BUG();
> > 
> > What about xen_remap_domain_mfn_range? I guess we don't support that use
> > case on 64K guests? If so, I would appreaciate an assert and/or an error
> > message.
> 
> The implementation of xen_remap_domain_mfn_range return -ENOSYS no
> matter the page granularity.
> 
> This function is PV specific and has been added few months ago just for
> a stub. See comment in the code:
> "/* Not used by XENFEAT_auto_translated guests */"
> 
> Any logging/BUG_ON within this function is out of scope for this series.
> And I don't think this will be really useful. Feel free to send a patch
> for it.

Yes, you are right, I was reading an older version of Linux that still
had xen_remap_domain_mfn_range properly implemented.

The new function is called xen_remap_domain_mfn_array which calls
xen_xlate_remap_gfn_array.

I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
guess we don't support that use case on 64K guests? If so, I would
appreaciate an assert and/or an error message.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [RFC 23/23] arm/xen: Add support for 64KB page granularity
@ 2015-06-23 14:49         ` Stefano Stabellini
  0 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 14:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 23 Jun 2015, Julien Grall wrote:
> Hi,
> 
> On 23/06/15 15:19, Stefano Stabellini wrote:
> >> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> >> index 224081c..dcfe251 100644
> >> --- a/arch/arm/xen/enlighten.c
> >> +++ b/arch/arm/xen/enlighten.c
> >> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
> >>  	pr_info("Xen: initializing cpu%d\n", cpu);
> >>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
> >>  
> >> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
> >> -	info.offset = offset_in_page(vcpup);
> >> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
> >> +	info.offset = xen_offset_in_page(vcpup);
> >>  
> >>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
> >>  	BUG_ON(err);
> >> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
> >>  	xatp.domid = DOMID_SELF;
> >>  	xatp.idx = 0;
> >>  	xatp.space = XENMAPSPACE_shared_info;
> >> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
> >> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
> >>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
> >>  		BUG();
> > 
> > What about xen_remap_domain_mfn_range? I guess we don't support that use
> > case on 64K guests? If so, I would appreaciate an assert and/or an error
> > message.
> 
> The implementation of xen_remap_domain_mfn_range return -ENOSYS no
> matter the page granularity.
> 
> This function is PV specific and has been added few months ago just for
> a stub. See comment in the code:
> "/* Not used by XENFEAT_auto_translated guests */"
> 
> Any logging/BUG_ON within this function is out of scope for this series.
> And I don't think this will be really useful. Feel free to send a patch
> for it.

Yes, you are right, I was reading an older version of Linux that still
had xen_remap_domain_mfn_range properly implemented.

The new function is called xen_remap_domain_mfn_array which calls
xen_xlate_remap_gfn_array.

I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
guess we don't support that use case on 64K guests? If so, I would
appreaciate an assert and/or an error message.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-06-23 14:37       ` Julien Grall
  (?)
@ 2015-06-23 14:49       ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 14:49 UTC (permalink / raw)
  To: Julien Grall
  Cc: Russell King, ian.campbell, Stefano Stabellini, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Tue, 23 Jun 2015, Julien Grall wrote:
> Hi,
> 
> On 23/06/15 15:19, Stefano Stabellini wrote:
> >> diff --git a/arch/arm/xen/enlighten.c b/arch/arm/xen/enlighten.c
> >> index 224081c..dcfe251 100644
> >> --- a/arch/arm/xen/enlighten.c
> >> +++ b/arch/arm/xen/enlighten.c
> >> @@ -93,8 +93,8 @@ static void xen_percpu_init(void)
> >>  	pr_info("Xen: initializing cpu%d\n", cpu);
> >>  	vcpup = per_cpu_ptr(xen_vcpu_info, cpu);
> >>  
> >> -	info.mfn = __pa(vcpup) >> PAGE_SHIFT;
> >> -	info.offset = offset_in_page(vcpup);
> >> +	info.mfn = __pa(vcpup) >> XEN_PAGE_SHIFT;
> >> +	info.offset = xen_offset_in_page(vcpup);
> >>  
> >>  	err = HYPERVISOR_vcpu_op(VCPUOP_register_vcpu_info, cpu, &info);
> >>  	BUG_ON(err);
> >> @@ -204,7 +204,7 @@ static int __init xen_guest_init(void)
> >>  	xatp.domid = DOMID_SELF;
> >>  	xatp.idx = 0;
> >>  	xatp.space = XENMAPSPACE_shared_info;
> >> -	xatp.gpfn = __pa(shared_info_page) >> PAGE_SHIFT;
> >> +	xatp.gpfn = __pa(shared_info_page) >> XEN_PAGE_SHIFT;
> >>  	if (HYPERVISOR_memory_op(XENMEM_add_to_physmap, &xatp))
> >>  		BUG();
> > 
> > What about xen_remap_domain_mfn_range? I guess we don't support that use
> > case on 64K guests? If so, I would appreaciate an assert and/or an error
> > message.
> 
> The implementation of xen_remap_domain_mfn_range return -ENOSYS no
> matter the page granularity.
> 
> This function is PV specific and has been added few months ago just for
> a stub. See comment in the code:
> "/* Not used by XENFEAT_auto_translated guests */"
> 
> Any logging/BUG_ON within this function is out of scope for this series.
> And I don't think this will be really useful. Feel free to send a patch
> for it.

Yes, you are right, I was reading an older version of Linux that still
had xen_remap_domain_mfn_range properly implemented.

The new function is called xen_remap_domain_mfn_array which calls
xen_xlate_remap_gfn_array.

I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
guess we don't support that use case on 64K guests? If so, I would
appreaciate an assert and/or an error message.

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-06-23 14:49         ` Stefano Stabellini
@ 2015-06-23 15:02           ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 15:02 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Russell King, ian.campbell, tim, linux-kernel, xen-devel,
	linux-arm-kernel

On 23/06/15 15:49, Stefano Stabellini wrote:
> Yes, you are right, I was reading an older version of Linux that still
> had xen_remap_domain_mfn_range properly implemented.
> 
> The new function is called xen_remap_domain_mfn_array which calls
> xen_xlate_remap_gfn_array.
> 
> I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
> guess we don't support that use case on 64K guests? If so, I would
> appreaciate an assert and/or an error message.

See https://lkml.org/lkml/2015/5/14/563

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 23/23] arm/xen: Add support for 64KB page granularity
@ 2015-06-23 15:02           ` Julien Grall
  0 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 15:02 UTC (permalink / raw)
  To: linux-arm-kernel

On 23/06/15 15:49, Stefano Stabellini wrote:
> Yes, you are right, I was reading an older version of Linux that still
> had xen_remap_domain_mfn_range properly implemented.
> 
> The new function is called xen_remap_domain_mfn_array which calls
> xen_xlate_remap_gfn_array.
> 
> I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
> guess we don't support that use case on 64K guests? If so, I would
> appreaciate an assert and/or an error message.

See https://lkml.org/lkml/2015/5/14/563

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-06-23 14:49         ` Stefano Stabellini
  (?)
  (?)
@ 2015-06-23 15:02         ` Julien Grall
  -1 siblings, 0 replies; 200+ messages in thread
From: Julien Grall @ 2015-06-23 15:02 UTC (permalink / raw)
  To: Stefano Stabellini, Julien Grall
  Cc: Russell King, ian.campbell, tim, linux-kernel, xen-devel,
	linux-arm-kernel

On 23/06/15 15:49, Stefano Stabellini wrote:
> Yes, you are right, I was reading an older version of Linux that still
> had xen_remap_domain_mfn_range properly implemented.
> 
> The new function is called xen_remap_domain_mfn_array which calls
> xen_xlate_remap_gfn_array.
> 
> I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
> guess we don't support that use case on 64K guests? If so, I would
> appreaciate an assert and/or an error message.

See https://lkml.org/lkml/2015/5/14/563

-- 
Julien Grall

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [Xen-devel] [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-06-23 15:02           ` Julien Grall
@ 2015-06-23 16:12             ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 16:12 UTC (permalink / raw)
  To: Julien Grall
  Cc: Stefano Stabellini, Russell King, ian.campbell, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Tue, 23 Jun 2015, Julien Grall wrote:
> On 23/06/15 15:49, Stefano Stabellini wrote:
> > Yes, you are right, I was reading an older version of Linux that still
> > had xen_remap_domain_mfn_range properly implemented.
> > 
> > The new function is called xen_remap_domain_mfn_array which calls
> > xen_xlate_remap_gfn_array.
> > 
> > I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
> > guess we don't support that use case on 64K guests? If so, I would
> > appreaciate an assert and/or an error message.
> 
> See https://lkml.org/lkml/2015/5/14/563

Ah, that's fantastic! In that case:

Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

^ permalink raw reply	[flat|nested] 200+ messages in thread

* [Xen-devel] [RFC 23/23] arm/xen: Add support for 64KB page granularity
@ 2015-06-23 16:12             ` Stefano Stabellini
  0 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 16:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, 23 Jun 2015, Julien Grall wrote:
> On 23/06/15 15:49, Stefano Stabellini wrote:
> > Yes, you are right, I was reading an older version of Linux that still
> > had xen_remap_domain_mfn_range properly implemented.
> > 
> > The new function is called xen_remap_domain_mfn_array which calls
> > xen_xlate_remap_gfn_array.
> > 
> > I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
> > guess we don't support that use case on 64K guests? If so, I would
> > appreaciate an assert and/or an error message.
> 
> See https://lkml.org/lkml/2015/5/14/563

Ah, that's fantastic! In that case:

Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

^ permalink raw reply	[flat|nested] 200+ messages in thread

* Re: [RFC 23/23] arm/xen: Add support for 64KB page granularity
  2015-06-23 15:02           ` Julien Grall
  (?)
@ 2015-06-23 16:12           ` Stefano Stabellini
  -1 siblings, 0 replies; 200+ messages in thread
From: Stefano Stabellini @ 2015-06-23 16:12 UTC (permalink / raw)
  To: Julien Grall
  Cc: Russell King, ian.campbell, Stefano Stabellini, tim,
	linux-kernel, xen-devel, linux-arm-kernel

On Tue, 23 Jun 2015, Julien Grall wrote:
> On 23/06/15 15:49, Stefano Stabellini wrote:
> > Yes, you are right, I was reading an older version of Linux that still
> > had xen_remap_domain_mfn_range properly implemented.
> > 
> > The new function is called xen_remap_domain_mfn_array which calls
> > xen_xlate_remap_gfn_array.
> > 
> > I'll rephrase my question then: What about xen_remap_domain_mfn_array? I
> > guess we don't support that use case on 64K guests? If so, I would
> > appreaciate an assert and/or an error message.
> 
> See https://lkml.org/lkml/2015/5/14/563

Ah, that's fantastic! In that case:

Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

^ permalink raw reply	[flat|nested] 200+ messages in thread

end of thread, other threads:[~2015-06-23 16:13 UTC | newest]

Thread overview: 200+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-14 17:00 [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest Julien Grall
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00 ` [RFC 01/23] xen: Include xen/page.h rather than asm/xen/page.h Julien Grall
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 13:50   ` [Xen-devel] " David Vrabel
2015-05-19 13:50     ` David Vrabel
2015-05-19 13:50   ` David Vrabel
2015-05-14 17:00 ` [RFC 02/23] xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring Julien Grall
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 13:51   ` David Vrabel
2015-05-19 13:51   ` [Xen-devel] " David Vrabel
2015-05-19 13:51     ` David Vrabel
2015-05-19 14:12     ` Julien Grall
2015-05-19 14:12     ` [Xen-devel] " Julien Grall
2015-05-19 14:12       ` Julien Grall
2015-05-14 17:00 ` [RFC 03/23] xen/grant-table: Remove unused macro SPP Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 13:52   ` [Xen-devel] " David Vrabel
2015-05-19 13:52     ` David Vrabel
2015-05-19 13:52   ` David Vrabel
2015-05-14 17:00 ` [RFC 04/23] block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-20 14:36   ` Roger Pau Monné
2015-05-20 14:36   ` Roger Pau Monné
2015-05-20 14:36     ` Roger Pau Monné
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00 ` [RFC 05/23] block/xen-blkfront: Remove invalid comment Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-20 14:42   ` Roger Pau Monné
2015-05-20 14:42     ` Roger Pau Monné
2015-05-20 14:42     ` Roger Pau Monné
2015-05-14 17:00 ` [RFC 06/23] block/xen-blkback: s/nr_pages/nr_segs/ Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-20 14:54   ` Roger Pau Monné
2015-05-20 14:54     ` Roger Pau Monné
2015-05-20 14:54   ` Roger Pau Monné
2015-05-14 17:00 ` [RFC 07/23] net/xen-netfront: Correct printf format in xennet_get_responses Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 13:53   ` [Xen-devel] " David Vrabel
2015-05-19 13:53     ` David Vrabel
2015-05-19 13:53   ` David Vrabel
2015-05-14 17:00 ` [RFC 08/23] net/xen-netback: Remove unused code in xenvif_rx_action Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-15  0:26   ` Wei Liu
2015-05-15  0:26   ` Wei Liu
2015-05-15  0:26     ` Wei Liu
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00 ` [RFC 09/23] arm/xen: Drop duplicate define mfn_to_virt Julien Grall
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-06-23 13:25   ` Stefano Stabellini
2015-06-23 13:25   ` Stefano Stabellini
2015-06-23 13:25     ` Stefano Stabellini
2015-06-23 13:53     ` Julien Grall
2015-06-23 13:53     ` [Xen-devel] " Julien Grall
2015-06-23 13:53       ` Julien Grall
2015-05-14 17:00 ` [RFC 10/23] xen/biomerge: WORKAROUND always says the biovec are not mergeable Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-15 15:54   ` Boris Ostrovsky
2015-05-15 15:54   ` Boris Ostrovsky
2015-05-15 15:54     ` Boris Ostrovsky
2015-05-19 14:16     ` Julien Grall
2015-05-19 14:16       ` Julien Grall
2015-05-19 14:16     ` Julien Grall
2015-05-14 17:00 ` [RFC 11/23] xen: Add Xen specific page definition Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00 ` [RFC 12/23] xen: Extend page_to_mfn to take an offset in the page Julien Grall
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 13:57   ` David Vrabel
2015-05-19 13:57   ` [Xen-devel] " David Vrabel
2015-05-19 13:57     ` David Vrabel
2015-05-19 14:18     ` Julien Grall
2015-05-19 14:18       ` Julien Grall
2015-05-19 14:18     ` Julien Grall
2015-05-14 17:00 ` [RFC 13/23] xen/xenbus: Use Xen page definition Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 13:59   ` David Vrabel
2015-05-19 13:59   ` [Xen-devel] " David Vrabel
2015-05-19 13:59     ` David Vrabel
2015-05-19 14:19     ` Julien Grall
2015-05-19 14:19     ` [Xen-devel] " Julien Grall
2015-05-19 14:19       ` Julien Grall
2015-05-14 17:00 ` [RFC 14/23] tty/hvc: xen: Use xen " Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00 ` [RFC 15/23] xen/balloon: Don't rely on the page granularity is the same for Xen and Linux Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 15:23   ` [Xen-devel] " David Vrabel
2015-05-19 15:23     ` David Vrabel
2015-05-19 15:23   ` David Vrabel
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00 ` [RFC 16/23] xen/events: fifo: Make it running on 64KB granularity Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 15:25   ` David Vrabel
2015-05-19 15:25   ` [Xen-devel] " David Vrabel
2015-05-19 15:25     ` David Vrabel
2015-05-14 17:00 ` [RFC 17/23] xen/grant-table: " Julien Grall
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-19 15:27   ` [Xen-devel] " David Vrabel
2015-05-19 15:27     ` David Vrabel
2015-05-19 15:27   ` David Vrabel
2015-05-14 17:00 ` [RFC 18/23] block/xen-blkfront: Make it running on 64KB page granularity Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:00 ` [RFC 19/23] block/xen-blkback: " Julien Grall
2015-05-14 17:00 ` Julien Grall
2015-05-14 17:00   ` Julien Grall
2015-05-14 17:01 ` [RFC 20/23] net/xen-netfront: " Julien Grall
2015-05-14 17:01 ` Julien Grall
2015-05-14 17:01   ` Julien Grall
2015-05-14 17:01 ` [RFC 21/23] net/xen-netback: " Julien Grall
2015-05-14 17:01   ` Julien Grall
2015-05-14 17:01   ` Julien Grall
2015-05-14 17:01   ` Julien Grall
2015-05-15  2:35   ` Wei Liu
2015-05-15  2:35     ` Wei Liu
2015-05-15 12:35     ` [Xen-devel] " Julien Grall
2015-05-15 12:35       ` Julien Grall
2015-05-15 15:31       ` Wei Liu
2015-05-15 15:31         ` Wei Liu
2015-05-15 15:41         ` Ian Campbell
2015-05-15 15:41           ` Ian Campbell
2015-05-15 15:41         ` Ian Campbell
2015-05-18 12:11         ` [Xen-devel] " Julien Grall
2015-05-18 12:11           ` Julien Grall
2015-05-18 12:54           ` Wei Liu
2015-05-18 12:54           ` [Xen-devel] " Wei Liu
2015-05-18 12:54             ` Wei Liu
2015-05-19 22:56             ` Julien Grall
2015-05-19 22:56             ` [Xen-devel] " Julien Grall
2015-05-19 22:56               ` Julien Grall
2015-05-20  8:26               ` Wei Liu
2015-05-20  8:26               ` [Xen-devel] " Wei Liu
2015-05-20  8:26                 ` Wei Liu
2015-05-20 14:26                 ` Julien Grall
2015-05-20 14:26                   ` Julien Grall
2015-05-20 14:26                 ` Julien Grall
2015-05-20 14:29               ` Julien Grall
2015-05-20 14:29               ` [Xen-devel] " Julien Grall
2015-05-20 14:29                 ` Julien Grall
2015-05-18 12:11         ` Julien Grall
2015-05-15 15:31       ` Wei Liu
2015-05-15 12:35     ` Julien Grall
2015-05-15  2:35   ` Wei Liu
2015-05-14 17:01 ` [RFC 22/23] xen/privcmd: Add support for Linux " Julien Grall
2015-05-14 17:01 ` Julien Grall
2015-05-14 17:01   ` Julien Grall
2015-05-19 15:39   ` David Vrabel
2015-05-19 15:39   ` [Xen-devel] " David Vrabel
2015-05-19 15:39     ` David Vrabel
2015-06-18 17:05     ` Julien Grall
2015-06-18 17:05     ` [Xen-devel] " Julien Grall
2015-06-18 17:05       ` Julien Grall
2015-05-14 17:01 ` [RFC 23/23] arm/xen: Add support for " Julien Grall
2015-05-14 17:01   ` Julien Grall
2015-05-14 17:01   ` Julien Grall
2015-06-23 14:19   ` Stefano Stabellini
2015-06-23 14:19   ` Stefano Stabellini
2015-06-23 14:19     ` Stefano Stabellini
2015-06-23 14:37     ` Julien Grall
2015-06-23 14:37     ` Julien Grall
2015-06-23 14:37       ` Julien Grall
2015-06-23 14:49       ` Stefano Stabellini
2015-06-23 14:49       ` Stefano Stabellini
2015-06-23 14:49         ` Stefano Stabellini
2015-06-23 15:02         ` [Xen-devel] " Julien Grall
2015-06-23 15:02           ` Julien Grall
2015-06-23 16:12           ` Stefano Stabellini
2015-06-23 16:12           ` [Xen-devel] " Stefano Stabellini
2015-06-23 16:12             ` Stefano Stabellini
2015-06-23 15:02         ` Julien Grall
2015-05-15 15:45 ` [RFC 00/23] arm64: Add support for 64KB page granularity in Xen guest David Vrabel
2015-05-15 15:45 ` [Xen-devel] " David Vrabel
2015-05-15 15:45   ` David Vrabel
2015-05-15 15:51   ` Boris Ostrovsky
2015-05-15 15:51   ` [Xen-devel] " Boris Ostrovsky
2015-05-15 15:51     ` Boris Ostrovsky
2015-05-18 12:23   ` Julien Grall
2015-05-18 12:23   ` [Xen-devel] " Julien Grall
2015-05-18 12:23     ` Julien Grall
2015-06-23 13:37     ` Stefano Stabellini
2015-06-23 13:37     ` [Xen-devel] " Stefano Stabellini
2015-06-23 13:37       ` Stefano Stabellini
2015-06-23 13:41       ` Stefano Stabellini
2015-06-23 13:41       ` [Xen-devel] " Stefano Stabellini
2015-06-23 13:41         ` Stefano Stabellini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.