From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933238AbbENRSs (ORCPT ); Thu, 14 May 2015 13:18:48 -0400 Received: from smtp02.citrix.com ([66.165.176.63]:64243 "EHLO SMTP02.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753562AbbENRSZ (ORCPT ); Thu, 14 May 2015 13:18:25 -0400 X-IronPort-AV: E=Sophos;i="5.13,430,1427760000"; d="scan'208";a="265173454" From: Julien Grall To: CC: , , , , , Julien Grall , Konrad Rzeszutek Wilk , =?UTF-8?q?Roger=20Pau=20Monn=C3=A9?= , Boris Ostrovsky , "David Vrabel" Subject: [RFC 19/23] block/xen-blkback: Make it running on 64KB page granularity Date: Thu, 14 May 2015 18:00:59 +0100 Message-ID: <1431622863-28575-20-git-send-email-julien.grall@citrix.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1431622863-28575-1-git-send-email-julien.grall@citrix.com> References: <1431622863-28575-1-git-send-email-julien.grall@citrix.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-DLP: MIA1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The PV block protocol is using 4KB page granularity. The goal of this patch is to allow a Linux using 64KB page granularity behaving as a block backend on a non-modified Xen. It's only necessary to adapt the ring size and the number of request per indirect frames. The rest of the code is relying on the grant table code. Note that the grant table code is allocating a Linux page per grant which will result to waste 6OKB for every grant when Linux is using 64KB page granularity. This could be improved by sharing the page between multiple grants. Signed-off-by: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: "Roger Pau Monné" Cc: Boris Ostrovsky Cc: David Vrabel --- Improvement such as support of 64KB grant is not taken into consideration in this patch because we have the requirement to run a Linux using 64KB pages on a non-modified Xen. This has been tested only with a loop device. I plan to test passing hard drive partition but I didn't yet convert the swiotlb code. --- drivers/block/xen-blkback/blkback.c | 5 +++-- drivers/block/xen-blkback/common.h | 16 +++++++++++++--- drivers/block/xen-blkback/xenbus.c | 6 +++--- 3 files changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 7049528..1803c07 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -954,7 +954,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req, seg[n].nsec = segments[i].last_sect - segments[i].first_sect + 1; seg[n].offset = (segments[i].first_sect << 9); - if ((segments[i].last_sect >= (PAGE_SIZE >> 9)) || + if ((segments[i].last_sect >= (XEN_PAGE_SIZE >> 9)) || (segments[i].last_sect < segments[i].first_sect)) { rc = -EINVAL; goto unmap; @@ -1203,6 +1203,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, req_operation = req->operation == BLKIF_OP_INDIRECT ? req->u.indirect.indirect_op : req->operation; + if ((req->operation == BLKIF_OP_INDIRECT) && (req_operation != BLKIF_OP_READ) && (req_operation != BLKIF_OP_WRITE)) { @@ -1261,7 +1262,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, seg[i].nsec = req->u.rw.seg[i].last_sect - req->u.rw.seg[i].first_sect + 1; seg[i].offset = (req->u.rw.seg[i].first_sect << 9); - if ((req->u.rw.seg[i].last_sect >= (PAGE_SIZE >> 9)) || + if ((req->u.rw.seg[i].last_sect >= (XEN_PAGE_SIZE >> 9)) || (req->u.rw.seg[i].last_sect < req->u.rw.seg[i].first_sect)) goto fail_response; diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index 7a03e07..ef15ad4 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include @@ -50,12 +51,21 @@ */ #define MAX_INDIRECT_SEGMENTS 256 -#define SEGS_PER_INDIRECT_FRAME \ - (PAGE_SIZE/sizeof(struct blkif_request_segment)) +/* + * Xen use 4K pages. The guest may use different page size (4K or 64K) + * Number of Xen pages per segment + */ +#define XEN_PAGES_PER_SEGMENT (PAGE_SIZE / XEN_PAGE_SIZE) + +#define SEGS_PER_INDIRECT_FRAME \ + (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / XEN_PAGES_PER_SEGMENT) +#define XEN_PAGES_PER_INDIRECT_FRAME \ + (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment)) + #define MAX_INDIRECT_PAGES \ ((MAX_INDIRECT_SEGMENTS + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME) #define INDIRECT_PAGES(_segs) \ - ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME) + ((_segs + XEN_PAGES_PER_INDIRECT_FRAME - 1)/XEN_PAGES_PER_INDIRECT_FRAME) /* Not a real protocol. Used to generate ring structs which contain * the elements common to all protocols only. This way we get a diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 6ab69ad..2fcf24e 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -217,21 +217,21 @@ static int xen_blkif_map(struct xen_blkif *blkif, grant_ref_t gref, { struct blkif_sring *sring; sring = (struct blkif_sring *)blkif->blk_ring; - BACK_RING_INIT(&blkif->blk_rings.native, sring, PAGE_SIZE); + BACK_RING_INIT(&blkif->blk_rings.native, sring, XEN_PAGE_SIZE); break; } case BLKIF_PROTOCOL_X86_32: { struct blkif_x86_32_sring *sring_x86_32; sring_x86_32 = (struct blkif_x86_32_sring *)blkif->blk_ring; - BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, PAGE_SIZE); + BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, XEN_PAGE_SIZE); break; } case BLKIF_PROTOCOL_X86_64: { struct blkif_x86_64_sring *sring_x86_64; sring_x86_64 = (struct blkif_x86_64_sring *)blkif->blk_ring; - BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, PAGE_SIZE); + BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, XEN_PAGE_SIZE); break; } default: -- 2.1.4 From mboxrd@z Thu Jan 1 00:00:00 1970 From: julien.grall@citrix.com (Julien Grall) Date: Thu, 14 May 2015 18:00:59 +0100 Subject: [RFC 19/23] block/xen-blkback: Make it running on 64KB page granularity In-Reply-To: <1431622863-28575-1-git-send-email-julien.grall@citrix.com> References: <1431622863-28575-1-git-send-email-julien.grall@citrix.com> Message-ID: <1431622863-28575-20-git-send-email-julien.grall@citrix.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org The PV block protocol is using 4KB page granularity. The goal of this patch is to allow a Linux using 64KB page granularity behaving as a block backend on a non-modified Xen. It's only necessary to adapt the ring size and the number of request per indirect frames. The rest of the code is relying on the grant table code. Note that the grant table code is allocating a Linux page per grant which will result to waste 6OKB for every grant when Linux is using 64KB page granularity. This could be improved by sharing the page between multiple grants. Signed-off-by: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: "Roger Pau Monn?" Cc: Boris Ostrovsky Cc: David Vrabel --- Improvement such as support of 64KB grant is not taken into consideration in this patch because we have the requirement to run a Linux using 64KB pages on a non-modified Xen. This has been tested only with a loop device. I plan to test passing hard drive partition but I didn't yet convert the swiotlb code. --- drivers/block/xen-blkback/blkback.c | 5 +++-- drivers/block/xen-blkback/common.h | 16 +++++++++++++--- drivers/block/xen-blkback/xenbus.c | 6 +++--- 3 files changed, 19 insertions(+), 8 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 7049528..1803c07 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -954,7 +954,7 @@ static int xen_blkbk_parse_indirect(struct blkif_request *req, seg[n].nsec = segments[i].last_sect - segments[i].first_sect + 1; seg[n].offset = (segments[i].first_sect << 9); - if ((segments[i].last_sect >= (PAGE_SIZE >> 9)) || + if ((segments[i].last_sect >= (XEN_PAGE_SIZE >> 9)) || (segments[i].last_sect < segments[i].first_sect)) { rc = -EINVAL; goto unmap; @@ -1203,6 +1203,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, req_operation = req->operation == BLKIF_OP_INDIRECT ? req->u.indirect.indirect_op : req->operation; + if ((req->operation == BLKIF_OP_INDIRECT) && (req_operation != BLKIF_OP_READ) && (req_operation != BLKIF_OP_WRITE)) { @@ -1261,7 +1262,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, seg[i].nsec = req->u.rw.seg[i].last_sect - req->u.rw.seg[i].first_sect + 1; seg[i].offset = (req->u.rw.seg[i].first_sect << 9); - if ((req->u.rw.seg[i].last_sect >= (PAGE_SIZE >> 9)) || + if ((req->u.rw.seg[i].last_sect >= (XEN_PAGE_SIZE >> 9)) || (req->u.rw.seg[i].last_sect < req->u.rw.seg[i].first_sect)) goto fail_response; diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index 7a03e07..ef15ad4 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -39,6 +39,7 @@ #include #include #include +#include #include #include #include @@ -50,12 +51,21 @@ */ #define MAX_INDIRECT_SEGMENTS 256 -#define SEGS_PER_INDIRECT_FRAME \ - (PAGE_SIZE/sizeof(struct blkif_request_segment)) +/* + * Xen use 4K pages. The guest may use different page size (4K or 64K) + * Number of Xen pages per segment + */ +#define XEN_PAGES_PER_SEGMENT (PAGE_SIZE / XEN_PAGE_SIZE) + +#define SEGS_PER_INDIRECT_FRAME \ + (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment) / XEN_PAGES_PER_SEGMENT) +#define XEN_PAGES_PER_INDIRECT_FRAME \ + (XEN_PAGE_SIZE/sizeof(struct blkif_request_segment)) + #define MAX_INDIRECT_PAGES \ ((MAX_INDIRECT_SEGMENTS + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME) #define INDIRECT_PAGES(_segs) \ - ((_segs + SEGS_PER_INDIRECT_FRAME - 1)/SEGS_PER_INDIRECT_FRAME) + ((_segs + XEN_PAGES_PER_INDIRECT_FRAME - 1)/XEN_PAGES_PER_INDIRECT_FRAME) /* Not a real protocol. Used to generate ring structs which contain * the elements common to all protocols only. This way we get a diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 6ab69ad..2fcf24e 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -217,21 +217,21 @@ static int xen_blkif_map(struct xen_blkif *blkif, grant_ref_t gref, { struct blkif_sring *sring; sring = (struct blkif_sring *)blkif->blk_ring; - BACK_RING_INIT(&blkif->blk_rings.native, sring, PAGE_SIZE); + BACK_RING_INIT(&blkif->blk_rings.native, sring, XEN_PAGE_SIZE); break; } case BLKIF_PROTOCOL_X86_32: { struct blkif_x86_32_sring *sring_x86_32; sring_x86_32 = (struct blkif_x86_32_sring *)blkif->blk_ring; - BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, PAGE_SIZE); + BACK_RING_INIT(&blkif->blk_rings.x86_32, sring_x86_32, XEN_PAGE_SIZE); break; } case BLKIF_PROTOCOL_X86_64: { struct blkif_x86_64_sring *sring_x86_64; sring_x86_64 = (struct blkif_x86_64_sring *)blkif->blk_ring; - BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, PAGE_SIZE); + BACK_RING_INIT(&blkif->blk_rings.x86_64, sring_x86_64, XEN_PAGE_SIZE); break; } default: -- 2.1.4