From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Jan Beulich" Subject: Re: [PATCH V3 2/3] xen-blkfront: teach blkfront driver handle trim request Date: Thu, 25 Aug 2011 08:03:32 +0100 Message-ID: <4E560FE4020000780005324F@nat28.tlf.novell.com> References: <1314177825-22360-1-git-send-email-lidongyang@novell.com> <1314177825-22360-3-git-send-email-lidongyang@novell.com> <4E54F1C20200007800052DF8@victor.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: Content-Disposition: inline List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Dong Yang Li Cc: xen-devel@lists.xensource.com, owen.smith@citrix.com List-Id: xen-devel@lists.xenproject.org >>> On 25.08.11 at 08:37, Li Dongyang wrote: > On Wed, Aug 24, 2011 at 6:42 PM, Jan Beulich = wrote: >>>>> On 24.08.11 at 11:23, Li Dongyang wrote: >>> The blkfront driver now will read feature-trim from xenstore, >>> and set up the request queue with trim params, then we can forward the >>> discard requests to backend driver. >>> >>> Signed-off-by: Li Dongyang >>> --- >>> drivers/block/xen-blkfront.c | 111 +++++++++++++++++++++++++++++++++-= -------- >>> 1 files changed, 88 insertions(+), 23 deletions(-) >>> >>> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.= c >>> index 9ea8c25..aa3cede 100644 >>> --- a/drivers/block/xen-blkfront.c >>> +++ b/drivers/block/xen-blkfront.c >>> @@ -98,6 +98,9 @@ struct blkfront_info >>> unsigned long shadow_free; >>> unsigned int feature_flush; >>> unsigned int flush_op; >>> + unsigned int feature_trim; >>> + unsigned int discard_granularity; >>> + unsigned int discard_alignment; >>> int is_ready; >>> }; >>> >>> @@ -302,29 +305,36 @@ static int blkif_queue_request(struct request = *req) >>> ring_req->operation =3D info->flush_op; >>> } >>> >>> - ring_req->nr_segments =3D blk_rq_map_sg(req->q, req, info->sg); >>> - BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); >>> + if (unlikely(req->cmd_flags & REQ_DISCARD)) { >>> + /* id, sector_number and handle are set above. */ >>> + ring_req->operation =3D BLKIF_OP_TRIM; >>> + ring_req->nr_segments =3D 0; >>> + ring_req->u.trim.nr_sectors =3D blk_rq_sectors(req); >>> + } else { >>> + ring_req->nr_segments =3D blk_rq_map_sg(req->q, req, = info->sg); >>> + BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQ= UEST); >>> >>> - for_each_sg(info->sg, sg, ring_req->nr_segments, i) { >>> - buffer_mfn =3D pfn_to_mfn(page_to_pfn(sg_page(sg))); >>> - fsect =3D sg->offset >> 9; >>> - lsect =3D fsect + (sg->length >> 9) - 1; >>> - /* install a grant reference. */ >>> - ref =3D gnttab_claim_grant_reference(&gref_head); >>> - BUG_ON(ref =3D=3D -ENOSPC); >>> + for_each_sg(info->sg, sg, ring_req->nr_segments, i) { >>> + buffer_mfn =3D pfn_to_mfn(page_to_pfn(sg_page(sg)= )); >>> + fsect =3D sg->offset >> 9; >>> + lsect =3D fsect + (sg->length >> 9) - 1; >>> + /* install a grant reference. */ >>> + ref =3D gnttab_claim_grant_reference(&gref_head);= >>> + BUG_ON(ref =3D=3D -ENOSPC); >>> >>> - gnttab_grant_foreign_access_ref( >>> - ref, >>> - info->xbdev->otherend_id, >>> - buffer_mfn, >>> - rq_data_dir(req) ); >>> - >>> - info->shadow[id].frame[i] =3D mfn_to_pfn(buffer_mfn); >>> - ring_req->u.rw.seg[i] =3D >>> - (struct blkif_request_segment) { >>> - .gref =3D ref, >>> - .first_sect =3D fsect, >>> - .last_sect =3D lsect }; >>> + gnttab_grant_foreign_access_ref( >>> + ref, >>> + info->xbdev->otherend_id, >>> + buffer_mfn, >>> + rq_data_dir(req)); >>> + >>> + info->shadow[id].frame[i] =3D mfn_to_pfn(buffer_m= fn); >>> + ring_req->u.rw.seg[i] =3D >>> + (struct blkif_request_segment) { >>> + .gref =3D ref, >>> + .first_sect =3D fsect, >>> + .last_sect =3D lsect }; >>> + } >>> } >>> >>> info->ring.req_prod_pvt++; >>> @@ -399,6 +409,7 @@ wait: >>> static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) >>> { >>> struct request_queue *rq; >>> + struct blkfront_info *info =3D gd->private_data; >>> >>> rq =3D blk_init_queue(do_blkif_request, &blkif_io_lock); >>> if (rq =3D=3D NULL) >>> @@ -406,6 +417,13 @@ static int xlvbd_init_blk_queue(struct gendisk = *gd, u16 >>> sector_size) >>> >>> queue_flag_set_unlocked(QUEUE_FLAG_VIRT, rq); >>> >>> + if (info->feature_trim) { >>> + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, rq); >>> + blk_queue_max_discard_sectors(rq, get_capacity(gd)); >>> + rq->limits.discard_granularity =3D info->discard_granular= ity; >>> + rq->limits.discard_alignment =3D info->discard_alignment;= >> >> Don't you also need to set rq->limits.max_discard_sectors here (since >> when zero blkdev_issue_discard() doesn't do anything)? And wouldn't >> that need to be propagated from the backend, too? > the max_discard_sectors are set by blk_queue_max_discard_sectors() above = ;-) Oh, silly me to overlook that. > rq->limits.max_discard_sectors is the full phy device size, and if we > only assign > a partition to guest, the number is incorrect for guest, so the > max_discard_sectors should > be the capacity the guest will see, Thanks Using the capacity seems right only for the file: case; I'd still think the backend should pass the underlying disk's setting for the phy: one. Jan