From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752209AbbJLSBl (ORCPT <rfc822;w@1wt.eu>);
	Mon, 12 Oct 2015 14:01:41 -0400
Received: from smtp02.citrix.com ([66.165.176.63]:31810 "EHLO
	SMTP02.CITRIX.COM" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751640AbbJLSBk (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Mon, 12 Oct 2015 14:01:40 -0400
X-IronPort-AV: E=Sophos;i="5.17,674,1437436800"; 
   d="scan'208";a="309778673"
Message-ID: <561BF52B.9020900@citrix.com>
Date: Mon, 12 Oct 2015 19:00:11 +0100
From: Julien Grall <julien.grall@citrix.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.8.0
MIME-Version: 1.0
To: =?UTF-8?B?Um9nZXIgUGF1IE1vbm7DqQ==?= <roger.pau@citrix.com>,
        <xen-devel@lists.xenproject.org>
CC: <linux-kernel@vger.kernel.org>,
        Boris Ostrovsky <boris.ostrovsky@oracle.com>,
        David Vrabel <david.vrabel@citrix.com>, <ian.campbell@citrix.com>,
        <stefano.stabellini@eu.citrix.com>
Subject: Re: [Xen-devel] [PATCH 2/2] block/xen-blkfront: Handle non-indirect
 grant with 64KB pages
References: <1441999920-3639-1-git-send-email-julien.grall@citrix.com>
	<1441999920-3639-3-git-send-email-julien.grall@citrix.com>
	<56129EEF.9090702@citrix.com> <5612ADEB.40002@citrix.com>
	<561396DF.9040406@citrix.com> <56139B2A.1050809@citrix.com>
 <56139D0D.4020409@citrix.com>
In-Reply-To: <56139D0D.4020409@citrix.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
X-DLP: MIA1
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 06/10/15 11:06, Roger Pau Monné wrote:
> El 06/10/15 a les 11.58, Julien Grall ha escrit:
>> Hi Roger,
>>
>> On 06/10/2015 10:39, Roger Pau Monné wrote:
>>> El 05/10/15 a les 19.05, Julien Grall ha escrit:
>>>> On 05/10/15 17:01, Roger Pau Monné wrote:
>>>>> El 11/09/15 a les 21.32, Julien Grall ha escrit:
>>>>>>           ring_req->u.rw.nr_segments = num_grant;
>>>>>> +        if (unlikely(require_extra_req)) {
>>>>>> +            id2 = blkif_ring_get_request(info, req, &ring_req2);
>>>>>
>>>>> How can you guarantee that there's always going to be another free
>>>>> request? AFAICT blkif_queue_rq checks for RING_FULL, but you don't
>>>>> actually know if there's only one slot or more than one available.
>>>>
>>>> Because the depth of the queue is divided by 2 when the extra request is
>>>> used (see xlvbd_init_blk_queue).
>>
>> I just noticed that I didn't mention this restriction in the commit
>> message. I will do it in the next revision.
>>
>>> I see, that's quite restrictive but I guess it's better than introducing
>>> a new ring macro in order to figure out if there are at least two free
>>> slots.
>>
>> I actually didn't think about your suggestion. I choose to divide by two
>> based on the assumption that the block framework will always try to send
>> a request with the maximum data possible.
> 
> AFAIK that depends on the request itself, the block layer will try to
> merge requests if possible, but you can also expect that there are going
> to be requests that will just contain a single block.
> 
>> I don't know if this assumption is correct as I'm not fully aware how
>> the block framework is working.
>>
>> If it's valid, in the case of 64KB guest, the maximum size of a request
>> would be 64KB when indirect segment is not supported. So we would end up
>> with a lot of 64KB request which will require 2 ring request.
> 
> I guess you could add a counter in order to see how many requests were
> split vs total number of requests.

So the number of 64KB request is fairly small compare to the total
number of request (277 for 4687 request) for general usage (i.e cd, find).

Although as soon as I use dd, the block request will be merged. So I
guess a common usage will not provide enough data to fill a 64KB request.

Although as soon as I use dd with a block size of 64KB, most of the
request fill 64KB and an extra request is required.

Note that I had to implement quickly xen_biovec_phys_mergeable for 64KB
page as I left this aside. Without it, the biovec won't be merge except
with dd if you specific the block size (bs=64KB).

I've also looked to the code to see if it's possible to check if there
is 2 ring requests free and if not waiting until they are available.

Currently, we don't need to check if a request if free because the queue
is sized according to the number of request supported by the ring. This
means that the block layer is handling the check and we will always have
space in the ring.

If we decide to avoid dividing the number of request enqueue by the
block layer, we would have to handle ourself if there is 2 ring requests
free.
AFAICT, when BLK_MQ_RQ_BUSY is returned the block layer will stop the
queue. So we need to have some logic in blkfront to know when the 2 ring
requests become free and restart the queue. I guest it would be similar
to gnttab_request_free_callback.

I'd like your advice to know whether this is worth to implement it in
blockfront given that it will be only used for 64KB guest with backend
not supporting indirect grant.

Regards,

-- 
Julien Grall