From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53210) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cBjly-0005D3-KP for qemu-devel@nongnu.org; Tue, 29 Nov 2016 09:52:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cBjlv-0003Od-Ei for qemu-devel@nongnu.org; Tue, 29 Nov 2016 09:52:42 -0500 Received: from mail.avalus.com ([89.16.176.221]:42806) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cBjlv-0003OQ-5v for qemu-devel@nongnu.org; Tue, 29 Nov 2016 09:52:39 -0500 Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\)) From: Alex Bligh In-Reply-To: Date: Tue, 29 Nov 2016 14:52:35 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <11F2E6BC-D538-466B-9D80-541D146EF2A0@alex.org.uk> References: <1480073296-6931-1-git-send-email-vsementsov@virtuozzo.com> Subject: Re: [Qemu-devel] [PATCH v3] doc: Add NBD_CMD_BLOCK_STATUS extension List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Vladimir Sementsov-Ogievskiy Cc: Alex Bligh , "nbd-general@lists.sourceforge.net" , "qemu-devel@nongnu.org" , Kevin Wolf , Paolo Bonzini , Pavel Borzenkov , "Stefan stefanha@redhat. com" , "Denis V. Lunev" , Wouter Verhelst , Eric Blake , mpa@pengutronix.de Vladimir, >>> 4. Q: Should not get_{allocated,dirty} be separate commands? >>> cons: Two commands with almost same semantic and similar means? >>> pros: However here is a good point of separating clearly defined = and native >>> for block devices GET_BLOCK_STATUS from user-driven and = actually >>> undefined data, called 'dirtyness'. >> I'm suggesting one generic 'read bitmap' command like you. >=20 > To support get_block_status in this general read_bitmap, we will need = to define something like 'multibitmap', which allows several bits per = chunk, as allocation data has two: zero and allocated. I think you are saying that for arbitrary 'bitmap' there might be more = than one state. For instance, one might (in an allocation 'bitmap') have = a hole, a non-hole-zero, or a non-hole-non-zero. In the spec I'd suggest, for one 'bitmap', we represent the output as = extents. Each extent has a status. For the bitmap to be useful, at least = two status need to be possible, but the above would have three. This = could be internally implemented by the server as (a) a bitmap (with two = bits per entry), (b) two bitmaps (possibly with different granularity), = (c) something else (e.g. reading file extents, then if the data is = allocated manually comparing it against zero). I should have put 'bitmap' in quotes in what I wrote because returning = extents (as you suggested) is a good idea, and there need not be an = actual bitmap. >>> 5. Number of status descriptors, sent by server, should be = restricted >>> variants: >>> 1: just allow server to restrict this as it wants (which was done = in v3) >>> 2: (not excluding 1). Client specifies somehow the maximum for = number >>> of descriptors. >>> 2.1: add command flag, which will request only one descriptor >>> (otherwise, no restrictions from the client) >>> 2.2: again, introduce extended nbd requests, and add field to >>> specify this maximum >> I think some form of extended request is the way to go, but out of >> interest, what's the issue with as many descriptors being sent as it >> takes to encode the reply? The client can just consume the remainder >> (without buffering) and reissue the request at a later point for >> the areas it discarded. >=20 > the issue is: too many descriptors possible. So, (1) solves it. (2) is = optional, just to simplify/optimize client side. I think I'd prefer the server to return what it was asked for, and the = client to deal with it. So either the client should be able to specify a = maximum number of extents (and if we are extending the command = structure, that's possible) or we deal with the client consuming and = retrying unwanted extents. The reason for this is that it's unlikely the = server can know a priori the number of extents which is the appropriate = maximum for the client. >>> + The list of block status descriptors within the >>> + `NBD_REPLY_TYPE_BLOCK_STATUS` chunk represent consecutive = portions >>> + of the file starting from specified *offset*, and the sum of = the >>> + *length* fields of each descriptor MUST not be greater than the >>> + overall *length* of the request. This means that the server MAY >>> + return less data than required. However the server MUST return = at >>> + least one status descriptor >> I'm not sure I understand why that's useful. What should the client >> infer from the server refusing to provide information? We don't >> permit short reads etc. >=20 > if the bitmap is 010101010101 we will have too many descriptors. For = example, 16tb disk, 64k granularity -> 2G of descriptors payload. Yep. And the cost of consuming and retrying is quite high. One option = would be for the client to realise this is a possibility, and not = request the entire extent map for a 16TB disk, as it might be very = large! Even if the client worked at e.g. a 64MB level (where they'd get = a maximum of 1024 extents per reply), this isn't going to noticeably = increase the round trip timing. One issue here is that to determine a = 'reasonable' size, the client needs to know the minimum length of any = extent. I think the answer is probably a 'maximum number of extents' in the = request packet. Of course with statuses in extent, the final extent could be represented = as 'I don't know, break this bit into a separate request' status. --=20 Alex Bligh