From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39061) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aj3rt-0004iy-A6 for qemu-devel@nongnu.org; Thu, 24 Mar 2016 07:56:02 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aj3rq-0000Wi-1T for qemu-devel@nongnu.org; Thu, 24 Mar 2016 07:56:01 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49538) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aj3rp-0000We-Px for qemu-devel@nongnu.org; Thu, 24 Mar 2016 07:55:57 -0400 References: <1458742562-30624-1-git-send-email-den@openvz.org> <1458742562-30624-3-git-send-email-den@openvz.org> <20160323175834.GC2467@grep.be> From: Paolo Bonzini Message-ID: <56F3D5C7.9070007@redhat.com> Date: Thu, 24 Mar 2016 12:55:51 +0100 MIME-Version: 1.0 In-Reply-To: <20160323175834.GC2467@grep.be> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Wouter Verhelst , "Denis V. Lunev" Cc: nbd-general@lists.sourceforge.net, Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi On 23/03/2016 18:58, Wouter Verhelst wrote: >> +To provide such class of information, `GET_LBA_STATUS` extension adds= new >> +`NBD_CMD_GET_LBA_STATUS` command which returns a list of LBA ranges w= ith >> +their respective states. >> + >> +* `NBD_CMD_GET_LBA_STATUS` (7) >> + >> + An LBA range status query request. Length and offset define the r= ange >> + of interest. The server MUST reply with a reply header, followed >> + immediately by the following data: >=20 > As Eric noted, please expand LBA at least once. Let's just use "block" (e.g. NBD_CMD_GET_BLOCK_STATUS). >> + - 32 bits, length of parameter data that follow (unsigned) >> + - zero or more LBA status descriptors, each having the followin= g >> + structure: >> + >> + * 64 bits, offset (unsigned) >> + * 32 bits, length (unsigned) >> + * 16 bits, status (unsigned) >> + >> + unless an error condition has occurred. >> + Can we just return one descriptor? That would simplify the protocol a bi= t. >> + the provisioning state of the device. The following provisionnig = states >> + are defined for the command: >> + >> + - `NBD_STATE_ALLOCATED` (0x0), LBA extent is present on the blo= ck device; >> + - `NBD_STATE_ZEROED` (0x1), LBA extent is present on the block = device >> + and contains zeroes; >=20 > Presumably this should be "contains only zeroes"? Yes, or "reads as zeroes". > Also, this may end up being a fairly expensive call for the server to > process. Is it really useful? It's always okay for the server to omit NBD_STATE_ZERO, but it can be useful if the state is known for some reason. For example, file system holes are always zero, but unallocated blocks on a block device are not always zero. However, let's make these bits, so that NBD_STATE_ALLOCATED (0x1), LBA extent is present on the block device NBD_STATE_ZERO (0x2), LBA extent will read as zeroes and you can have NBD_STATE_ALLOCATED|NBD_STATE_ZERO. File systems do have the concept of unwritten extents which would be represented like that. The API to access the information (the FIEMAP ioctl) is broken, but perhaps in the future a non-broken API could be added---for example SEEK_ZERO and SEEK_NONZERO values for lseek's "whence" argument. >> + - `NBD_STATE_DEALLOCATED` (0x2), LBA extent is not present on t= he >> + block device. A client MUST NOT make any assumptions about th= e >> + contents of the extent. >> + >> + 2. Block dirtiness state >> + >> + Upon receiving an `NBD_CMD_GET_LBA_STATUS` command with command f= lags >> + field set to `NBD_FLAG_GET_DIRTY` (0x1), the server MUST return >> + the dirtiness status of the device. The following dirtiness state= s >> + are defined for the command: >> + >> + - `NBD_STATE_DIRTY` (0x0), LBA extent is dirty; >> + - `NBD_STATE_CLEAN` (0x1), LBA extent is clean. >> + >> + Generic NBD client implementation without knowledge of a particul= ar NBD >> + server operation MUST NOT make any assumption on the meaning of t= he >> + NBD_STATE_DIRTY or NBD_STATE_CLEAN states. >=20 > That makes it a useless call. A server can read /dev/random to decide > whether to send STATE_DIRTY or STATE_CLEAN, and still be compliant with > this spec. >=20 > Either the spec should define what it means for a block to be in a dirt= y > state, or it should not talk about it. Here is my attempt: This command is meant to operate in tandem with other (non-NBD) channels to the server. Generally, a "dirty" block is a block that has been written to by someone, but the exact meaning of "has been written" is left to the implementation. For example, a virtual machine monitor could provide a (non-NBD) command to start tracking blocks written by the virtual machine. A backup client then can connect to an NBD server provided by the virtual machine monitor and use NBD_CMD_GET_BLOCK_STATUS only read blocks that the virtual machine has changed. An implementation that doesn't track the "dirtiness" state of blocks MUST either fail this command with EINVAL, or mark all blocks as dirty in the descriptor that it returns. Paolo