All of lore.kernel.org
 help / color / mirror / Atom feed
From: Liu Yuan <namei.unix@gmail.com>
To: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: kwolf@redhat.com, mitake.hitoshi@lab.ntt.co.jp,
	sheepdog@lists.wpkg.org, qemu-devel@nongnu.org,
	stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH v4] sheepdog: selectable object size support
Date: Tue, 10 Feb 2015 11:18:25 +0800	[thread overview]
Message-ID: <20150210031825.GC3859@ubuntu-trusty> (raw)
In-Reply-To: <20150210031051.GB3859@ubuntu-trusty>

On Tue, Feb 10, 2015 at 11:10:51AM +0800, Liu Yuan wrote:
> On Tue, Jan 27, 2015 at 05:35:27PM +0900, Teruaki Ishizaki wrote:
> > Previously, qemu block driver of sheepdog used hard-coded VDI object size.
> > This patch enables users to handle "block_size_shift" value for
> > calculating VDI object size.
> > 
> > When you start qemu, you don't need to specify additional command option.
> > 
> > But when you create the VDI which doesn't have default object size
> > with qemu-img command, you specify block_size_shift option.
> > 
> > If you want to create a VDI of 8MB(1 << 23) object size,
> > you need to specify following command option.
> > 
> >  # qemu-img create -o block_size_shift=23 sheepdog:test1 100M
> > 
> > In addition, when you don't specify qemu-img command option,
> > a default value of sheepdog cluster is used for creating VDI.
> > 
> >  # qemu-img create sheepdog:test2 100M
> > 
> > Signed-off-by: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
> > ---
> > V4:
> >  - Limit a read/write buffer size for creating a preallocated VDI.
> >  - Replace a parse function for the block_size_shift option.
> >  - Fix an error message.
> > 
> > V3:
> >  - Delete the needless operation of buffer.
> >  - Delete the needless operations of request header.
> >    for SD_OP_GET_CLUSTER_DEFAULT.
> >  - Fix coding style problems.
> > 
> > V2:
> >  - Fix coding style problem (white space).
> >  - Add members, store_policy and block_size_shift to struct SheepdogVdiReq.
> >  - Initialize request header to use block_size_shift specified by user.
> > ---
> >  block/sheepdog.c          |  138 ++++++++++++++++++++++++++++++++++++++-------
> >  include/block/block_int.h |    1 +
> >  2 files changed, 119 insertions(+), 20 deletions(-)
> > 
> > diff --git a/block/sheepdog.c b/block/sheepdog.c
> > index be3176f..a43b947 100644
> > --- a/block/sheepdog.c
> > +++ b/block/sheepdog.c
> > @@ -37,6 +37,7 @@
> >  #define SD_OP_READ_VDIS      0x15
> >  #define SD_OP_FLUSH_VDI      0x16
> >  #define SD_OP_DEL_VDI        0x17
> > +#define SD_OP_GET_CLUSTER_DEFAULT   0x18
> >  
> >  #define SD_FLAG_CMD_WRITE    0x01
> >  #define SD_FLAG_CMD_COW      0x02
> > @@ -167,7 +168,8 @@ typedef struct SheepdogVdiReq {
> >      uint32_t base_vdi_id;
> >      uint8_t copies;
> >      uint8_t copy_policy;
> > -    uint8_t reserved[2];
> > +    uint8_t store_policy;
> > +    uint8_t block_size_shift;
> >      uint32_t snapid;
> >      uint32_t type;
> >      uint32_t pad[2];
> > @@ -186,6 +188,21 @@ typedef struct SheepdogVdiRsp {
> >      uint32_t pad[5];
> >  } SheepdogVdiRsp;
> >  
> > +typedef struct SheepdogClusterRsp {
> > +    uint8_t proto_ver;
> > +    uint8_t opcode;
> > +    uint16_t flags;
> > +    uint32_t epoch;
> > +    uint32_t id;
> > +    uint32_t data_length;
> > +    uint32_t result;
> > +    uint8_t nr_copies;
> > +    uint8_t copy_policy;
> > +    uint8_t block_size_shift;
> > +    uint8_t __pad1;
> > +    uint32_t __pad2[6];
> > +} SheepdogClusterRsp;
> > +
> >  typedef struct SheepdogInode {
> >      char name[SD_MAX_VDI_LEN];
> >      char tag[SD_MAX_VDI_TAG_LEN];
> > @@ -1544,6 +1561,7 @@ static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot,
> >      hdr.vdi_size = s->inode.vdi_size;
> >      hdr.copy_policy = s->inode.copy_policy;
> >      hdr.copies = s->inode.nr_copies;
> > +    hdr.block_size_shift = s->inode.block_size_shift;
> >  
> >      ret = do_req(fd, s->aio_context, (SheepdogReq *)&hdr, buf, &wlen, &rlen);
> >  
> > @@ -1569,9 +1587,12 @@ static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot,
> >  static int sd_prealloc(const char *filename, Error **errp)
> >  {
> >      BlockDriverState *bs = NULL;
> > +    BDRVSheepdogState *base = NULL;
> > +    unsigned long buf_size;
> >      uint32_t idx, max_idx;
> > +    uint32_t object_size;
> >      int64_t vdi_size;
> > -    void *buf = g_malloc0(SD_DATA_OBJ_SIZE);
> > +    void *buf = NULL;
> >      int ret;
> >  
> >      ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
> > @@ -1585,18 +1606,24 @@ static int sd_prealloc(const char *filename, Error **errp)
> >          ret = vdi_size;
> >          goto out;
> >      }
> > -    max_idx = DIV_ROUND_UP(vdi_size, SD_DATA_OBJ_SIZE);
> > +
> > +    base = bs->opaque;
> > +    object_size = (UINT32_C(1) << base->inode.block_size_shift);
> > +    buf_size = MIN(object_size, SD_DATA_OBJ_SIZE);
> > +    buf = g_malloc0(buf_size);
> > +
> > +    max_idx = DIV_ROUND_UP(vdi_size, buf_size);
> >  
> >      for (idx = 0; idx < max_idx; idx++) {
> >          /*
> >           * The created image can be a cloned image, so we need to read
> >           * a data from the source image.
> >           */
> > -        ret = bdrv_pread(bs, idx * SD_DATA_OBJ_SIZE, buf, SD_DATA_OBJ_SIZE);
> > +        ret = bdrv_pread(bs, idx * buf_size, buf, buf_size);
> >          if (ret < 0) {
> >              goto out;
> >          }
> > -        ret = bdrv_pwrite(bs, idx * SD_DATA_OBJ_SIZE, buf, SD_DATA_OBJ_SIZE);
> > +        ret = bdrv_pwrite(bs, idx * buf_size, buf, buf_size);
> >          if (ret < 0) {
> >              goto out;
> >          }
> > @@ -1669,6 +1696,21 @@ static int parse_redundancy(BDRVSheepdogState *s, const char *opt)
> >      return 0;
> >  }
> >  
> > +static int parse_block_size_shift(BDRVSheepdogState *s, QemuOpts *opt)
> > +{
> > +    struct SheepdogInode *inode = &s->inode;
> > +    inode->block_size_shift =
> > +        (uint8_t)qemu_opt_get_number_del(opt, "block_size_shift", 0);
> > +    if (inode->block_size_shift == 0) {
> > +        /* block_size_shift is set for cluster default value by sheepdog */
> > +        return 0;
> > +    } else if (inode->block_size_shift < 20 || inode->block_size_shift > 31) {
> > +        return -EINVAL;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> >  static int sd_create(const char *filename, QemuOpts *opts,
> >                       Error **errp)
> >  {
> > @@ -1679,6 +1721,7 @@ static int sd_create(const char *filename, QemuOpts *opts,
> >      BDRVSheepdogState *s;
> >      char tag[SD_MAX_VDI_TAG_LEN];
> >      uint32_t snapid;
> > +    uint64_t max_vdi_size;
> >      bool prealloc = false;
> >  
> >      s = g_new0(BDRVSheepdogState, 1);
> > @@ -1718,9 +1761,10 @@ static int sd_create(const char *filename, QemuOpts *opts,
> >          }
> >      }
> >  
> > -    if (s->inode.vdi_size > SD_MAX_VDI_SIZE) {
> > -        error_setg(errp, "too big image size");
> > -        ret = -EINVAL;
> > +    ret = parse_block_size_shift(s, opts);
> > +    if (ret < 0) {
> > +        error_setg(errp, "Invalid block_size_shift:"
> > +                         " '%s'. please use 20-31", buf);
> >          goto out;
> >      }
> >  
> > @@ -1757,6 +1801,47 @@ static int sd_create(const char *filename, QemuOpts *opts,
> >      }
> >  
> >      s->aio_context = qemu_get_aio_context();
> > +
> > +    /* if block_size_shift is not specified, get cluster default value */
> > +    if (s->inode.block_size_shift == 0) {
> 
> Seems that we don't keep backward compatibility for new QEMU and old sheep that
> doesn't support SD_OP_GET_CLUSTER_DEFAULT.
> 
> What will happen if old QEMU with new sheep that support block_size_shift? Most
> distributions will ship the old stable qemu that wouldn't be aware of
> block_size_shift.
> 
> I'd suggest we have 0 in block_size_shift represent 4MB to keep backward
> compatiblity.

How about make this additional field as the factor to multiply 4MB? that is,

(x + 1) * 4MB

qemu-img create -o object_size=8MB test 1G

will have x set as 1 and the formula will equal to 8MB.

-o object_size=4MB will set x as 0 and default value for x is 0 too to keep
backward compatiblity.

Thanks
Yuan

  reply	other threads:[~2015-02-10  3:18 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-27  8:35 [Qemu-devel] [PATCH v4] sheepdog: selectable object size support Teruaki Ishizaki
2015-02-02  6:52 ` Liu Yuan
2015-02-04  4:54   ` Teruaki Ishizaki
2015-02-06  2:18     ` Liu Yuan
2015-02-06  7:57       ` Teruaki Ishizaki
2015-02-09  3:08         ` Liu Yuan
2015-02-10  3:10 ` Liu Yuan
2015-02-10  3:18   ` Liu Yuan [this message]
2015-02-10  8:22   ` Teruaki Ishizaki
2015-02-10  8:58     ` Liu Yuan
2015-02-10  9:56       ` Teruaki Ishizaki
2015-02-10 10:35         ` Liu Yuan
2015-02-12  6:19           ` [Qemu-devel] [sheepdog] " Hitoshi Mitake
2015-02-12  7:00             ` Liu Yuan
2015-02-12  7:28               ` Hitoshi Mitake
2015-02-12  7:42                 ` Liu Yuan
2015-02-12  8:01                   ` Teruaki Ishizaki
2015-02-12  8:11                     ` Liu Yuan
2015-02-12  8:13                   ` Hitoshi Mitake
2015-02-12  8:16                     ` Liu Yuan
2015-02-10 11:12 ` [Qemu-devel] " Liu Yuan
2015-02-12  1:51   ` Teruaki Ishizaki
2015-02-12  2:19     ` Liu Yuan
2015-02-12  2:33       ` Teruaki Ishizaki
2015-02-12  2:55         ` Liu Yuan
2015-02-13  1:33           ` Teruaki Ishizaki
2015-02-13  2:01             ` Liu Yuan
2015-02-13  4:28               ` Teruaki Ishizaki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150210031825.GC3859@ubuntu-trusty \
    --to=namei.unix@gmail.com \
    --cc=ishizaki.teruaki@lab.ntt.co.jp \
    --cc=kwolf@redhat.com \
    --cc=mitake.hitoshi@lab.ntt.co.jp \
    --cc=qemu-devel@nongnu.org \
    --cc=sheepdog@lists.wpkg.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.