On 19/08/17 12:46, Alexey Kardashevskiy wrote: > On 19/08/17 01:18, Eric Blake wrote: >> On 08/18/2017 08:31 AM, Stefan Hajnoczi wrote: >>> Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1) >>> * cluster_size + 512 bytes upfront. Allocate s->cluster_cache and >>> s->cluster_data when the first read operation is performance on a >>> compressed cluster. >>> >>> The buffers are freed in .bdrv_close(). .bdrv_open() no longer has any >>> code paths that can allocate these buffers, so remove the free functions >>> in the error code path. >>> >>> Reported-by: Alexey Kardashevskiy >>> Cc: Kevin Wolf >>> Signed-off-by: Stefan Hajnoczi >>> --- >>> Alexey: Does this improve your memory profiling results? >> >> Is this a regression from earlier versions? > > Hm, I have not thought about this. > > So. I did bisect and this started happening from > 9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5 > "hw/virtio-pci: fix virtio behaviour" > > Before that, the very same command line would take less than 1GB of > resident memory. That thing basically enforces virtio-1.0 for QEMU <=2.6 > which means that upstream with "-machine pseries-2.6" works fine (less than > 1GB), "-machine pseries-2.7" does not (close to 7GB, sometime even 9GB). > > Then I tried bisecting again, with > "scsi=off,disable-modern=off,disable-legacy=on" on my 150 virtio-block > devices, started from > e266d421490e0 "virtio-pci: add flags to enable/disable legacy/modern" (it > added the disable-modern switch) which uses 2GB of memory. > > I ended up with ada434cd0b44 "virtio-pci: implement cfg capability". > > Then I removed proxy->modern_as on v2.10.0-rc3 (see below) and got 1.5GB of > used memory (yay!) > > I do not really know how to reinterpret all of this, do you? Anyone, ping? Should I move the conversation to the original thread? Any hacks to try with libc? > > > Note: 1GB..9GB numbers from below are the peak values from valgrind's > massif. This is pretty much resident memory used by QEMU process. In my > testing I did not enable KVM and I did not start the guest (i.e. used -S). > 150 virtio-block devices, 2GB RAM for the guest. > > Also, while bisecting, I only paid attention if it is 1..2GB or 6..9GB - > all tests did fit these 2 ranges, for any given sha1 the amount of memory > would be stable but among "good" commits it could change between 1GB and 2GB. > > > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c > index 5d14bd6..7ad447a 100644 > --- a/hw/virtio/virtio-pci.c > +++ b/hw/virtio/virtio-pci.c > @@ -1783,6 +1783,7 @@ static void virtio_pci_realize(PCIDevice *pci_dev, > Error **errp) > /* PCI BAR regions must be powers of 2 */ > pow2ceil(proxy->notify.offset + proxy->notify.size)); > > +#if 0 > memory_region_init_alias(&proxy->modern_cfg, > OBJECT(proxy), > "virtio-pci-cfg", > @@ -1791,7 +1792,7 @@ static void virtio_pci_realize(PCIDevice *pci_dev, > Error **errp) > memory_region_size(&proxy->modern_bar)); > > address_space_init(&proxy->modern_as, &proxy->modern_cfg, > "virtio-pci-cfg-as"); > - > +#endif > if (proxy->disable_legacy == ON_OFF_AUTO_AUTO) { > proxy->disable_legacy = pcie_port ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF; > } > @@ -1860,10 +1861,10 @@ static void virtio_pci_realize(PCIDevice *pci_dev, > Error **errp) > > static void virtio_pci_exit(PCIDevice *pci_dev) > { > - VirtIOPCIProxy *proxy = VIRTIO_PCI(pci_dev); > + //VirtIOPCIProxy *proxy = VIRTIO_PCI(pci_dev); > > msix_uninit_exclusive_bar(pci_dev); > - address_space_destroy(&proxy->modern_as); > + //address_space_destroy(&proxy->modern_as); > } > > static void virtio_pci_reset(DeviceState *qdev) > > > > > >> Likely, it is NOT -rc4 >> material, and thus can wait for 2.11; but it should be fine for -stable >> as part of 2.10.1 down the road. >> >>> +++ b/block/qcow2-cluster.c >>> @@ -1516,6 +1516,23 @@ int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset) >>> nb_csectors = ((cluster_offset >> s->csize_shift) & s->csize_mask) + 1; >>> sector_offset = coffset & 511; >>> csize = nb_csectors * 512 - sector_offset; >>> + >>> + /* Allocate buffers on first decompress operation, most images are >>> + * uncompressed and the memory overhead can be avoided. The buffers >>> + * are freed in .bdrv_close(). >>> + */ >>> + if (!s->cluster_data) { >>> + /* one more sector for decompressed data alignment */ >>> + s->cluster_data = qemu_try_blockalign(bs->file->bs, >>> + QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size + 512); >>> + if (!s->cluster_data) { >>> + return -EIO; >> >> Is -ENOMEM any better than -EIO here? >> > > -- Alexey