All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liu, Yuan1" <yuan1.liu@intel.com>
To: Peter Xu <peterx@redhat.com>
Cc: "farosas@suse.de" <farosas@suse.de>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"hao.xiang@bytedance.com" <hao.xiang@bytedance.com>,
	 "bryan.zhang@bytedance.com" <bryan.zhang@bytedance.com>,
	"Zou, Nanhai" <nanhai.zou@intel.com>
Subject: RE: [PATCH v5 1/7] docs/migration: add qpl compression feature
Date: Wed, 27 Mar 2024 02:14:56 +0000	[thread overview]
Message-ID: <PH7PR11MB5941D52A928691D234F5C981A3342@PH7PR11MB5941.namprd11.prod.outlook.com> (raw)
In-Reply-To: <ZgMMsFWPClvF5Gm6@x1n>


> -----Original Message-----
> From: Peter Xu <peterx@redhat.com>
> Sent: Wednesday, March 27, 2024 1:58 AM
> To: Liu, Yuan1 <yuan1.liu@intel.com>
> Cc: farosas@suse.de; qemu-devel@nongnu.org; hao.xiang@bytedance.com;
> bryan.zhang@bytedance.com; Zou, Nanhai <nanhai.zou@intel.com>
> Subject: Re: [PATCH v5 1/7] docs/migration: add qpl compression feature
> 
> On Wed, Mar 20, 2024 at 12:45:21AM +0800, Yuan Liu wrote:
> > add Intel Query Processing Library (QPL) compression method
> > introduction
> >
> > Signed-off-by: Yuan Liu <yuan1.liu@intel.com>
> > Reviewed-by: Nanhai Zou <nanhai.zou@intel.com>
> > ---
> >  docs/devel/migration/features.rst        |   1 +
> >  docs/devel/migration/qpl-compression.rst | 231 +++++++++++++++++++++++
> >  2 files changed, 232 insertions(+)
> >  create mode 100644 docs/devel/migration/qpl-compression.rst
> >
> > diff --git a/docs/devel/migration/features.rst
> b/docs/devel/migration/features.rst
> > index d5ca7b86d5..bc98b65075 100644
> > --- a/docs/devel/migration/features.rst
> > +++ b/docs/devel/migration/features.rst
> > @@ -12,3 +12,4 @@ Migration has plenty of features to support different
> use cases.
> >     virtio
> >     mapped-ram
> >     CPR
> > +   qpl-compression
> > diff --git a/docs/devel/migration/qpl-compression.rst
> b/docs/devel/migration/qpl-compression.rst
> > new file mode 100644
> > index 0000000000..42c7969d30
> > --- /dev/null
> > +++ b/docs/devel/migration/qpl-compression.rst
> > @@ -0,0 +1,231 @@
> > +===============
> > +QPL Compression
> > +===============
> > +The Intel Query Processing Library (Intel ``QPL``) is an open-source
> library to
> > +provide compression and decompression features and it is based on
> deflate
> > +compression algorithm (RFC 1951).
> > +
> > +The ``QPL`` compression relies on Intel In-Memory Analytics
> Accelerator(``IAA``)
> > +and Shared Virtual Memory(``SVM``) technology, they are new features
> supported
> > +from Intel 4th Gen Intel Xeon Scalable processors, codenamed Sapphire
> Rapids
> > +processor(``SPR``).
> > +
> > +For more ``QPL`` introduction, please refer to:
> > +
> >
> +https://intel.github.io/qpl/documentation/introduction_docs/introduction.
> html
> 
> There're a bunch of links in this page, please consider switching all of
> them to use the link formats of .rST:
> 
>   Please refer to `QPL introduction page <https://...>`_.

Sure, thanks for the suggestion

> > +
> > +QPL Compression Framework
> > +=========================
> > +
> > +::
> > +
> > +  +----------------+       +------------------+
> > +  | MultiFD Service|       |accel-config tool |
> > +  +-------+--------+       +--------+---------+
> > +          |                         |
> > +          |                         |
> > +  +-------+--------+                | Setup IAA
> > +  |  QPL library   |                | Resources
> > +  +-------+---+----+                |
> > +          |   |                     |
> > +          |   +-------------+-------+
> > +          |   Open IAA      |
> > +          |   Devices +-----+-----+
> > +          |           |idxd driver|
> > +          |           +-----+-----+
> > +          |                 |
> > +          |                 |
> > +          |           +-----+-----+
> > +          +-----------+IAA Devices|
> > +      Submit jobs     +-----------+
> > +      via enqcmd
> > +
> > +
> > +Intel In-Memory Analytics Accelerator (Intel IAA) Introduction
> > +================================================================
> > +
> > +Intel ``IAA`` is an accelerator that has been designed to help benefit
> > +in-memory databases and analytic workloads. There are three main areas
> > +that Intel ``IAA`` can assist with analytics primitives (scan, filter,
> etc.),
> > +sparse data compression and memory tiering.
> > +
> > +``IAA`` Manual Documentation:
> > +
> > +https://www.intel.com/content/www/us/en/content-details/721858/intel-
> in-memory-analytics-accelerator-architecture-specification
> > +
> > +IAA Device Enabling
> > +-------------------
> > +
> > +- Enabling ``IAA`` devices for platform configuration, please refer to:
> > +
> > +https://www.intel.com/content/www/us/en/content-details/780887/intel-
> in-memory-analytics-accelerator-intel-iaa.html
> > +
> > +- ``IAA`` device driver is ``Intel Data Accelerator Driver (idxd)``, it
> is
> > +  recommended that the minimum version of Linux kernel is 5.18.
> > +
> > +- Add ``"intel_iommu=on,sm_on"`` parameter to kernel command line
> > +  for ``SVM`` feature enabling.
> > +
> > +Here is an easy way to verify ``IAA`` device driver and ``SVM``, refer
> to:
> > +
> > +https://github.com/intel/idxd-config/tree/stable/test
> > +
> > +IAA Device Management
> > +---------------------
> > +
> > +The number of ``IAA`` devices will vary depending on the Xeon product
> model.
> > +On a ``SPR`` server, there can be a maximum of 8 ``IAA`` devices, with
> up to
> > +4 devices per socket.
> > +
> > +By default, all ``IAA`` devices are disabled and need to be configured
> and
> > +enabled by users manually.
> > +
> > +Check the number of devices through the following command
> > +
> > +.. code-block:: shell
> > +
> > +  # lspci -d 8086:0cfe
> > +  # 6a:02.0 System peripheral: Intel Corporation Device 0cfe
> > +  # 6f:02.0 System peripheral: Intel Corporation Device 0cfe
> > +  # 74:02.0 System peripheral: Intel Corporation Device 0cfe
> > +  # 79:02.0 System peripheral: Intel Corporation Device 0cfe
> > +  # e7:02.0 System peripheral: Intel Corporation Device 0cfe
> > +  # ec:02.0 System peripheral: Intel Corporation Device 0cfe
> > +  # f1:02.0 System peripheral: Intel Corporation Device 0cfe
> > +  # f6:02.0 System peripheral: Intel Corporation Device 0cfe
> > +
> > +IAA Device Configuration
> > +------------------------
> > +
> > +The ``accel-config`` tool is used to enable ``IAA`` devices and
> configure
> > +``IAA`` hardware resources(work queues and engines). One ``IAA`` device
> > +has 8 work queues and 8 processing engines, multiple engines can be
> assigned
> > +to a work queue via ``group`` attribute.
> > +
> > +One example of configuring and enabling an ``IAA`` device.
> > +
> > +.. code-block:: shell
> > +
> > +  # accel-config config-engine iax1/engine1.0 -g 0
> > +  # accel-config config-engine iax1/engine1.1 -g 0
> > +  # accel-config config-engine iax1/engine1.2 -g 0
> > +  # accel-config config-engine iax1/engine1.3 -g 0
> > +  # accel-config config-engine iax1/engine1.4 -g 0
> > +  # accel-config config-engine iax1/engine1.5 -g 0
> > +  # accel-config config-engine iax1/engine1.6 -g 0
> > +  # accel-config config-engine iax1/engine1.7 -g 0
> > +  # accel-config config-wq iax1/wq1.0 -g 0 -s 128 -p 10 -b 1 -t 128 -m
> shared -y user -n app1 -d user
> > +  # accel-config enable-device iax1
> > +  # accel-config enable-wq iax1/wq1.0
> > +
> > +.. note::
> > +   IAX is an early name for IAA
> > +
> > +- The ``IAA`` device index is 1, use ``ls -lh
> /sys/bus/dsa/devices/iax*``
> > +  command to query the ``IAA`` device index.
> > +
> > +- 8 engines and 1 work queue are configured in group 0, so all
> compression jobs
> > +  submitted to this work queue can be processed by all engines at the
> same time.
> > +
> > +- Set work queue attributes including the work mode, work queue size
> and so on.
> > +
> > +- Enable the ``IAA1`` device and work queue 1.0
> > +
> > +.. note::
> > +  Set work queue mode to shared mode, since ``QPL`` library only
> supports
> > +  shared mode
> > +
> > +For more detailed configuration, please refer to:
> > +
> > +https://github.com/intel/idxd-config/tree/stable/Documentation/accfg
> > +
> > +IAA Resources Allocation For Migration
> > +--------------------------------------
> > +
> > +There is no ``IAA`` resource configuration parameters for migration and
> > +``accel-config`` tool configuration cannot directly specify the ``IAA``
> > +resources used for migration.
> > +
> > +``QPL`` will use all work queues that are enabled and set to shared
> mode,
> > +and use all engines assigned to the work queues with shared mode.
> > +
> > +By default, ``QPL`` will only use the local ``IAA`` device for
> compression
> > +job processing. The local ``IAA`` device means that the CPU of the job
> > +submission and the ``IAA`` device are on the same socket, so one CPU
> > +can submit the jobs to up to 4 ``IAA`` devices.
> > +
> > +Shared Virtual Memory(SVM) Introduction
> > +=======================================
> > +
> > +An ability for an accelerator I/O device to operate in the same virtual
> > +memory space of applications on host processors. It also implies the
> > +ability to operate from pageable memory, avoiding functional
> requirements
> > +to pin memory for DMA operations.
> > +
> > +When using ``SVM`` technology, users do not need to reserve memory for
> the
> > +``IAA`` device and perform pin memory operation. The ``IAA`` device can
> > +directly access data using the virtual address of the process.
> > +
> > +For more ``SVM`` technology, please refer to:
> > +
> > +https://docs.kernel.org/next/x86/sva.html
> > +
> > +
> > +How To Use QPL Compression In Migration
> > +=======================================
> > +
> > +1 - Installation of ``accel-config`` tool and ``QPL`` library
> 
> We can drop "1 " and stick with:
> 
>   - item1
>     - item1.1
>     ...
>   - item2
> 

Yes, it is better.

> > +
> > +  - Install ``accel-config`` tool from https://github.com/intel/idxd-
> config
> > +  - Install ``QPL`` library from https://github.com/intel/qpl
> > +
> > +2 - Configure and enable ``IAA`` devices and work queues via ``accel-
> config``
> > +
> > +3 - Build ``Qemu`` with ``--enable-qpl`` parameter
> > +
> > +  E.g. configure --target-list=x86_64-softmmu --enable-kvm ``--enable-
> qpl``
> > +
> > +4 - Start VMs with ``sudo`` command or ``root`` permission
> > +
> > +  Use the ``sudo`` command or ``root`` privilege to start the source
> and
> > +  destination virtual machines, since migration service needs
> permission
> > +  to access ``IAA`` hardware resources.
> > +
> > +5 - Enable ``QPL`` compression during migration
> > +
> > +  Set ``migrate_set_parameter multifd-compression qpl`` when migrating,
> the
> > +  ``QPL`` compression does not support configuring the compression
> level, it
> > +  only supports one compression level.
> > +
> > +The Difference Between QPL And ZLIB
> > +===================================
> > +
> > +Although both ``QPL`` and ``ZLIB`` are based on the deflate compression
> > +algorithm, and ``QPL`` can support the header and tail of ``ZLIB``,
> ``QPL``
> > +is still not fully compatible with the ``ZLIB`` compression in the
> migration.
> > +
> > +``QPL`` only supports 4K history buffer, and ``ZLIB`` is 32K by
> default. The
> > +``ZLIB`` compressed data that ``QPL`` may not decompress correctly and
> > +vice versa.
> > +
> > +``QPL`` does not support the ``Z_SYNC_FLUSH`` operation in ``ZLIB``
> streaming
> > +compression, current ``ZLIB`` implementation uses ``Z_SYNC_FLUSH``, so
> each
> > +``multifd`` thread has a ``ZLIB`` streaming context, and all page
> compression
> > +and decompression are based on this stream. ``QPL`` cannot decompress
> such data
> > +and vice versa.
> > +
> > +The introduction for ``Z_SYNC_FLUSH``, please refer to:
> > +
> > +https://www.zlib.net/manual.html
> > +
> > +The Best Practices
> > +==================
> > +
> > +When the virtual machine's pages are not populated and the ``IAA``
> device is
> > +used, I/O page faults occur, which can impact performance due to a
> large number
> > +of flush ``IOTLB`` operations.
> 
> AFAIU the IOTLB issue is not expected and can be fixed, while per our
> discussion in the other thread, the DMA fault latency cannot.
> 
> I think we can mention the possibility of IOTLB flush issue but that
> shouldn't be the major cause of such suggestion.  Again, I think it'll be
> great to mention how slow a DMA page fault can be, it can be compared in
> migration performance difference, or just describe an average DMA page
> fault latency, v.s. a processor fault.  That might explain why we suggest
> prefault the pages (comparing to a generic setup where the pages are
> always
> faulted by processors).

Get it, I will explain this issue in depth and add some comparative data, 
including I/O page fault Vs. process fault performance comparison and the
performance impact of I/O page fault on the entire live migration. Help 
developers and users better understand why -mem- prealloc needs to be 
added currently.

> > +
> > +Since the normal pages on the source side are all populated, ``IOTLB``
> caused
> > +by I/O page fault will not occur. On the destination side, a large
> number
> > +of normal pages need to be loaded, so it is recommended to add ``-mem-
> prealloc``
> > +parameter on the destination side.
> > --
> > 2.39.3
> >
> 
> --
> Peter Xu


  reply	other threads:[~2024-03-27  2:16 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-19 16:45 [PATCH v5 0/7] Live Migration With IAA Yuan Liu
2024-03-19 16:45 ` [PATCH v5 1/7] docs/migration: add qpl compression feature Yuan Liu
2024-03-26 17:58   ` Peter Xu
2024-03-27  2:14     ` Liu, Yuan1 [this message]
2024-03-19 16:45 ` [PATCH v5 2/7] migration/multifd: put IOV initialization into compression method Yuan Liu
2024-03-20 15:18   ` Fabiano Rosas
2024-03-20 15:32     ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 3/7] configure: add --enable-qpl build option Yuan Liu
2024-03-20  8:55   ` Thomas Huth
2024-03-20  8:56     ` Thomas Huth
2024-03-20 14:34       ` Liu, Yuan1
2024-03-20 10:31   ` Daniel P. Berrangé
2024-03-20 14:42     ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 4/7] migration/multifd: add qpl compression method Yuan Liu
2024-03-27 19:49   ` Peter Xu
2024-03-28  3:03     ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 5/7] migration/multifd: implement initialization of qpl compression Yuan Liu
2024-03-20 10:42   ` Daniel P. Berrangé
2024-03-20 15:02     ` Liu, Yuan1
2024-03-20 15:20       ` Daniel P. Berrangé
2024-03-20 16:04         ` Liu, Yuan1
2024-03-20 15:34       ` Peter Xu
2024-03-20 16:23         ` Liu, Yuan1
2024-03-20 20:31           ` Peter Xu
2024-03-21  1:37             ` Liu, Yuan1
2024-03-21 15:28               ` Peter Xu
2024-03-22  2:06                 ` Liu, Yuan1
2024-03-22 14:47                   ` Liu, Yuan1
2024-03-22 16:40                     ` Peter Xu
2024-03-27 19:25                       ` Peter Xu
2024-03-28  2:32                         ` Liu, Yuan1
2024-03-28 15:16                           ` Peter Xu
2024-03-29  2:04                             ` Liu, Yuan1
2024-03-19 16:45 ` [PATCH v5 6/7] migration/multifd: implement qpl compression and decompression Yuan Liu
2024-03-19 16:45 ` [PATCH v5 7/7] tests/migration-test: add qpl compression test Yuan Liu
2024-03-20 10:45   ` Daniel P. Berrangé
2024-03-20 15:30     ` Liu, Yuan1
2024-03-20 15:39       ` Daniel P. Berrangé
2024-03-20 16:26         ` Liu, Yuan1
2024-03-26 20:30 ` [PATCH v5 0/7] Live Migration With IAA Peter Xu
2024-03-27  3:20   ` Liu, Yuan1
2024-03-27 19:46     ` Peter Xu
2024-03-28  3:02       ` Liu, Yuan1
2024-03-28 15:22         ` Peter Xu
2024-03-29  3:33           ` Liu, Yuan1

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH7PR11MB5941D52A928691D234F5C981A3342@PH7PR11MB5941.namprd11.prod.outlook.com \
    --to=yuan1.liu@intel.com \
    --cc=bryan.zhang@bytedance.com \
    --cc=farosas@suse.de \
    --cc=hao.xiang@bytedance.com \
    --cc=nanhai.zou@intel.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.