QEMU-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/2] RFC: add -mem-shared option
@ 2019-11-28 14:15 Marc-André Lureau
  2019-11-28 14:15 ` [PATCH 1/2] memfd: add qemu_memfd_open() Marc-André Lureau
                   ` (6 more replies)
  0 siblings, 7 replies; 41+ messages in thread
From: Marc-André Lureau @ 2019-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, stefanha, Marc-André Lureau, Paolo Bonzini,
	Richard Henderson

Hi,

Setting up shared memory for vhost-user is a bit complicated from
command line, as it requires NUMA setup such as: m 4G -object
memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
node,memdev=mem.

Instead, I suggest to add a -mem-shared option for non-numa setups,
that will make the -mem-path or anonymouse memory shareable.

Comments welcome,

Marc-André Lureau (2):
  memfd: add qemu_memfd_open()
  Add -mem-shared option

 exec.c                  | 11 ++++++++++-
 hw/core/numa.c          | 16 +++++++++++++++-
 include/qemu/memfd.h    |  3 +++
 include/sysemu/sysemu.h |  1 +
 qemu-options.hx         | 10 ++++++++++
 util/memfd.c            | 39 +++++++++++++++++++++++++--------------
 vl.c                    |  4 ++++
 7 files changed, 68 insertions(+), 16 deletions(-)

-- 
2.24.0



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 1/2] memfd: add qemu_memfd_open()
  2019-11-28 14:15 [PATCH 0/2] RFC: add -mem-shared option Marc-André Lureau
@ 2019-11-28 14:15 ` Marc-André Lureau
  2019-11-28 14:15 ` [PATCH 2/2] Add -mem-shared option Marc-André Lureau
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 41+ messages in thread
From: Marc-André Lureau @ 2019-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, stefanha, Marc-André Lureau, Paolo Bonzini,
	Richard Henderson

Refactor qemu_memfd_alloc() to simply return the opened fd.

mmap() can be done later by the caller.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 include/qemu/memfd.h |  3 +++
 util/memfd.c         | 39 +++++++++++++++++++++++++--------------
 2 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/include/qemu/memfd.h b/include/qemu/memfd.h
index 975b6bdb77..1642af9459 100644
--- a/include/qemu/memfd.h
+++ b/include/qemu/memfd.h
@@ -44,4 +44,7 @@ void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals,
 void qemu_memfd_free(void *ptr, size_t size, int fd);
 bool qemu_memfd_check(unsigned int flags);
 
+int qemu_memfd_open(const char *name, size_t size,
+                    unsigned int seals, Error **errp);
+
 #endif /* QEMU_MEMFD_H */
diff --git a/util/memfd.c b/util/memfd.c
index 4a3c07e0be..523b943b62 100644
--- a/util/memfd.c
+++ b/util/memfd.c
@@ -104,10 +104,9 @@ err:
  * memfd with sealing, but may fallback on other methods without
  * sealing.
  */
-void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals,
-                       int *fd, Error **errp)
+int qemu_memfd_open(const char *name, size_t size, unsigned int seals,
+                    Error **errp)
 {
-    void *ptr;
     int mfd = qemu_memfd_create(name, size, false, 0, seals, NULL);
 
     /* some systems have memfd without sealing */
@@ -124,26 +123,38 @@ void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals,
         unlink(fname);
         g_free(fname);
 
-        if (mfd == -1 ||
-            ftruncate(mfd, size) == -1) {
-            goto err;
+        if (mfd != -1 && ftruncate(mfd, size) == -1) {
+            close(mfd);
+            mfd = -1;
         }
     }
 
+    if (mfd == -1) {
+        error_setg_errno(errp, errno, "qemu_memfd_open() failed");
+    }
+
+    return mfd;
+}
+
+void *qemu_memfd_alloc(const char *name, size_t size, unsigned int seals,
+                       int *fd, Error **errp)
+{
+    int mfd = qemu_memfd_open(name, size, seals, errp);
+    void *ptr;
+
+    if (mfd == -1) {
+        return NULL;
+    }
+
     ptr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, mfd, 0);
     if (ptr == MAP_FAILED) {
-        goto err;
+        error_setg_errno(errp, errno, "failed to allocate shared memory");
+        close(mfd);
+        return NULL;
     }
 
     *fd = mfd;
     return ptr;
-
-err:
-    error_setg_errno(errp, errno, "failed to allocate shared memory");
-    if (mfd >= 0) {
-        close(mfd);
-    }
-    return NULL;
 }
 
 void qemu_memfd_free(void *ptr, size_t size, int fd)
-- 
2.24.0



^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 2/2] Add -mem-shared option
  2019-11-28 14:15 [PATCH 0/2] RFC: add -mem-shared option Marc-André Lureau
  2019-11-28 14:15 ` [PATCH 1/2] memfd: add qemu_memfd_open() Marc-André Lureau
@ 2019-11-28 14:15 ` Marc-André Lureau
  2019-11-28 16:14   ` Eduardo Habkost
  2019-11-28 16:28   ` Igor Mammedov
  2019-11-28 16:10 ` [PATCH 0/2] RFC: add " Eduardo Habkost
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 41+ messages in thread
From: Marc-André Lureau @ 2019-11-28 14:15 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eduardo Habkost, stefanha, Marc-André Lureau, Paolo Bonzini,
	Richard Henderson

Add an option to simplify shared memory / vhost-user setup.

Currently, using vhost-user requires NUMA setup such as:
-m 4G -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa node,memdev=mem

As there is no other way to allocate shareable RAM, afaik.

-mem-shared aims to have a simple way instead: -m 4G -mem-shared

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 exec.c                  | 11 ++++++++++-
 hw/core/numa.c          | 16 +++++++++++++++-
 include/sysemu/sysemu.h |  1 +
 qemu-options.hx         | 10 ++++++++++
 vl.c                    |  4 ++++
 5 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index ffdb518535..4e53937eaf 100644
--- a/exec.c
+++ b/exec.c
@@ -72,6 +72,10 @@
 #include "qemu/mmap-alloc.h"
 #endif
 
+#ifdef CONFIG_POSIX
+#include "qemu/memfd.h"
+#endif
+
 #include "monitor/monitor.h"
 
 //#define DEBUG_SUBPAGE
@@ -2347,7 +2351,12 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
     bool created;
     RAMBlock *block;
 
-    fd = file_ram_open(mem_path, memory_region_name(mr), &created, errp);
+    if (mem_path) {
+        fd = file_ram_open(mem_path, memory_region_name(mr), &created, errp);
+    } else {
+        fd = qemu_memfd_open(mr->name, size,
+                             F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL, errp);
+    }
     if (fd < 0) {
         return NULL;
     }
diff --git a/hw/core/numa.c b/hw/core/numa.c
index e3332a984f..6f72cddb1c 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -493,7 +493,8 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
     if (mem_path) {
 #ifdef __linux__
         Error *err = NULL;
-        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, 0,
+        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0,
+                                         mem_shared ? RAM_SHARED : 0,
                                          mem_path, &err);
         if (err) {
             error_report_err(err);
@@ -513,6 +514,19 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
 #else
         fprintf(stderr, "-mem-path not supported on this host\n");
         exit(1);
+#endif
+    } else if (mem_shared) {
+#ifdef CONFIG_POSIX
+        Error *err = NULL;
+        memory_region_init_ram_from_file(mr, owner, NULL, ram_size, 0,
+                                         RAM_SHARED, NULL, &err);
+        if (err) {
+            error_report_err(err);
+            exit(1);
+        }
+#else
+        fprintf(stderr, "-mem-shared not supported on this host\n");
+        exit(1);
 #endif
     } else {
         memory_region_init_ram_nomigrate(mr, owner, name, ram_size, &error_fatal);
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 80c57fdc4e..80db8465a9 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -55,6 +55,7 @@ extern bool enable_cpu_pm;
 extern QEMUClockType rtc_clock;
 extern const char *mem_path;
 extern int mem_prealloc;
+extern int mem_shared;
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/qemu-options.hx b/qemu-options.hx
index 65c9473b73..4c69b03ad3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -394,6 +394,16 @@ STEXI
 Preallocate memory when using -mem-path.
 ETEXI
 
+DEF("mem-shared", 0, QEMU_OPTION_mem_shared,
+    "-mem-shared     allocate shared memory\n", QEMU_ARCH_ALL)
+STEXI
+@item -mem-shared
+@findex -mem-shared
+Allocate guest RAM with shared mapping.  Whether the allocation is
+anonymous or not (with -mem-path), QEMU will allocate a shared memory that
+can be shared by unrelated processes, such as vhost-user backends.
+ETEXI
+
 DEF("k", HAS_ARG, QEMU_OPTION_k,
     "-k language     use keyboard layout (for example 'fr' for French)\n",
     QEMU_ARCH_ALL)
diff --git a/vl.c b/vl.c
index 6a65a64bfd..53b1155455 100644
--- a/vl.c
+++ b/vl.c
@@ -143,6 +143,7 @@ const char* keyboard_layout = NULL;
 ram_addr_t ram_size;
 const char *mem_path = NULL;
 int mem_prealloc = 0; /* force preallocation of physical target memory */
+int mem_shared = 0;
 bool enable_mlock = false;
 bool enable_cpu_pm = false;
 int nb_nics;
@@ -3172,6 +3173,9 @@ int main(int argc, char **argv, char **envp)
             case QEMU_OPTION_mem_prealloc:
                 mem_prealloc = 1;
                 break;
+            case QEMU_OPTION_mem_shared:
+                mem_shared = 1;
+                break;
             case QEMU_OPTION_d:
                 log_mask = optarg;
                 break;
-- 
2.24.0



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-28 14:15 [PATCH 0/2] RFC: add -mem-shared option Marc-André Lureau
  2019-11-28 14:15 ` [PATCH 1/2] memfd: add qemu_memfd_open() Marc-André Lureau
  2019-11-28 14:15 ` [PATCH 2/2] Add -mem-shared option Marc-André Lureau
@ 2019-11-28 16:10 ` " Eduardo Habkost
  2019-11-29  9:18   ` Igor Mammedov
  2019-11-29  9:31   ` Paolo Bonzini
  2019-11-28 16:59 ` Dr. David Alan Gilbert
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 41+ messages in thread
From: Eduardo Habkost @ 2019-11-28 16:10 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: Paolo Bonzini, stefanha, qemu-devel, Richard Henderson

On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> Hi,
> 
> Setting up shared memory for vhost-user is a bit complicated from
> command line, as it requires NUMA setup such as: m 4G -object
> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> node,memdev=mem.
> 
> Instead, I suggest to add a -mem-shared option for non-numa setups,
> that will make the -mem-path or anonymouse memory shareable.

Can we make this be a "-m" option?

Or, even better: can we make "-m" options be automatically
translated to memory-backend-* options somehow?

-- 
Eduardo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-28 14:15 ` [PATCH 2/2] Add -mem-shared option Marc-André Lureau
@ 2019-11-28 16:14   ` Eduardo Habkost
  2019-11-28 16:28   ` Igor Mammedov
  1 sibling, 0 replies; 41+ messages in thread
From: Eduardo Habkost @ 2019-11-28 16:14 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: qemu-devel, stefanha, Igor Mammedov, Paolo Bonzini, Richard Henderson

+Igor

On Thu, Nov 28, 2019 at 06:15:18PM +0400, Marc-André Lureau wrote:
> Add an option to simplify shared memory / vhost-user setup.
> 
> Currently, using vhost-user requires NUMA setup such as:
> -m 4G -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa node,memdev=mem
> 
> As there is no other way to allocate shareable RAM, afaik.
> 
> -mem-shared aims to have a simple way instead: -m 4G -mem-shared
> 
> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
[...]
> diff --git a/hw/core/numa.c b/hw/core/numa.c
> index e3332a984f..6f72cddb1c 100644
> --- a/hw/core/numa.c
> +++ b/hw/core/numa.c
> @@ -493,7 +493,8 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>      if (mem_path) {
>  #ifdef __linux__
>          Error *err = NULL;
> -        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, 0,
> +        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0,
> +                                         mem_shared ? RAM_SHARED : 0,
>                                           mem_path, &err);
>          if (err) {
>              error_report_err(err);
> @@ -513,6 +514,19 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>  #else
>          fprintf(stderr, "-mem-path not supported on this host\n");
>          exit(1);
> +#endif
> +    } else if (mem_shared) {
> +#ifdef CONFIG_POSIX
> +        Error *err = NULL;
> +        memory_region_init_ram_from_file(mr, owner, NULL, ram_size, 0,
> +                                         RAM_SHARED, NULL, &err);
> +        if (err) {
> +            error_report_err(err);
> +            exit(1);
> +        }
> +#else
> +        fprintf(stderr, "-mem-shared not supported on this host\n");
> +        exit(1);
>  #endif
>      } else {
>          memory_region_init_ram_nomigrate(mr, owner, name, ram_size, &error_fatal);

I'd really like make allocate_system_memory_nonnuma() just create
a memory backend object.  This way non-NUMA and NUMA
configuration would be able to use exactly the same set of
options.

I have the impression we tried to do this in the past.  Igor, do
you remember if we did?

-- 
Eduardo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-28 14:15 ` [PATCH 2/2] Add -mem-shared option Marc-André Lureau
  2019-11-28 16:14   ` Eduardo Habkost
@ 2019-11-28 16:28   ` Igor Mammedov
  2019-11-28 20:31     ` Marc-André Lureau
  1 sibling, 1 reply; 41+ messages in thread
From: Igor Mammedov @ 2019-11-28 16:28 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: Paolo Bonzini, Richard Henderson, qemu-devel, stefanha, Eduardo Habkost

On Thu, 28 Nov 2019 18:15:18 +0400
Marc-André Lureau <marcandre.lureau@redhat.com> wrote:

> Add an option to simplify shared memory / vhost-user setup.
> 
> Currently, using vhost-user requires NUMA setup such as:
> -m 4G -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa node,memdev=mem
> 
> As there is no other way to allocate shareable RAM, afaik.
> 
> -mem-shared aims to have a simple way instead: -m 4G -mem-shared
User always can write a wrapper script if verbose CLI is too much,
and we won't have to deal with myriad permutations to maintain.

Also current -mem-path/prealloc in combination with memdevs is
the source of problems (as ram allocation uses 2 different paths).
It's possible to fix with a kludge but I'd rather fix it properly.
So during 5.0, I'm planning to consolidate -mem-path/prealloc
handling around memory backend internally (and possibly deprecate them),
so the only way to allocate RAM for guest would be via memdevs.
(reducing number of options an globals that they use)

So user who wants something non trivial could override default
non-numa behavior with
  -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
  -machine memdev=mem
or use any other backend that suits theirs needs.


> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> ---
>  exec.c                  | 11 ++++++++++-
>  hw/core/numa.c          | 16 +++++++++++++++-
>  include/sysemu/sysemu.h |  1 +
>  qemu-options.hx         | 10 ++++++++++
>  vl.c                    |  4 ++++
>  5 files changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index ffdb518535..4e53937eaf 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -72,6 +72,10 @@
>  #include "qemu/mmap-alloc.h"
>  #endif
>  
> +#ifdef CONFIG_POSIX
> +#include "qemu/memfd.h"
> +#endif
> +
>  #include "monitor/monitor.h"
>  
>  //#define DEBUG_SUBPAGE
> @@ -2347,7 +2351,12 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
>      bool created;
>      RAMBlock *block;
>  
> -    fd = file_ram_open(mem_path, memory_region_name(mr), &created, errp);
> +    if (mem_path) {
> +        fd = file_ram_open(mem_path, memory_region_name(mr), &created, errp);
> +    } else {
> +        fd = qemu_memfd_open(mr->name, size,
> +                             F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL, errp);
> +    }

that's what I'm mostly against, as it spills out memdev impl. details
into generic code.

>      if (fd < 0) {
>          return NULL;
>      }
> diff --git a/hw/core/numa.c b/hw/core/numa.c
> index e3332a984f..6f72cddb1c 100644
> --- a/hw/core/numa.c
> +++ b/hw/core/numa.c
> @@ -493,7 +493,8 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>      if (mem_path) {
>  #ifdef __linux__
>          Error *err = NULL;
> -        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, 0,
> +        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0,
> +                                         mem_shared ? RAM_SHARED : 0,
>                                           mem_path, &err);
this will be gone and replaced by memory region that memdev initializes.

>          if (err) {
>              error_report_err(err);
> @@ -513,6 +514,19 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
>  #else
>          fprintf(stderr, "-mem-path not supported on this host\n");
>          exit(1);
> +#endif
> +    } else if (mem_shared) {
> +#ifdef CONFIG_POSIX
> +        Error *err = NULL;
> +        memory_region_init_ram_from_file(mr, owner, NULL, ram_size, 0,
> +                                         RAM_SHARED, NULL, &err);
> +        if (err) {
> +            error_report_err(err);
> +            exit(1);
> +        }
> +#else
> +        fprintf(stderr, "-mem-shared not supported on this host\n");
> +        exit(1);
>  #endif
>      } else {
>          memory_region_init_ram_nomigrate(mr, owner, name, ram_size, &error_fatal);
> diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> index 80c57fdc4e..80db8465a9 100644
> --- a/include/sysemu/sysemu.h
> +++ b/include/sysemu/sysemu.h
> @@ -55,6 +55,7 @@ extern bool enable_cpu_pm;
>  extern QEMUClockType rtc_clock;
>  extern const char *mem_path;
>  extern int mem_prealloc;
> +extern int mem_shared;
>  
>  #define MAX_OPTION_ROMS 16
>  typedef struct QEMUOptionRom {
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 65c9473b73..4c69b03ad3 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -394,6 +394,16 @@ STEXI
>  Preallocate memory when using -mem-path.
>  ETEXI
>  
> +DEF("mem-shared", 0, QEMU_OPTION_mem_shared,
> +    "-mem-shared     allocate shared memory\n", QEMU_ARCH_ALL)
> +STEXI
> +@item -mem-shared
> +@findex -mem-shared
> +Allocate guest RAM with shared mapping.  Whether the allocation is
> +anonymous or not (with -mem-path), QEMU will allocate a shared memory that
> +can be shared by unrelated processes, such as vhost-user backends.
> +ETEXI
> +
>  DEF("k", HAS_ARG, QEMU_OPTION_k,
>      "-k language     use keyboard layout (for example 'fr' for French)\n",
>      QEMU_ARCH_ALL)
> diff --git a/vl.c b/vl.c
> index 6a65a64bfd..53b1155455 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -143,6 +143,7 @@ const char* keyboard_layout = NULL;
>  ram_addr_t ram_size;
>  const char *mem_path = NULL;
>  int mem_prealloc = 0; /* force preallocation of physical target memory */
> +int mem_shared = 0;
Also what happened to no more globals policy?

>  bool enable_mlock = false;
>  bool enable_cpu_pm = false;
>  int nb_nics;
> @@ -3172,6 +3173,9 @@ int main(int argc, char **argv, char **envp)
>              case QEMU_OPTION_mem_prealloc:
>                  mem_prealloc = 1;
>                  break;
> +            case QEMU_OPTION_mem_shared:
> +                mem_shared = 1;
> +                break;
>              case QEMU_OPTION_d:
>                  log_mask = optarg;
>                  break;



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-28 14:15 [PATCH 0/2] RFC: add -mem-shared option Marc-André Lureau
                   ` (2 preceding siblings ...)
  2019-11-28 16:10 ` [PATCH 0/2] RFC: add " Eduardo Habkost
@ 2019-11-28 16:59 ` Dr. David Alan Gilbert
  2019-11-29  9:23   ` Igor Mammedov
  2019-11-29  4:37 ` no-reply
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 41+ messages in thread
From: Dr. David Alan Gilbert @ 2019-11-28 16:59 UTC (permalink / raw)
  To: Marc-André Lureau, imammedo
  Cc: Paolo Bonzini, Richard Henderson, qemu-devel, stefanha, Eduardo Habkost

* Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> Hi,
> 
> Setting up shared memory for vhost-user is a bit complicated from
> command line, as it requires NUMA setup such as: m 4G -object
> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> node,memdev=mem.
> 
> Instead, I suggest to add a -mem-shared option for non-numa setups,
> that will make the -mem-path or anonymouse memory shareable.
> 
> Comments welcome,

It's worth checking with Igor (cc'd) - he said he was going to work on
something similar.

One other thing this fixes is that it lets you potentially do vhost-user
on s390, since it currently has no NUMA.

Dave

> Marc-André Lureau (2):
>   memfd: add qemu_memfd_open()
>   Add -mem-shared option
> 
>  exec.c                  | 11 ++++++++++-
>  hw/core/numa.c          | 16 +++++++++++++++-
>  include/qemu/memfd.h    |  3 +++
>  include/sysemu/sysemu.h |  1 +
>  qemu-options.hx         | 10 ++++++++++
>  util/memfd.c            | 39 +++++++++++++++++++++++++--------------
>  vl.c                    |  4 ++++
>  7 files changed, 68 insertions(+), 16 deletions(-)
> 
> -- 
> 2.24.0
> 
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-28 16:28   ` Igor Mammedov
@ 2019-11-28 20:31     ` Marc-André Lureau
  2019-11-29 10:07       ` Igor Mammedov
  0 siblings, 1 reply; 41+ messages in thread
From: Marc-André Lureau @ 2019-11-28 20:31 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Paolo Bonzini, Eduardo Habkost, QEMU, Stefan Hajnoczi, Richard Henderson

Hi

On Thu, Nov 28, 2019 at 9:25 PM Igor Mammedov <imammedo@redhat.com> wrote:
>
> On Thu, 28 Nov 2019 18:15:18 +0400
> Marc-André Lureau <marcandre.lureau@redhat.com> wrote:
>
> > Add an option to simplify shared memory / vhost-user setup.
> >
> > Currently, using vhost-user requires NUMA setup such as:
> > -m 4G -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa node,memdev=mem
> >
> > As there is no other way to allocate shareable RAM, afaik.
> >
> > -mem-shared aims to have a simple way instead: -m 4G -mem-shared
> User always can write a wrapper script if verbose CLI is too much,
> and we won't have to deal with myriad permutations to maintain.

Sure, but that's not exactly making it easier for the user,
documentation etc (or machine that do not support numa as David
mentionned).

>
> Also current -mem-path/prealloc in combination with memdevs is
> the source of problems (as ram allocation uses 2 different paths).
> It's possible to fix with a kludge but I'd rather fix it properly.

I agree, however I think it's a separate problems. We don't have to
fix both simultaneously. The semantic of a new CLI -mem-shared (or
shared=on etc) can be defined and implemented in a simple way, before
internal refactoring.

> So during 5.0, I'm planning to consolidate -mem-path/prealloc
> handling around memory backend internally (and possibly deprecate them),
> so the only way to allocate RAM for guest would be via memdevs.
> (reducing number of options an globals that they use)
>

That would be great indeed. I tried to look at that in the past, but
was a it overwhelmed by the amount of details and/or code complexity.

> So user who wants something non trivial could override default
> non-numa behavior with
>   -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
>   -machine memdev=mem
> or use any other backend that suits theirs needs.

That's nice, but not as friendly as a simple -mem-shared.

thanks

>
> > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > ---
> >  exec.c                  | 11 ++++++++++-
> >  hw/core/numa.c          | 16 +++++++++++++++-
> >  include/sysemu/sysemu.h |  1 +
> >  qemu-options.hx         | 10 ++++++++++
> >  vl.c                    |  4 ++++
> >  5 files changed, 40 insertions(+), 2 deletions(-)
> >
> > diff --git a/exec.c b/exec.c
> > index ffdb518535..4e53937eaf 100644
> > --- a/exec.c
> > +++ b/exec.c
> > @@ -72,6 +72,10 @@
> >  #include "qemu/mmap-alloc.h"
> >  #endif
> >
> > +#ifdef CONFIG_POSIX
> > +#include "qemu/memfd.h"
> > +#endif
> > +
> >  #include "monitor/monitor.h"
> >
> >  //#define DEBUG_SUBPAGE
> > @@ -2347,7 +2351,12 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
> >      bool created;
> >      RAMBlock *block;
> >
> > -    fd = file_ram_open(mem_path, memory_region_name(mr), &created, errp);
> > +    if (mem_path) {
> > +        fd = file_ram_open(mem_path, memory_region_name(mr), &created, errp);
> > +    } else {
> > +        fd = qemu_memfd_open(mr->name, size,
> > +                             F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL, errp);
> > +    }
>
> that's what I'm mostly against, as it spills out memdev impl. details
> into generic code.
>
> >      if (fd < 0) {
> >          return NULL;
> >      }
> > diff --git a/hw/core/numa.c b/hw/core/numa.c
> > index e3332a984f..6f72cddb1c 100644
> > --- a/hw/core/numa.c
> > +++ b/hw/core/numa.c
> > @@ -493,7 +493,8 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
> >      if (mem_path) {
> >  #ifdef __linux__
> >          Error *err = NULL;
> > -        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, 0,
> > +        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0,
> > +                                         mem_shared ? RAM_SHARED : 0,
> >                                           mem_path, &err);
> this will be gone and replaced by memory region that memdev initializes.
>
> >          if (err) {
> >              error_report_err(err);
> > @@ -513,6 +514,19 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
> >  #else
> >          fprintf(stderr, "-mem-path not supported on this host\n");
> >          exit(1);
> > +#endif
> > +    } else if (mem_shared) {
> > +#ifdef CONFIG_POSIX
> > +        Error *err = NULL;
> > +        memory_region_init_ram_from_file(mr, owner, NULL, ram_size, 0,
> > +                                         RAM_SHARED, NULL, &err);
> > +        if (err) {
> > +            error_report_err(err);
> > +            exit(1);
> > +        }
> > +#else
> > +        fprintf(stderr, "-mem-shared not supported on this host\n");
> > +        exit(1);
> >  #endif
> >      } else {
> >          memory_region_init_ram_nomigrate(mr, owner, name, ram_size, &error_fatal);
> > diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> > index 80c57fdc4e..80db8465a9 100644
> > --- a/include/sysemu/sysemu.h
> > +++ b/include/sysemu/sysemu.h
> > @@ -55,6 +55,7 @@ extern bool enable_cpu_pm;
> >  extern QEMUClockType rtc_clock;
> >  extern const char *mem_path;
> >  extern int mem_prealloc;
> > +extern int mem_shared;
> >
> >  #define MAX_OPTION_ROMS 16
> >  typedef struct QEMUOptionRom {
> > diff --git a/qemu-options.hx b/qemu-options.hx
> > index 65c9473b73..4c69b03ad3 100644
> > --- a/qemu-options.hx
> > +++ b/qemu-options.hx
> > @@ -394,6 +394,16 @@ STEXI
> >  Preallocate memory when using -mem-path.
> >  ETEXI
> >
> > +DEF("mem-shared", 0, QEMU_OPTION_mem_shared,
> > +    "-mem-shared     allocate shared memory\n", QEMU_ARCH_ALL)
> > +STEXI
> > +@item -mem-shared
> > +@findex -mem-shared
> > +Allocate guest RAM with shared mapping.  Whether the allocation is
> > +anonymous or not (with -mem-path), QEMU will allocate a shared memory that
> > +can be shared by unrelated processes, such as vhost-user backends.
> > +ETEXI
> > +
> >  DEF("k", HAS_ARG, QEMU_OPTION_k,
> >      "-k language     use keyboard layout (for example 'fr' for French)\n",
> >      QEMU_ARCH_ALL)
> > diff --git a/vl.c b/vl.c
> > index 6a65a64bfd..53b1155455 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -143,6 +143,7 @@ const char* keyboard_layout = NULL;
> >  ram_addr_t ram_size;
> >  const char *mem_path = NULL;
> >  int mem_prealloc = 0; /* force preallocation of physical target memory */
> > +int mem_shared = 0;
> Also what happened to no more globals policy?
>
> >  bool enable_mlock = false;
> >  bool enable_cpu_pm = false;
> >  int nb_nics;
> > @@ -3172,6 +3173,9 @@ int main(int argc, char **argv, char **envp)
> >              case QEMU_OPTION_mem_prealloc:
> >                  mem_prealloc = 1;
> >                  break;
> > +            case QEMU_OPTION_mem_shared:
> > +                mem_shared = 1;
> > +                break;
> >              case QEMU_OPTION_d:
> >                  log_mask = optarg;
> >                  break;
>
>


-- 
Marc-André Lureau


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-28 14:15 [PATCH 0/2] RFC: add -mem-shared option Marc-André Lureau
                   ` (3 preceding siblings ...)
  2019-11-28 16:59 ` Dr. David Alan Gilbert
@ 2019-11-29  4:37 ` no-reply
  2019-11-29  5:34 ` no-reply
  2019-11-29  7:02 ` Gerd Hoffmann
  6 siblings, 0 replies; 41+ messages in thread
From: no-reply @ 2019-11-29  4:37 UTC (permalink / raw)
  To: marcandre.lureau
  Cc: ehabkost, qemu-devel, stefanha, pbonzini, marcandre.lureau, rth

Patchew URL: https://patchew.org/QEMU/20191128141518.628245-1-marcandre.lureau@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      x86_64-softmmu/accel/tcg/tcg-runtime-gvec.o
  CC      x86_64-softmmu/accel/tcg/cpu-exec.o
/tmp/qemu-test/src/exec.c: In function 'qemu_ram_alloc_from_file':
/tmp/qemu-test/src/exec.c:2366:12: error: 'created' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         if (created) {
            ^
cc1: all warnings being treated as errors
make[1]: *** [exec.o] Error 1
make[1]: *** Waiting for unfinished jobs....
  CC      aarch64-softmmu/disas.o
  GEN     aarch64-softmmu/gdbstub-xml.c
/tmp/qemu-test/src/exec.c: In function 'qemu_ram_alloc_from_file':
/tmp/qemu-test/src/exec.c:2366:12: error: 'created' may be used uninitialized in this function [-Werror=maybe-uninitialized]
         if (created) {
            ^
cc1: all warnings being treated as errors
make[1]: *** [exec.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [x86_64-softmmu/all] Error 2
make: *** Waiting for unfinished jobs....
make: *** [aarch64-softmmu/all] Error 2
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in <module>
    sys.exit(main())
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=ce348942cc144218ab665262b5e2bf34', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-fts2mukp/src/docker-src.2019-11-28-23.35.08.15514:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=ce348942cc144218ab665262b5e2bf34
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-fts2mukp/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    2m27.615s
user    0m8.146s


The full log is available at
http://patchew.org/logs/20191128141518.628245-1-marcandre.lureau@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-28 14:15 [PATCH 0/2] RFC: add -mem-shared option Marc-André Lureau
                   ` (4 preceding siblings ...)
  2019-11-29  4:37 ` no-reply
@ 2019-11-29  5:34 ` no-reply
  2019-11-29  7:02 ` Gerd Hoffmann
  6 siblings, 0 replies; 41+ messages in thread
From: no-reply @ 2019-11-29  5:34 UTC (permalink / raw)
  To: marcandre.lureau
  Cc: ehabkost, qemu-devel, stefanha, pbonzini, marcandre.lureau, rth

Patchew URL: https://patchew.org/QEMU/20191128141518.628245-1-marcandre.lureau@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH 0/2] RFC: add -mem-shared option
Type: series
Message-id: 20191128141518.628245-1-marcandre.lureau@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
e2537da Add -mem-shared option
623d044 memfd: add qemu_memfd_open()

=== OUTPUT BEGIN ===
1/2 Checking commit 623d044023d1 (memfd: add qemu_memfd_open())
2/2 Checking commit e2537da34663 (Add -mem-shared option)
ERROR: do not initialise globals to 0 or NULL
#123: FILE: vl.c:146:
+int mem_shared = 0;

total: 1 errors, 0 warnings, 90 lines checked

Patch 2/2 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20191128141518.628245-1-marcandre.lureau@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-28 14:15 [PATCH 0/2] RFC: add -mem-shared option Marc-André Lureau
                   ` (5 preceding siblings ...)
  2019-11-29  5:34 ` no-reply
@ 2019-11-29  7:02 ` Gerd Hoffmann
  2019-11-29  7:30   ` Marc-André Lureau
  6 siblings, 1 reply; 41+ messages in thread
From: Gerd Hoffmann @ 2019-11-29  7:02 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: Paolo Bonzini, Richard Henderson, qemu-devel, stefanha, Eduardo Habkost

On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> Hi,
> 
> Setting up shared memory for vhost-user is a bit complicated from
> command line, as it requires NUMA setup such as: m 4G -object
> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> node,memdev=mem.
> 
> Instead, I suggest to add a -mem-shared option for non-numa setups,
> that will make the -mem-path or anonymouse memory shareable.

Is it an option to just use memfd by default (if available)?

cheers,
  Gerd



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  7:02 ` Gerd Hoffmann
@ 2019-11-29  7:30   ` Marc-André Lureau
  2019-11-29  9:27     ` Daniel P. Berrangé
  0 siblings, 1 reply; 41+ messages in thread
From: Marc-André Lureau @ 2019-11-29  7:30 UTC (permalink / raw)
  To: Gerd Hoffmann
  Cc: Paolo Bonzini, Richard Henderson, qemu-devel, Stefan Hajnoczi,
	Eduardo Habkost

Hi

On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
>
> On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > Hi,
> >
> > Setting up shared memory for vhost-user is a bit complicated from
> > command line, as it requires NUMA setup such as: m 4G -object
> > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > node,memdev=mem.
> >
> > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > that will make the -mem-path or anonymouse memory shareable.
>
> Is it an option to just use memfd by default (if available)?

Yes, with a fallback path currently using a temporary file under /tmp
(we may want to use shm_open() instead, or a different location such
as XDG_RUNTIME_DIR? - and use O_TMPFILE)



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-28 16:10 ` [PATCH 0/2] RFC: add " Eduardo Habkost
@ 2019-11-29  9:18   ` Igor Mammedov
  2019-11-29  9:31   ` Paolo Bonzini
  1 sibling, 0 replies; 41+ messages in thread
From: Igor Mammedov @ 2019-11-29  9:18 UTC (permalink / raw)
  To: Eduardo Habkost
  Cc: Marc-André Lureau, Richard Henderson, qemu-devel, stefanha,
	Paolo Bonzini

On Thu, 28 Nov 2019 13:10:21 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > Hi,
> > 
> > Setting up shared memory for vhost-user is a bit complicated from
> > command line, as it requires NUMA setup such as: m 4G -object
> > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > node,memdev=mem.
> > 
> > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > that will make the -mem-path or anonymouse memory shareable.  
> 
> Can we make this be a "-m" option?
> 
> Or, even better: can we make "-m" options be automatically
> translated to memory-backend-* options somehow?
-m SIZE will be translated to a one of memdev properties,
so new suboption potentially could be an aliased to
another memdev property.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-28 16:59 ` Dr. David Alan Gilbert
@ 2019-11-29  9:23   ` Igor Mammedov
  0 siblings, 0 replies; 41+ messages in thread
From: Igor Mammedov @ 2019-11-29  9:23 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Eduardo Habkost, qemu-devel, stefanha, Paolo Bonzini,
	Marc-André Lureau, Richard Henderson

On Thu, 28 Nov 2019 16:59:33 +0000
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:

> * Marc-André Lureau (marcandre.lureau@redhat.com) wrote:
> > Hi,
> > 
> > Setting up shared memory for vhost-user is a bit complicated from
> > command line, as it requires NUMA setup such as: m 4G -object
> > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > node,memdev=mem.
> > 
> > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > that will make the -mem-path or anonymouse memory shareable.
> > 
> > Comments welcome,  
> 
> It's worth checking with Igor (cc'd) - he said he was going to work on
> something similar.
> 
> One other thing this fixes is that it lets you potentially do vhost-user
> on s390, since it currently has no NUMA.
Switching to memdev will let vhost-user on s390 work as well.
This is convenience option and workarounds inability to set main RAM
properties in current impl. 


> Dave
> 
> > Marc-André Lureau (2):
> >   memfd: add qemu_memfd_open()
> >   Add -mem-shared option
> > 
> >  exec.c                  | 11 ++++++++++-
> >  hw/core/numa.c          | 16 +++++++++++++++-
> >  include/qemu/memfd.h    |  3 +++
> >  include/sysemu/sysemu.h |  1 +
> >  qemu-options.hx         | 10 ++++++++++
> >  util/memfd.c            | 39 +++++++++++++++++++++++++--------------
> >  vl.c                    |  4 ++++
> >  7 files changed, 68 insertions(+), 16 deletions(-)
> > 
> > -- 
> > 2.24.0
> > 
> >   
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> 
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  7:30   ` Marc-André Lureau
@ 2019-11-29  9:27     ` Daniel P. Berrangé
  2019-11-29  9:31       ` Marc-André Lureau
  2019-11-29  9:33       ` Paolo Bonzini
  0 siblings, 2 replies; 41+ messages in thread
From: Daniel P. Berrangé @ 2019-11-29  9:27 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: Eduardo Habkost, qemu-devel, Gerd Hoffmann, Stefan Hajnoczi,
	Paolo Bonzini, Richard Henderson

On Fri, Nov 29, 2019 at 11:30:51AM +0400, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> >
> > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > > Hi,
> > >
> > > Setting up shared memory for vhost-user is a bit complicated from
> > > command line, as it requires NUMA setup such as: m 4G -object
> > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > node,memdev=mem.
> > >
> > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > that will make the -mem-path or anonymouse memory shareable.
> >
> > Is it an option to just use memfd by default (if available)?
> 
> Yes, with a fallback path currently using a temporary file under /tmp
> (we may want to use shm_open() instead, or a different location such
> as XDG_RUNTIME_DIR? - and use O_TMPFILE)

We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
where near standard across all OS distros, and even if on tmpfs these
dirs can be size limited to a significant subset of available RAM. IOW
we can't safely use them unless explicitly told to.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:27     ` Daniel P. Berrangé
@ 2019-11-29  9:31       ` Marc-André Lureau
  2019-11-29  9:42         ` Daniel P. Berrangé
  2019-11-29  9:33       ` Paolo Bonzini
  1 sibling, 1 reply; 41+ messages in thread
From: Marc-André Lureau @ 2019-11-29  9:31 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Eduardo Habkost, qemu-devel, Gerd Hoffmann, Stefan Hajnoczi,
	Paolo Bonzini, Richard Henderson

Hi

On Fri, Nov 29, 2019 at 1:27 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Fri, Nov 29, 2019 at 11:30:51AM +0400, Marc-André Lureau wrote:
> > Hi
> >
> > On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > >
> > > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > > > Hi,
> > > >
> > > > Setting up shared memory for vhost-user is a bit complicated from
> > > > command line, as it requires NUMA setup such as: m 4G -object
> > > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > > node,memdev=mem.
> > > >
> > > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > > that will make the -mem-path or anonymouse memory shareable.
> > >
> > > Is it an option to just use memfd by default (if available)?
> >
> > Yes, with a fallback path currently using a temporary file under /tmp
> > (we may want to use shm_open() instead, or a different location such
> > as XDG_RUNTIME_DIR? - and use O_TMPFILE)
>
> We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> where near standard across all OS distros, and even if on tmpfs these
> dirs can be size limited to a significant subset of available RAM. IOW
> we can't safely use them unless explicitly told to.

Is shm_open() acceptable instead?

Imho, -mem-shared should do a best effort by default, but can fail.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-28 16:10 ` [PATCH 0/2] RFC: add " Eduardo Habkost
  2019-11-29  9:18   ` Igor Mammedov
@ 2019-11-29  9:31   ` Paolo Bonzini
  2019-11-29 10:23     ` Igor Mammedov
  2019-11-29 20:21     ` Eduardo Habkost
  1 sibling, 2 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-11-29  9:31 UTC (permalink / raw)
  To: Eduardo Habkost, Marc-André Lureau
  Cc: stefanha, qemu-devel, Richard Henderson

On 28/11/19 17:10, Eduardo Habkost wrote:
> On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
>> Hi,
>>
>> Setting up shared memory for vhost-user is a bit complicated from
>> command line, as it requires NUMA setup such as: m 4G -object
>> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
>> node,memdev=mem.
>>
>> Instead, I suggest to add a -mem-shared option for non-numa setups,
>> that will make the -mem-path or anonymouse memory shareable.
> 
> Can we make this be a "-m" option?
> 
> Or, even better: can we make "-m" options be automatically
> translated to memory-backend-* options somehow?
> 

The original idea was to always support one NUMA node, so that you could
do "-numa node,memdev=..." to specify a memory backend with -object.
However, this is not possible anymore since

    if (!mc->cpu_index_to_instance_props ||
        !mc->get_default_cpu_node_id) {
        error_setg(errp, "NUMA is not supported by this machine-type");
        return;
    }

has been added to hw/core/numa.c.

Therefore, I think instead of -mem-shared we should add a "-m
memdev=..." option.  This option:

* would be mutually exclusive with both -mem-path

* would be handled from allocate_system_memory_nonnuma.

* could be mutually exclusive "-numa node", or could just be mutually
exclusive with "-numa node,memdev=..." (the logical conclusion of that
however would be an undeprecation of "-numa node,mem=...", so that has
to be taken into account as well).

Paolo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:27     ` Daniel P. Berrangé
  2019-11-29  9:31       ` Marc-André Lureau
@ 2019-11-29  9:33       ` Paolo Bonzini
  2019-11-29  9:39         ` Daniel P. Berrangé
  2019-11-29 10:13         ` Igor Mammedov
  1 sibling, 2 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-11-29  9:33 UTC (permalink / raw)
  To: Daniel P. Berrangé, Marc-André Lureau
  Cc: qemu-devel, Eduardo Habkost, Gerd Hoffmann, Stefan Hajnoczi,
	Richard Henderson

On 29/11/19 10:27, Daniel P. Berrangé wrote:
>> Yes, with a fallback path currently using a temporary file under /tmp
>> (we may want to use shm_open() instead, or a different location such
>> as XDG_RUNTIME_DIR? - and use O_TMPFILE)
> We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> where near standard across all OS distros, and even if on tmpfs these
> dirs can be size limited to a significant subset of available RAM. IOW
> we can't safely use them unless explicitly told to.

Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
hostmem-memfd.c though, basically as a fallback option?  In principle
one could even use getmntent to search for a hugetlbfs mount.

Paolo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:33       ` Paolo Bonzini
@ 2019-11-29  9:39         ` Daniel P. Berrangé
  2019-11-29  9:52           ` Paolo Bonzini
  2019-11-29 10:13         ` Igor Mammedov
  1 sibling, 1 reply; 41+ messages in thread
From: Daniel P. Berrangé @ 2019-11-29  9:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Eduardo Habkost, qemu-devel, Gerd Hoffmann, Stefan Hajnoczi,
	Marc-André Lureau, Richard Henderson

On Fri, Nov 29, 2019 at 10:33:39AM +0100, Paolo Bonzini wrote:
> On 29/11/19 10:27, Daniel P. Berrangé wrote:
> >> Yes, with a fallback path currently using a temporary file under /tmp
> >> (we may want to use shm_open() instead, or a different location such
> >> as XDG_RUNTIME_DIR? - and use O_TMPFILE)
> > We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> > where near standard across all OS distros, and even if on tmpfs these
> > dirs can be size limited to a significant subset of available RAM. IOW
> > we can't safely use them unless explicitly told to.
> 
> Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
> hostmem-memfd.c though, basically as a fallback option?  In principle
> one could even use getmntent to search for a hugetlbfs mount.

With mkstemp you still need to pick a location, and I don't think it
is clear there is a reliable choice that will always work.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:31       ` Marc-André Lureau
@ 2019-11-29  9:42         ` Daniel P. Berrangé
  2019-11-29  9:45           ` Marc-André Lureau
  0 siblings, 1 reply; 41+ messages in thread
From: Daniel P. Berrangé @ 2019-11-29  9:42 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: Eduardo Habkost, qemu-devel, Gerd Hoffmann, Stefan Hajnoczi,
	Paolo Bonzini, Richard Henderson

On Fri, Nov 29, 2019 at 01:31:11PM +0400, Marc-André Lureau wrote:
> Hi
> 
> On Fri, Nov 29, 2019 at 1:27 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Fri, Nov 29, 2019 at 11:30:51AM +0400, Marc-André Lureau wrote:
> > > Hi
> > >
> > > On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > > >
> > > > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > > > > Hi,
> > > > >
> > > > > Setting up shared memory for vhost-user is a bit complicated from
> > > > > command line, as it requires NUMA setup such as: m 4G -object
> > > > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > > > node,memdev=mem.
> > > > >
> > > > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > > > that will make the -mem-path or anonymouse memory shareable.
> > > >
> > > > Is it an option to just use memfd by default (if available)?
> > >
> > > Yes, with a fallback path currently using a temporary file under /tmp
> > > (we may want to use shm_open() instead, or a different location such
> > > as XDG_RUNTIME_DIR? - and use O_TMPFILE)
> >
> > We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> > where near standard across all OS distros, and even if on tmpfs these
> > dirs can be size limited to a significant subset of available RAM. IOW
> > we can't safely use them unless explicitly told to.
> 
> Is shm_open() acceptable instead?
> 
> Imho, -mem-shared should do a best effort by default, but can fail.

Possibly, but if we're relying on shm_open choosing the path, then
its harder for users to know what files to clean up when QEMU crashes
or otherwise exits wthout a shm_unlink

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:42         ` Daniel P. Berrangé
@ 2019-11-29  9:45           ` Marc-André Lureau
  2019-11-29 11:44             ` Gerd Hoffmann
  0 siblings, 1 reply; 41+ messages in thread
From: Marc-André Lureau @ 2019-11-29  9:45 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Eduardo Habkost, qemu-devel, Gerd Hoffmann, Stefan Hajnoczi,
	Paolo Bonzini, Richard Henderson

On Fri, Nov 29, 2019 at 1:42 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Fri, Nov 29, 2019 at 01:31:11PM +0400, Marc-André Lureau wrote:
> > Hi
> >
> > On Fri, Nov 29, 2019 at 1:27 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >
> > > On Fri, Nov 29, 2019 at 11:30:51AM +0400, Marc-André Lureau wrote:
> > > > Hi
> > > >
> > > > On Fri, Nov 29, 2019 at 11:03 AM Gerd Hoffmann <kraxel@redhat.com> wrote:
> > > > >
> > > > > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Setting up shared memory for vhost-user is a bit complicated from
> > > > > > command line, as it requires NUMA setup such as: m 4G -object
> > > > > > memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > > > > > node,memdev=mem.
> > > > > >
> > > > > > Instead, I suggest to add a -mem-shared option for non-numa setups,
> > > > > > that will make the -mem-path or anonymouse memory shareable.
> > > > >
> > > > > Is it an option to just use memfd by default (if available)?
> > > >
> > > > Yes, with a fallback path currently using a temporary file under /tmp
> > > > (we may want to use shm_open() instead, or a different location such
> > > > as XDG_RUNTIME_DIR? - and use O_TMPFILE)
> > >
> > > We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> > > where near standard across all OS distros, and even if on tmpfs these
> > > dirs can be size limited to a significant subset of available RAM. IOW
> > > we can't safely use them unless explicitly told to.
> >
> > Is shm_open() acceptable instead?
> >
> > Imho, -mem-shared should do a best effort by default, but can fail.
>
> Possibly, but if we're relying on shm_open choosing the path, then
> its harder for users to know what files to clean up when QEMU crashes
> or otherwise exits wthout a shm_unlink

True, although I think you can call shm_unlink() right after
shm_open() that should limit the risk of those forgotten files.



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:39         ` Daniel P. Berrangé
@ 2019-11-29  9:52           ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-11-29  9:52 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Eduardo Habkost, qemu-devel, Gerd Hoffmann, Stefan Hajnoczi,
	Marc-André Lureau, Richard Henderson

On 29/11/19 10:39, Daniel P. Berrangé wrote:
> On Fri, Nov 29, 2019 at 10:33:39AM +0100, Paolo Bonzini wrote:
>> On 29/11/19 10:27, Daniel P. Berrangé wrote:
>>>> Yes, with a fallback path currently using a temporary file under /tmp
>>>> (we may want to use shm_open() instead, or a different location such
>>>> as XDG_RUNTIME_DIR? - and use O_TMPFILE)
>>> We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
>>> where near standard across all OS distros, and even if on tmpfs these
>>> dirs can be size limited to a significant subset of available RAM. IOW
>>> we can't safely use them unless explicitly told to.
>>
>> Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
>> hostmem-memfd.c though, basically as a fallback option?  In principle
>> one could even use getmntent to search for a hugetlbfs mount.
> 
> With mkstemp you still need to pick a location, and I don't think it
> is clear there is a reliable choice that will always work.

Sorry, I meant mktemp (which is almost never the right choice so my
brain fat-fingered it...) + shm_open(O_CREAT|O_EXCL).

Paolo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-28 20:31     ` Marc-André Lureau
@ 2019-11-29 10:07       ` Igor Mammedov
  2019-11-29 10:11         ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Igor Mammedov @ 2019-11-29 10:07 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: Paolo Bonzini, Eduardo Habkost, QEMU, Stefan Hajnoczi, Richard Henderson

On Fri, 29 Nov 2019 00:31:32 +0400
Marc-André Lureau <marcandre.lureau@gmail.com> wrote:

> Hi
> 
> On Thu, Nov 28, 2019 at 9:25 PM Igor Mammedov <imammedo@redhat.com> wrote:
> >
> > On Thu, 28 Nov 2019 18:15:18 +0400
> > Marc-André Lureau <marcandre.lureau@redhat.com> wrote:
> >  
> > > Add an option to simplify shared memory / vhost-user setup.
> > >
> > > Currently, using vhost-user requires NUMA setup such as:
> > > -m 4G -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa node,memdev=mem
> > >
> > > As there is no other way to allocate shareable RAM, afaik.
> > >
> > > -mem-shared aims to have a simple way instead: -m 4G -mem-shared  
> > User always can write a wrapper script if verbose CLI is too much,
> > and we won't have to deal with myriad permutations to maintain.  
> 
> Sure, but that's not exactly making it easier for the user,
> documentation etc (or machine that do not support numa as David
> mentionned).
on positive side it makes behavior consistent (and need to be documented
in one place (memdev)), i.e. user knows that he will get what he specified 
on CLI.

With global options like you propose, good luck with figuring out
(and documenting it in user documentation will be nightmare)
if your config actually does what it's supposed to do or not.
That's unfortunate the state we are in right now with
mem-path/prealloc.

> > Also current -mem-path/prealloc in combination with memdevs is
> > the source of problems (as ram allocation uses 2 different paths).
> > It's possible to fix with a kludge but I'd rather fix it properly.  
> 
> I agree, however I think it's a separate problems. We don't have to
> fix both simultaneously. The semantic of a new CLI -mem-shared (or
> shared=on etc) can be defined and implemented in a simple way, before
> internal refactoring.
Well, it adds more invariants to already complex RAM allocation
I have to untangle. So lets look into it after memdev re-factoring.
I'll plan to post it right after merge window is open.

> > So during 5.0, I'm planning to consolidate -mem-path/prealloc
> > handling around memory backend internally (and possibly deprecate them),
> > so the only way to allocate RAM for guest would be via memdevs.
> > (reducing number of options an globals that they use)
> >  
> 
> That would be great indeed. I tried to look at that in the past, but
> was a it overwhelmed by the amount of details and/or code complexity.
Complexity is only one side of issue, I'm mainly refactoring it because
different approaches to allocate RAM lead to subtle bugs that's hard to
spot and fix. That's why I'm against adding more invariants to the pile.

> > So user who wants something non trivial could override default
> > non-numa behavior with
> >   -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
> >   -machine memdev=mem
> > or use any other backend that suits theirs needs.  
> 
> That's nice, but not as friendly as a simple -mem-shared.
(I still do not like idea of convenience options but it won't
get onto the way much if implemented as "global property" to memdev,
so I won't object if there is real demand for it)

Simplicity in global options is deceiving once you have mixed
RAM (main and as -device), good luck with documenting how it
works in all possible configs and making it robust (so it
won't explode later) (I wouldn't take on such task).
And then poor user will have to figure it out as well.

Verbose way makes allocating backing storage consistent,
which is much more important (including end user).
The rest could be scripted or one could use higher level
mgmt tools if simplicity is desired.


> thanks
> 
> >  
> > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> > > ---
> > >  exec.c                  | 11 ++++++++++-
> > >  hw/core/numa.c          | 16 +++++++++++++++-
> > >  include/sysemu/sysemu.h |  1 +
> > >  qemu-options.hx         | 10 ++++++++++
> > >  vl.c                    |  4 ++++
> > >  5 files changed, 40 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/exec.c b/exec.c
> > > index ffdb518535..4e53937eaf 100644
> > > --- a/exec.c
> > > +++ b/exec.c
> > > @@ -72,6 +72,10 @@
> > >  #include "qemu/mmap-alloc.h"
> > >  #endif
> > >
> > > +#ifdef CONFIG_POSIX
> > > +#include "qemu/memfd.h"
> > > +#endif
> > > +
> > >  #include "monitor/monitor.h"
> > >
> > >  //#define DEBUG_SUBPAGE
> > > @@ -2347,7 +2351,12 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr,
> > >      bool created;
> > >      RAMBlock *block;
> > >
> > > -    fd = file_ram_open(mem_path, memory_region_name(mr), &created, errp);
> > > +    if (mem_path) {
> > > +        fd = file_ram_open(mem_path, memory_region_name(mr), &created, errp);
> > > +    } else {
> > > +        fd = qemu_memfd_open(mr->name, size,
> > > +                             F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL, errp);
> > > +    }  
> >
> > that's what I'm mostly against, as it spills out memdev impl. details
> > into generic code.
> >  
> > >      if (fd < 0) {
> > >          return NULL;
> > >      }
> > > diff --git a/hw/core/numa.c b/hw/core/numa.c
> > > index e3332a984f..6f72cddb1c 100644
> > > --- a/hw/core/numa.c
> > > +++ b/hw/core/numa.c
> > > @@ -493,7 +493,8 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
> > >      if (mem_path) {
> > >  #ifdef __linux__
> > >          Error *err = NULL;
> > > -        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0, 0,
> > > +        memory_region_init_ram_from_file(mr, owner, name, ram_size, 0,
> > > +                                         mem_shared ? RAM_SHARED : 0,
> > >                                           mem_path, &err);  
> > this will be gone and replaced by memory region that memdev initializes.
> >  
> > >          if (err) {
> > >              error_report_err(err);
> > > @@ -513,6 +514,19 @@ static void allocate_system_memory_nonnuma(MemoryRegion *mr, Object *owner,
> > >  #else
> > >          fprintf(stderr, "-mem-path not supported on this host\n");
> > >          exit(1);
> > > +#endif
> > > +    } else if (mem_shared) {
> > > +#ifdef CONFIG_POSIX
> > > +        Error *err = NULL;
> > > +        memory_region_init_ram_from_file(mr, owner, NULL, ram_size, 0,
> > > +                                         RAM_SHARED, NULL, &err);
> > > +        if (err) {
> > > +            error_report_err(err);
> > > +            exit(1);
> > > +        }
> > > +#else
> > > +        fprintf(stderr, "-mem-shared not supported on this host\n");
> > > +        exit(1);
> > >  #endif
> > >      } else {
> > >          memory_region_init_ram_nomigrate(mr, owner, name, ram_size, &error_fatal);
> > > diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
> > > index 80c57fdc4e..80db8465a9 100644
> > > --- a/include/sysemu/sysemu.h
> > > +++ b/include/sysemu/sysemu.h
> > > @@ -55,6 +55,7 @@ extern bool enable_cpu_pm;
> > >  extern QEMUClockType rtc_clock;
> > >  extern const char *mem_path;
> > >  extern int mem_prealloc;
> > > +extern int mem_shared;
> > >
> > >  #define MAX_OPTION_ROMS 16
> > >  typedef struct QEMUOptionRom {
> > > diff --git a/qemu-options.hx b/qemu-options.hx
> > > index 65c9473b73..4c69b03ad3 100644
> > > --- a/qemu-options.hx
> > > +++ b/qemu-options.hx
> > > @@ -394,6 +394,16 @@ STEXI
> > >  Preallocate memory when using -mem-path.
> > >  ETEXI
> > >
> > > +DEF("mem-shared", 0, QEMU_OPTION_mem_shared,
> > > +    "-mem-shared     allocate shared memory\n", QEMU_ARCH_ALL)
> > > +STEXI
> > > +@item -mem-shared
> > > +@findex -mem-shared
> > > +Allocate guest RAM with shared mapping.  Whether the allocation is
> > > +anonymous or not (with -mem-path), QEMU will allocate a shared memory that
> > > +can be shared by unrelated processes, such as vhost-user backends.
> > > +ETEXI
> > > +
> > >  DEF("k", HAS_ARG, QEMU_OPTION_k,
> > >      "-k language     use keyboard layout (for example 'fr' for French)\n",
> > >      QEMU_ARCH_ALL)
> > > diff --git a/vl.c b/vl.c
> > > index 6a65a64bfd..53b1155455 100644
> > > --- a/vl.c
> > > +++ b/vl.c
> > > @@ -143,6 +143,7 @@ const char* keyboard_layout = NULL;
> > >  ram_addr_t ram_size;
> > >  const char *mem_path = NULL;
> > >  int mem_prealloc = 0; /* force preallocation of physical target memory */
> > > +int mem_shared = 0;  
> > Also what happened to no more globals policy?
> >  
> > >  bool enable_mlock = false;
> > >  bool enable_cpu_pm = false;
> > >  int nb_nics;
> > > @@ -3172,6 +3173,9 @@ int main(int argc, char **argv, char **envp)
> > >              case QEMU_OPTION_mem_prealloc:
> > >                  mem_prealloc = 1;
> > >                  break;
> > > +            case QEMU_OPTION_mem_shared:
> > > +                mem_shared = 1;
> > > +                break;
> > >              case QEMU_OPTION_d:
> > >                  log_mask = optarg;
> > >                  break;  
> >
> >  
> 
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-29 10:07       ` Igor Mammedov
@ 2019-11-29 10:11         ` Paolo Bonzini
  2019-11-29 12:01           ` Markus Armbruster
  2019-11-29 12:16           ` Igor Mammedov
  0 siblings, 2 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-11-29 10:11 UTC (permalink / raw)
  To: Igor Mammedov, Marc-André Lureau
  Cc: Eduardo Habkost, QEMU, Stefan Hajnoczi, Richard Henderson

On 29/11/19 11:07, Igor Mammedov wrote:
>>> So user who wants something non trivial could override default
>>> non-numa behavior with
>>>   -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
>>>   -machine memdev=mem
>>> or use any other backend that suits theirs needs.  
>> That's nice, but not as friendly as a simple -mem-shared.
> (I still do not like idea of convenience options but it won't
> get onto the way much if implemented as "global property" to memdev,
> so I won't object if there is real demand for it)

I agree with Igor, we should always think about the generic ("object
model") options and only then add convenience option.

It looks like the remaining point is to decide between "-m memdev" and
"-machine memdev".

Paolo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:33       ` Paolo Bonzini
  2019-11-29  9:39         ` Daniel P. Berrangé
@ 2019-11-29 10:13         ` Igor Mammedov
  2019-11-29 11:20           ` Paolo Bonzini
  1 sibling, 1 reply; 41+ messages in thread
From: Igor Mammedov @ 2019-11-29 10:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Daniel P. Berrangé,
	Eduardo Habkost, qemu-devel, Gerd Hoffmann, Stefan Hajnoczi,
	Marc-André Lureau, Richard Henderson

On Fri, 29 Nov 2019 10:33:39 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 29/11/19 10:27, Daniel P. Berrangé wrote:
> >> Yes, with a fallback path currently using a temporary file under /tmp
> >> (we may want to use shm_open() instead, or a different location such
> >> as XDG_RUNTIME_DIR? - and use O_TMPFILE)  
> > We can't assume either /tmp or XDG_RUNTIME_DIR is on tmpfs as that is no
> > where near standard across all OS distros, and even if on tmpfs these
> > dirs can be size limited to a significant subset of available RAM. IOW
> > we can't safely use them unless explicitly told to.  
> 
> Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
> hostmem-memfd.c though, basically as a fallback option?  In principle
> one could even use getmntent to search for a hugetlbfs mount.

So far fall backs proved to be a pain to deal with, as end users can't
be sure what machine they are getting eventually.
I'd prefer if we fail cleanly if asked config isn't possible and
let user fix vm configuration instead.

> Paolo
> 
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:31   ` Paolo Bonzini
@ 2019-11-29 10:23     ` Igor Mammedov
  2019-11-29 11:21       ` Paolo Bonzini
  2019-11-29 20:21     ` Eduardo Habkost
  1 sibling, 1 reply; 41+ messages in thread
From: Igor Mammedov @ 2019-11-29 10:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Marc-André Lureau, Richard Henderson, Eduardo Habkost,
	stefanha, qemu-devel

On Fri, 29 Nov 2019 10:31:36 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 28/11/19 17:10, Eduardo Habkost wrote:
> > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:  
> >> Hi,
> >>
> >> Setting up shared memory for vhost-user is a bit complicated from
> >> command line, as it requires NUMA setup such as: m 4G -object
> >> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> >> node,memdev=mem.
> >>
> >> Instead, I suggest to add a -mem-shared option for non-numa setups,
> >> that will make the -mem-path or anonymouse memory shareable.  
> > 
> > Can we make this be a "-m" option?
> > 
> > Or, even better: can we make "-m" options be automatically
> > translated to memory-backend-* options somehow?
> >   
[...]
> Therefore, I think instead of -mem-shared we should add a "-m
> memdev=..." option.  This option:
> 
> * would be mutually exclusive with both -mem-path
> 
> * would be handled from allocate_system_memory_nonnuma.
> 
> * could be mutually exclusive "-numa node", or could just be mutually
> exclusive with "-numa node,memdev=..." (the logical conclusion of that
> however would be an undeprecation of "-numa node,mem=...", so that has
> to be taken into account as well).

the plan was/still is to have memdev for main ram and deprecate
"-numa node,mem=..." for new machine types (can't be done for
old ones due to migration breakages and old libvirt which uses
it). So new machines will be forced to use "-numa memdev"
For old machines '-m memdev' provided memory region will used to
keep "-numa node,mem=..." working as it's now (however broken it is).


> 
> Paolo
> 
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29 10:13         ` Igor Mammedov
@ 2019-11-29 11:20           ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-11-29 11:20 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Daniel P. Berrangé,
	Eduardo Habkost, qemu-devel, Gerd Hoffmann, Stefan Hajnoczi,
	Marc-André Lureau, Richard Henderson

On 29/11/19 11:13, Igor Mammedov wrote:
>> Agreed, mkstemp+shm_open seems better.  Perhaps this could be done in
>> hostmem-memfd.c though, basically as a fallback option?  In principle
>> one could even use getmntent to search for a hugetlbfs mount.
> So far fall backs proved to be a pain to deal with, as end users can't
> be sure what machine they are getting eventually.
> I'd prefer if we fail cleanly if asked config isn't possible and
> let user fix vm configuration instead.
> 

As far as I know memfd vs. mktemp+shm_open+shm_unlink is pretty much the
same thing.  memfd provide additional features such as sealing, but
unless someone explicitly checks for memfd features, the two should look
the same.

Paolo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29 10:23     ` Igor Mammedov
@ 2019-11-29 11:21       ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-11-29 11:21 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Marc-André Lureau, Richard Henderson, Eduardo Habkost,
	stefanha, qemu-devel

On 29/11/19 11:23, Igor Mammedov wrote:
> [...]
>> Therefore, I think instead of -mem-shared we should add a "-m
>> memdev=..." option.  This option:
>>
>> * would be mutually exclusive with both -mem-path
>>
>> * would be handled from allocate_system_memory_nonnuma.
>>
>> * could be mutually exclusive "-numa node", or could just be mutually
>> exclusive with "-numa node,memdev=..." (the logical conclusion of that
>> however would be an undeprecation of "-numa node,mem=...", so that has
>> to be taken into account as well).
> the plan was/still is to have memdev for main ram and deprecate
> "-numa node,mem=..." for new machine types (can't be done for
> old ones due to migration breakages and old libvirt which uses
> it). So new machines will be forced to use "-numa memdev"
> For old machines '-m memdev' provided memory region will used to
> keep "-numa node,mem=..." working as it's now (however broken it is).

Ok, then "-m memdev=..." would only be mutually exclusive with
"-mem-path" and "-numa node,memdev=...".

Thanks!

Paolo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:45           ` Marc-André Lureau
@ 2019-11-29 11:44             ` Gerd Hoffmann
  0 siblings, 0 replies; 41+ messages in thread
From: Gerd Hoffmann @ 2019-11-29 11:44 UTC (permalink / raw)
  To: Marc-André Lureau
  Cc: Daniel P. Berrangé,
	Eduardo Habkost, qemu-devel, Stefan Hajnoczi, Paolo Bonzini,
	Richard Henderson

  Hi,

> > Possibly, but if we're relying on shm_open choosing the path, then
> > its harder for users to know what files to clean up when QEMU crashes
> > or otherwise exits wthout a shm_unlink
> 
> True, although I think you can call shm_unlink() right after
> shm_open() that should limit the risk of those forgotten files.

Yes, this is possible and pretty standard ...

cheers,
  Gerd



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-29 10:11         ` Paolo Bonzini
@ 2019-11-29 12:01           ` Markus Armbruster
  2019-11-29 20:31             ` Eduardo Habkost
  2019-11-29 12:16           ` Igor Mammedov
  1 sibling, 1 reply; 41+ messages in thread
From: Markus Armbruster @ 2019-11-29 12:01 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Eduardo Habkost, QEMU, Marc-André Lureau, Stefan Hajnoczi,
	Igor Mammedov, Richard Henderson

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 29/11/19 11:07, Igor Mammedov wrote:
>>>> So user who wants something non trivial could override default
>>>> non-numa behavior with
>>>>   -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
>>>>   -machine memdev=mem
>>>> or use any other backend that suits theirs needs.  
>>> That's nice, but not as friendly as a simple -mem-shared.
>> (I still do not like idea of convenience options but it won't
>> get onto the way much if implemented as "global property" to memdev,
>> so I won't object if there is real demand for it)
>
> I agree with Igor, we should always think about the generic ("object
> model") options and only then add convenience option.

+1

[...]



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-29 10:11         ` Paolo Bonzini
  2019-11-29 12:01           ` Markus Armbruster
@ 2019-11-29 12:16           ` Igor Mammedov
  2019-11-29 17:46             ` Paolo Bonzini
  1 sibling, 1 reply; 41+ messages in thread
From: Igor Mammedov @ 2019-11-29 12:16 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Eduardo Habkost, Marc-André Lureau, QEMU, Stefan Hajnoczi,
	Richard Henderson

On Fri, 29 Nov 2019 11:11:09 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 29/11/19 11:07, Igor Mammedov wrote:
> >>> So user who wants something non trivial could override default
> >>> non-numa behavior with
> >>>   -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
> >>>   -machine memdev=mem
> >>> or use any other backend that suits theirs needs.    
> >> That's nice, but not as friendly as a simple -mem-shared.  
> > (I still do not like idea of convenience options but it won't
> > get onto the way much if implemented as "global property" to memdev,
> > so I won't object if there is real demand for it)  
> 
> I agree with Igor, we should always think about the generic ("object
> model") options and only then add convenience option.
> 
> It looks like the remaining point is to decide between "-m memdev" and
> "-machine memdev".

I'm still entertaining idea, to use -device pc-dimm|some_ram_dev
for main RAM but that's not generic enough so I'd probably post
'-machine memdev' variant for now and think some more on -device
(we can add front-end re-factoring if necessary on top).

As for "-m", I'd make it just an alias that translates
 -m/mem-path/mem-prealloc
combination to appropriate '-object' for '-machine memdev' consumption.
That should cover compat purposes for old machines and the rest of
-m options (maxmem/slots) would be aliased to appropriate machine options.

That will allow us to get rid of ad-hoc '-m' parser. After that it would
be possible to deprecate '-m' in favor of machine properties, but that
probably will get quite a push back so unless I find compelling reason
to do it I won't care much as '-m' would be a lightweight shim over
machine properties.

> 
> Paolo
> 



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-29 12:16           ` Igor Mammedov
@ 2019-11-29 17:46             ` Paolo Bonzini
  2019-12-02  7:39               ` Igor Mammedov
  0 siblings, 1 reply; 41+ messages in thread
From: Paolo Bonzini @ 2019-11-29 17:46 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Thomas Huth, Eduardo Habkost, QEMU, Marc-André Lureau,
	Stefan Hajnoczi, Richard Henderson

On 29/11/19 13:16, Igor Mammedov wrote:
> As for "-m", I'd make it just an alias that translates
>  -m/mem-path/mem-prealloc

I think we should just deprecate -mem-path/-mem-prealloc in 5.0.  CCing
Thomas as mister deprecation. :)

> combination to appropriate '-object' for '-machine memdev' consumption.
> That should cover compat purposes for old machines and the rest of
> -m options (maxmem/slots) would be aliased to appropriate machine options.
> 
> That will allow us to get rid of ad-hoc '-m' parser. After that it would
> be possible to deprecate '-m' in favor of machine properties, but that
> probably will get quite a push back so unless I find compelling reason
> to do it I won't care much as '-m' would be a lightweight shim over
> machine properties.

Well, deprecation and ultimately removal is always a long path.
However, I understand your plan better now and having "-machine memdev"
makes sense if you want to also move/alias "-m maxmem"/"-m slots" from
"-m" to "-machine".

So ultimately you'd have two ways of configuring memory:

- -m N,maxmem=P,slots=Q
" -machine memdev=M,maxmem=P,slots=Q

Paolo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29  9:31   ` Paolo Bonzini
  2019-11-29 10:23     ` Igor Mammedov
@ 2019-11-29 20:21     ` Eduardo Habkost
  2019-12-01 15:40       ` Marc-André Lureau
  1 sibling, 1 reply; 41+ messages in thread
From: Eduardo Habkost @ 2019-11-29 20:21 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Marc-André Lureau, stefanha, qemu-devel, Richard Henderson

On Fri, Nov 29, 2019 at 10:31:36AM +0100, Paolo Bonzini wrote:
> On 28/11/19 17:10, Eduardo Habkost wrote:
> > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> >> Hi,
> >>
> >> Setting up shared memory for vhost-user is a bit complicated from
> >> command line, as it requires NUMA setup such as: m 4G -object
> >> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> >> node,memdev=mem.
> >>
> >> Instead, I suggest to add a -mem-shared option for non-numa setups,
> >> that will make the -mem-path or anonymouse memory shareable.
> > 
> > Can we make this be a "-m" option?
> > 
> > Or, even better: can we make "-m" options be automatically
> > translated to memory-backend-* options somehow?
> > 
> 
> The original idea was to always support one NUMA node, so that you could
> do "-numa node,memdev=..." to specify a memory backend with -object.
> However, this is not possible anymore since
> 
>     if (!mc->cpu_index_to_instance_props ||
>         !mc->get_default_cpu_node_id) {
>         error_setg(errp, "NUMA is not supported by this machine-type");
>         return;
>     }
> 
> has been added to hw/core/numa.c.
> 
> Therefore, I think instead of -mem-shared we should add a "-m
> memdev=..." option.  This option:
> 
> * would be mutually exclusive with both -mem-path
> 
> * would be handled from allocate_system_memory_nonnuma.
> 
> * could be mutually exclusive "-numa node", or could just be mutually
> exclusive with "-numa node,memdev=..." (the logical conclusion of that
> however would be an undeprecation of "-numa node,mem=...", so that has
> to be taken into account as well).

I completely agree we could do this.  I just think this misses
completely the point of this series, because usability of:

  -object memory-backend-file,...,share=on,id=mem -m ...,memdev=mem

is not much better than the usability of:

  -object memory-backend-file,...,share=on,id=mem -numa node,memdev=mem

-- 
Eduardo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-29 12:01           ` Markus Armbruster
@ 2019-11-29 20:31             ` Eduardo Habkost
  0 siblings, 0 replies; 41+ messages in thread
From: Eduardo Habkost @ 2019-11-29 20:31 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: QEMU, Marc-André Lureau, Stefan Hajnoczi, Igor Mammedov,
	Paolo Bonzini, Richard Henderson

On Fri, Nov 29, 2019 at 01:01:54PM +0100, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > On 29/11/19 11:07, Igor Mammedov wrote:
> >>>> So user who wants something non trivial could override default
> >>>> non-numa behavior with
> >>>>   -object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
> >>>>   -machine memdev=mem
> >>>> or use any other backend that suits theirs needs.  
> >>> That's nice, but not as friendly as a simple -mem-shared.
> >> (I still do not like idea of convenience options but it won't
> >> get onto the way much if implemented as "global property" to memdev,
> >> so I won't object if there is real demand for it)
> >
> > I agree with Igor, we should always think about the generic ("object
> > model") options and only then add convenience option.
> 
> +1

I agree with this.  I just hope we don't forget about the second
part.

-- 
Eduardo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-11-29 20:21     ` Eduardo Habkost
@ 2019-12-01 15:40       ` Marc-André Lureau
  2019-12-01 18:03         ` Paolo Bonzini
  0 siblings, 1 reply; 41+ messages in thread
From: Marc-André Lureau @ 2019-12-01 15:40 UTC (permalink / raw)
  To: Eduardo Habkost; +Cc: Paolo Bonzini, QEMU, Stefan Hajnoczi, Richard Henderson

Hi

On Sat, Nov 30, 2019 at 12:23 AM Eduardo Habkost <ehabkost@redhat.com> wrote:
>
> On Fri, Nov 29, 2019 at 10:31:36AM +0100, Paolo Bonzini wrote:
> > On 28/11/19 17:10, Eduardo Habkost wrote:
> > > On Thu, Nov 28, 2019 at 06:15:16PM +0400, Marc-André Lureau wrote:
> > >> Hi,
> > >>
> > >> Setting up shared memory for vhost-user is a bit complicated from
> > >> command line, as it requires NUMA setup such as: m 4G -object
> > >> memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on -numa
> > >> node,memdev=mem.
> > >>
> > >> Instead, I suggest to add a -mem-shared option for non-numa setups,
> > >> that will make the -mem-path or anonymouse memory shareable.
> > >
> > > Can we make this be a "-m" option?
> > >
> > > Or, even better: can we make "-m" options be automatically
> > > translated to memory-backend-* options somehow?
> > >
> >
> > The original idea was to always support one NUMA node, so that you could
> > do "-numa node,memdev=..." to specify a memory backend with -object.
> > However, this is not possible anymore since
> >
> >     if (!mc->cpu_index_to_instance_props ||
> >         !mc->get_default_cpu_node_id) {
> >         error_setg(errp, "NUMA is not supported by this machine-type");
> >         return;
> >     }
> >
> > has been added to hw/core/numa.c.
> >
> > Therefore, I think instead of -mem-shared we should add a "-m
> > memdev=..." option.  This option:
> >
> > * would be mutually exclusive with both -mem-path
> >
> > * would be handled from allocate_system_memory_nonnuma.
> >
> > * could be mutually exclusive "-numa node", or could just be mutually
> > exclusive with "-numa node,memdev=..." (the logical conclusion of that
> > however would be an undeprecation of "-numa node,mem=...", so that has
> > to be taken into account as well).
>
> I completely agree we could do this.  I just think this misses
> completely the point of this series, because usability of:
>
>   -object memory-backend-file,...,share=on,id=mem -m ...,memdev=mem
>
> is not much better than the usability of:
>
>   -object memory-backend-file,...,share=on,id=mem -numa node,memdev=mem
>

+1
Perhaps when all RAM allocation will occur through memory-backend,
"-mem-shared" could be simply an alias to "-global
memory-backend.shared=on"


-- 
Marc-André Lureau


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 0/2] RFC: add -mem-shared option
  2019-12-01 15:40       ` Marc-André Lureau
@ 2019-12-01 18:03         ` Paolo Bonzini
  0 siblings, 0 replies; 41+ messages in thread
From: Paolo Bonzini @ 2019-12-01 18:03 UTC (permalink / raw)
  To: Marc-André Lureau, Eduardo Habkost
  Cc: QEMU, Stefan Hajnoczi, Richard Henderson

On 01/12/19 16:40, Marc-André Lureau wrote:
>>> The original idea was to always support one NUMA node, so that you could
>>> do "-numa node,memdev=..." to specify a memory backend with -object.
>>> However, this is not possible anymore since
>>>
>>>     if (!mc->cpu_index_to_instance_props ||
>>>         !mc->get_default_cpu_node_id) {
>>>         error_setg(errp, "NUMA is not supported by this machine-type");
>>>         return;
>>>     }
>>>
>>> has been added to hw/core/numa.c.
>>>
>>> Therefore, I think instead of -mem-shared we should add a "-m
>>> memdev=..." option.  This option:
>>>
>>> * would be mutually exclusive with both -mem-path
>>>
>>> * would be handled from allocate_system_memory_nonnuma.
>>>
>>> * could be mutually exclusive "-numa node", or could just be mutually
>>> exclusive with "-numa node,memdev=..." (the logical conclusion of that
>>> however would be an undeprecation of "-numa node,mem=...", so that has
>>> to be taken into account as well).
>> I completely agree we could do this.  I just think this misses
>> completely the point of this series, because usability of:
>>
>>   -object memory-backend-file,...,share=on,id=mem -m ...,memdev=mem
>>
>> is not much better than the usability of:
>>
>>   -object memory-backend-file,...,share=on,id=mem -numa node,memdev=mem
>>
> +1
> Perhaps when all RAM allocation will occur through memory-backend,
> "-mem-shared" could be simply an alias to "-global
> memory-backend.shared=on"

Yes, this is the point.  There are two parts in this series:

(1) allowing use of vhost-user on non-NUMA machines

(2) providing syntactic sugar for it

I have no problem with -mem-shared for (2), but it should just be
syntactic sugar for (1).  It's okay if -mem-shared is a global variable
rather than an alias for -global; the important part is not to add any
feature that is not available from the QOM-style command line options.

Paolo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-11-29 17:46             ` Paolo Bonzini
@ 2019-12-02  7:39               ` Igor Mammedov
  2019-12-02 21:00                 ` Eduardo Habkost
  0 siblings, 1 reply; 41+ messages in thread
From: Igor Mammedov @ 2019-12-02  7:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Huth, Eduardo Habkost, QEMU, Marc-André Lureau,
	Stefan Hajnoczi, Richard Henderson

On Fri, 29 Nov 2019 18:46:12 +0100
Paolo Bonzini <pbonzini@redhat.com> wrote:

> On 29/11/19 13:16, Igor Mammedov wrote:
> > As for "-m", I'd make it just an alias that translates
> >  -m/mem-path/mem-prealloc  
> 
> I think we should just deprecate -mem-path/-mem-prealloc in 5.0.  CCing
> Thomas as mister deprecation. :)

I'll add that to my series
 
[...]



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-12-02  7:39               ` Igor Mammedov
@ 2019-12-02 21:00                 ` Eduardo Habkost
  2019-12-03  8:56                   ` Thomas Huth
  0 siblings, 1 reply; 41+ messages in thread
From: Eduardo Habkost @ 2019-12-02 21:00 UTC (permalink / raw)
  To: Igor Mammedov
  Cc: Thomas Huth, QEMU, Marc-André Lureau, Stefan Hajnoczi,
	Paolo Bonzini, Richard Henderson

On Mon, Dec 02, 2019 at 08:39:48AM +0100, Igor Mammedov wrote:
> On Fri, 29 Nov 2019 18:46:12 +0100
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> > On 29/11/19 13:16, Igor Mammedov wrote:
> > > As for "-m", I'd make it just an alias that translates
> > >  -m/mem-path/mem-prealloc  
> > 
> > I think we should just deprecate -mem-path/-mem-prealloc in 5.0.  CCing
> > Thomas as mister deprecation. :)
> 
> I'll add that to my series

Considering that the plan is to eventually reimplement those
options as syntactic sugar for memory backend options (hopefully
in less than 2 QEMU releases), what's the point of deprecating
them?

-- 
Eduardo



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-12-02 21:00                 ` Eduardo Habkost
@ 2019-12-03  8:56                   ` Thomas Huth
  2019-12-03 14:43                     ` Igor Mammedov
  2019-12-03 21:34                     ` Eduardo Habkost
  0 siblings, 2 replies; 41+ messages in thread
From: Thomas Huth @ 2019-12-03  8:56 UTC (permalink / raw)
  To: Eduardo Habkost, Igor Mammedov
  Cc: libvir-list, QEMU, Marc-André Lureau, Stefan Hajnoczi,
	Paolo Bonzini, Richard Henderson

On 02/12/2019 22.00, Eduardo Habkost wrote:
> On Mon, Dec 02, 2019 at 08:39:48AM +0100, Igor Mammedov wrote:
>> On Fri, 29 Nov 2019 18:46:12 +0100
>> Paolo Bonzini <pbonzini@redhat.com> wrote:
>>
>>> On 29/11/19 13:16, Igor Mammedov wrote:
>>>> As for "-m", I'd make it just an alias that translates
>>>>  -m/mem-path/mem-prealloc  
>>>
>>> I think we should just deprecate -mem-path/-mem-prealloc in 5.0.  CCing
>>> Thomas as mister deprecation. :)
>>
>> I'll add that to my series
> 
> Considering that the plan is to eventually reimplement those
> options as syntactic sugar for memory backend options (hopefully
> in less than 2 QEMU releases), what's the point of deprecating
> them?

Well, it depends on the "classification" [1] of the parameter...

Let's ask: What's the main purpose of the option?

Is it easier to use than the "full" option, and thus likely to be used
by a lot of people who run QEMU directly from the CLI? In that case it
should stay as "convenience option" and not be deprecated.

Or is the option merely there to give the upper layers like libvirt or
some few users and their scripts some more grace period to adapt their
code, but we all agree that the options are rather ugly and should
finally go away? Then it's rather a "legacy option" and the deprecation
process is the right way to go. Our QEMU interface is still way to
overcrowded, we should try to keep it as clean as possible.

 Thomas


[1] Using the terms from:
    https://www.youtube.com/watch?v=Oscjpkns7tM&t=8m



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-12-03  8:56                   ` Thomas Huth
@ 2019-12-03 14:43                     ` Igor Mammedov
  2019-12-03 21:34                     ` Eduardo Habkost
  1 sibling, 0 replies; 41+ messages in thread
From: Igor Mammedov @ 2019-12-03 14:43 UTC (permalink / raw)
  To: Thomas Huth
  Cc: Eduardo Habkost, libvir-list, QEMU, Marc-André Lureau,
	Stefan Hajnoczi, Paolo Bonzini, Richard Henderson

On Tue, 3 Dec 2019 09:56:15 +0100
Thomas Huth <thuth@redhat.com> wrote:

> On 02/12/2019 22.00, Eduardo Habkost wrote:
> > On Mon, Dec 02, 2019 at 08:39:48AM +0100, Igor Mammedov wrote:  
> >> On Fri, 29 Nov 2019 18:46:12 +0100
> >> Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>  
> >>> On 29/11/19 13:16, Igor Mammedov wrote:  
> >>>> As for "-m", I'd make it just an alias that translates
> >>>>  -m/mem-path/mem-prealloc    
> >>>
> >>> I think we should just deprecate -mem-path/-mem-prealloc in 5.0.  CCing
> >>> Thomas as mister deprecation. :)  
> >>
> >> I'll add that to my series  
> > 
> > Considering that the plan is to eventually reimplement those
> > options as syntactic sugar for memory backend options (hopefully
> > in less than 2 QEMU releases), what's the point of deprecating
> > them?  
> 
> Well, it depends on the "classification" [1] of the parameter...
> 
> Let's ask: What's the main purpose of the option?
> 
> Is it easier to use than the "full" option, and thus likely to be used
> by a lot of people who run QEMU directly from the CLI? In that case it
> should stay as "convenience option" and not be deprecated.
> 
> Or is the option merely there to give the upper layers like libvirt or
> some few users and their scripts some more grace period to adapt their
> code, but we all agree that the options are rather ugly and should
> finally go away? Then it's rather a "legacy option" and the deprecation
> process is the right way to go. Our QEMU interface is still way 
> overcrowded, we should try to keep it as clean as possible.

After switching to memdev for main RAM, users could use relatively
short global options
 -global memory-backend.prealloc|share=on
and
 -global memory-backend-file.mem-path=X|prealloc|share=on

instead of us adding and maintaining slightly shorter
 -mem-shared/-mem-path/-mem-prealloc

>  Thomas
> 
> 
> [1] Using the terms from:
>     https://www.youtube.com/watch?v=Oscjpkns7tM&t=8m



^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 2/2] Add -mem-shared option
  2019-12-03  8:56                   ` Thomas Huth
  2019-12-03 14:43                     ` Igor Mammedov
@ 2019-12-03 21:34                     ` Eduardo Habkost
  1 sibling, 0 replies; 41+ messages in thread
From: Eduardo Habkost @ 2019-12-03 21:34 UTC (permalink / raw)
  To: Thomas Huth
  Cc: libvir-list, QEMU, Marc-André Lureau, Stefan Hajnoczi,
	Paolo Bonzini, Igor Mammedov, Richard Henderson

On Tue, Dec 03, 2019 at 09:56:15AM +0100, Thomas Huth wrote:
> On 02/12/2019 22.00, Eduardo Habkost wrote:
> > On Mon, Dec 02, 2019 at 08:39:48AM +0100, Igor Mammedov wrote:
> >> On Fri, 29 Nov 2019 18:46:12 +0100
> >> Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>
> >>> On 29/11/19 13:16, Igor Mammedov wrote:
> >>>> As for "-m", I'd make it just an alias that translates
> >>>>  -m/mem-path/mem-prealloc  
> >>>
> >>> I think we should just deprecate -mem-path/-mem-prealloc in 5.0.  CCing
> >>> Thomas as mister deprecation. :)
> >>
> >> I'll add that to my series
> > 
> > Considering that the plan is to eventually reimplement those
> > options as syntactic sugar for memory backend options (hopefully
> > in less than 2 QEMU releases), what's the point of deprecating
> > them?
> 
> Well, it depends on the "classification" [1] of the parameter...
> 
> Let's ask: What's the main purpose of the option?
> 
> Is it easier to use than the "full" option, and thus likely to be used
> by a lot of people who run QEMU directly from the CLI? In that case it
> should stay as "convenience option" and not be deprecated.
> 
> Or is the option merely there to give the upper layers like libvirt or
> some few users and their scripts some more grace period to adapt their
> code, but we all agree that the options are rather ugly and should
> finally go away? Then it's rather a "legacy option" and the deprecation
> process is the right way to go. Our QEMU interface is still way to
> overcrowded, we should try to keep it as clean as possible.

That's a good way to describe the questions involved.  To me they
are clearly convenience options.

We could still replace them with new (more consistent and less
ugly) convenience options, though.

> 
>  Thomas
> 
> 
> [1] Using the terms from:
>     https://www.youtube.com/watch?v=Oscjpkns7tM&t=8m

-- 
Eduardo



^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, back to index

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-28 14:15 [PATCH 0/2] RFC: add -mem-shared option Marc-André Lureau
2019-11-28 14:15 ` [PATCH 1/2] memfd: add qemu_memfd_open() Marc-André Lureau
2019-11-28 14:15 ` [PATCH 2/2] Add -mem-shared option Marc-André Lureau
2019-11-28 16:14   ` Eduardo Habkost
2019-11-28 16:28   ` Igor Mammedov
2019-11-28 20:31     ` Marc-André Lureau
2019-11-29 10:07       ` Igor Mammedov
2019-11-29 10:11         ` Paolo Bonzini
2019-11-29 12:01           ` Markus Armbruster
2019-11-29 20:31             ` Eduardo Habkost
2019-11-29 12:16           ` Igor Mammedov
2019-11-29 17:46             ` Paolo Bonzini
2019-12-02  7:39               ` Igor Mammedov
2019-12-02 21:00                 ` Eduardo Habkost
2019-12-03  8:56                   ` Thomas Huth
2019-12-03 14:43                     ` Igor Mammedov
2019-12-03 21:34                     ` Eduardo Habkost
2019-11-28 16:10 ` [PATCH 0/2] RFC: add " Eduardo Habkost
2019-11-29  9:18   ` Igor Mammedov
2019-11-29  9:31   ` Paolo Bonzini
2019-11-29 10:23     ` Igor Mammedov
2019-11-29 11:21       ` Paolo Bonzini
2019-11-29 20:21     ` Eduardo Habkost
2019-12-01 15:40       ` Marc-André Lureau
2019-12-01 18:03         ` Paolo Bonzini
2019-11-28 16:59 ` Dr. David Alan Gilbert
2019-11-29  9:23   ` Igor Mammedov
2019-11-29  4:37 ` no-reply
2019-11-29  5:34 ` no-reply
2019-11-29  7:02 ` Gerd Hoffmann
2019-11-29  7:30   ` Marc-André Lureau
2019-11-29  9:27     ` Daniel P. Berrangé
2019-11-29  9:31       ` Marc-André Lureau
2019-11-29  9:42         ` Daniel P. Berrangé
2019-11-29  9:45           ` Marc-André Lureau
2019-11-29 11:44             ` Gerd Hoffmann
2019-11-29  9:33       ` Paolo Bonzini
2019-11-29  9:39         ` Daniel P. Berrangé
2019-11-29  9:52           ` Paolo Bonzini
2019-11-29 10:13         ` Igor Mammedov
2019-11-29 11:20           ` Paolo Bonzini

QEMU-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/qemu-devel/0 qemu-devel/git/0.git
	git clone --mirror https://lore.kernel.org/qemu-devel/1 qemu-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 qemu-devel qemu-devel/ https://lore.kernel.org/qemu-devel \
		qemu-devel@nongnu.org
	public-inbox-index qemu-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.nongnu.qemu-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git