All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails
@ 2018-06-19 15:17 Daniel P. Berrangé
  2018-06-19 15:45 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel P. Berrangé @ 2018-06-19 15:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert, Juan Quintela, Daniel P. Berrangé

The way we determine if we can start the incoming migration was
changed to use migration_has_all_channels() in:

  commit 428d89084c709e568f9cd301c2f6416a54c53d6d
  Author: Juan Quintela <quintela@redhat.com>
  Date:   Mon Jul 24 13:06:25 2017 +0200

    migration: Create migration_has_all_channels

This method in turn calls multifd_recv_all_channels_created()
which is hardcoded to always return 'true' when multifd is
not in use.

This means that if channel initialization fails with normal
migration, it'll never notice and attempt to start the
incoming migration regardless.

This can be seen, for example, if a client connects to a server
requiring TLS, but has an invalid x509 certificate:

qemu-system-x86_64: The certificate hasn't got a known issuer
qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed.

 #0  0x00007fffebd24f2b in raise () at /lib64/libc.so.6
 #1  0x00007fffebd0f561 in abort () at /lib64/libc.so.6
 #2  0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
 #3  0x00007fffebd1d692 in  () at /lib64/libc.so.6
 #4  0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386
 #5  0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
 #6  0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6
 #7  0x0000000000000000 in  ()

To handle the non-multifd case, we check whether mis->from_src_file
is non-NULL.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 migration/migration.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index e1eaa97df4..38ad818b23 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -518,11 +518,12 @@ void migration_ioc_process_incoming(QIOChannel *ioc)
  */
 bool migration_has_all_channels(void)
 {
+    MigrationIncomingState *mis = migration_incoming_get_current();
     bool all_channels;
 
     all_channels = multifd_recv_all_channels_created();
 
-    return all_channels;
+    return all_channels && mis->from_src_file != NULL;
 }
 
 /*
-- 
2.17.0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails
  2018-06-19 15:17 [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails Daniel P. Berrangé
@ 2018-06-19 15:45 ` Dr. David Alan Gilbert
  2018-06-19 15:58   ` Daniel P. Berrangé
  0 siblings, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-19 15:45 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, Juan Quintela

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> The way we determine if we can start the incoming migration was
> changed to use migration_has_all_channels() in:
> 
>   commit 428d89084c709e568f9cd301c2f6416a54c53d6d
>   Author: Juan Quintela <quintela@redhat.com>
>   Date:   Mon Jul 24 13:06:25 2017 +0200
> 
>     migration: Create migration_has_all_channels
> 
> This method in turn calls multifd_recv_all_channels_created()
> which is hardcoded to always return 'true' when multifd is
> not in use.
> 
> This means that if channel initialization fails with normal
> migration, it'll never notice and attempt to start the
> incoming migration regardless.

I'm not sure if that was actually the cause, because older versions
had no check at all in socket_accept_incoming_migration.

> This can be seen, for example, if a client connects to a server
> requiring TLS, but has an invalid x509 certificate:
> 
> qemu-system-x86_64: The certificate hasn't got a known issuer
> qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed.
> 
>  #0  0x00007fffebd24f2b in raise () at /lib64/libc.so.6
>  #1  0x00007fffebd0f561 in abort () at /lib64/libc.so.6
>  #2  0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
>  #3  0x00007fffebd1d692 in  () at /lib64/libc.so.6
>  #4  0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386
>  #5  0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
>  #6  0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6
>  #7  0x0000000000000000 in  ()
> 
> To handle the non-multifd case, we check whether mis->from_src_file
> is non-NULL.

So, what happens with this fix, does the destination exit cleanly, or
stay to accept another connection or what?

Dave

> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  migration/migration.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index e1eaa97df4..38ad818b23 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -518,11 +518,12 @@ void migration_ioc_process_incoming(QIOChannel *ioc)
>   */
>  bool migration_has_all_channels(void)
>  {
> +    MigrationIncomingState *mis = migration_incoming_get_current();
>      bool all_channels;
>  
>      all_channels = multifd_recv_all_channels_created();
>  
> -    return all_channels;
> +    return all_channels && mis->from_src_file != NULL;
>  }
>  
>  /*
> -- 
> 2.17.0
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails
  2018-06-19 15:45 ` Dr. David Alan Gilbert
@ 2018-06-19 15:58   ` Daniel P. Berrangé
  2018-06-19 16:00     ` Dr. David Alan Gilbert
                       ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Daniel P. Berrangé @ 2018-06-19 15:58 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, Juan Quintela

On Tue, Jun 19, 2018 at 04:45:05PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > The way we determine if we can start the incoming migration was
> > changed to use migration_has_all_channels() in:
> > 
> >   commit 428d89084c709e568f9cd301c2f6416a54c53d6d
> >   Author: Juan Quintela <quintela@redhat.com>
> >   Date:   Mon Jul 24 13:06:25 2017 +0200
> > 
> >     migration: Create migration_has_all_channels
> > 
> > This method in turn calls multifd_recv_all_channels_created()
> > which is hardcoded to always return 'true' when multifd is
> > not in use.
> > 
> > This means that if channel initialization fails with normal
> > migration, it'll never notice and attempt to start the
> > incoming migration regardless.
> 
> I'm not sure if that was actually the cause, because older versions
> had no check at all in socket_accept_incoming_migration.

Hmm, actually your write - it was a complicated series of refactorings
for multifd, and I think I mis-identified. I'll do a proper bisect so
we can have an accurate record.

> 
> > This can be seen, for example, if a client connects to a server
> > requiring TLS, but has an invalid x509 certificate:
> > 
> > qemu-system-x86_64: The certificate hasn't got a known issuer
> > qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed.
> > 
> >  #0  0x00007fffebd24f2b in raise () at /lib64/libc.so.6
> >  #1  0x00007fffebd0f561 in abort () at /lib64/libc.so.6
> >  #2  0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
> >  #3  0x00007fffebd1d692 in  () at /lib64/libc.so.6
> >  #4  0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386
> >  #5  0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
> >  #6  0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6
> >  #7  0x0000000000000000 in  ()
> > 
> > To handle the non-multifd case, we check whether mis->from_src_file
> > is non-NULL.
> 
> So, what happens with this fix, does the destination exit cleanly, or
> stay to accept another connection or what?

It stays around and can accept another connection, which is actually
quite desirable, as it limits impact of a malicious client from DOS'ing
the genuine client by racing to connect first.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails
  2018-06-19 15:58   ` Daniel P. Berrangé
@ 2018-06-19 16:00     ` Dr. David Alan Gilbert
  2018-06-19 16:02       ` Daniel P. Berrangé
  2018-06-19 16:28     ` Daniel P. Berrangé
  2018-06-19 16:31     ` Eric Blake
  2 siblings, 1 reply; 7+ messages in thread
From: Dr. David Alan Gilbert @ 2018-06-19 16:00 UTC (permalink / raw)
  To: Daniel P. Berrangé; +Cc: qemu-devel, Juan Quintela

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Tue, Jun 19, 2018 at 04:45:05PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > The way we determine if we can start the incoming migration was
> > > changed to use migration_has_all_channels() in:
> > > 
> > >   commit 428d89084c709e568f9cd301c2f6416a54c53d6d
> > >   Author: Juan Quintela <quintela@redhat.com>
> > >   Date:   Mon Jul 24 13:06:25 2017 +0200
> > > 
> > >     migration: Create migration_has_all_channels
> > > 
> > > This method in turn calls multifd_recv_all_channels_created()
> > > which is hardcoded to always return 'true' when multifd is
> > > not in use.
> > > 
> > > This means that if channel initialization fails with normal
> > > migration, it'll never notice and attempt to start the
> > > incoming migration regardless.
> > 
> > I'm not sure if that was actually the cause, because older versions
> > had no check at all in socket_accept_incoming_migration.
> 
> Hmm, actually your write - it was a complicated series of refactorings
> for multifd, and I think I mis-identified. I'll do a proper bisect so
> we can have an accurate record.
> 
> > 
> > > This can be seen, for example, if a client connects to a server
> > > requiring TLS, but has an invalid x509 certificate:
> > > 
> > > qemu-system-x86_64: The certificate hasn't got a known issuer
> > > qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed.
> > > 
> > >  #0  0x00007fffebd24f2b in raise () at /lib64/libc.so.6
> > >  #1  0x00007fffebd0f561 in abort () at /lib64/libc.so.6
> > >  #2  0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
> > >  #3  0x00007fffebd1d692 in  () at /lib64/libc.so.6
> > >  #4  0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386
> > >  #5  0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
> > >  #6  0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6
> > >  #7  0x0000000000000000 in  ()
> > > 
> > > To handle the non-multifd case, we check whether mis->from_src_file
> > > is non-NULL.
> > 
> > So, what happens with this fix, does the destination exit cleanly, or
> > stay to accept another connection or what?
> 
> It stays around and can accept another connection, which is actually
> quite desirable, as it limits impact of a malicious client from DOS'ing
> the genuine client by racing to connect first.

OK, that's fine, although I suspect in most other cases QEMU exits on
the destination with an error (e.g. if you aren't using TLS and just
connect to the port and squirt garbage down it, I think it'll exit
telling you that it's not a migration header).


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails
  2018-06-19 16:00     ` Dr. David Alan Gilbert
@ 2018-06-19 16:02       ` Daniel P. Berrangé
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel P. Berrangé @ 2018-06-19 16:02 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, Juan Quintela

On Tue, Jun 19, 2018 at 05:00:34PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > On Tue, Jun 19, 2018 at 04:45:05PM +0100, Dr. David Alan Gilbert wrote:
> > > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > > The way we determine if we can start the incoming migration was
> > > > changed to use migration_has_all_channels() in:
> > > > 
> > > >   commit 428d89084c709e568f9cd301c2f6416a54c53d6d
> > > >   Author: Juan Quintela <quintela@redhat.com>
> > > >   Date:   Mon Jul 24 13:06:25 2017 +0200
> > > > 
> > > >     migration: Create migration_has_all_channels
> > > > 
> > > > This method in turn calls multifd_recv_all_channels_created()
> > > > which is hardcoded to always return 'true' when multifd is
> > > > not in use.
> > > > 
> > > > This means that if channel initialization fails with normal
> > > > migration, it'll never notice and attempt to start the
> > > > incoming migration regardless.
> > > 
> > > I'm not sure if that was actually the cause, because older versions
> > > had no check at all in socket_accept_incoming_migration.
> > 
> > Hmm, actually your write - it was a complicated series of refactorings
> > for multifd, and I think I mis-identified. I'll do a proper bisect so
> > we can have an accurate record.
> > 
> > > 
> > > > This can be seen, for example, if a client connects to a server
> > > > requiring TLS, but has an invalid x509 certificate:
> > > > 
> > > > qemu-system-x86_64: The certificate hasn't got a known issuer
> > > > qemu-system-x86_64: migration/migration.c:386: process_incoming_migration_co: Assertion `mis->from_src_file' failed.
> > > > 
> > > >  #0  0x00007fffebd24f2b in raise () at /lib64/libc.so.6
> > > >  #1  0x00007fffebd0f561 in abort () at /lib64/libc.so.6
> > > >  #2  0x00007fffebd0f431 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
> > > >  #3  0x00007fffebd1d692 in  () at /lib64/libc.so.6
> > > >  #4  0x0000555555ad027e in process_incoming_migration_co (opaque=<optimized out>) at migration/migration.c:386
> > > >  #5  0x0000555555c45e8b in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:116
> > > >  #6  0x00007fffebd3a6a0 in __start_context () at /lib64/libc.so.6
> > > >  #7  0x0000000000000000 in  ()
> > > > 
> > > > To handle the non-multifd case, we check whether mis->from_src_file
> > > > is non-NULL.
> > > 
> > > So, what happens with this fix, does the destination exit cleanly, or
> > > stay to accept another connection or what?
> > 
> > It stays around and can accept another connection, which is actually
> > quite desirable, as it limits impact of a malicious client from DOS'ing
> > the genuine client by racing to connect first.
> 
> OK, that's fine, although I suspect in most other cases QEMU exits on
> the destination with an error (e.g. if you aren't using TLS and just
> connect to the port and squirt garbage down it, I think it'll exit
> telling you that it's not a migration header).

That's ok as non-TLS has zero security anyway, so improving resilience
only really matters for TLS setups, prior to authorizing the client.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails
  2018-06-19 15:58   ` Daniel P. Berrangé
  2018-06-19 16:00     ` Dr. David Alan Gilbert
@ 2018-06-19 16:28     ` Daniel P. Berrangé
  2018-06-19 16:31     ` Eric Blake
  2 siblings, 0 replies; 7+ messages in thread
From: Daniel P. Berrangé @ 2018-06-19 16:28 UTC (permalink / raw)
  To: Dr. David Alan Gilbert; +Cc: qemu-devel, Juan Quintela

On Tue, Jun 19, 2018 at 04:58:59PM +0100, Daniel P. Berrangé wrote:
> On Tue, Jun 19, 2018 at 04:45:05PM +0100, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berrange@redhat.com) wrote:
> > > The way we determine if we can start the incoming migration was
> > > changed to use migration_has_all_channels() in:
> > > 
> > >   commit 428d89084c709e568f9cd301c2f6416a54c53d6d
> > >   Author: Juan Quintela <quintela@redhat.com>
> > >   Date:   Mon Jul 24 13:06:25 2017 +0200
> > > 
> > >     migration: Create migration_has_all_channels
> > > 
> > > This method in turn calls multifd_recv_all_channels_created()
> > > which is hardcoded to always return 'true' when multifd is
> > > not in use.
> > > 
> > > This means that if channel initialization fails with normal
> > > migration, it'll never notice and attempt to start the
> > > incoming migration regardless.
> > 
> > I'm not sure if that was actually the cause, because older versions
> > had no check at all in socket_accept_incoming_migration.
> 
> Hmm, actually your write - it was a complicated series of refactorings
> for multifd, and I think I mis-identified. I'll do a proper bisect so
> we can have an accurate record.

Ok, this commit above introduces the latent bug, and then it is
activated by

  commit 36c2f8be2c4eb0003ac77a14910842b7ddd7337e
  Author: Juan Quintela <quintela@redhat.com>
  Date:   Wed Mar 7 08:40:52 2018 +0100

    migration: Delay start of migration main routines
    

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails
  2018-06-19 15:58   ` Daniel P. Berrangé
  2018-06-19 16:00     ` Dr. David Alan Gilbert
  2018-06-19 16:28     ` Daniel P. Berrangé
@ 2018-06-19 16:31     ` Eric Blake
  2 siblings, 0 replies; 7+ messages in thread
From: Eric Blake @ 2018-06-19 16:31 UTC (permalink / raw)
  To: Daniel P. Berrangé, Dr. David Alan Gilbert; +Cc: qemu-devel, Juan Quintela

On 06/19/2018 10:58 AM, Daniel P. Berrangé wrote:

>> So, what happens with this fix, does the destination exit cleanly, or
>> stay to accept another connection or what?
> 
> It stays around and can accept another connection, which is actually
> quite desirable, as it limits impact of a malicious client from DOS'ing
> the genuine client by racing to connect first.

Compare to NBD which had a CVE created for the DOS aspect of a malicious 
client connecting first; see commit 0c9390d9, CVE-2017-9524.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-06-19 16:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-19 15:17 [Qemu-devel] [PATCH] migration: fix crash in when incoming client channel setup fails Daniel P. Berrangé
2018-06-19 15:45 ` Dr. David Alan Gilbert
2018-06-19 15:58   ` Daniel P. Berrangé
2018-06-19 16:00     ` Dr. David Alan Gilbert
2018-06-19 16:02       ` Daniel P. Berrangé
2018-06-19 16:28     ` Daniel P. Berrangé
2018-06-19 16:31     ` Eric Blake

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.