All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Maydell <peter.maydell@linaro.org>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Juan Quintela <quintela@redhat.com>
Subject: Re: [Qemu-devel] [PULL 42/57] Page request: Consume pages off the post-copy queue
Date: Thu, 12 Nov 2015 12:57:28 +0000	[thread overview]
Message-ID: <CAFEAcA80SQ1doqnvkev7WxvFyzgdsu3f4=tKhOTaq5s71Y18rg@mail.gmail.com> (raw)
In-Reply-To: <20151112122318.GF2754@work-vm>

On 12 November 2015 at 12:23, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
> * Peter Maydell (peter.maydell@linaro.org) wrote:
>> On 12 November 2015 at 12:04, Dr. David Alan Gilbert
>> <dgilbert@redhat.com> wrote:
>> > * Peter Maydell (peter.maydell@linaro.org) wrote:
>> >> On 10 November 2015 at 14:25, Juan Quintela <quintela@redhat.com> wrote:
>> >> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>> >> >
>> >> > When transmitting RAM pages, consume pages that have been queued by
>> >> > MIG_RPCOMM_REQPAGE commands and send them ahead of normal page scanning.
>> >> >
>> >> > Note:
>> >> >   a) After a queued page the linear walk carries on from after the
>> >> > unqueued page; there is a reasonable chance that the destination
>> >> > was about to ask for other closeby pages anyway.
>> >> >
>> >> >   b) We have to be careful of any assumptions that the page walking
>> >> > code makes, in particular it does some short cuts on its first linear
>> >> > walk that break as soon as we do a queued page.
>> >> >
>> >> >   c) We have to be careful to not break up host-page size chunks, since
>> >> > this makes it harder to place the pages on the destination.
>> >> >
>> >> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> >> > Reviewed-by: Juan Quintela <quintela@redhat.com>
>> >> > Signed-off-by: Juan Quintela <quintela@redhat.com>
>> >>
>> >> I've just discovered that this is causing 'make check' failures on
>> >> my OSX host (unfortunately something in my setup is causing
>> >> 'make check' failures to not always cause a build failure, so I
>> >> didn't notice earlier):
>> >
>> > It's only failing on OSX? Every time or only sometimes?
>>
>> Only OSX, and always. I think OSX is pickier about mutexes really
>> needing to be initialized before use.
>
> OK, at least an 'always' should be easier to debug.
>
>> > If you can find a way to get a backtrace off that qemu_mutex_lock case
>> > that would be great; I'd assume the later errors are the fall out from that.
>>
>> I'll have a look after lunch, but it's usually painful to get a
>> backtrace out of this kind of qtest, because it's clearly starting
>> a whole pile of QEMUs and there's no way I know of to say "only
>> run a few of these tests, not the whole huge pile".
>
> You could add an abort/assert into util/qemu-thread-posix.c qemu_mutex_lock
> in the error path.

abort/assert doesn't print a backtrace.

I added some OSX backtrace-gathering/printing functions to the errorpath,
and got this:

0   qemu-system-x86_64                  0x000000010c66d203 qemu_mutex_lock + 83
1   qemu-system-x86_64                  0x000000010c2ac7af unqueue_page + 47
2   qemu-system-x86_64                  0x000000010c2ac386 get_queued_page + 54
3   qemu-system-x86_64                  0x000000010c2ac135
ram_find_and_save_block + 165
4   qemu-system-x86_64                  0x000000010c2ab5a2
ram_save_iterate + 130
5   qemu-system-x86_64                  0x000000010c2afa2e
qemu_savevm_state_iterate + 302
6   qemu-system-x86_64                  0x000000010c53acbb
migration_thread + 571
7   libsystem_pthread.dylib             0x00007fff9146c05a _pthread_body + 131
8   libsystem_pthread.dylib             0x00007fff9146bfd7 _pthread_body + 0
9   libsystem_pthread.dylib             0x00007fff914693ed thread_start + 13


>
> Could you also add:
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 9bd2ce7..85e5766 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -93,6 +93,7 @@ MigrationState *migrate_get_current(void)
>      };
>
>      if (!once) {
> +        fprintf(stderr,"migrate_get_current do init of current_migration %d\n", getpid());
>          qemu_mutex_init(&current_migration.src_page_req_mutex);
>          once = true;
>      }
> diff --git a/migration/ram.c b/migration/ram.c
> index 4266687..72b46f2 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1036,6 +1036,7 @@ static RAMBlock *unqueue_page(MigrationState *ms, ram_addr_t *offset,
>  {
>      RAMBlock *block = NULL;
>
> +    fprintf(stderr,"unqueue_page %d\n", getpid());
>      qemu_mutex_lock(&ms->src_page_req_mutex);
>      if (!QSIMPLEQ_EMPTY(&ms->src_page_requests)) {
>          struct MigrationSrcPageRequest *entry =
>
>
> and make sure that the init happens before the first unqueue (you'll get
> loads of calls to unqueue).

With that change, plus the backtracing:

/x86_64/ahci/flush/retry: OK
/x86_64/ahci/flush/migrate: migrate_get_current do init of
current_migration 60427
migrate_get_current do init of current_migration 60428
unqueue_page 60427
0   qemu-system-x86_64                  0x0000000101a751c3 qemu_mutex_lock + 83
1   qemu-system-x86_64                  0x00000001016b4749 unqueue_page + 89
2   qemu-system-x86_64                  0x00000001016b42f6 get_queued_page + 54
3   qemu-system-x86_64                  0x00000001016b40a5
ram_find_and_save_block + 165
4   qemu-system-x86_64                  0x00000001016b3512
ram_save_iterate + 130
5   qemu-system-x86_64                  0x00000001016b79be
qemu_savevm_state_iterate + 302
6   qemu-system-x86_64                  0x0000000101942c7b
migration_thread + 571
7   libsystem_pthread.dylib             0x00007fff9146c05a _pthread_body + 131
8   libsystem_pthread.dylib             0x00007fff9146bfd7 _pthread_body + 0
9   libsystem_pthread.dylib             0x00007fff914693ed thread_start + 13
qemu: qemu_mutex_lock: Invalid argument
qemu-system-x86_64:Broken pipe
 Not a migration stream
qemu-system-x86_64: load of migration failed: Invalid argument

thanks
-- PMM

  reply	other threads:[~2015-11-12 12:57 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-10 14:24 [Qemu-devel] [PULL 00/57] Migration pull (take 2) Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 01/57] Add postcopy documentation Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 02/57] Provide runtime Target page information Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 03/57] Move configuration section writing Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 04/57] Move page_size_init earlier Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 05/57] qemu_ram_block_from_host Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 06/57] qemu_ram_block_by_name Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 07/57] Rename mis->file to from_src_file Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 08/57] Add qemu_get_buffer_in_place to avoid copies some of the time Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 09/57] Add wrapper for setting blocking status on a QEMUFile Juan Quintela
2015-11-10 14:24 ` [Qemu-devel] [PULL 10/57] Add QEMU_MADV_NOHUGEPAGE Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 11/57] ram_debug_dump_bitmap: Dump a migration bitmap as text Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 12/57] ram_load: Factor out host_from_stream_offset call and check Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 13/57] migrate_init: Call from savevm Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 14/57] Rename save_live_complete to save_live_complete_precopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 15/57] Add Linux userfaultfd.h header Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 16/57] Return path: Open a return path on QEMUFile for sockets Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 17/57] Return path: socket_writev_buffer: Block even on non-blocking fd's Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 18/57] Migration commands Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 19/57] Return path: Control commands Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 20/57] Return path: Send responses from destination to source Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 21/57] migration_is_setup_or_active Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 22/57] Return path: Source handling of return path Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 23/57] Rework loadvm path for subloops Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 24/57] Add migration-capability boolean for postcopy-ram Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 25/57] Add wrappers and handlers for sending/receiving the postcopy-ram migration messages Juan Quintela
2015-11-14 19:11   ` Stefan Weil
2015-11-16 10:07     ` Dr. David Alan Gilbert
2015-11-16 13:14       ` Stefan Weil
2015-11-16 14:20         ` Dr. David Alan Gilbert
2015-11-10 14:25 ` [Qemu-devel] [PULL 26/57] MIG_CMD_PACKAGED: Send a packaged chunk of migration stream Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 27/57] Modify save_live_pending for postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 28/57] postcopy: OS support test Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 29/57] migrate_start_postcopy: Command to trigger transition to postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 30/57] migration_completion: Take current state Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 31/57] MIGRATION_STATUS_POSTCOPY_ACTIVE: Add new migration state Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 32/57] Avoid sending vmdescription during postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 33/57] Add qemu_savevm_state_complete_postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 34/57] Postcopy: Maintain unsentmap Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 35/57] migration_completion: Take current state Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 36/57] postcopy: Incoming initialisation Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 37/57] postcopy: ram_enable_notify to switch on userfault Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 38/57] Postcopy: Postcopy startup in migration thread Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 39/57] Postcopy: End of iteration Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 40/57] Page request: Add MIG_RP_MSG_REQ_PAGES reverse command Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 41/57] Page request: Process incoming page request Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 42/57] Page request: Consume pages off the post-copy queue Juan Quintela
2015-11-12 11:53   ` Peter Maydell
2015-11-12 12:04     ` Dr. David Alan Gilbert
2015-11-12 12:15       ` Peter Maydell
2015-11-12 12:23         ` Dr. David Alan Gilbert
2015-11-12 12:57           ` Peter Maydell [this message]
2015-11-12 13:08             ` Dr. David Alan Gilbert
2015-11-12 13:18               ` Peter Maydell
2015-11-12 13:53                 ` Peter Maydell
2015-11-12 14:20                   ` Dr. David Alan Gilbert
2015-11-12 15:25                     ` Juan Quintela
2015-11-12 15:57                       ` Dr. David Alan Gilbert
2015-11-12 14:22                   ` Juan Quintela
2015-11-12 13:36         ` Markus Armbruster
2015-11-16 10:10           ` Andreas Färber
2015-11-10 14:25 ` [Qemu-devel] [PULL 43/57] postcopy_ram.c: place_page and helpers Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 44/57] Postcopy: Use helpers to map pages during migration Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 45/57] postcopy: Check order of received target pages Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 46/57] Don't sync dirty bitmaps in postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 47/57] Don't iterate on precopy-only devices during postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 48/57] Host page!=target page: Cleanup bitmaps Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 49/57] Round up RAMBlock sizes to host page sizes Juan Quintela
2015-12-30  0:26   ` Peter Crosthwaite
2016-01-04  9:48     ` Dr. David Alan Gilbert
2016-01-06 21:27     ` Paolo Bonzini
2015-11-10 14:25 ` [Qemu-devel] [PULL 50/57] Postcopy; Handle userfault requests Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 51/57] Start up a postcopy/listener thread ready for incoming page data Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 52/57] postcopy: Wire up loadvm_postcopy_handle_ commands Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 53/57] Postcopy: Mark nohugepage before discard Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 54/57] End of migration for postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 55/57] Disable mlock around incoming postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 56/57] Inhibit ballooning during postcopy Juan Quintela
2015-11-10 14:25 ` [Qemu-devel] [PULL 57/57] migration: qemu_savevm_state_cleanup becomes mandatory operation Juan Quintela
2015-11-10 22:21 ` [Qemu-devel] [PULL 00/57] Migration pull (take 2) Peter Maydell
  -- strict thread matches above, loose matches on Subject: below --
2015-11-09 17:28 [Qemu-devel] [PULL 00/57] Migration pull Juan Quintela
2015-11-09 17:28 ` [Qemu-devel] [PULL 42/57] Page request: Consume pages off the post-copy queue Juan Quintela

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFEAcA80SQ1doqnvkev7WxvFyzgdsu3f4=tKhOTaq5s71Y18rg@mail.gmail.com' \
    --to=peter.maydell@linaro.org \
    --cc=amit.shah@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.