From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:54174)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1X7ONW-0002Tu-Fa
	for qemu-devel@nongnu.org; Wed, 16 Jul 2014 08:32:16 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1X7ONP-0007oA-E2
	for qemu-devel@nongnu.org; Wed, 16 Jul 2014 08:32:10 -0400
Received: from mx1.redhat.com ([209.132.183.28]:29255)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <pbonzini@redhat.com>) id 1X7ONP-0007nc-6E
	for qemu-devel@nongnu.org; Wed, 16 Jul 2014 08:32:03 -0400
Message-ID: <53C670B9.7030008@redhat.com>
Date: Wed, 16 Jul 2014 14:31:53 +0200
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1404495717-4239-1-git-send-email-dgilbert@redhat.com>
	<1404495717-4239-8-git-send-email-dgilbert@redhat.com>
	<53B7CE18.9020907@redhat.com> <20140716093742.GE2514@work-vm>
	<53C64AE1.8070805@redhat.com> <20140716115233.GH2514@work-vm>
In-Reply-To: <20140716115233.GH2514@work-vm>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 07/46] Return path: Open a return path on
 QEMUFile for sockets
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: aarcange@redhat.com, yamahata@private.email.ne.jp, lilei@linux.vnet.ibm.com, qemu-devel@nongnu.org, quintela@redhat.com

Il 16/07/2014 13:52, Dr. David Alan Gilbert ha scritto:
> * Paolo Bonzini (pbonzini@redhat.com) wrote:
>> Il 16/07/2014 11:37, Dr. David Alan Gilbert ha scritto:
>>>>>
>>>>> +
>>>>> +    /* If it's already open, return it */
>>>>> +    if (qfs->file->return_path) {
>>>>> +        return qfs->file->return_path;
>>>>
>>>> Wouldn't this leave a dangling file descriptor if you call
>>>> socket_dup_return_path twice, and then close the original QEMUFile?
>>>
>>> Hmm - how?
>>
>> The problem is that there is no reference count on QEMUFile, so if you do
>>
>>   f1 = open_return_path(f0);
>>   f2 = open_return_path(f0);
>>   /* now f1 == f2 */
>>   qemu_fclose(f1);
>>   /* now f2 is dangling */
>
> I think from the way I'm using it, I can remove the optimisation, but I do
> need to check.
>
> I'm not too sure what your worry is about 'f2' in this case; I guess the caller
> needs to know that it should only close the return path once - is that
> your worry?

Yes.  The API is not well encapsulated; a "random" caller of 
open_return_path does not know (and cannot know) whether it should close 
the returned file or not.

> I'm more nervous about dropping that one, because the current scheme
> does provide a clean way of finding the forward path if you've got the
> reverse (although I don't think I make use of it).

If it isn't used, why keep it?

>>>  Source side
>>>     Forward path - written by migration thread
>>>            : It's OK for this to be blocking, but we end up with it being
>>>              non-blocking, and modify the socket code to emulate blocking.
>>
>> This likely has a performance impact though.  The first migration thread
>> code drop from Juan already improved throughput a lot, even if it kept the
>> iothread all the time and only converted from nonblocking writes to
>> blocking.
>
> Can you give some more reasoning as to why you think this will hit the
> performance so much; I thought the output buffers were quite big anyway.

I don't really know, it's
>>>     Return path  - opened by main thread, read by fd_handler on main thread
>>>            : Must be non-blocking so as not to block the main thread while
>>>              waiting for a partially sent command.
>>
>> Why can't you handle this in the migration thread (or a new postcopy thread
>> on the source side)?  Then it can stay blocking.
>
> Handling it within the migration thread would make it much more complicated
> (which would be bad since it's already complex enough);

Ok.  I'm not sure why it is more complicated since migration is 
essentially two-phase, one where the source drives it and one where the 
source just waits for requests, but I'll trust you on this. :)

>>>  Destination side
>>>     Forward path - read by main thread
>>
>> This must be nonblocking so that the monitor keeps responding.
>
> Interesting, I suspect none of the code in there is set up for that at the
> moment, so how does that work during migration at the moment?

It sure is. :)

On the destination side, migration is done in a coroutine (see 
process_incoming_migration) so it's all transparent.  Only 
socket_get_buffer has to know about this:

         len = qemu_recv(s->fd, buf, size, 0);
         if (len != -1) {
             break;
         }
         if (socket_error() == EAGAIN) {
             yield_until_fd_readable(s->fd);
         } else if (socket_error() != EINTR) {
             break;
         }

If the socket is put in blocking mode recv will never return EAGAIN, so 
this code will only run if the socket is nonblocking.

> Actually, I see I missed something here; this should be:
>
>    Destination side
>          Forward path - read by main thread, and listener thread (see the
>              separate mail that described that listner thread)
>
> and that means that once postcopy is going (and the listener thread started)
> it can't block the monitor.

Ok, so the listener thread can do socket_set_block(qemu_get_fd(file)) 
once it gets its hands on the QEMUFile.

>>>     Return path  - opened by main thread, written by main thread AND postcopy
>>>                    thread (protected by rp_mutex)
>>
>> When does the main thread needs to write?
>
> Not much; the only things the main thread currently responds to are the
> ReqAck (ping like) requests; those are turning out to be very useful during debug;
> I've also got the ability for the destination to send a migration result back to the
> source which seems useful to be able to 'fail' early.

Why can't this be done in the listener thread?  (Thus transforming it 
into a more general postcopy migration thread; later we could even 
change incoming migration from a coroutine to a thread).

>> If it doesn't need that, you can just switch to blocking when you process
>> the listen command (i.e. when the postcopy thread starts).
>
> Why don't I just do it anyway? Prior to postcopy starting we're in the same
> situation as we're in with precopy today, so can already get mainblock threading.

See above for the explanation.

Paolo