On 2/5/19 11:07 AM, Vladimir Sementsov-Ogievskiy wrote:

>> Or, put another way, we KNOW we have (corner) cases where a mis-aligned
>> image can currently cause the server to return BLOCK_STATUS replies that
>> aren't aligned to the advertised minimumm block size.  Attempting to
>> read the last sector of an image then causes the client to see the
>> misaligned reply and complain, which we are treating as fatal.
> 
> Do we have fixes for it?

Not yet - it's still on my queue of things to fix after I get libvirt
incremental backup APIs in.  Might make 4.0, might not (but not the end
of the world; it's been known-broken since 3.0, so it's not a new
regression).

> 
>> But why
>> not instead just fail that particular read, but still attempt a
>> reconnect, in order to attempt further reads elsewhere in the image that
>> do not trip up the server's misaligned reply?
>>
> 
> Hmm, for these cases, if we consider this errors not fatal, we don't need
> even a reconnect..

Well, it all depends on whether the client is still in sync with the
server. If either side has disobeyed the spec, and send too many/too few
bytes compared to what the other side expects, you'll have magic number
mismatches, where you really DO need a reconnect to get back in sync.

> 
> If we want to consider protocol errors to be recoverable, we need reconnect only
> on wrong magic and may be some kind of inconsistent variable data lenghts..
> 
> And it may need addition option, like --strict=false

An option on how hard to try may be reasonable.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org