From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keir Fraser Subject: Re: [PATCH] Require that xenstored writes to a domain complete in a single chunk Date: Mon, 26 Feb 2007 17:20:34 +0000 Message-ID: References: <871wkcyjig.fsf@apfelstrudel.hh.sledj.net> Mime-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <871wkcyjig.fsf@apfelstrudel.hh.sledj.net> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: David Edmondson , xen-devel@lists.xensource.com List-Id: xen-devel@lists.xenproject.org On 26/2/07 16:24, "David Edmondson" wrote: > If xenstored is part-way through writing a header+payload into the > buffer shared with a guest domain when the guest domain decides to > suspend, the buffer is corrupted, as xenstored doesn't know that it > has a partial write to complete when the domain revives. The domain > is expecting proper completion of the partial header+payload and is > disappointed. > > The attached patch modifies xenstored such that it checks for > sufficient space for header+payload before making any changes to the > shared buffer. > > It is against 3.0.4-1, but the code in unstable looks the same. This seems dubious. There's no reason we might not have payloads bigger than the ring size (which is only 1kB). The right fix would be in the guest, which should already be stopping any transactions or commands across save/restore. Does this problem occur when xenstored sends an asynchronous watch-fired message? Probably the packet-reading thread should be interrupted and put to sleep before suspending. For older guest compatibility perhaps we can take a variant of your patch that only waits for enough space is the entire message fits in the ring in one go. This would be 'best-effort' at compatibility while not precluding use of larger messages in general. -- Keir