From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SIGNED_OFF_BY,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06074C10F03 for ; Thu, 7 Mar 2019 09:49:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BF01320675 for ; Thu, 7 Mar 2019 09:49:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=nexedi.com header.i=kirr@nexedi.com header.b="psezAs0j"; dkim=pass (1024-bit key) header.d=mandrillapp.com header.i=@mandrillapp.com header.b="TAabBqeb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726131AbfCGJtb (ORCPT ); Thu, 7 Mar 2019 04:49:31 -0500 Received: from mail133-16.atl131.mandrillapp.com ([198.2.133.16]:40770 "EHLO mail133-16.atl131.mandrillapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725747AbfCGJta (ORCPT ); Thu, 7 Mar 2019 04:49:30 -0500 X-Greylist: delayed 901 seconds by postgrey-1.27 at vger.kernel.org; Thu, 07 Mar 2019 04:49:29 EST DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=mandrill; d=nexedi.com; h=From:Subject:To:Cc:Message-Id:References:In-Reply-To:Date:MIME-Version:Content-Type:Content-Transfer-Encoding; i=kirr@nexedi.com; bh=XWiXs78VZDsqyQLv3F/IX1IJH6F8rfXnqZIljGl5144=; b=psezAs0jddAoLl49LiKqOHnmMSONrKK377P3Q9cZPElnN5KYjiysC8klGAli3UtOAV5Z00FqT4f2 q6n0fQhx89M1g3RyuWE0/U5jkvVONrVxCbv9Tz97R6sg00UCsCGCmxPqp68TfiwDzqknIFf2hrAP ZGntKWFvHhiU5QdWT88= Received: from pmta02.mandrill.prod.atl01.rsglab.com (127.0.0.1) by mail133-16.atl131.mandrillapp.com id hg3kii1sar8i for ; Thu, 7 Mar 2019 09:34:28 +0000 (envelope-from ) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1551951268; h=From : Subject : To : Cc : Message-Id : References : In-Reply-To : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=XWiXs78VZDsqyQLv3F/IX1IJH6F8rfXnqZIljGl5144=; b=TAabBqebiUk3asttkuJYb9qifziMNrJBQ/1bzyOcLWyuXT+3VRncHNLPrquSwQN6EpD5C5 qvPaYfLNONPtpl/B8Hi1tNEvmqXjn54YKMaUuk4VxQx2o14R6ebvL2a3iAO7fCAJN20+88ng 69BS/wniG2oGnKinaFL9YEwpgKh30= From: Kirill Smelkov Subject: Re: [RESEND, PATCH v2] fuse: Don't drop NOTIFY_REPLY if we promised it Received: from [87.98.221.171] by mandrillapp.com id bea620e2e4904036a84e5090d7604ebd; Thu, 07 Mar 2019 09:34:28 +0000 To: Miklos Szeredi Cc: Miklos Szeredi , , fuse-devel , Han-Wen Nienhuys , Jakob Unterwurzacher , stable Message-Id: <20190307093421.GA4620@deco.navytux.spb.ru> References: <20190219094147.32734-1-kirr@nexedi.com> <20190227200155.GA14682@deco.navytux.spb.ru> <20190227203903.GA2798@deco.navytux.spb.ru> <20190228114757.GA2796@deco.navytux.spb.ru> In-Reply-To: <20190228114757.GA2796@deco.navytux.spb.ru> X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=31050260.bea620e2e4904036a84e5090d7604ebd X-Mandrill-User: md_31050260 Date: Thu, 07 Mar 2019 09:34:28 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org On Thu, Feb 28, 2019 at 02:47:57PM +0300, Kirill Smelkov wrote: > On Thu, Feb 28, 2019 at 09:10:15AM +0100, Miklos Szeredi wrote: > > On Wed, Feb 27, 2019 at 9:39 PM Kirill Smelkov wrote: > > > > > I more or less agree with this statement. However can we please make = the > > > breakage to be explicitly visible with an error instead of exhibiting= it > > > via harder to debug stucks/deadlocks? For example sys_read < max_writ= e > > > -> error instead of getting stuck. And if notify_retrieve requests > > > buffer larger than max_write -> error or cut to max_write, but don't > > > return OK when we know we will never send what was requested to > > > filesystem even if it uses max_write sized reads. What is the point o= f > > > breaking in hard to diagnose way when we can make the breakage showin= g > > > itself explicitly? Would a patch for such behaviour accepted? > > > > Sure, if it's only adds a couple of lines. Adding more than say ten > > lines for such a non-bug fix is definitely excessive. > > Ok, thanks. Please consider applying the following patch. (It's a bit > pity to hear the problem is not considered to be a bug, but anyway). > > I will also send the second patch as another mail, since I could not > made `git am --scissors` to apply several patched extracted from one > mail successfully. Ping. Miklos, is there anything wrong with this patch and its second counterpart? Thank beforehand for feedback, Kirill > ---- 8< ---- > From: Kirill Smelkov > Date: Thu, 28 Feb 2019 13:06:18 +0300 > Subject: [PATCH 1/2] fuse: retrieve: cap requested size to negotiated > max_write > MIME-Version: 1.0 > Content-Type: text/plain; charset=3Dutf-8 > Content-Transfer-Encoding: 8bit > > FUSE filesystem server and kernel client negotiate during initialization > phase, what should be the maximum write size the client will ever issue. > Correspondingly the filesystem server then queues sys_read calls to read > requests with buffer capacity large enough to carry request header > + that max_write bytes. A filesystem server is free to set its max_write > in anywhere in the range between [1=C2=B7page, fc->max_pages=C2=B7page]. = In > particular go-fuse[2] sets max_write by default as 64K, wheres default > fc->max_pages corresponds to 128K. Libfuse also allows users to > configure max_write, but by default presets it to possible maximum. > > If max_write is < fc->max_pages=C2=B7page, and in NOTIFY_RETRIEVE handler= we > allow to retrieve more than max_write bytes, corresponding prepared > NOTIFY_REPLY will be thrown away by fuse_dev_do_read, because the > filesystem server, in full correspondence with server/client contract, > will be only queuing sys_read with ~max_write buffer capacity, and > fuse_dev_do_read throws away requests that cannot fit into server > request buffer. In turn the filesystem server could get stuck waiting > indefinitely for NOTIFY_REPLY since NOTIFY_RETRIEVE handler returned OK > which is understood by clients as that NOTIFY_REPLY was queued and will > be sent back. > > -> Cap requested size to negotiate max_write to avoid the problem. > This aligns with the way NOTIFY_RETRIEVE handler works, which already > unconditionally caps requested retrieve size to fuse_conn->max_pages. > This way it should not hurt NOTIFY_RETRIEVE semantic if we return less > data than was originally requested. > > Please see [1] for context where the problem of stuck filesystem was hit > for real, how the situation was traced and for more involving patch that > did not make it into the tree. > > [1] https://marc.info/?l=3Dlinux-fsdevel&m=3D155057023600853&w=3D2 > [2] https://github.com/hanwen/go-fuse > > Signed-off-by: Kirill Smelkov > Cc: Han-Wen Nienhuys > Cc: Jakob Unterwurzacher > Cc: # v2.6.36+ > --- > fs/fuse/dev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c > index 8a63e52785e9..38e94bc43053 100644 > --- a/fs/fuse/dev.c > +++ b/fs/fuse/dev.c > @@ -1749,7 +1749,7 @@ static int fuse_retrieve(struct fuse_conn *fc, stru= ct inode *inode, > =09offset =3D outarg->offset & ~PAGE_MASK; > =09file_size =3D i_size_read(inode); > > -=09num =3D outarg->size; > +=09num =3D min(outarg->size, fc->max_write); > =09if (outarg->offset > file_size) > =09=09num =3D 0; > =09else if (outarg->offset + num > file_size) > -- > 2.21.0.352.gf09ad66450