From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.1 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E504C43381 for ; Wed, 27 Feb 2019 20:17:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1CB2D213A2 for ; Wed, 27 Feb 2019 20:17:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=nexedi.com header.i=kirr@nexedi.com header.b="ANxn/2Et"; dkim=pass (1024-bit key) header.d=mandrillapp.com header.i=@mandrillapp.com header.b="pJqqYJiI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730348AbfB0URJ (ORCPT ); Wed, 27 Feb 2019 15:17:09 -0500 Received: from mail179-15.suw41.mandrillapp.com ([198.2.179.15]:15621 "EHLO mail179-15.suw41.mandrillapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726594AbfB0URI (ORCPT ); Wed, 27 Feb 2019 15:17:08 -0500 X-Greylist: delayed 900 seconds by postgrey-1.27 at vger.kernel.org; Wed, 27 Feb 2019 15:17:06 EST DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=mandrill; d=nexedi.com; h=From:Subject:To:Cc:Message-Id:References:In-Reply-To:Date:MIME-Version:Content-Type:Content-Transfer-Encoding; i=kirr@nexedi.com; bh=6PeEi5lPSjcmnRYE5dBgA+L5cQiB8/zTDhrD854Ct10=; b=ANxn/2EtDWD/8P+VuCHstrsqmQldNx9BCMEB4yEpCNocuRirUiL3VQThqomgTwK3u3yDz60YI9N6 JU5l52UdEmvt1tOmf9o9WMx/nBDZHuhIG6Ob23Nu1pWY6ISoqOn7uOQuEsRIbvO+vRNNdi15Nlgs cH+MYCLEdCpGZYB67jM= Received: from pmta04.mandrill.prod.suw01.rsglab.com (127.0.0.1) by mail179-15.suw41.mandrillapp.com id hero4422s28a for ; Wed, 27 Feb 2019 20:02:05 +0000 (envelope-from ) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1551297724; h=From : Subject : To : Cc : Message-Id : References : In-Reply-To : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=6PeEi5lPSjcmnRYE5dBgA+L5cQiB8/zTDhrD854Ct10=; b=pJqqYJiII7FV2CNU+qv0FUAh0T/zOD+aUOJp4+BqwqyTby/f62zI4O6tP7iJIHpek8c4p9 HItIESvwz37j9lZDIY4unbvC8tHYeVp8wkmoLKuoX8JTf6e1N3UkWL9bf2PwehEHc7IpwOcp kW+feXhcfjzF5br34Q3tkPnKRIBJQ= From: Kirill Smelkov Subject: Re: [RESEND, PATCH v2] fuse: Don't drop NOTIFY_REPLY if we promised it Received: from [87.98.221.171] by mandrillapp.com id bb71bdbc62a54b59a7f1b2404f089835; Wed, 27 Feb 2019 20:02:04 +0000 To: Miklos Szeredi Cc: Miklos Szeredi , , fuse-devel , Han-Wen Nienhuys , Jakob Unterwurzacher , stable Message-Id: <20190227200155.GA14682@deco.navytux.spb.ru> References: <20190219094147.32734-1-kirr@nexedi.com> In-Reply-To: X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=31050260.bb71bdbc62a54b59a7f1b2404f089835 X-Mandrill-User: md_31050260 Date: Wed, 27 Feb 2019 20:02:04 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Miklos, first of all thanks for feedback. On Tue, Feb 26, 2019 at 04:14:22PM +0100, Miklos Szeredi wrote: > On Tue, Feb 19, 2019 at 10:42 AM Kirill Smelkov wrote: > > > > A successful call to NOTIFY_RETRIEVE by filesystem carries promise from > > the kernel to send back NOTIFY_REPLY message. However if the filesystem > > is not reading requests with fuse_conn->max_pages capacity, > > That's a violation of the contract by the fuse server, not the kernel. Do you mean that even if filesystem server configures via init_out.max_write that it is accepting e.g. only 32K max writes, it still has to be issuing sys_read with buffer of 128K (= hardcoded fuse_conn->max_pages before Linux 4.20, and default since Linux 4.20)? Also, I could not find any FUSE contract being specified anywhere, so I used message of the commit that added support for NOTIFY_RETRIEVE to sense its semantic: commit 2d45ba381a74a743eeaa2b06c7c5c0d2bf73ba1a Author: Miklos Szeredi Date: Mon Jul 12 14:41:40 2010 +0200 fuse: add retrieve request Userspace filesystem can request data to be retrieved from the inode's mapping. This request is synchronous and the retrieved data is queued as a new request. If the write to the fuse device returns an error then the retrieve request was not completed and a reply will not be sent. Only present pages are returned in the retrieve reply. Retrieving stops when it finds a non-present page and only data prior to that is returned. This request doesn't change the dirty state of pages. Signed-off-by: Miklos Szeredi which, even if not explicitly, gives the impression that if NOTIFY_RETRIEVE was queued successfully, the reply will come. Also: if it is a violation of contract by filesystem server, the kernel should return ESOMETHING for violating sys_read, instead of making that read to be waiting indefinitely, isn't it? In summary: instead of getting clients stuck silently I still suggest for NOTIFY_RETRIEVE to come to client, if it can come. And also to return EINVAL for /dev/fuse sys_read calls that are violating the server/kernel contract. > > fuse_dev_do_read might see that the "request is too large" and decide to > > "reply with an error and restart the read". "Reply with an error" has > > underlying assumption that there is a "requester thread" that is waiting > > for request completion, which is true for most requests, but is not true > > for NOTIFY_REPLY: NOTIFY_RETRIEVE handler completes with OK status right > > after it could successfully queue NOTIFY_REPLY message without waiting > > for NOTIFY_REPLY completion. This leads to situation when filesystem > > requested to retrieve inode data with NOTIFY_RETRIEVE, got err=OK for > > that notification request, but NOTIFY_REPLY is not coming back. > > > > More, since there is no "requester thread" to handle the error, the > > situation shows itself as /sys/fs/fuse/connections/X/waiting=1 _and_ > > /dev/fuse read(s) queued. Which is misleading since NOTIFY_REPLY request > > was removed from pending queue and abandoned. > > Now I don't understand how that would happen. If the request is > abandoned, its refcount should go down to zero and the num_waiting > count decremented accordingly. You are right - it was my mistake. I misinterpreted waiting=1 as a request not being transferred to filesystem server yet, but in my test it turned out to be already transferred and sitting on client "processing" list waiting for corresponding reply (which was not coming because the filesystem in turn was stuck waiting for NOTIFY_REPLY to come from the kernel): root@(none):~/src/wendelin/wendelin.core/wcfs# cat /sys/fs/fuse/connections/25/waiting 1 root@(none):~/src/wendelin/wendelin.core/wcfs# cat /sys/fs/fuse/connections/25/queue #waiting: 1 (0)Interrupt: (0)Forget: (0)Request: (0)Processing.IO: (1)Processing.P: #0: .52 R15 i5 So the part of commit message that discussed X/waiting=1 is wrong and should be dropped - thanks for catching this. Still the main point is what should be the semantic of NOTIFY_RETRIEVE vs NOTIFY_REPLY vs INIT.max_write, and that it is better to always send retrieve data if client promised it, and also to explicitly indicate with an error if filesystem server is violating FUSE server/client contract. Thanks, Kirill P.S. I attach the draft patch for /sys/fs/fuse/connections/X/queue in case someone is interested. ---- 8< ---- diff --git a/fs/fuse/control.c b/fs/fuse/control.c index fe80bea4ad89..f4e22f5436e2 100644 --- a/fs/fuse/control.c +++ b/fs/fuse/control.c @@ -63,6 +63,160 @@ static ssize_t fuse_conn_waiting_read(struct file *file, char __user *buf, return simple_read_from_buffer(buf, len, ppos, tmp, size); } +/* fuse_conn_iqueue_print prints input queue into provided buf. + * + * buf can be NULL in which case only the length of would-be printed text is + * returned and nothing is actually printed. + * + * must be called with fc->iq->waitq locked. */ +static size_t fuse_conn_iqueue_print(char *buf, size_t size, struct fuse_conn *fc) +{ + struct fuse_iqueue *fiq = &fc->iq; + struct fuse_req *req; + struct fuse_forget_link *freq; + size_t nreq; + + size_t __n, __total = 0; +#define emitf(FORMAT, ...) do { \ + __n = snprintf(buf, size, FORMAT, __VA_ARGS__); \ + __total += __n; \ + if (buf) { \ + size -= __n; \ + buf += __n; \ + } \ +} while (0) + + + if (!buf) + size = 0; + + // XXX temp + emitf("#waiting: %d\n", atomic_read(&fc->num_waiting)); + + /* interrupts */ + nreq = 0; + list_for_each_entry(req, &fiq->interrupts, list) { + nreq++; + } + + emitf("(%lu)Interrupt:\n", nreq); + + nreq = 0; + list_for_each_entry(req, &fiq->interrupts, list) { + emitf("\t#%lu: R%d\n", nreq, req->in.h.opcode); + nreq++; + } + + /* forgets */ + nreq = 0; + for (freq = fiq->forget_list_head.next; freq != NULL; freq = freq->next) { + nreq++; + } + + emitf("(%lu)Forget:\n", nreq); + + nreq = 0; + for(freq = fiq->forget_list_head.next; freq != NULL; freq = freq->next) { + emitf("\t#%lu: FORGET i%llu -%llu\n", nreq, + freq->forget_one.nodeid, freq->forget_one.nlookup); + nreq++; + } + + /* all other requests */ + nreq = 0; + list_for_each_entry(req, &fiq->pending, list) { + nreq++; + } + + emitf("(%lu)Request:\n", nreq); + + nreq = 0; + list_for_each_entry(req, &fiq->pending, list) { + emitf("\t#%lu: R%d\n", nreq, req->in.h.opcode); + nreq++; + } + + /* processing */ + // XXX temp? XXX locking + { + int i; + struct fuse_dev *fud; + list_for_each_entry(fud, &fc->devices, entry) { + struct fuse_pqueue *fpq = &fud->pq; + + nreq = 0; + list_for_each_entry(req, &fpq->io, list) { + nreq++; + } + emitf("(%lu)Processing.IO:\n", nreq); + + // XXX print IO elements + + nreq = 0; + for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) { + list_for_each_entry(req, &fpq->processing[i], list) { + nreq++; + } + } + emitf("(%lu)Processing.P:\n", nreq); + + nreq = 0; + for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) { + list_for_each_entry(req, &fpq->processing[i], list) { + struct fuse_in_header *h = &req->in.h; + emitf("\t#%lu: .%lld R%d i%llu\n", nreq, h->unique, h->opcode, h->nodeid); + nreq++; + } + } + } + } + + return __total; +#undef emitf +} + +static ssize_t fuse_conn_iqueue_read(struct file *file, char __user *buf, + size_t len, loff_t *ppos) +{ + char *qdump; + + if (!*ppos) { + struct fuse_conn *fc = fuse_ctl_file_conn_get(file); + struct fuse_iqueue *fiq; + size_t n; + char *qdump2; + if (!fc) + return 0; + + fiq = &fc->iq; + spin_lock(&fiq->waitq.lock); + n = fuse_conn_iqueue_print(NULL, 0, fc); + n += 1; /* trailing 0 */ + + qdump = kmalloc(n, GFP_ATOMIC); + if (qdump) { + fuse_conn_iqueue_print(qdump, n, fc); + } + + spin_unlock(&fiq->waitq.lock); + fuse_conn_put(fc); + + if (!qdump) { + return -ENOMEM; + } + + /* release atomic memory, since it is scarce resource */ + qdump2 = kstrdup(qdump, GFP_KERNEL); + kfree(qdump); + + file->private_data = (void *)qdump2; + // TODO release qdump on file release + } + + qdump = (char *)file->private_data; + return simple_read_from_buffer(buf, len, ppos, qdump, strlen(qdump)); +} + static ssize_t fuse_conn_limit_read(struct file *file, char __user *buf, size_t len, loff_t *ppos, unsigned val) { @@ -202,6 +356,12 @@ static const struct file_operations fuse_ctl_waiting_ops = { .llseek = no_llseek, }; +static const struct file_operations fuse_ctl_queue_ops = { + .open = nonseekable_open, + .read = fuse_conn_iqueue_read, + .llseek = no_llseek, +}; + static const struct file_operations fuse_conn_max_background_ops = { .open = nonseekable_open, .read = fuse_conn_max_background_read, @@ -278,6 +438,8 @@ int fuse_ctl_add_conn(struct fuse_conn *fc) if (!fuse_ctl_add_dentry(parent, fc, "waiting", S_IFREG | 0400, 1, NULL, &fuse_ctl_waiting_ops) || + !fuse_ctl_add_dentry(parent, fc, "queue", S_IFREG | 0400, 1, + NULL, &fuse_ctl_queue_ops) || !fuse_ctl_add_dentry(parent, fc, "abort", S_IFREG | 0200, 1, NULL, &fuse_ctl_abort_ops) || !fuse_ctl_add_dentry(parent, fc, "max_background", S_IFREG | 0600, diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 0920c0c032a0..7efef59caaa9 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -41,7 +41,7 @@ #define FUSE_NAME_MAX 1024 /** Number of dentries for each connection in the control filesystem */ -#define FUSE_CTL_NUM_DENTRIES 5 +#define FUSE_CTL_NUM_DENTRIES 6 /** Number of page pointers embedded in fuse_req */ #define FUSE_REQ_INLINE_PAGES 1 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.1 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8832C43381 for ; Wed, 27 Feb 2019 20:17:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 7AB382186A for ; Wed, 27 Feb 2019 20:17:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=nexedi.com header.i=kirr@nexedi.com header.b="VHYvTvSB"; dkim=pass (1024-bit key) header.d=mandrillapp.com header.i=@mandrillapp.com header.b="K4zpRMq1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730349AbfB0URL (ORCPT ); Wed, 27 Feb 2019 15:17:11 -0500 Received: from mail179-15.suw41.mandrillapp.com ([198.2.179.15]:15621 "EHLO mail179-15.suw41.mandrillapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730256AbfB0URK (ORCPT ); Wed, 27 Feb 2019 15:17:10 -0500 X-Greylist: delayed 903 seconds by postgrey-1.27 at vger.kernel.org; Wed, 27 Feb 2019 15:17:08 EST DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=mandrill; d=nexedi.com; h=From:Subject:To:Cc:Message-Id:References:In-Reply-To:Date:MIME-Version:Content-Type:Content-Transfer-Encoding; i=kirr@nexedi.com; bh=6PeEi5lPSjcmnRYE5dBgA+L5cQiB8/zTDhrD854Ct10=; b=VHYvTvSBJkjcaZFwi8UHFF87UwsvcC41iZbFJf2Ay9+lP1mDXRIcxmRmFdShqJSFGE05mzogcqAs csu5ENAKXuKZ13kaZeX2ICgSq2tim2XIRowXShPfBhtnvADWlmoFIUoPLzubrvcEPs4ZMhejPu6Q K6jOoPVOxd1n8BjmJEU= Received: from pmta04.mandrill.prod.suw01.rsglab.com (127.0.0.1) by mail179-15.suw41.mandrillapp.com id hero4822s286 for ; Wed, 27 Feb 2019 20:02:05 +0000 (envelope-from ) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1551297725; h=From : Subject : To : Cc : Message-Id : References : In-Reply-To : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=6PeEi5lPSjcmnRYE5dBgA+L5cQiB8/zTDhrD854Ct10=; b=K4zpRMq1Iptvo/K+99XEj8BNXlUgaGChnzL/FI/kdphw57s/UShCzn7CZFrDtq1mGeIIvD GND6oL6cjI/h/BsuwCmMK+Oy88erp0BQKtxvQyo4LXqHCvtJmtY3Z3LMJLsZi46daxIi5K1a BKF3WdoUq+mUtVaeGS012PZuE9QhE= From: Kirill Smelkov Subject: Re: [RESEND, PATCH v2] fuse: Don't drop NOTIFY_REPLY if we promised it Received: from [87.98.221.171] by mandrillapp.com id 26a40bfc8eb7453486e65aa4565309c8; Wed, 27 Feb 2019 20:02:05 +0000 To: Miklos Szeredi Cc: Miklos Szeredi , , fuse-devel , Han-Wen Nienhuys , Jakob Unterwurzacher , stable Message-Id: <20190227200155.GA14682@deco.navytux.spb.ru> References: <20190219094147.32734-1-kirr@nexedi.com> In-Reply-To: X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=31050260.26a40bfc8eb7453486e65aa4565309c8 X-Mandrill-User: md_31050260 Date: Wed, 27 Feb 2019 20:02:05 +0000 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Miklos, first of all thanks for feedback. On Tue, Feb 26, 2019 at 04:14:22PM +0100, Miklos Szeredi wrote: > On Tue, Feb 19, 2019 at 10:42 AM Kirill Smelkov wrote: > > > > A successful call to NOTIFY_RETRIEVE by filesystem carries promise from > > the kernel to send back NOTIFY_REPLY message. However if the filesystem > > is not reading requests with fuse_conn->max_pages capacity, > > That's a violation of the contract by the fuse server, not the kernel. Do you mean that even if filesystem server configures via init_out.max_write that it is accepting e.g. only 32K max writes, it still has to be issuing sys_read with buffer of 128K (= hardcoded fuse_conn->max_pages before Linux 4.20, and default since Linux 4.20)? Also, I could not find any FUSE contract being specified anywhere, so I used message of the commit that added support for NOTIFY_RETRIEVE to sense its semantic: commit 2d45ba381a74a743eeaa2b06c7c5c0d2bf73ba1a Author: Miklos Szeredi Date: Mon Jul 12 14:41:40 2010 +0200 fuse: add retrieve request Userspace filesystem can request data to be retrieved from the inode's mapping. This request is synchronous and the retrieved data is queued as a new request. If the write to the fuse device returns an error then the retrieve request was not completed and a reply will not be sent. Only present pages are returned in the retrieve reply. Retrieving stops when it finds a non-present page and only data prior to that is returned. This request doesn't change the dirty state of pages. Signed-off-by: Miklos Szeredi which, even if not explicitly, gives the impression that if NOTIFY_RETRIEVE was queued successfully, the reply will come. Also: if it is a violation of contract by filesystem server, the kernel should return ESOMETHING for violating sys_read, instead of making that read to be waiting indefinitely, isn't it? In summary: instead of getting clients stuck silently I still suggest for NOTIFY_RETRIEVE to come to client, if it can come. And also to return EINVAL for /dev/fuse sys_read calls that are violating the server/kernel contract. > > fuse_dev_do_read might see that the "request is too large" and decide to > > "reply with an error and restart the read". "Reply with an error" has > > underlying assumption that there is a "requester thread" that is waiting > > for request completion, which is true for most requests, but is not true > > for NOTIFY_REPLY: NOTIFY_RETRIEVE handler completes with OK status right > > after it could successfully queue NOTIFY_REPLY message without waiting > > for NOTIFY_REPLY completion. This leads to situation when filesystem > > requested to retrieve inode data with NOTIFY_RETRIEVE, got err=OK for > > that notification request, but NOTIFY_REPLY is not coming back. > > > > More, since there is no "requester thread" to handle the error, the > > situation shows itself as /sys/fs/fuse/connections/X/waiting=1 _and_ > > /dev/fuse read(s) queued. Which is misleading since NOTIFY_REPLY request > > was removed from pending queue and abandoned. > > Now I don't understand how that would happen. If the request is > abandoned, its refcount should go down to zero and the num_waiting > count decremented accordingly. You are right - it was my mistake. I misinterpreted waiting=1 as a request not being transferred to filesystem server yet, but in my test it turned out to be already transferred and sitting on client "processing" list waiting for corresponding reply (which was not coming because the filesystem in turn was stuck waiting for NOTIFY_REPLY to come from the kernel): root@(none):~/src/wendelin/wendelin.core/wcfs# cat /sys/fs/fuse/connections/25/waiting 1 root@(none):~/src/wendelin/wendelin.core/wcfs# cat /sys/fs/fuse/connections/25/queue #waiting: 1 (0)Interrupt: (0)Forget: (0)Request: (0)Processing.IO: (1)Processing.P: #0: .52 R15 i5 So the part of commit message that discussed X/waiting=1 is wrong and should be dropped - thanks for catching this. Still the main point is what should be the semantic of NOTIFY_RETRIEVE vs NOTIFY_REPLY vs INIT.max_write, and that it is better to always send retrieve data if client promised it, and also to explicitly indicate with an error if filesystem server is violating FUSE server/client contract. Thanks, Kirill P.S. I attach the draft patch for /sys/fs/fuse/connections/X/queue in case someone is interested. ---- 8< ---- diff --git a/fs/fuse/control.c b/fs/fuse/control.c index fe80bea4ad89..f4e22f5436e2 100644 --- a/fs/fuse/control.c +++ b/fs/fuse/control.c @@ -63,6 +63,160 @@ static ssize_t fuse_conn_waiting_read(struct file *file, char __user *buf, return simple_read_from_buffer(buf, len, ppos, tmp, size); } +/* fuse_conn_iqueue_print prints input queue into provided buf. + * + * buf can be NULL in which case only the length of would-be printed text is + * returned and nothing is actually printed. + * + * must be called with fc->iq->waitq locked. */ +static size_t fuse_conn_iqueue_print(char *buf, size_t size, struct fuse_conn *fc) +{ + struct fuse_iqueue *fiq = &fc->iq; + struct fuse_req *req; + struct fuse_forget_link *freq; + size_t nreq; + + size_t __n, __total = 0; +#define emitf(FORMAT, ...) do { \ + __n = snprintf(buf, size, FORMAT, __VA_ARGS__); \ + __total += __n; \ + if (buf) { \ + size -= __n; \ + buf += __n; \ + } \ +} while (0) + + + if (!buf) + size = 0; + + // XXX temp + emitf("#waiting: %d\n", atomic_read(&fc->num_waiting)); + + /* interrupts */ + nreq = 0; + list_for_each_entry(req, &fiq->interrupts, list) { + nreq++; + } + + emitf("(%lu)Interrupt:\n", nreq); + + nreq = 0; + list_for_each_entry(req, &fiq->interrupts, list) { + emitf("\t#%lu: R%d\n", nreq, req->in.h.opcode); + nreq++; + } + + /* forgets */ + nreq = 0; + for (freq = fiq->forget_list_head.next; freq != NULL; freq = freq->next) { + nreq++; + } + + emitf("(%lu)Forget:\n", nreq); + + nreq = 0; + for(freq = fiq->forget_list_head.next; freq != NULL; freq = freq->next) { + emitf("\t#%lu: FORGET i%llu -%llu\n", nreq, + freq->forget_one.nodeid, freq->forget_one.nlookup); + nreq++; + } + + /* all other requests */ + nreq = 0; + list_for_each_entry(req, &fiq->pending, list) { + nreq++; + } + + emitf("(%lu)Request:\n", nreq); + + nreq = 0; + list_for_each_entry(req, &fiq->pending, list) { + emitf("\t#%lu: R%d\n", nreq, req->in.h.opcode); + nreq++; + } + + /* processing */ + // XXX temp? XXX locking + { + int i; + struct fuse_dev *fud; + list_for_each_entry(fud, &fc->devices, entry) { + struct fuse_pqueue *fpq = &fud->pq; + + nreq = 0; + list_for_each_entry(req, &fpq->io, list) { + nreq++; + } + emitf("(%lu)Processing.IO:\n", nreq); + + // XXX print IO elements + + nreq = 0; + for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) { + list_for_each_entry(req, &fpq->processing[i], list) { + nreq++; + } + } + emitf("(%lu)Processing.P:\n", nreq); + + nreq = 0; + for (i = 0; i < FUSE_PQ_HASH_SIZE; i++) { + list_for_each_entry(req, &fpq->processing[i], list) { + struct fuse_in_header *h = &req->in.h; + emitf("\t#%lu: .%lld R%d i%llu\n", nreq, h->unique, h->opcode, h->nodeid); + nreq++; + } + } + } + } + + return __total; +#undef emitf +} + +static ssize_t fuse_conn_iqueue_read(struct file *file, char __user *buf, + size_t len, loff_t *ppos) +{ + char *qdump; + + if (!*ppos) { + struct fuse_conn *fc = fuse_ctl_file_conn_get(file); + struct fuse_iqueue *fiq; + size_t n; + char *qdump2; + if (!fc) + return 0; + + fiq = &fc->iq; + spin_lock(&fiq->waitq.lock); + n = fuse_conn_iqueue_print(NULL, 0, fc); + n += 1; /* trailing 0 */ + + qdump = kmalloc(n, GFP_ATOMIC); + if (qdump) { + fuse_conn_iqueue_print(qdump, n, fc); + } + + spin_unlock(&fiq->waitq.lock); + fuse_conn_put(fc); + + if (!qdump) { + return -ENOMEM; + } + + /* release atomic memory, since it is scarce resource */ + qdump2 = kstrdup(qdump, GFP_KERNEL); + kfree(qdump); + + file->private_data = (void *)qdump2; + // TODO release qdump on file release + } + + qdump = (char *)file->private_data; + return simple_read_from_buffer(buf, len, ppos, qdump, strlen(qdump)); +} + static ssize_t fuse_conn_limit_read(struct file *file, char __user *buf, size_t len, loff_t *ppos, unsigned val) { @@ -202,6 +356,12 @@ static const struct file_operations fuse_ctl_waiting_ops = { .llseek = no_llseek, }; +static const struct file_operations fuse_ctl_queue_ops = { + .open = nonseekable_open, + .read = fuse_conn_iqueue_read, + .llseek = no_llseek, +}; + static const struct file_operations fuse_conn_max_background_ops = { .open = nonseekable_open, .read = fuse_conn_max_background_read, @@ -278,6 +438,8 @@ int fuse_ctl_add_conn(struct fuse_conn *fc) if (!fuse_ctl_add_dentry(parent, fc, "waiting", S_IFREG | 0400, 1, NULL, &fuse_ctl_waiting_ops) || + !fuse_ctl_add_dentry(parent, fc, "queue", S_IFREG | 0400, 1, + NULL, &fuse_ctl_queue_ops) || !fuse_ctl_add_dentry(parent, fc, "abort", S_IFREG | 0200, 1, NULL, &fuse_ctl_abort_ops) || !fuse_ctl_add_dentry(parent, fc, "max_background", S_IFREG | 0600, diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 0920c0c032a0..7efef59caaa9 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -41,7 +41,7 @@ #define FUSE_NAME_MAX 1024 /** Number of dentries for each connection in the control filesystem */ -#define FUSE_CTL_NUM_DENTRIES 5 +#define FUSE_CTL_NUM_DENTRIES 6 /** Number of page pointers embedded in fuse_req */ #define FUSE_REQ_INLINE_PAGES 1