* handling NFSERR_JUKEBOX @ 2001-07-10 0:23 Bob Smart 2001-07-13 14:06 ` Pete Wyckoff 0 siblings, 1 reply; 10+ messages in thread From: Bob Smart @ 2001-07-10 0:23 UTC (permalink / raw) To: trond.myklebust; +Cc: linux-kernel, jeroen, smart The latest version of unicos has started to return NFSERR_JUKEBOX when it has to retrieve a file from tape. I found the following relevant discussion on the netbsd list: http://mail-index.netbsd.org/tech-kern/1999/03/16/0002.html To save you looking the key point is: >Not sure. Is there a way that the server can say, "I got your request, >but I'm too busy now, try again in a little bit." ?? Isn't this what NFSERR_JUKEBOX is for? AFAIK, the protocol goes something like this: Client sends a request. Server starts loading tape/optical disk/whatever. Client resends request. Server notices that this is a repeat of an earlier request which is already in the "slow queue", and replies NFSERR_JUKEBOX (= "be patient, I'll send the response eventually"). Client shuts up and waits. Server completes request and sends response to client. It seems that what the linux client is doing is returning error 528 to the user program (cp is giving this error message). From linux/errno.h: #define EJUKEBOX 528 /* Request initiated, but will not complete before timeout */ This is wrong because the nfs file system is hard mounted in my case - there is no timeout. While it would be nice to do a perfect solution, it looks like a quick fix is to just ignore NFSERR_JUKEBOX from the server. Bob Smart ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: handling NFSERR_JUKEBOX 2001-07-10 0:23 handling NFSERR_JUKEBOX Bob Smart @ 2001-07-13 14:06 ` Pete Wyckoff 2001-11-06 5:21 ` Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) Bob Smart 0 siblings, 1 reply; 10+ messages in thread From: Pete Wyckoff @ 2001-07-13 14:06 UTC (permalink / raw) To: Bob Smart; +Cc: trond.myklebust, linux-kernel, jeroen, smart smart@hpc.CSIRO.AU said: > http://mail-index.netbsd.org/tech-kern/1999/03/16/0002.html > To save you looking the key point is: > > >Not sure. Is there a way that the server can say, "I got your request, > >but I'm too busy now, try again in a little bit." ?? > > Isn't this what NFSERR_JUKEBOX is for? > > AFAIK, the protocol goes something like this: > > Client sends a request. > Server starts loading tape/optical disk/whatever. > Client resends request. > Server notices that this is a repeat of an earlier request which is already > in the "slow queue", and replies NFSERR_JUKEBOX (= "be patient, I'll send > the response eventually"). > Client shuts up and waits. > Server completes request and sends response to client. > > It seems that what the linux client is doing is returning error 528 > to the user program (cp is giving this error message). From > linux/errno.h: > > #define EJUKEBOX 528 /* Request initiated, but will not complete > before timeout */ > > This is wrong because the nfs file system is hard mounted in my case > - there is no timeout. > > While it would be nice to do a perfect solution, it looks like > a quick fix is to just ignore NFSERR_JUKEBOX from the server. The comment is incorrect. Take a look at the RFC for NFSv3. There is no duplicate request sent from the client for a JUKEBOX-supporting server. The first request is satisfied by the return of EJUKEBOX, after which it is up to the client to re-request the file. Ignoring it is not an option because the request is terminated; a retry does not make sense. A new request must be initiated by the client later. Take a look at the nfs sourceforge mailing list. There's a long thread about this in the context of a Solaris HSM server. I wrote a patch to try to do the right thing on a linux client, but it is not perfect for many reasons: http://devrandom.net/lists/archives/2001/6/NFS/0135.html -- Pete ^ permalink raw reply [flat|nested] 10+ messages in thread
* Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) 2001-07-13 14:06 ` Pete Wyckoff @ 2001-11-06 5:21 ` Bob Smart 2001-11-06 9:05 ` Trond Myklebust 0 siblings, 1 reply; 10+ messages in thread From: Bob Smart @ 2001-11-06 5:21 UTC (permalink / raw) To: Pete Wyckoff; +Cc: trond.myklebust, linux-kernel, alan > Take a look at the nfs sourceforge mailing list. There's a long thread > about this in the context of a Solaris HSM server. I wrote a patch to > try to do the right thing on a linux client, but it is not perfect for > many reasons: http://devrandom.net/lists/archives/2001/6/NFS/0135.html Thanks again for this patch. I've appended our current version to save others the trouble of minor modifications. I've just tested it against 2.4.13-ac7. Your patch may not be perfect but it is a million times better than nothing. We have had no problems with it. Linux has a long tradition of utilizing imperfect software till something better comes along. This patch is ESSENTIAL to make Linux useable in large enterprises, because they increasingly commonly utilize hirearchically managed storage. I strongly urge Alan/Linus/Macello to include this patch in the stable kernel tree (but marked experimental). Or at least can the kernel maintainers at Red Hat and other distributions put it in their offerings: surely you see that you need this for your big customers. I understand that some people think the problem should be addressed in glibc (+ all other places that do open system calls). Well really this is just a small part of a much wider problem. I've included a rant below. The bottom line is that default "wait as long as necessary" behaviour should be provided by the kernel, but there needs to be an optional mechanism that lets the user know when things will take longer than expected. Cheers, Bob ----------------------------------------------------------------------------- Handling long latency events ---------------------------- The linux kernel potentially returns the error EJUKEBOX from some operations. This error means: "This operation might well succeed eventually, and I've started the task [e.g. get file from tape] that will cause it to later succeed. However since it will take a long time I'm returning this error so that you can: inform the user; wait a while; then retry the op." Now that NFS servers are starting to use NFSERR_JUKEBOX returns, (which get turned into EJUKEBOX if you don't have the Wyckoff patch) this is starting to happen and we find that linux user mode software doesn't handle it. Getting a file from tape is just one of many circumstances where operations sometimes take an unexpectedly long time. Other cases typcially revolve around network operations where remote servers are not responding for one reason or another. As use of network services increases this problem is getting worse. We need to be able to handle these situations in 2 different ways. If it is a batch or background operation then most of the time we just want it to wait. Typical example: a telnet server shouldn't drop a terminal session just because the client has been out of touch for an hour because of network problems. On the other hand if the software is interacting with the user and suddenly there are delays, then we want to give the user useful feedback and also allow the user to kill the operation. A typical example is a telnet client [and the (old?) windows behaviour of locking up is particularly annoying]. It seems to me that the right unixy solution to this problem is through signals. The EJUKEBOX idea is regretable. However it has happened and we need to handle it. I strongly urge that: - default behaviour is for the kernel to wait and never return EJUKEBOX. - however user programs that want to handle EJUKEBOX should be able to put themselves into a mode where EJUKEBOX is returned. This approach has the advantage of minimizing impact on existing user mode software. If you accept my argument then Pete Wyckoff's patch is a step in the right direction. ----------------------------------------------------------------------------- diff -r -u --exclude='*.orig' SAVE/linux-2.4.13-0.3/Documentation/Configure.help linux-2.4.13-0.3+jukebox/Documentation/Configure.help --- SAVE/linux-2.4.13-0.3/Documentation/Configure.help Tue Oct 30 09:19:01 2001 +++ linux-2.4.13-0.3+jukebox/Documentation/Configure.help Sun Nov 4 14:03:59 2001 @@ -14094,6 +14094,23 @@ If unsure, say N. +Auto-retry jukebox error requests (EXPERIMENTAL) +CONFIG_NFS_V3_JUKEBOX_RETRY + Say Y here if you want your NFS client to automatically retry requests + to which the server responded with EJUKEBOX (528). Some NFS servers + export filesystems which are hierarchical, in that some files may take + much longer than others to read or write. These files may be migrated + to a slow tape storage system which serves as the bulk capacity for the + file system; the fast disk becomes essentially a cache. + + If you try to read a file which has been migrated to tape, the server + may return the error EJUKEBOX and begin the process which brings the + file from tape to disk. This option enables the kernel to retry + the access until the file is ready, rather than passing the error to + userspace applications, which is the default. + + If unsure, say N. + Root file system on NFS CONFIG_ROOT_NFS If you want your Linux box to mount its whole root file system (the diff -r -u --exclude='*.orig' SAVE/linux-2.4.13-0.3/fs/Config.in linux-2.4.13-0.3+jukebox/fs/Config.in --- SAVE/linux-2.4.13-0.3/fs/Config.in Tue Oct 30 09:18:42 2001 +++ linux-2.4.13-0.3+jukebox/fs/Config.in Sun Nov 4 14:03:59 2001 @@ -95,6 +95,7 @@ dep_tristate 'InterMezzo file system support (experimental, replicating fs)' CONFIG_INTERMEZZO_FS $CONFIG_INET $CONFIG_EXPERIMENTAL dep_tristate 'NFS file system support' CONFIG_NFS_FS $CONFIG_INET dep_mbool ' Provide NFSv3 client support' CONFIG_NFS_V3 $CONFIG_NFS_FS + dep_mbool ' Auto-retry jukebox error requests' CONFIG_NFS_V3_JUKEBOX_RETRY $CONFIG_NFS_V3 dep_bool ' Root file system on NFS' CONFIG_ROOT_NFS $CONFIG_NFS_FS $CONFIG_IP_PNP dep_tristate 'NFS server support' CONFIG_NFSD $CONFIG_INET diff -r -u --exclude='*.orig' SAVE/linux-2.4.13-0.3/fs/nfs/file.c linux-2.4.13-0.3+jukebox/fs/nfs/file.c --- SAVE/linux-2.4.13-0.3/fs/nfs/file.c Mon Sep 24 02:48:01 2001 +++ linux-2.4.13-0.3+jukebox/fs/nfs/file.c Sun Nov 4 15:17:14 2001 @@ -192,8 +192,19 @@ if (!inode) return 0; +#ifdef CONFIG_NFS_V3_JUKEBOX_RETRY + /* perhaps nfsv3 server returned jukebox, go try again */ + /* if (PageError(page) && inode->u.nfs_i.async_error == -EJUKEBOX) { + result = nfs_readpage_sync(inode->u.nfs_i.file, inode, page); */ + if (PageError(page) && NFS_I(inode)->async_error == -EJUKEBOX) { + result = nfs_readpage_sync(NFS_I(inode)->file, inode, page); + } else +#endif + { rpages = NFS_SERVER(inode)->rpages; result = nfs_pagein_inode(inode, index, rpages); + } + if (result < 0) return result; return 0; diff -r -u --exclude='*.orig' SAVE/linux-2.4.13-0.3/fs/nfs/nfs3proc.c linux-2.4.13-0.3+jukebox/fs/nfs/nfs3proc.c --- SAVE/linux-2.4.13-0.3/fs/nfs/nfs3proc.c Tue Oct 2 06:45:37 2001 +++ linux-2.4.13-0.3+jukebox/fs/nfs/nfs3proc.c Sun Nov 4 16:28:09 2001 @@ -4,6 +4,8 @@ * Client-side NFSv3 procedures stubs. * * Copyright (C) 1997, Olaf Kirch + * + * 21 Jun 01 - Retry NFSv3 EJUKEBOX errors by Pete Wyckoff */ #include <linux/mm.h> @@ -56,10 +58,25 @@ struct nfs3_sattrargs arg = { NFS_FH(inode), sattr, 0, 0 }; int status; + for (;;) { dprintk("NFS call setattr\n"); fattr->valid = 0; - status = rpc_call(NFS_CLIENT(inode), NFS3PROC_SETATTR, &arg, fattr, 0); - dprintk("NFS reply setattr\n"); + status = rpc_call(NFS_CLIENT(inode), NFS3PROC_SETATTR, &arg, + fattr, 0); + dprintk("NFS reply setattr %d\n", status); +#ifdef CONFIG_NFS_V3_JUKEBOX_RETRY + if (status == -EJUKEBOX) { + DECLARE_WAIT_QUEUE_HEAD(jukebox_wq); + tty_write_message(current->tty, + "NFSv3: write waiting for file unmigration\r\n"); + if (interruptible_sleep_on_timeout(&jukebox_wq, + NFS_JUKEBOX_RETRY_TIMEO) == 0) + continue; /* but not if interrupted */ + } +#endif + break; + } + return status; } @@ -147,10 +164,24 @@ struct rpc_message msg = { NFS3PROC_READ, &arg, &res, cred }; int status; + for (;;) { dprintk("NFS call read %d @ %Ld\n", count, (long long)offset); fattr->valid = 0; status = rpc_call_sync(NFS_CLIENT(inode), &msg, flags); dprintk("NFS reply read: %d\n", status); +#ifdef CONFIG_NFS_V3_JUKEBOX_RETRY + if (status == -EJUKEBOX) { + DECLARE_WAIT_QUEUE_HEAD(jukebox_wq); + tty_write_message(current->tty, + "NFSv3: read waiting for file unmigration\r\n"); + if (interruptible_sleep_on_timeout(&jukebox_wq, + NFS_JUKEBOX_RETRY_TIMEO) == 0) + continue; /* but not if interrupted */ + } +#endif + break; + } + *eofp = res.eof; return status; } @@ -177,7 +208,7 @@ status = rpc_call_sync(NFS_CLIENT(inode), &msg, rpcflags); - dprintk("NFS reply read: %d\n", status); + dprintk("NFS reply write: %d\n", status); return status < 0? status : res.count; } diff -r -u --exclude='*.orig' SAVE/linux-2.4.13-0.3/fs/nfs/read.c linux-2.4.13-0.3+jukebox/fs/nfs/read.c --- SAVE/linux-2.4.13-0.3/fs/nfs/read.c Tue Oct 30 09:18:42 2001 +++ linux-2.4.13-0.3+jukebox/fs/nfs/read.c Sun Nov 4 15:28:38 2001 @@ -81,7 +81,7 @@ /* * Read a page synchronously. */ -static int +int nfs_readpage_sync(struct file *file, struct inode *inode, struct page *page) { struct rpc_cred *cred = NULL; @@ -449,11 +449,27 @@ if (task->tk_status >= 0 && count >= 0) { SetPageUptodate(page); count -= PAGE_CACHE_SIZE; - } else + flush_dcache_page(page); + kunmap(page); + UnlockPage(page); + } else { SetPageError(page); flush_dcache_page(page); kunmap(page); +#ifdef CONFIG_NFS_V3_JUKEBOX_RETRY + /* do not unlock page, let nfs_sync_page retry */ + if (task->tk_status == -EJUKEBOX) { + NFS_I(inode)->async_error = task->tk_status; + if (waitqueue_active(&page->wait)) + wake_up(&page->wait); + else /* nobody waiting, just unlock */ UnlockPage(page); + } else +#endif + { + UnlockPage(page); + } + } dprintk("NFS: read (%x/%Ld %d@%Ld)\n", req->wb_inode->i_dev, @@ -504,8 +520,13 @@ goto out_error; error = -1; - if (!PageError(page) && NFS_SERVER(inode)->rsize >= PAGE_CACHE_SIZE) + if (!PageError(page) && NFS_SERVER(inode)->rsize >= PAGE_CACHE_SIZE) { +#ifdef CONFIG_NFS_V3_JUKEBOX_RETRY + NFS_I(inode)->file = file; + NFS_I(inode)->async_error = 0; +#endif error = nfs_readpage_async(file, inode, page); + } if (error >= 0) goto out; diff -r -u --exclude='*.orig' SAVE/linux-2.4.13-0.3/include/linux/nfs_fs.h linux-2.4.13-0.3+jukebox/include/linux/nfs_fs.h --- SAVE/linux-2.4.13-0.3/include/linux/nfs_fs.h Tue Oct 30 10:04:16 2001 +++ linux-2.4.13-0.3+jukebox/include/linux/nfs_fs.h Sun Nov 4 14:09:34 2001 @@ -104,6 +104,9 @@ /* Inode Flags */ #define NFS_USE_READDIRPLUS(inode) ((NFS_FLAGS(inode) & NFS_INO_ADVISE_RDPLUS) ? 1 : 0) +/* Jukebox retry interval */ +#define NFS_JUKEBOX_RETRY_TIMEO (10 * HZ) + /* * These are the default flags for swap requests */ @@ -258,6 +261,7 @@ extern int nfs_readpage(struct file *, struct page *); extern int nfs_pagein_inode(struct inode *, unsigned long, unsigned int); extern int nfs_pagein_timeout(struct inode *); +extern int nfs_readpage_sync(struct file *, struct inode *, struct page *); /* * linux/fs/mount_clnt.c diff -r -u --exclude='*.orig' SAVE/linux-2.4.13-0.3/include/linux/nfs_fs_i.h linux-2.4.13-0.3+jukebox/include/linux/nfs_fs_i.h --- SAVE/linux-2.4.13-0.3/include/linux/nfs_fs_i.h Tue Oct 30 10:03:46 2001 +++ linux-2.4.13-0.3+jukebox/include/linux/nfs_fs_i.h Sun Nov 4 14:09:25 2001 @@ -75,6 +75,11 @@ /* Credentials for shared mmap */ struct rpc_cred *mm_cred; + +#ifdef CONFIG_NFS_V3_JUKEBOX_RETRY + struct file *file; + int async_error; +#endif }; /* ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) 2001-11-06 5:21 ` Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) Bob Smart @ 2001-11-06 9:05 ` Trond Myklebust 2001-11-06 9:10 ` Marcelo Tosatti ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Trond Myklebust @ 2001-11-06 9:05 UTC (permalink / raw) To: Bob Smart; +Cc: Pete Wyckoff, linux-kernel, alan >>>>> " " == Bob Smart <smart@hpc.CSIRO.AU> writes: >> Take a look at the nfs sourceforge mailing list. There's a >> long thread about this in the context of a Solaris HSM server. >> I wrote a patch to try to do the right thing on a linux client, >> but it is not perfect for many reasons: >> http://devrandom.net/lists/archives/2001/6/NFS/0135.html > Thanks again for this patch. I've appended our current version > to save others the trouble of minor modifications. I've just > tested it against 2.4.13-ac7. > Your patch may not be perfect but it is a million times better > than nothing. We have had no problems with it. Linux has a long > tradition of utilizing imperfect software till something better > comes along. > This patch is ESSENTIAL to make Linux useable in large > enterprises, because they increasingly commonly utilize > hirearchically managed storage. I strongly urge > Alan/Linus/Macello to include this patch in the stable kernel > tree (but marked experimental). Or at least can the kernel > maintainers at Red Hat and other distributions put it in their > offerings: surely you see that you need this for your big > customers. The patch may appear to work, but it fails to understand the nature of the problem. The problem with NFSERR_JUKEBOX is that is can appear in more or less *any* NFSv3 RPC call. That includes everything from lookups to writes. If all you do is intercept read and setattr, then you certainly haven't achieved 'enterprise quality'. Cheers, Trond ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) 2001-11-06 9:05 ` Trond Myklebust @ 2001-11-06 9:10 ` Marcelo Tosatti 2001-11-06 10:36 ` Bob Smart 2001-11-06 10:53 ` Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) Trond Myklebust 2 siblings, 0 replies; 10+ messages in thread From: Marcelo Tosatti @ 2001-11-06 9:10 UTC (permalink / raw) To: Trond Myklebust; +Cc: Bob Smart, Pete Wyckoff, linux-kernel, alan On 6 Nov 2001, Trond Myklebust wrote: > >>>>> " " == Bob Smart <smart@hpc.CSIRO.AU> writes: > > >> Take a look at the nfs sourceforge mailing list. There's a > >> long thread about this in the context of a Solaris HSM server. > >> I wrote a patch to try to do the right thing on a linux client, > >> but it is not perfect for many reasons: > >> http://devrandom.net/lists/archives/2001/6/NFS/0135.html > > > Thanks again for this patch. I've appended our current version > > to save others the trouble of minor modifications. I've just > > tested it against 2.4.13-ac7. > > > Your patch may not be perfect but it is a million times better > > than nothing. We have had no problems with it. Linux has a long > > tradition of utilizing imperfect software till something better > > comes along. > > > This patch is ESSENTIAL to make Linux useable in large > > enterprises, because they increasingly commonly utilize > > hirearchically managed storage. I strongly urge > > Alan/Linus/Macello to include this patch in the stable kernel > > tree (but marked experimental). Or at least can the kernel > > maintainers at Red Hat and other distributions put it in their > > offerings: surely you see that you need this for your big > > customers. > > The patch may appear to work, but it fails to understand the nature of > the problem. The problem with NFSERR_JUKEBOX is that is can appear in > more or less *any* NFSv3 RPC call. That includes everything from > lookups to writes. > > If all you do is intercept read and setattr, then you certainly > haven't achieved 'enterprise quality'. Trond, Should'nt we make Linux retry on NFSERR_JUKEBOX _by default_ ? Is there any reason why we already don't do that already except "nobody coded it yet" ? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) 2001-11-06 9:05 ` Trond Myklebust 2001-11-06 9:10 ` Marcelo Tosatti @ 2001-11-06 10:36 ` Bob Smart 2001-11-06 13:54 ` Trond Myklebust 2001-11-06 10:53 ` Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) Trond Myklebust 2 siblings, 1 reply; 10+ messages in thread From: Bob Smart @ 2001-11-06 10:36 UTC (permalink / raw) To: linux-kernel > If all you do is intercept read and setattr, then you certainly > haven't achieved 'enterprise quality'. I don't have documentation on when Cray's NFSv3 server will return NFSERR_JUKEBOX, but since inodes aren't migrated to tape I don't think it is likely on other than read and non-truncate write with no prior read. Our applications are unlikely to do the latter. Maybe there is a subset of servers for which the patch is OK, and we could make that clear in the help for the config option. I'll ask Cray for clarification about their server. Bob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) 2001-11-06 10:36 ` Bob Smart @ 2001-11-06 13:54 ` Trond Myklebust 2001-11-06 23:41 ` handling NFSERR_JUKEBOX Trond Myklebust 0 siblings, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2001-11-06 13:54 UTC (permalink / raw) To: Bob Smart; +Cc: linux-kernel >>>>> " " == Bob Smart <smart@hpc.CSIRO.AU> writes: > Maybe there is a subset of servers for which the patch is OK, > and we could make that clear in the help for the config > option. I'll ask Cray for clarification about their server. I'm still not convinced this is a good idea, but if you are going to do things inside the NFS client, why don't you instead write a wrapper function around rpc_call_sync() for fs/nfs/nfs3proc.c. Something like static int nfs_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags) { sigset_t oldset; int res; rpc_clnt_sigmask(clnt, &oldset); do { res = rpc_call_sync(clnt, msg, flags); if (result != -EJUKEBOX) break; set_current_state(TASK_INTERRUPTIBLE); schedule_timeout(NFS_JUKEBOX_RETRY_TIME); res = -ERESTARTSYS; } while (!signalled()); rpc_clnt_sigunmask(clnt, &oldset); return res; } and then use a couple of '#define's to wrap rpc_call() and rpc_call_sync() in nfs3_proc_*(). That will take care of all those synchronous calls in one fell swoop. You'll still need to take care of asynchronous reads, writes and unlink, but those are easy. Just do something like if (task->tk_error == -EJUKEBOX) { rpc_delay(task, NFS_JUKEBOX_RETRY_TIMEO); rpc_restart_call(task); return; } in nfs_readpage_result(), nfs_writeback_done(), nfs_commit_done(), and nfs_async_unlink_done()... Cheers, Trond ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: handling NFSERR_JUKEBOX 2001-11-06 13:54 ` Trond Myklebust @ 2001-11-06 23:41 ` Trond Myklebust 2001-11-08 16:20 ` Pete Wyckoff 0 siblings, 1 reply; 10+ messages in thread From: Trond Myklebust @ 2001-11-06 23:41 UTC (permalink / raw) To: Bob Smart; +Cc: linux-kernel, nfs >>>>> " " == Trond Myklebust <trond.myklebust@fys.uio.no> writes: > I'm still not convinced this is a good idea, but if you are > going to do things inside the NFS client, why don't you instead > write a wrapper function around rpc_call_sync() for > fs/nfs/nfs3proc.c. Something like ...and here's the full patch that implements it... Cheers, Trond diff -u --recursive --new-file linux-2.4.14/fs/nfs/nfs3proc.c linux-2.4.14-jukebox/fs/nfs/nfs3proc.c --- linux-2.4.14/fs/nfs/nfs3proc.c Mon Oct 1 22:45:37 2001 +++ linux-2.4.14-jukebox/fs/nfs/nfs3proc.c Wed Nov 7 00:25:19 2001 @@ -17,6 +17,37 @@ #define NFSDBG_FACILITY NFSDBG_PROC +/* A wrapper to handle the EJUKEBOX error message */ +static int +nfs3_rpc_wrapper(struct rpc_clnt *clnt, struct rpc_message *msg, int flags) +{ + sigset_t oldset; + int res; + rpc_clnt_sigmask(clnt, &oldset); + do { + res = rpc_call_sync(clnt, msg, flags); + if (res != -EJUKEBOX) + break; + set_current_state(TASK_INTERRUPTIBLE); + schedule_timeout(NFS_JUKEBOX_RETRY_TIME); + res = -ERESTARTSYS; + } while (!signalled()); + rpc_clnt_sigunmask(clnt, &oldset); + return res; +} + +static inline int +nfs3_rpc_call_wrapper(struct rpc_clnt *clnt, u32 proc, void *argp, void *resp, int flags) +{ + struct rpc_message msg = { proc, argp, resp, NULL }; + return nfs3_rpc_wrapper(clnt, &msg, flags); +} + +#define rpc_call(clnt, proc, argp, resp, flags) \ + nfs3_rpc_call_wrapper(clnt, proc, argp, resp, flags) +#define rpc_call_sync(clnt, msg, flags) \ + nfs3_rpc_wrapper(clnt, msg, flags) + /* * Bare-bones access to getattr: this is for nfs_read_super. */ diff -u --recursive --new-file linux-2.4.14/fs/nfs/read.c linux-2.4.14-jukebox/fs/nfs/read.c --- linux-2.4.14/fs/nfs/read.c Thu Oct 11 17:12:52 2001 +++ linux-2.4.14-jukebox/fs/nfs/read.c Wed Nov 7 00:25:19 2001 @@ -437,6 +437,9 @@ dprintk("NFS: %4d nfs_readpage_result, (status %d)\n", task->tk_pid, task->tk_status); + if (nfs_async_handle_jukebox(task)) + return; + nfs_refresh_inode(inode, &data->fattr); while (!list_empty(&data->pages)) { struct nfs_page *req = nfs_list_entry(data->pages.next); diff -u --recursive --new-file linux-2.4.14/fs/nfs/unlink.c linux-2.4.14-jukebox/fs/nfs/unlink.c --- linux-2.4.14/fs/nfs/unlink.c Thu Aug 16 18:39:37 2001 +++ linux-2.4.14-jukebox/fs/nfs/unlink.c Wed Nov 7 00:25:19 2001 @@ -123,6 +123,8 @@ struct dentry *dir = data->dir; struct inode *dir_i; + if (nfs_async_handle_jukebox(task)) + return; if (!dir) return; dir_i = dir->d_inode; diff -u --recursive --new-file linux-2.4.14/fs/nfs/write.c linux-2.4.14-jukebox/fs/nfs/write.c --- linux-2.4.14/fs/nfs/write.c Thu Oct 11 17:12:52 2001 +++ linux-2.4.14-jukebox/fs/nfs/write.c Wed Nov 7 00:25:19 2001 @@ -1229,6 +1229,9 @@ dprintk("NFS: %4d nfs_writeback_done (status %d)\n", task->tk_pid, task->tk_status); + if (nfs_async_handle_jukebox(task)) + return; + /* We can't handle that yet but we check for it nevertheless */ if (resp->count < argp->count && task->tk_status >= 0) { static unsigned long complain; @@ -1422,6 +1425,9 @@ dprintk("NFS: %4d nfs_commit_done (status %d)\n", task->tk_pid, task->tk_status); + if (nfs_async_handle_jukebox(task)) + return; + nfs_write_attributes(inode, resp->fattr); while (!list_empty(&data->pages)) { req = nfs_list_entry(data->pages.next); diff -u --recursive --new-file linux-2.4.14/include/linux/nfs_fs.h linux-2.4.14-jukebox/include/linux/nfs_fs.h --- linux-2.4.14/include/linux/nfs_fs.h Wed Oct 24 07:00:07 2001 +++ linux-2.4.14-jukebox/include/linux/nfs_fs.h Wed Nov 7 00:25:19 2001 @@ -16,6 +16,7 @@ #include <linux/sunrpc/debug.h> #include <linux/sunrpc/auth.h> +#include <linux/sunrpc/clnt.h> #include <linux/nfs.h> #include <linux/nfs2.h> @@ -326,6 +327,29 @@ __retval; \ }) +#ifdef CONFIG_NFS_V3 + +#define NFS_JUKEBOX_RETRY_TIME (5 * HZ) +static inline int +nfs_async_handle_jukebox(struct rpc_task *task) +{ + if (task->tk_status != -EJUKEBOX) + return 0; + task->tk_status = 0; + rpc_restart_call(task); + rpc_delay(task, NFS_JUKEBOX_RETRY_TIME); + return 1; +} + +#else + +static inline int +nfs_async_handle_jukebox(struct rpc_task *task) +{ + return 0; +} +#endif /* CONFIG_NFS_V3 */ + #endif /* __KERNEL__ */ /* ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: handling NFSERR_JUKEBOX 2001-11-06 23:41 ` handling NFSERR_JUKEBOX Trond Myklebust @ 2001-11-08 16:20 ` Pete Wyckoff 0 siblings, 0 replies; 10+ messages in thread From: Pete Wyckoff @ 2001-11-08 16:20 UTC (permalink / raw) To: Trond Myklebust; +Cc: Bob Smart, linux-kernel, nfs trond.myklebust@fys.uio.no said: > > I'm still not convinced this is a good idea, but if you are > > going to do things inside the NFS client, why don't you instead > > write a wrapper function around rpc_call_sync() for > > fs/nfs/nfs3proc.c. Something like > > ...and here's the full patch that implements it... Works like a charm. Glad you allowed yourself to be badgered into writing the proper code to handle EJUKEBOX. I'll destroy my icky version. Can you push it forward to Linus for general inclusion? -- Pete ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) 2001-11-06 9:05 ` Trond Myklebust 2001-11-06 9:10 ` Marcelo Tosatti 2001-11-06 10:36 ` Bob Smart @ 2001-11-06 10:53 ` Trond Myklebust 2 siblings, 0 replies; 10+ messages in thread From: Trond Myklebust @ 2001-11-06 10:53 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Bob Smart, Pete Wyckoff, linux-kernel, alan >>>>> " " == Marcelo Tosatti <marcelo@conectiva.com.br> writes: > Trond, > Should'nt we make Linux retry on NFSERR_JUKEBOX _by default_ ? Normally we probably should. The thing that worries me is the fact that we may be holding the VFS semaphores for long periods of time. In particular that will greatly slow down lookups... This is the reason why I'd prefer a lot of these things to be done in the VFS layer or above. Cheers, Trond ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2001-11-08 16:20 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-07-10 0:23 handling NFSERR_JUKEBOX Bob Smart 2001-07-13 14:06 ` Pete Wyckoff 2001-11-06 5:21 ` Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) Bob Smart 2001-11-06 9:05 ` Trond Myklebust 2001-11-06 9:10 ` Marcelo Tosatti 2001-11-06 10:36 ` Bob Smart 2001-11-06 13:54 ` Trond Myklebust 2001-11-06 23:41 ` handling NFSERR_JUKEBOX Trond Myklebust 2001-11-08 16:20 ` Pete Wyckoff 2001-11-06 10:53 ` Red Hat needs this patch (was Re: handling NFSERR_JUKEBOX) Trond Myklebust
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).