* nfsd changes for 2.6.37 @ 2010-10-26 16:45 J. Bruce Fields 2010-10-26 17:22 ` J. Bruce Fields 2010-10-26 20:18 ` Arnd Bergmann 0 siblings, 2 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-26 16:45 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-nfs, linux-kernel Please pull the following nfsd updates from the for-2.6.37 branch at: git://linux-nfs.org/~bfields/linux.git for-2.6.37 Neil Brown modifed the way we wait for upcalls to solve a performance problem and a long-standing NFSv4 correctness problem. Pavel Emelyanov started the job of making the NFS client and server container-aware. (Note that for the sake of keeping his work in one tree, Trond and I agreed to merge it through my tree, though it includes client as well as server changes.) Thanks to Chuck Lever for help with review. Thanks also to Tom Tucker for some rdma fixes (and review help). I'm continuing work to bring the NFSv4.1 server close enough to the spec that we could turn it on by default without risking future interoperability problems. We're not quite there yet. A moment's silence for the spkm3 code, please. It's been lying unused for years now, so we decided to finally put it out of its misery. We're also deprecating (not removing quite yet) the old nfsd control interfaces. I have a few last-minute changes still under testing that I may try to sneak in later in the week, but given the short merge window I thought it would make more sense to send the bulk of the changes now. --b. Andy Adamson (1): nfsd: remove duplicate NFS4_STATEID_SIZE declaration Andy Shevchenko (1): sunrpc/cache: don't use custom hex_to_bin() converter Benny Halevy (1): nfsd4: adjust buflen for encoded attrs bitmap based on actual bitmap length Bryan Schumaker (1): lockd: Mostly remove BKL from the server Chuck Lever (2): SUNRPC: Use conventional switch statement when reclassifying sockets SUNRPC: Properly initialize sock_xprt.srcaddr in all cases J. Bruce Fields (39): svcrpc: minor cache cleanup svcrpc: cache deferral cleanup Merge remote branch 'trond/bugfixes' into for-2.6.37 nfsd4: fix hang on fast-booting nfs servers nfsd: fix /proc/net/rpc/nfsd.export/content display nfsd4: remove spkm3 nfsd4: minor variable renaming (cb -> conn) nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence nfsd4: rename nfs4_rpc_args->nfsd4_cb_args nfsd4: generic callback code nfsd4: use generic callback code in null case nfsd4: remove separate cb_args struct nfsd4: Move callback setup to callback queue nfsd4: fix alloc_init_session BUILD_BUG_ON() nfsd4: fix alloc_init_session return type nfsd4: clean up session allocation nfsd4: keep per-session list of connections nfsd: provide callbacks on svc_xprt deletion nfsd4: use callbacks on svc_xprt_deletion nfsd4: refactor connection allocation nfsd4: add new connections to session nfsd4: return expired on unfound stateid's nfsd4: expire clients more promptly nfsd4: don't cache seq_misordered replies nfsd4: move callback setup into session init code nfsd4: use client pointer to backchannel session nfsd4: make backchannel sequence number per-session nfsd4: confirm only on succesful create_session nfsd4: track backchannel connections nfsd4: callback program number is per-session nfsd4: separate callback change and callback probe nfsd4: delay session removal till free_client nfsd4: move minorversion to client nfsd4: only require krb5 principal for NFSv4.0 callbacks nfsd4: fix connection allocation in sequence() svcrpc: never clear XPT_BUSY on dead xprt svcrpc: assume svc_delete_xprt() called only once svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue svcrpc: svc_tcp_sendto XPT_DEAD check is redundant NeilBrown (14): sunrpc: extract some common sunrpc_cache code from nfsd sunrpc: use seconds since boot in expiry cache sunrpc/cache: allow threads to block while waiting for cache update. sunrpc: close connection when a request is irretrievably lost. nfsd: disable deferral for NFSv4 nfsd/idmap: drop special request deferal in favour of improved default. svcauth_gss: replace a trivial 'switch' with an 'if' sunrpc/cache: change deferred-request hash table to use hlist. sunrpc/cache: fix recent breakage of cache_clean_deferred nfsd: formally deprecate legacy nfsd syscall interface nfsd: allow deprecated interface to be compiled out. sunrpc: fix race in new cache_wait code. sunrpc: Simplify cache_defer_req and related functions. sunrpc/cache: centralise handling of size limit on deferred list. Pavel Emelyanov (37): nfsd: Export get_task_comm for nfsd sunrpc: Pass the ip_map_parse's cd to lower calls sunrpc: Make xprt auth cache release work with the xprt sunrpc: Pass xprt to cached get/put routines sunrpc: Add net to pure API calls sunrpc: Add routines that allow registering per-net caches sunrpc: Tag svc_xprt with net sunrpc: The per-net skeleton sunrpc: Make the /proc/net/rpc appear in net namespaces sunrpc: Make the ip_map_cache be per-net sunrpc: Factor out rpc_xprt allocation sunrpc: Factor out rpc_xprt freeing sunrpc: Add net argument to svc_create_xprt sunrpc: Pull net argument downto svc_create_socket sunrpc: Add net to rpc_create_args sunrpc: Add net to xprt_create sunrpc: Tag rpc_xprt with net net: Export __sock_create sunrpc: Create sockets in net namespaces sunrpc: Use helper to set v4 mapped addr in ip_map_parse sunrpc: Remove unused sock arg from xs_get_srcport sunrpc: Remove unused sock arg from xs_next_srcport sunrpc: Get xprt pointer once in xs_tcp_setup_socket sunrpc: Remove duplicate xprt/transport arguments from calls sunrpc: Factor out udp sockets creation sunrpc: Factor out v4 sockets creation sunrpc: Factor out v6 sockets creation sunrpc: Call xs_create_sockX directly from setup_socket sunrpc: Merge the xs_bind code sunrpc: Merge xs_create_sock code sunrpc: Pass family to setup_socket calls sunrpc: Remove TCP worker wrappers sunrpc: Remove UDP worker wrappers sunrpc: Remove useless if (task == NULL) from xprt_reserve_xprt sunrpc: Don't return NULL from rpcb_create sunrpc: Remove dead "else" branch from bc xprt creation sunrpc: Turn list_for_each-s into the ..._entry-s Stephen Rothwell (1): sunrpc: fix up rpcauth_remove_module section mismatch Tejun Heo (1): sunrpc/xprtrdma: clean up workqueue usage Tom Tucker (2): svcrdma: Change DMA mapping logic to avoid the page_address kernel API svcrdma: Cleanup DMA unmapping in error paths. Documentation/feature-removal-schedule.txt | 10 + fs/Makefile | 5 +- fs/compat.c | 2 +- fs/lockd/host.c | 1 + fs/lockd/mon.c | 1 + fs/lockd/svc.c | 2 +- fs/lockd/svc4proc.c | 2 - fs/lockd/svclock.c | 31 ++- fs/lockd/svcproc.c | 2 - fs/nfs/callback.c | 4 +- fs/nfs/client.c | 1 + fs/nfs/dns_resolve.c | 6 +- fs/nfs/mount_clnt.c | 2 + fs/nfsd/Kconfig | 12 + fs/nfsd/export.c | 73 +++-- fs/nfsd/nfs4callback.c | 245 ++++++++------ fs/nfsd/nfs4idmap.c | 105 +------ fs/nfsd/nfs4proc.c | 7 +- fs/nfsd/nfs4state.c | 491 ++++++++++++++++++---------- fs/nfsd/nfs4xdr.c | 18 +- fs/nfsd/nfsctl.c | 26 ++- fs/nfsd/nfsd.h | 2 +- fs/nfsd/nfssvc.c | 5 +- fs/nfsd/state.h | 52 ++- include/linux/net.h | 2 + include/linux/nfs4.h | 3 + include/linux/sunrpc/auth.h | 4 +- include/linux/sunrpc/cache.h | 37 ++- include/linux/sunrpc/clnt.h | 1 + include/linux/sunrpc/gss_spkm3.h | 55 --- include/linux/sunrpc/stats.h | 23 +- include/linux/sunrpc/svc_xprt.h | 32 ++- include/linux/sunrpc/svcauth.h | 17 +- include/linux/sunrpc/xprt.h | 4 + net/socket.c | 3 +- net/sunrpc/Kconfig | 19 - net/sunrpc/auth.c | 2 +- net/sunrpc/auth_generic.c | 2 +- net/sunrpc/auth_gss/Makefile | 5 - net/sunrpc/auth_gss/gss_spkm3_mech.c | 247 -------------- net/sunrpc/auth_gss/gss_spkm3_seal.c | 186 ----------- net/sunrpc/auth_gss/gss_spkm3_token.c | 267 --------------- net/sunrpc/auth_gss/gss_spkm3_unseal.c | 127 ------- net/sunrpc/auth_gss/svcauth_gss.c | 51 ++-- net/sunrpc/cache.c | 288 ++++++++++++----- net/sunrpc/clnt.c | 1 + net/sunrpc/netns.h | 19 + net/sunrpc/rpcb_clnt.c | 4 +- net/sunrpc/stats.c | 43 ++- net/sunrpc/sunrpc_syms.c | 58 +++- net/sunrpc/svc.c | 3 + net/sunrpc/svc_xprt.c | 59 +++-- net/sunrpc/svcauth_unix.c | 194 ++++++++--- net/sunrpc/svcsock.c | 27 +- net/sunrpc/xprt.c | 39 ++- net/sunrpc/xprtrdma/svc_rdma.c | 11 +- net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 19 +- net/sunrpc/xprtrdma/svc_rdma_sendto.c | 82 ++++-- net/sunrpc/xprtrdma/svc_rdma_transport.c | 49 ++-- net/sunrpc/xprtrdma/transport.c | 25 +- net/sunrpc/xprtsock.c | 358 +++++++------------- 61 files changed, 1534 insertions(+), 1937 deletions(-) delete mode 100644 include/linux/sunrpc/gss_spkm3.h delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_mech.c delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_seal.c delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_token.c delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_unseal.c create mode 100644 net/sunrpc/netns.h ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 16:45 nfsd changes for 2.6.37 J. Bruce Fields @ 2010-10-26 17:22 ` J. Bruce Fields 2010-10-26 17:39 ` Linus Torvalds 2010-10-26 20:18 ` Arnd Bergmann 1 sibling, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-26 17:22 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 12:45:50PM -0400, J. Bruce Fields wrote: > Please pull the following nfsd updates from the for-2.6.37 branch at: > > git://linux-nfs.org/~bfields/linux.git for-2.6.37 By the way, apologies, I see there's a conflict with upstream--looks obvious (conflicting appends to Documentation/feature-removal-schedule.txt), but I'm happy to fix it up and add a merge commit on that branch if it saves you time. --b. > > Neil Brown modifed the way we wait for upcalls to solve a performance > problem and a long-standing NFSv4 correctness problem. > > Pavel Emelyanov started the job of making the NFS client and server > container-aware. (Note that for the sake of keeping his work in one > tree, Trond and I agreed to merge it through my tree, though it includes > client as well as server changes.) Thanks to Chuck Lever for help with > review. > > Thanks also to Tom Tucker for some rdma fixes (and review help). > > I'm continuing work to bring the NFSv4.1 server close enough to the spec > that we could turn it on by default without risking future > interoperability problems. We're not quite there yet. > > A moment's silence for the spkm3 code, please. It's been lying unused > for years now, so we decided to finally put it out of its misery. > > We're also deprecating (not removing quite yet) the old nfsd control > interfaces. > > I have a few last-minute changes still under testing that I may try to > sneak in later in the week, but given the short merge window I thought > it would make more sense to send the bulk of the changes now. > > --b. > > Andy Adamson (1): > nfsd: remove duplicate NFS4_STATEID_SIZE declaration > > Andy Shevchenko (1): > sunrpc/cache: don't use custom hex_to_bin() converter > > Benny Halevy (1): > nfsd4: adjust buflen for encoded attrs bitmap based on actual bitmap length > > Bryan Schumaker (1): > lockd: Mostly remove BKL from the server > > Chuck Lever (2): > SUNRPC: Use conventional switch statement when reclassifying sockets > SUNRPC: Properly initialize sock_xprt.srcaddr in all cases > > J. Bruce Fields (39): > svcrpc: minor cache cleanup > svcrpc: cache deferral cleanup > Merge remote branch 'trond/bugfixes' into for-2.6.37 > nfsd4: fix hang on fast-booting nfs servers > nfsd: fix /proc/net/rpc/nfsd.export/content display > nfsd4: remove spkm3 > nfsd4: minor variable renaming (cb -> conn) > nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence > nfsd4: rename nfs4_rpc_args->nfsd4_cb_args > nfsd4: generic callback code > nfsd4: use generic callback code in null case > nfsd4: remove separate cb_args struct > nfsd4: Move callback setup to callback queue > nfsd4: fix alloc_init_session BUILD_BUG_ON() > nfsd4: fix alloc_init_session return type > nfsd4: clean up session allocation > nfsd4: keep per-session list of connections > nfsd: provide callbacks on svc_xprt deletion > nfsd4: use callbacks on svc_xprt_deletion > nfsd4: refactor connection allocation > nfsd4: add new connections to session > nfsd4: return expired on unfound stateid's > nfsd4: expire clients more promptly > nfsd4: don't cache seq_misordered replies > nfsd4: move callback setup into session init code > nfsd4: use client pointer to backchannel session > nfsd4: make backchannel sequence number per-session > nfsd4: confirm only on succesful create_session > nfsd4: track backchannel connections > nfsd4: callback program number is per-session > nfsd4: separate callback change and callback probe > nfsd4: delay session removal till free_client > nfsd4: move minorversion to client > nfsd4: only require krb5 principal for NFSv4.0 callbacks > nfsd4: fix connection allocation in sequence() > svcrpc: never clear XPT_BUSY on dead xprt > svcrpc: assume svc_delete_xprt() called only once > svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue > svcrpc: svc_tcp_sendto XPT_DEAD check is redundant > > NeilBrown (14): > sunrpc: extract some common sunrpc_cache code from nfsd > sunrpc: use seconds since boot in expiry cache > sunrpc/cache: allow threads to block while waiting for cache update. > sunrpc: close connection when a request is irretrievably lost. > nfsd: disable deferral for NFSv4 > nfsd/idmap: drop special request deferal in favour of improved default. > svcauth_gss: replace a trivial 'switch' with an 'if' > sunrpc/cache: change deferred-request hash table to use hlist. > sunrpc/cache: fix recent breakage of cache_clean_deferred > nfsd: formally deprecate legacy nfsd syscall interface > nfsd: allow deprecated interface to be compiled out. > sunrpc: fix race in new cache_wait code. > sunrpc: Simplify cache_defer_req and related functions. > sunrpc/cache: centralise handling of size limit on deferred list. > > Pavel Emelyanov (37): > nfsd: Export get_task_comm for nfsd > sunrpc: Pass the ip_map_parse's cd to lower calls > sunrpc: Make xprt auth cache release work with the xprt > sunrpc: Pass xprt to cached get/put routines > sunrpc: Add net to pure API calls > sunrpc: Add routines that allow registering per-net caches > sunrpc: Tag svc_xprt with net > sunrpc: The per-net skeleton > sunrpc: Make the /proc/net/rpc appear in net namespaces > sunrpc: Make the ip_map_cache be per-net > sunrpc: Factor out rpc_xprt allocation > sunrpc: Factor out rpc_xprt freeing > sunrpc: Add net argument to svc_create_xprt > sunrpc: Pull net argument downto svc_create_socket > sunrpc: Add net to rpc_create_args > sunrpc: Add net to xprt_create > sunrpc: Tag rpc_xprt with net > net: Export __sock_create > sunrpc: Create sockets in net namespaces > sunrpc: Use helper to set v4 mapped addr in ip_map_parse > sunrpc: Remove unused sock arg from xs_get_srcport > sunrpc: Remove unused sock arg from xs_next_srcport > sunrpc: Get xprt pointer once in xs_tcp_setup_socket > sunrpc: Remove duplicate xprt/transport arguments from calls > sunrpc: Factor out udp sockets creation > sunrpc: Factor out v4 sockets creation > sunrpc: Factor out v6 sockets creation > sunrpc: Call xs_create_sockX directly from setup_socket > sunrpc: Merge the xs_bind code > sunrpc: Merge xs_create_sock code > sunrpc: Pass family to setup_socket calls > sunrpc: Remove TCP worker wrappers > sunrpc: Remove UDP worker wrappers > sunrpc: Remove useless if (task == NULL) from xprt_reserve_xprt > sunrpc: Don't return NULL from rpcb_create > sunrpc: Remove dead "else" branch from bc xprt creation > sunrpc: Turn list_for_each-s into the ..._entry-s > > Stephen Rothwell (1): > sunrpc: fix up rpcauth_remove_module section mismatch > > Tejun Heo (1): > sunrpc/xprtrdma: clean up workqueue usage > > Tom Tucker (2): > svcrdma: Change DMA mapping logic to avoid the page_address kernel API > svcrdma: Cleanup DMA unmapping in error paths. > > Documentation/feature-removal-schedule.txt | 10 + > fs/Makefile | 5 +- > fs/compat.c | 2 +- > fs/lockd/host.c | 1 + > fs/lockd/mon.c | 1 + > fs/lockd/svc.c | 2 +- > fs/lockd/svc4proc.c | 2 - > fs/lockd/svclock.c | 31 ++- > fs/lockd/svcproc.c | 2 - > fs/nfs/callback.c | 4 +- > fs/nfs/client.c | 1 + > fs/nfs/dns_resolve.c | 6 +- > fs/nfs/mount_clnt.c | 2 + > fs/nfsd/Kconfig | 12 + > fs/nfsd/export.c | 73 +++-- > fs/nfsd/nfs4callback.c | 245 ++++++++------ > fs/nfsd/nfs4idmap.c | 105 +------ > fs/nfsd/nfs4proc.c | 7 +- > fs/nfsd/nfs4state.c | 491 ++++++++++++++++++---------- > fs/nfsd/nfs4xdr.c | 18 +- > fs/nfsd/nfsctl.c | 26 ++- > fs/nfsd/nfsd.h | 2 +- > fs/nfsd/nfssvc.c | 5 +- > fs/nfsd/state.h | 52 ++- > include/linux/net.h | 2 + > include/linux/nfs4.h | 3 + > include/linux/sunrpc/auth.h | 4 +- > include/linux/sunrpc/cache.h | 37 ++- > include/linux/sunrpc/clnt.h | 1 + > include/linux/sunrpc/gss_spkm3.h | 55 --- > include/linux/sunrpc/stats.h | 23 +- > include/linux/sunrpc/svc_xprt.h | 32 ++- > include/linux/sunrpc/svcauth.h | 17 +- > include/linux/sunrpc/xprt.h | 4 + > net/socket.c | 3 +- > net/sunrpc/Kconfig | 19 - > net/sunrpc/auth.c | 2 +- > net/sunrpc/auth_generic.c | 2 +- > net/sunrpc/auth_gss/Makefile | 5 - > net/sunrpc/auth_gss/gss_spkm3_mech.c | 247 -------------- > net/sunrpc/auth_gss/gss_spkm3_seal.c | 186 ----------- > net/sunrpc/auth_gss/gss_spkm3_token.c | 267 --------------- > net/sunrpc/auth_gss/gss_spkm3_unseal.c | 127 ------- > net/sunrpc/auth_gss/svcauth_gss.c | 51 ++-- > net/sunrpc/cache.c | 288 ++++++++++++----- > net/sunrpc/clnt.c | 1 + > net/sunrpc/netns.h | 19 + > net/sunrpc/rpcb_clnt.c | 4 +- > net/sunrpc/stats.c | 43 ++- > net/sunrpc/sunrpc_syms.c | 58 +++- > net/sunrpc/svc.c | 3 + > net/sunrpc/svc_xprt.c | 59 +++-- > net/sunrpc/svcauth_unix.c | 194 ++++++++--- > net/sunrpc/svcsock.c | 27 +- > net/sunrpc/xprt.c | 39 ++- > net/sunrpc/xprtrdma/svc_rdma.c | 11 +- > net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 19 +- > net/sunrpc/xprtrdma/svc_rdma_sendto.c | 82 ++++-- > net/sunrpc/xprtrdma/svc_rdma_transport.c | 49 ++-- > net/sunrpc/xprtrdma/transport.c | 25 +- > net/sunrpc/xprtsock.c | 358 +++++++------------- > 61 files changed, 1534 insertions(+), 1937 deletions(-) > delete mode 100644 include/linux/sunrpc/gss_spkm3.h > delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_mech.c > delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_seal.c > delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_token.c > delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_unseal.c > create mode 100644 net/sunrpc/netns.h > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 17:22 ` J. Bruce Fields @ 2010-10-26 17:39 ` Linus Torvalds 2010-10-26 17:46 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Linus Torvalds @ 2010-10-26 17:39 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 10:22 AM, J. Bruce Fields <bfields@fieldses.org> wrote: > On Tue, Oct 26, 2010 at 12:45:50PM -0400, J. Bruce Fields wrote: >> Please pull the following nfsd updates from the for-2.6.37 branch at: >> >> git://linux-nfs.org/~bfields/linux.git for-2.6.37 > > By the way, apologies, I see there's a conflict with upstream--looks > obvious (conflicting appends to > Documentation/feature-removal-schedule.txt), but I'm happy to fix it up > and add a merge commit on that branch if it saves you time. No, as mentioned elsewhere in other similar threads, I actually prefer people _not_ to pre-merge. It results in significantly uglier history (especially since I tend to merge a lot during the merge window, so your pre-merge will not obviate my later merge, because my tree will have moved forward), since criss-crossing merges really end up making the gitk graph rather harder to read. But to me personally, the "uglier history" is actually the secondary concern - I much prefer seeing the conflicts rather than have them hidden from me by a pre-merge. Now, for something like feature-removal-schedule.txt, the conflicts are usually not interesting from a development angle, but on the other hand they are also so trivial that they don't inconvenience me and I'd rather just do them myself and avoid the history pollution of extra merges. But then when the conflicts get more interesting and touching actual code, I end up spending more time at them, but I _still_ want to see them, because they inform me where different people ended up stepping on each others code. And those conflicts are interesting to me as a top-level maintainer, because it either shows code organizational problems, or it shows where people really were touching the same code for good reasons. And regardless of the reason for the conflict, I like being aware of it, because merges like that are often indicative of "this might cause issues". If the merge conflict ends up being so involved that I can't resolve it, I'll push back an email saying so. It happens, but not very often I'm happy to say. Our source organization is good enough that we seldom have really complicated merges. One thing that some maintainers do is to do a private pre-merge just to see if there are problems (and because it can give a more accurate diffstat in the presense of criss-cross merges or duplicate changes) and then if that merge is complex, expose both an unmerged branch and a "here, if you have issues, this is how I resolved the merge" branch. What I usually end up doing in that case is actually to do the merge myself anyway (because of all the issues above - I just want to know what is going on), but then also compare my end result with the maintainer pre-merge result, just to verify. But even that kind of "both non-merged and pre-merged" pull requests should be reserved just for cases where there really was a somewhat involved conflict. For trivial conflicts like a feature removal doc clash, don't even bother. I do several trivial merges a day, it's normal and it doesn't really take me any appreciable extra time. Git command line of the day: git log --oneline v2.6.36.. --merges --author=Torvalds --grep=onflict to see at least a partial list of the trivial conflicts I've done in just this merge window. It's 3-4 per day, it's perfectly normal. Linus ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 @ 2010-10-26 17:46 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-26 17:46 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 10:39:41AM -0700, Linus Torvalds wrote: > But even that kind of "both non-merged and pre-merged" pull requests > should be reserved just for cases where there really was a somewhat > involved conflict. For trivial conflicts like a feature removal doc > clash, don't even bother. I do several trivial merges a day, it's > normal and it doesn't really take me any appreciable extra time. Git > command line of the day: > > git log --oneline v2.6.36.. --merges --author=Torvalds --grep=onflict > > to see at least a partial list of the trivial conflicts I've done in > just this merge window. It's 3-4 per day, it's perfectly normal. Makes sense, thanks! --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 @ 2010-10-26 17:46 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-26 17:46 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 10:39:41AM -0700, Linus Torvalds wrote: > But even that kind of "both non-merged and pre-merged" pull requests > should be reserved just for cases where there really was a somewhat > involved conflict. For trivial conflicts like a feature removal doc > clash, don't even bother. I do several trivial merges a day, it's > normal and it doesn't really take me any appreciable extra time. Git > command line of the day: > > git log --oneline v2.6.36.. --merges --author=Torvalds --grep=onflict > > to see at least a partial list of the trivial conflicts I've done in > just this merge window. It's 3-4 per day, it's perfectly normal. Makes sense, thanks! --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 16:45 nfsd changes for 2.6.37 J. Bruce Fields 2010-10-26 17:22 ` J. Bruce Fields @ 2010-10-26 20:18 ` Arnd Bergmann 2010-10-26 20:35 ` Bryan Schumaker 1 sibling, 1 reply; 62+ messages in thread From: Arnd Bergmann @ 2010-10-26 20:18 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Linus Torvalds, linux-nfs, linux-kernel, Bryan Schumaker On Tuesday 26 October 2010 18:45:50 J. Bruce Fields wrote: > Bryan Schumaker (1): > lockd: Mostly remove BKL from the server Could you explain the "mostly" part of this commit? The commit message only says "This patch removes all but one call to lock_kernel() from the server." This one call is what keeps us from removing the BKL from fs/locks.c because I can't tell if you still suspect that lockd needs to lock against posix file locks or if there was a different reason for leaving it in. I can't think of anything else that this might be locking against because everything that might interact with lockd now does not use the BKL any more and lockd is single-threaded by definition. My guess is that the only thing that really needs to lock_flocks() in lockd are the nlm_file_inuse and nlm_traverse_locks functions because they traverse the inode->i_flock list. All the exported functions from fs/lock.c take care of locking in their own way (possibly not lease_get_time, as I just discovered, but that was never called under the BKL...). Arnd ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 20:18 ` Arnd Bergmann @ 2010-10-26 20:35 ` Bryan Schumaker 2010-10-26 20:55 ` Arnd Bergmann 0 siblings, 1 reply; 62+ messages in thread From: Bryan Schumaker @ 2010-10-26 20:35 UTC (permalink / raw) To: Arnd Bergmann; +Cc: J. Bruce Fields, Linus Torvalds, linux-nfs, linux-kernel Hi I left the one call because I was unable to figure out what was being protected with the BKL in that section of the code. I figured I would leave it for the maintainers, since they know more about the code than I do. Bryan On 10/26/2010 04:18 PM, Arnd Bergmann wrote: > On Tuesday 26 October 2010 18:45:50 J. Bruce Fields wrote: >> Bryan Schumaker (1): >> lockd: Mostly remove BKL from the server > > Could you explain the "mostly" part of this commit? > > The commit message only says "This patch removes all > but one call to lock_kernel() from the server." This one > call is what keeps us from removing the BKL from fs/locks.c > because I can't tell if you still suspect that lockd > needs to lock against posix file locks or if there was > a different reason for leaving it in. > > I can't think of anything else that this might be locking > against because everything that might interact with lockd > now does not use the BKL any more and lockd is > single-threaded by definition. > > My guess is that the only thing that really needs to > lock_flocks() in lockd are the nlm_file_inuse and > nlm_traverse_locks functions because they traverse > the inode->i_flock list. All the exported functions > from fs/lock.c take care of locking in their own way > (possibly not lease_get_time, as I just discovered, > but that was never called under the BKL...). > > Arnd > -- > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 20:35 ` Bryan Schumaker @ 2010-10-26 20:55 ` Arnd Bergmann 2010-10-26 21:02 ` Linus Torvalds 0 siblings, 1 reply; 62+ messages in thread From: Arnd Bergmann @ 2010-10-26 20:55 UTC (permalink / raw) To: Bryan Schumaker; +Cc: J. Bruce Fields, Linus Torvalds, linux-nfs, linux-kernel On Tuesday 26 October 2010 22:35:50 Bryan Schumaker wrote: > Hi > > I left the one call because I was unable to figure out what > was being protected with the BKL in that section of the code. > I figured I would leave it for the maintainers, since they > know more about the code than I do. My guess is that we need something like the patch below. This moves the lock_flocks() call into the places where it is needed and can be safely converted into a spinlock because that code does not sleep. Finally, we can build fs/locks.c, nfs and nfsd with CONFIG_BKL disabled. *not tested* Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- Kconfig | 1 - lockd/svc.c | 11 ----------- lockd/svcsubs.c | 9 ++++++++- locks.c | 5 +++-- nfs/Kconfig | 1 - nfsd/Kconfig | 1 - 6 files changed, 11 insertions(+), 17 deletions(-) diff --git a/fs/Kconfig b/fs/Kconfig index 65781de..3d18530 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -50,7 +50,6 @@ endif # BLOCK config FILE_LOCKING bool "Enable POSIX file locking API" if EMBEDDED default y - select BKL # while lockd still uses it. help This option enables standard file locking support, required for filesystems like NFS and for the flock() system diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c index b13aabc..abfff9d 100644 --- a/fs/lockd/svc.c +++ b/fs/lockd/svc.c @@ -22,7 +22,6 @@ #include <linux/in.h> #include <linux/uio.h> #include <linux/smp.h> -#include <linux/smp_lock.h> #include <linux/mutex.h> #include <linux/kthread.h> #include <linux/freezer.h> @@ -130,15 +129,6 @@ lockd(void *vrqstp) dprintk("NFS locking service started (ver " LOCKD_VERSION ").\n"); - /* - * FIXME: it would be nice if lockd didn't spend its entire life - * running under the BKL. At the very least, it would be good to - * have someone clarify what it's intended to protect here. I've - * seen some handwavy posts about posix locking needing to be - * done under the BKL, but it's far from clear. - */ - lock_kernel(); - if (!nlm_timeout) nlm_timeout = LOCKD_DFLT_TIMEO; nlmsvc_timeout = nlm_timeout * HZ; @@ -195,7 +185,6 @@ lockd(void *vrqstp) if (nlmsvc_ops) nlmsvc_invalidate_all(); nlm_shutdown_hosts(); - unlock_kernel(); return 0; } diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c index d0ef94c..1ca0679 100644 --- a/fs/lockd/svcsubs.c +++ b/fs/lockd/svcsubs.c @@ -170,6 +170,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file, again: file->f_locks = 0; + lock_flocks(); /* protects i_flock list */ for (fl = inode->i_flock; fl; fl = fl->fl_next) { if (fl->fl_lmops != &nlmsvc_lock_operations) continue; @@ -181,6 +182,7 @@ again: if (match(lockhost, host)) { struct file_lock lock = *fl; + unlock_flocks(); lock.fl_type = F_UNLCK; lock.fl_start = 0; lock.fl_end = OFFSET_MAX; @@ -192,6 +194,7 @@ again: goto again; } } + unlock_flocks(); return 0; } @@ -226,10 +229,14 @@ nlm_file_inuse(struct nlm_file *file) if (file->f_count || !list_empty(&file->f_blocks) || file->f_shares) return 1; + lock_flocks(); for (fl = inode->i_flock; fl; fl = fl->fl_next) { - if (fl->fl_lmops == &nlmsvc_lock_operations) + if (fl->fl_lmops == &nlmsvc_lock_operations) { + unlock_flocks(); return 1; + } } + unlock_flocks(); file->f_locks = 0; return 0; } diff --git a/fs/locks.c b/fs/locks.c index 8b2b6ad..a417901 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -142,6 +142,7 @@ int lease_break_time = 45; static LIST_HEAD(file_lock_list); static LIST_HEAD(blocked_list); +static DEFINE_SPINLOCK(file_lock_lock); /* * Protects the two list heads above, plus the inode->i_flock list @@ -149,13 +150,13 @@ static LIST_HEAD(blocked_list); */ void lock_flocks(void) { - lock_kernel(); + spin_lock(&file_lock_lock); } EXPORT_SYMBOL_GPL(lock_flocks); void unlock_flocks(void) { - unlock_kernel(); + spin_unlock(&file_lock_lock); } EXPORT_SYMBOL_GPL(unlock_flocks); diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig index fd66765..ba30665 100644 --- a/fs/nfs/Kconfig +++ b/fs/nfs/Kconfig @@ -1,7 +1,6 @@ config NFS_FS tristate "NFS client support" depends on INET && FILE_LOCKING - depends on BKL # fix as soon as lockd is done select LOCKD select SUNRPC select NFS_ACL_SUPPORT if NFS_V3_ACL diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig index 31a78fc..18b3e89 100644 --- a/fs/nfsd/Kconfig +++ b/fs/nfsd/Kconfig @@ -2,7 +2,6 @@ config NFSD tristate "NFS server support" depends on INET depends on FILE_LOCKING - depends on BKL # fix as soon as lockd is done select LOCKD select SUNRPC select EXPORTFS ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 20:55 ` Arnd Bergmann @ 2010-10-26 21:02 ` Linus Torvalds 2010-10-26 21:24 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Linus Torvalds @ 2010-10-26 21:02 UTC (permalink / raw) To: Arnd Bergmann; +Cc: Bryan Schumaker, J. Bruce Fields, linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 1:55 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > Finally, we can build fs/locks.c, nfs and nfsd with CONFIG_BKL > disabled. > > *not tested* Ok, please make this so, so that we finally can effectively get rid of the BKL in this release (I don't really care about the random old drivers that may be "depends on BKL"). But that obviously requires at least some minimal testing by somebody who is using NFS heavily, including locking. I assume that the connectathon tests include file locking? And I presume that the nfs developers run those at least semi-regularly and could do at least that kind of minimal testing? Linus ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 21:02 ` Linus Torvalds @ 2010-10-26 21:24 ` J. Bruce Fields 2010-10-26 21:37 ` Linus Torvalds 0 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-26 21:24 UTC (permalink / raw) To: Linus Torvalds; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 02:02:05PM -0700, Linus Torvalds wrote: > On Tue, Oct 26, 2010 at 1:55 PM, Arnd Bergmann <arnd@arndb.de> wrote: > > > > Finally, we can build fs/locks.c, nfs and nfsd with CONFIG_BKL > > disabled. > > > > *not tested* > > Ok, please make this so, so that we finally can effectively get rid of > the BKL in this release (I don't really care about the random old > drivers that may be "depends on BKL"). > > But that obviously requires at least some minimal testing by somebody > who is using NFS heavily, including locking. I assume that the > connectathon tests include file locking? And I presume that the nfs > developers run those at least semi-regularly and could do at least > that kind of minimal testing? I did a couple connectathon runs just now with no obvious ill effects except for some sleep-within-spinlock warnings in the lease code. Yeah, connecthon does include lock tests, though they're just basic correctness tests. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 21:24 ` J. Bruce Fields @ 2010-10-26 21:37 ` Linus Torvalds 2010-10-26 21:44 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Linus Torvalds @ 2010-10-26 21:37 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 2:24 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > I did a couple connectathon runs just now with no obvious ill effects > except for some sleep-within-spinlock warnings in the lease code. Hmm. Those sleep-within-spinlock warnings are very likely serious bugs. Can you quote the whole warning with stack trace? Linus ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 21:37 ` Linus Torvalds @ 2010-10-26 21:44 ` J. Bruce Fields 2010-10-26 22:11 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-26 21:44 UTC (permalink / raw) To: Linus Torvalds; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 02:37:26PM -0700, Linus Torvalds wrote: > On Tue, Oct 26, 2010 at 2:24 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > > > I did a couple connectathon runs just now with no obvious ill effects > > except for some sleep-within-spinlock warnings in the lease code. > > Hmm. Those sleep-within-spinlock warnings are very likely serious > bugs. Yeah, didn't mean to belittle them. > Can you quote the whole warning with stack trace? It's just obvious allocations in setlease: BUG: sleeping function called from invalid context at mm/slab.c:3101 in_atomic(): 1, irqs_disabled(): 0, pid: 4345, name: lease_tests 1 lock held by lease_tests/4345: #0: (file_lock_lock){+.+.+.}, at: [<ffffffff81128be5>] lock_flocks+0x15/0x20 Pid: 4345, comm: lease_tests Not tainted 2.6.36-05858-gbd5e20b #1028 Call Trace: [<ffffffff8103141d>] __might_sleep+0x10d/0x140 [<ffffffff810e3ad3>] kmem_cache_alloc+0x1f3/0x230 [<ffffffff8112a4d2>] generic_setlease+0x112/0x2c0 [<ffffffff8112a6b5>] __vfs_setlease+0x35/0x40 [<ffffffff8112acfe>] fcntl_setlease+0xce/0x180 [<ffffffff810f7c2e>] sys_fcntl+0x2fe/0x630 [<ffffffff81961999>] ? trace_hardirqs_on_thunk+0x3a/0x3f [<ffffffff81002658>] system_call_fastpath+0x16/0x1b I'm testing a patch. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 21:44 ` J. Bruce Fields @ 2010-10-26 22:11 ` J. Bruce Fields 2010-10-26 22:41 ` J. Bruce Fields 2010-10-27 7:21 ` Arnd Bergmann 0 siblings, 2 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-26 22:11 UTC (permalink / raw) To: Linus Torvalds; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 05:44:41PM -0400, J. Bruce Fields wrote: > On Tue, Oct 26, 2010 at 02:37:26PM -0700, Linus Torvalds wrote: > > On Tue, Oct 26, 2010 at 2:24 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > > > > > I did a couple connectathon runs just now with no obvious ill effects > > > except for some sleep-within-spinlock warnings in the lease code. > > > > Hmm. Those sleep-within-spinlock warnings are very likely serious > > bugs. > > Yeah, didn't mean to belittle them. > > > Can you quote the whole warning with stack trace? > > It's just obvious allocations in setlease: > > BUG: sleeping function called from invalid context at mm/slab.c:3101 > in_atomic(): 1, irqs_disabled(): 0, pid: 4345, name: lease_tests > 1 lock held by lease_tests/4345: > #0: (file_lock_lock){+.+.+.}, at: [<ffffffff81128be5>] lock_flocks+0x15/0x20 > Pid: 4345, comm: lease_tests Not tainted 2.6.36-05858-gbd5e20b #1028 > Call Trace: > [<ffffffff8103141d>] __might_sleep+0x10d/0x140 > [<ffffffff810e3ad3>] kmem_cache_alloc+0x1f3/0x230 > [<ffffffff8112a4d2>] generic_setlease+0x112/0x2c0 > [<ffffffff8112a6b5>] __vfs_setlease+0x35/0x40 > [<ffffffff8112acfe>] fcntl_setlease+0xce/0x180 > [<ffffffff810f7c2e>] sys_fcntl+0x2fe/0x630 > [<ffffffff81961999>] ? trace_hardirqs_on_thunk+0x3a/0x3f > [<ffffffff81002658>] system_call_fastpath+0x16/0x1b > > I'm testing a patch. This works for me. I'm not saying it's correct, but it does at least pass my dumb tests without complaining. --b. diff --git a/fs/locks.c b/fs/locks.c index 02b6e0e..db3afa0 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1379,7 +1379,9 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) if (error) return error; + lock_flocks(); time_out_leases(inode); + unlock_flocks(); BUG_ON(!(*flp)->fl_lmops->fl_break); @@ -1400,6 +1402,7 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) goto out; } + lock_flocks(); /* * At this point, we know that if there is an exclusive * lease on this file, then we hold it on this filp @@ -1427,28 +1430,31 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) error = -EAGAIN; if ((arg == F_RDLCK && (wrlease_count > 0)) || (arg == F_WRLCK && ((rdlease_count + wrlease_count) > 0))) - goto out; + goto out_unlock; if (my_before != NULL) { *flp = *my_before; error = lease->fl_lmops->fl_change(my_before, arg); - goto out; + goto out_unlock; } error = 0; if (arg == F_UNLCK) - goto out; + goto out_unlock; error = -EINVAL; if (!leases_enable) - goto out; + goto out_unlock; locks_copy_lock(new_fl, lease); locks_insert_lock(before, new_fl); *flp = new_fl; + unlock_flocks(); return 0; +out_unlock: + unlock_flocks(); out: if (new_fl != NULL) locks_free_lock(new_fl); @@ -1495,9 +1501,7 @@ int vfs_setlease(struct file *filp, long arg, struct file_lock **lease) { int error; - lock_flocks(); error = __vfs_setlease(filp, arg, lease); - unlock_flocks(); return error; } @@ -1524,8 +1528,6 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) if (error) return error; - lock_flocks(); - error = __vfs_setlease(filp, arg, &flp); if (error || arg == F_UNLCK) goto out_unlock; @@ -1541,7 +1543,6 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); out_unlock: - unlock_flocks(); return error; } ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 22:11 ` J. Bruce Fields @ 2010-10-26 22:41 ` J. Bruce Fields 2010-10-27 7:21 ` Arnd Bergmann 1 sibling, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-26 22:41 UTC (permalink / raw) To: Linus Torvalds; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Tue, Oct 26, 2010 at 06:11:56PM -0400, J. Bruce Fields wrote: > On Tue, Oct 26, 2010 at 05:44:41PM -0400, J. Bruce Fields wrote: > > On Tue, Oct 26, 2010 at 02:37:26PM -0700, Linus Torvalds wrote: > > > On Tue, Oct 26, 2010 at 2:24 PM, J. Bruce Fields <bfields@fieldses.org> wrote: > > > > > > > > I did a couple connectathon runs just now with no obvious ill effects > > > > except for some sleep-within-spinlock warnings in the lease code. > > > > > > Hmm. Those sleep-within-spinlock warnings are very likely serious > > > bugs. > > > > Yeah, didn't mean to belittle them. > > > > > Can you quote the whole warning with stack trace? > > > > It's just obvious allocations in setlease: > > > > BUG: sleeping function called from invalid context at mm/slab.c:3101 > > in_atomic(): 1, irqs_disabled(): 0, pid: 4345, name: lease_tests > > 1 lock held by lease_tests/4345: > > #0: (file_lock_lock){+.+.+.}, at: [<ffffffff81128be5>] lock_flocks+0x15/0x20 > > Pid: 4345, comm: lease_tests Not tainted 2.6.36-05858-gbd5e20b #1028 > > Call Trace: > > [<ffffffff8103141d>] __might_sleep+0x10d/0x140 > > [<ffffffff810e3ad3>] kmem_cache_alloc+0x1f3/0x230 > > [<ffffffff8112a4d2>] generic_setlease+0x112/0x2c0 > > [<ffffffff8112a6b5>] __vfs_setlease+0x35/0x40 > > [<ffffffff8112acfe>] fcntl_setlease+0xce/0x180 > > [<ffffffff810f7c2e>] sys_fcntl+0x2fe/0x630 > > [<ffffffff81961999>] ? trace_hardirqs_on_thunk+0x3a/0x3f > > [<ffffffff81002658>] system_call_fastpath+0x16/0x1b > > > > I'm testing a patch. > > This works for me. > > I'm not saying it's correct, but it does at least pass my dumb tests > without complaining. I can't think of any more missing locking, though I did notice this on a quick look. --b. commit fc42117585672abd3cbf247dd311869233d1606a Author: J. Bruce Fields <bfields@redhat.com> Date: Tue Oct 26 18:25:30 2010 -0400 fix nlmsvc_notify_blocked locking Signed-off-by: J. Bruce Fields <bfields@redhat.com> diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c index 6f1ef00..c462d34 100644 --- a/fs/lockd/svclock.c +++ b/fs/lockd/svclock.c @@ -700,14 +700,16 @@ nlmsvc_notify_blocked(struct file_lock *fl) struct nlm_block *block; dprintk("lockd: VFS unblock notification for block %p\n", fl); + spin_lock(&nlm_blocked_lock); list_for_each_entry(block, &nlm_blocked, b_list) { if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) { - nlmsvc_insert_block(block, 0); + nlmsvc_insert_block_locked(block, 0); + spin_unlock(&nlm_blocked_lock); svc_wake_up(block->b_daemon); return; } } - + spin_unlock(&nlm_blocked_lock); printk(KERN_WARNING "lockd: notification for unknown block!\n"); } ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-26 22:11 ` J. Bruce Fields 2010-10-26 22:41 ` J. Bruce Fields @ 2010-10-27 7:21 ` Arnd Bergmann 2010-10-27 8:39 ` Christoph Hellwig 1 sibling, 1 reply; 62+ messages in thread From: Arnd Bergmann @ 2010-10-27 7:21 UTC (permalink / raw) To: J. Bruce Fields; +Cc: Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wednesday 27 October 2010 00:11:56 J. Bruce Fields wrote: > > BUG: sleeping function called from invalid context at mm/slab.c:3101 > > in_atomic(): 1, irqs_disabled(): 0, pid: 4345, name: lease_tests > > 1 lock held by lease_tests/4345: > > #0: (file_lock_lock){+.+.+.}, at: [<ffffffff81128be5>] lock_flocks+0x15/0x20 > > Pid: 4345, comm: lease_tests Not tainted 2.6.36-05858-gbd5e20b #1028 > > Call Trace: > > [<ffffffff8103141d>] __might_sleep+0x10d/0x140 > > [<ffffffff810e3ad3>] kmem_cache_alloc+0x1f3/0x230 > > [<ffffffff8112a4d2>] generic_setlease+0x112/0x2c0 > > [<ffffffff8112a6b5>] __vfs_setlease+0x35/0x40 > > [<ffffffff8112acfe>] fcntl_setlease+0xce/0x180 > > [<ffffffff810f7c2e>] sys_fcntl+0x2fe/0x630 > > [<ffffffff81961999>] ? trace_hardirqs_on_thunk+0x3a/0x3f > > [<ffffffff81002658>] system_call_fastpath+0x16/0x1b > > > > I'm testing a patch. > Thanks for the report! > @@ -1524,8 +1528,6 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > if (error) > return error; > > - lock_flocks(); > - > error = __vfs_setlease(filp, arg, &flp); > if (error || arg == F_UNLCK) > goto out_unlock; > @@ -1541,7 +1543,6 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > > error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); > out_unlock: > - unlock_flocks(); > return error; > } If you don't hold lock_flocks throughout fcntl_setlease, the flp variable points to a flock that may get modified by another thread and you call time_out_leases() without holding lock_flocks, which it requires. The two alternatives I can see are to either use GFP_ATOMIC or to take the lock inside of generic_setlease and drop it outside. Neither of the two sounds particularly appealing. Arnd ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 7:21 ` Arnd Bergmann @ 2010-10-27 8:39 ` Christoph Hellwig 2010-10-27 13:39 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Christoph Hellwig @ 2010-10-27 8:39 UTC (permalink / raw) To: Arnd Bergmann Cc: J. Bruce Fields, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel > If you don't hold lock_flocks throughout fcntl_setlease, the flp variable > points to a flock that may get modified by another thread and you call > time_out_leases() without holding lock_flocks, which it requires. > > The two alternatives I can see are to either use GFP_ATOMIC or to > take the lock inside of generic_setlease and drop it outside. > Neither of the two sounds particularly appealing. Do locks_alloc_lock and initialization of the heap struct file_lock in the caller. This also avoids an entirely useless copy of the lock structure. free the passed in structure if we are modifying an existing lock structure. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 8:39 ` Christoph Hellwig @ 2010-10-27 13:39 ` J. Bruce Fields 2010-10-27 13:46 ` Arnd Bergmann 0 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 13:39 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel Arnd Bergmann wrote: > > If you don't hold lock_flocks throughout fcntl_setlease, the flp variable > > points to a flock that may get modified by another thread and you call > > time_out_leases() without holding lock_flocks, which it requires. Whoops, thanks for catching that. On Wed, Oct 27, 2010 at 04:39:24AM -0400, Christoph Hellwig wrote: > Do locks_alloc_lock and initialization of the heap struct file_lock > in the caller. This also avoids an entirely useless copy of the > lock structure. free the passed in structure if we are modifying > an existing lock structure. That sounds good; I'll give it a try. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 13:39 ` J. Bruce Fields @ 2010-10-27 13:46 ` Arnd Bergmann 2010-10-27 14:55 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Arnd Bergmann @ 2010-10-27 13:46 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wednesday 27 October 2010, J. Bruce Fields wrote: > On Wed, Oct 27, 2010 at 04:39:24AM -0400, Christoph Hellwig wrote: > > Do locks_alloc_lock and initialization of the heap struct file_lock > > in the caller. This also avoids an entirely useless copy of the > > lock structure. free the passed in structure if we are modifying > > an existing lock structure. > > That sounds good; I'll give it a try. I've already started, this is what I've come up with. I don't have a good way to test it though, and don't know as much as you about this code. Feel free to use it as a starting point if you like. It does look correct to me, but I'm not very confident in this. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- diff --git a/fs/locks.c b/fs/locks.c index 8b2b6ad..c5ec771 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -162,10 +162,11 @@ EXPORT_SYMBOL_GPL(unlock_flocks); static struct kmem_cache *filelock_cache __read_mostly; /* Allocate an empty lock structure. */ -static struct file_lock *locks_alloc_lock(void) +struct file_lock *locks_alloc_lock(void) { return kmem_cache_alloc(filelock_cache, GFP_KERNEL); } +EXPORT_SYMBOL_GPL(locks_alloc_lock); void locks_release_private(struct file_lock *fl) { @@ -1365,7 +1366,6 @@ int fcntl_getlease(struct file *filp) int generic_setlease(struct file *filp, long arg, struct file_lock **flp) { struct file_lock *fl, **before, **my_before = NULL, *lease; - struct file_lock *new_fl = NULL; struct dentry *dentry = filp->f_path.dentry; struct inode *inode = dentry->d_inode; int error, rdlease_count = 0, wrlease_count = 0; @@ -1385,11 +1385,6 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) lease = *flp; if (arg != F_UNLCK) { - error = -ENOMEM; - new_fl = locks_alloc_lock(); - if (new_fl == NULL) - goto out; - error = -EAGAIN; if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) goto out; @@ -1434,23 +1429,19 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) goto out; } - error = 0; if (arg == F_UNLCK) - goto out; + goto out_unlck; error = -EINVAL; if (!leases_enable) goto out; - locks_copy_lock(new_fl, lease); - locks_insert_lock(before, new_fl); - - *flp = new_fl; + locks_insert_lock(before, lease); +out_unlck: return 0; out: - if (new_fl != NULL) - locks_free_lock(new_fl); + locks_free_lock(lease); return error; } EXPORT_SYMBOL(generic_setlease); @@ -1514,26 +1505,31 @@ EXPORT_SYMBOL_GPL(vfs_setlease); */ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) { - struct file_lock fl, *flp = &fl; + struct file_lock *fl; struct inode *inode = filp->f_path.dentry->d_inode; int error; - locks_init_lock(&fl); - error = lease_init(filp, arg, &fl); - if (error) + fl = locks_alloc_lock(); + if (!fl) + return -ENOMEM; + + locks_init_lock(fl); + error = lease_init(filp, arg, fl); + if (error) { + locks_free_lock(fl); return error; + } lock_flocks(); - - error = __vfs_setlease(filp, arg, &flp); + error = __vfs_setlease(filp, arg, &fl); if (error || arg == F_UNLCK) goto out_unlock; - error = fasync_helper(fd, filp, 1, &flp->fl_fasync); + error = fasync_helper(fd, filp, 1, &fl->fl_fasync); if (error < 0) { /* remove lease just inserted by setlease */ - flp->fl_type = F_UNLCK | F_INPROGRESS; - flp->fl_break_time = jiffies - 10; + fl->fl_type = F_UNLCK | F_INPROGRESS; + fl->fl_break_time = jiffies - 10; time_out_leases(inode); goto out_unlock; } diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 9019e8e..56347e0 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2614,7 +2614,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta struct nfs4_delegation *dp; struct nfs4_stateowner *sop = stp->st_stateowner; int cb_up = atomic_read(&sop->so_client->cl_cb_set); - struct file_lock fl, *flp = &fl; + struct file_lock *fl; int status, flag = 0; flag = NFS4_OPEN_DELEGATE_NONE; @@ -2648,20 +2648,24 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta flag = NFS4_OPEN_DELEGATE_NONE; goto out; } - locks_init_lock(&fl); - fl.fl_lmops = &nfsd_lease_mng_ops; - fl.fl_flags = FL_LEASE; - fl.fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK; - fl.fl_end = OFFSET_MAX; - fl.fl_owner = (fl_owner_t)dp; - fl.fl_file = find_readable_file(stp->st_file); - BUG_ON(!fl.fl_file); - fl.fl_pid = current->tgid; + status = -ENOMEM; + fl = locks_alloc_lock(); + if (!fl) + goto out; + locks_init_lock(fl); + fl->fl_lmops = &nfsd_lease_mng_ops; + fl->fl_flags = FL_LEASE; + fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK; + fl->fl_end = OFFSET_MAX; + fl->fl_owner = (fl_owner_t)dp; + fl->fl_file = find_readable_file(stp->st_file); + BUG_ON(!fl->fl_file); + fl->fl_pid = current->tgid; /* vfs_setlease checks to see if delegation should be handed out. * the lock_manager callbacks fl_mylease and fl_change are used */ - if ((status = vfs_setlease(fl.fl_file, fl.fl_type, &flp))) { + if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) { dprintk("NFSD: setlease failed [%d], no delegation\n", status); unhash_delegation(dp); flag = NFS4_OPEN_DELEGATE_NONE; ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 13:46 ` Arnd Bergmann @ 2010-10-27 14:55 ` J. Bruce Fields 2010-10-27 14:59 ` Christoph Hellwig 0 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 14:55 UTC (permalink / raw) To: Arnd Bergmann Cc: Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 03:46:08PM +0200, Arnd Bergmann wrote: > On Wednesday 27 October 2010, J. Bruce Fields wrote: > > On Wed, Oct 27, 2010 at 04:39:24AM -0400, Christoph Hellwig wrote: > > > Do locks_alloc_lock and initialization of the heap struct file_lock > > > in the caller. This also avoids an entirely useless copy of the > > > lock structure. free the passed in structure if we are modifying > > > an existing lock structure. > > > > That sounds good; I'll give it a try. > > I've already started, this is what I've come up with. I don't have > a good way to test it though, and don't know as much as you about this > code. Feel free to use it as a starting point if you like. > It does look correct to me, but I'm not very confident in this. Hm, two problems: - We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK) failing with ENOMEM. - fasync_helper(.,.,1,.) sleeps. Argh. --b. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > > diff --git a/fs/locks.c b/fs/locks.c > index 8b2b6ad..c5ec771 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -162,10 +162,11 @@ EXPORT_SYMBOL_GPL(unlock_flocks); > static struct kmem_cache *filelock_cache __read_mostly; > > /* Allocate an empty lock structure. */ > -static struct file_lock *locks_alloc_lock(void) > +struct file_lock *locks_alloc_lock(void) > { > return kmem_cache_alloc(filelock_cache, GFP_KERNEL); > } > +EXPORT_SYMBOL_GPL(locks_alloc_lock); > > void locks_release_private(struct file_lock *fl) > { > @@ -1365,7 +1366,6 @@ int fcntl_getlease(struct file *filp) > int generic_setlease(struct file *filp, long arg, struct file_lock **flp) > { > struct file_lock *fl, **before, **my_before = NULL, *lease; > - struct file_lock *new_fl = NULL; > struct dentry *dentry = filp->f_path.dentry; > struct inode *inode = dentry->d_inode; > int error, rdlease_count = 0, wrlease_count = 0; > @@ -1385,11 +1385,6 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) > lease = *flp; > > if (arg != F_UNLCK) { > - error = -ENOMEM; > - new_fl = locks_alloc_lock(); > - if (new_fl == NULL) > - goto out; > - > error = -EAGAIN; > if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) > goto out; > @@ -1434,23 +1429,19 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) > goto out; > } > > - error = 0; > if (arg == F_UNLCK) > - goto out; > + goto out_unlck; > > error = -EINVAL; > if (!leases_enable) > goto out; > > - locks_copy_lock(new_fl, lease); > - locks_insert_lock(before, new_fl); > - > - *flp = new_fl; > + locks_insert_lock(before, lease); > +out_unlck: > return 0; > > out: > - if (new_fl != NULL) > - locks_free_lock(new_fl); > + locks_free_lock(lease); > return error; > } > EXPORT_SYMBOL(generic_setlease); > @@ -1514,26 +1505,31 @@ EXPORT_SYMBOL_GPL(vfs_setlease); > */ > int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > { > - struct file_lock fl, *flp = &fl; > + struct file_lock *fl; > struct inode *inode = filp->f_path.dentry->d_inode; > int error; > > - locks_init_lock(&fl); > - error = lease_init(filp, arg, &fl); > - if (error) > + fl = locks_alloc_lock(); > + if (!fl) > + return -ENOMEM; > + > + locks_init_lock(fl); > + error = lease_init(filp, arg, fl); > + if (error) { > + locks_free_lock(fl); > return error; > + } > > lock_flocks(); > - > - error = __vfs_setlease(filp, arg, &flp); > + error = __vfs_setlease(filp, arg, &fl); > if (error || arg == F_UNLCK) > goto out_unlock; > > - error = fasync_helper(fd, filp, 1, &flp->fl_fasync); > + error = fasync_helper(fd, filp, 1, &fl->fl_fasync); > if (error < 0) { > /* remove lease just inserted by setlease */ > - flp->fl_type = F_UNLCK | F_INPROGRESS; > - flp->fl_break_time = jiffies - 10; > + fl->fl_type = F_UNLCK | F_INPROGRESS; > + fl->fl_break_time = jiffies - 10; > time_out_leases(inode); > goto out_unlock; > } > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index 9019e8e..56347e0 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -2614,7 +2614,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta > struct nfs4_delegation *dp; > struct nfs4_stateowner *sop = stp->st_stateowner; > int cb_up = atomic_read(&sop->so_client->cl_cb_set); > - struct file_lock fl, *flp = &fl; > + struct file_lock *fl; > int status, flag = 0; > > flag = NFS4_OPEN_DELEGATE_NONE; > @@ -2648,20 +2648,24 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta > flag = NFS4_OPEN_DELEGATE_NONE; > goto out; > } > - locks_init_lock(&fl); > - fl.fl_lmops = &nfsd_lease_mng_ops; > - fl.fl_flags = FL_LEASE; > - fl.fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK; > - fl.fl_end = OFFSET_MAX; > - fl.fl_owner = (fl_owner_t)dp; > - fl.fl_file = find_readable_file(stp->st_file); > - BUG_ON(!fl.fl_file); > - fl.fl_pid = current->tgid; > + status = -ENOMEM; > + fl = locks_alloc_lock(); > + if (!fl) > + goto out; > + locks_init_lock(fl); > + fl->fl_lmops = &nfsd_lease_mng_ops; > + fl->fl_flags = FL_LEASE; > + fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK; > + fl->fl_end = OFFSET_MAX; > + fl->fl_owner = (fl_owner_t)dp; > + fl->fl_file = find_readable_file(stp->st_file); > + BUG_ON(!fl->fl_file); > + fl->fl_pid = current->tgid; > > /* vfs_setlease checks to see if delegation should be handed out. > * the lock_manager callbacks fl_mylease and fl_change are used > */ > - if ((status = vfs_setlease(fl.fl_file, fl.fl_type, &flp))) { > + if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) { > dprintk("NFSD: setlease failed [%d], no delegation\n", status); > unhash_delegation(dp); > flag = NFS4_OPEN_DELEGATE_NONE; ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 14:55 ` J. Bruce Fields @ 2010-10-27 14:59 ` Christoph Hellwig 2010-10-27 15:16 ` J. Bruce Fields ` (2 more replies) 0 siblings, 3 replies; 62+ messages in thread From: Christoph Hellwig @ 2010-10-27 14:59 UTC (permalink / raw) To: J. Bruce Fields Cc: Arnd Bergmann, Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote: > Hm, two problems: > - We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK) > failing with ENOMEM. splitt ->setlease into ->add_least and ->delete_lease. No need to pass in a structure for the later. No need to return one either. > - fasync_helper(.,.,1,.) sleeps. Argh. That's not new.. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 14:59 ` Christoph Hellwig @ 2010-10-27 15:16 ` J. Bruce Fields 2010-10-27 15:19 ` Christoph Hellwig 2010-10-27 15:23 ` Arnd Bergmann 2010-10-30 21:25 ` J. Bruce Fields 2 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 15:16 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 10:59:29AM -0400, Christoph Hellwig wrote: > On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote: > > Hm, two problems: > > - We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK) > > failing with ENOMEM. > > splitt ->setlease into ->add_least and ->delete_lease. No need to pass > in a structure for the later. No need to return one either. Sounds fine to me. > > > - fasync_helper(.,.,1,.) sleeps. Argh. > > That's not new.. So we could do unlock_flocks(); error = fasync_helper(fd, filp, 1, &fl->fl_fasync); lock_flocks(); and say, hey, we didn't introduce any new bug there. But.... I don't know, maybe add a version of fasync_add_entry() that takes a preallocated fasync_struct?? --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 15:16 ` J. Bruce Fields @ 2010-10-27 15:19 ` Christoph Hellwig 0 siblings, 0 replies; 62+ messages in thread From: Christoph Hellwig @ 2010-10-27 15:19 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 11:16:46AM -0400, J. Bruce Fields wrote: > On Wed, Oct 27, 2010 at 10:59:29AM -0400, Christoph Hellwig wrote: > > On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote: > > > Hm, two problems: > > > - We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK) > > > failing with ENOMEM. > > > > splitt ->setlease into ->add_least and ->delete_lease. No need to pass > > in a structure for the later. No need to return one either. > > Sounds fine to me. > > > > > > - fasync_helper(.,.,1,.) sleeps. Argh. > > > > That's not new.. > > So we could do > > unlock_flocks(); > error = fasync_helper(fd, filp, 1, &fl->fl_fasync); > lock_flocks(); > > and say, hey, we didn't introduce any new bug there. But.... > > I don't know, maybe add a version of fasync_add_entry() that takes a > preallocated fasync_struct?? Or just convert the lock to a sleeping mutex. Now that we have adaptive spinning the horrible behaviour that Willy saw years ago might not be that bad any more. That'll need some benchmarking, though. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 14:59 ` Christoph Hellwig 2010-10-27 15:16 ` J. Bruce Fields @ 2010-10-27 15:23 ` Arnd Bergmann 2010-10-27 15:28 ` J. Bruce Fields ` (2 more replies) 2010-10-30 21:25 ` J. Bruce Fields 2 siblings, 3 replies; 62+ messages in thread From: Arnd Bergmann @ 2010-10-27 15:23 UTC (permalink / raw) To: Christoph Hellwig Cc: J. Bruce Fields, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wednesday 27 October 2010, Christoph Hellwig wrote: > On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote: > > Hm, two problems: > > - We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK) > > failing with ENOMEM. > > splitt ->setlease into ->add_least and ->delete_lease. No need to pass > in a structure for the later. No need to return one either. That sounds like a good way to get rid of a lot of special cases, too. > > - fasync_helper(.,.,1,.) sleeps. Argh. > > That's not new.. It comes back to the original problem with Bruce's patch though: fcntl_setlease wants to atomically add a lease or fail. If fasync_helper fails, we want to remove the lease that was just added before anyone can see it. At the very least we need to keep the flock from getting freed in another thread while we call fasync_helper without the lock. locks_delete_lock is also called with lock_flocks held and calls fasync_helper... Arnd ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 15:23 ` Arnd Bergmann @ 2010-10-27 15:28 ` J. Bruce Fields 2010-10-27 15:31 ` Christoph Hellwig 2010-10-27 16:12 ` Linus Torvalds 2 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 15:28 UTC (permalink / raw) To: Arnd Bergmann Cc: Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 05:23:59PM +0200, Arnd Bergmann wrote: > On Wednesday 27 October 2010, Christoph Hellwig wrote: > > On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote: > > > Hm, two problems: > > > - We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK) > > > failing with ENOMEM. > > > > splitt ->setlease into ->add_least and ->delete_lease. No need to pass > > in a structure for the later. No need to return one either. > > That sounds like a good way to get rid of a lot of special cases, too. > > > > - fasync_helper(.,.,1,.) sleeps. Argh. > > > > That's not new.. > > It comes back to the original problem with Bruce's patch though: > fcntl_setlease wants to atomically add a lease or fail. If > fasync_helper fails, we want to remove the lease that was > just added before anyone can see it. At the very least we need > to keep the flock from getting freed in another thread while > we call fasync_helper without the lock. > > locks_delete_lock is also called with lock_flocks held and calls > fasync_helper... Yeah, but just the fasync_remove_entry() case. It would really seem nicer to me if people called fasync_{add,remove}_entry() instead of having this silly fasync_helper() and making the reader remember what the third argument means. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 15:23 ` Arnd Bergmann 2010-10-27 15:28 ` J. Bruce Fields @ 2010-10-27 15:31 ` Christoph Hellwig 2010-10-27 16:12 ` Linus Torvalds 2 siblings, 0 replies; 62+ messages in thread From: Christoph Hellwig @ 2010-10-27 15:31 UTC (permalink / raw) To: Arnd Bergmann Cc: Christoph Hellwig, J. Bruce Fields, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 05:23:59PM +0200, Arnd Bergmann wrote: > It comes back to the original problem with Bruce's patch though: > fcntl_setlease wants to atomically add a lease or fail. If > fasync_helper fails, we want to remove the lease that was > just added before anyone can see it. At the very least we need > to keep the flock from getting freed in another thread while > we call fasync_helper without the lock. a variant of fasync_add_entry that takes a preallocated fasync_struct would indeed take care of that. > locks_delete_lock is also called with lock_flocks held and calls > fasync_helper... but with on = 0, which calls into fasync_remove_entry, which doesn't sleep. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 15:23 ` Arnd Bergmann 2010-10-27 15:28 ` J. Bruce Fields 2010-10-27 15:31 ` Christoph Hellwig @ 2010-10-27 16:12 ` Linus Torvalds 2010-10-27 16:46 ` J. Bruce Fields 2 siblings, 1 reply; 62+ messages in thread From: Linus Torvalds @ 2010-10-27 16:12 UTC (permalink / raw) To: Arnd Bergmann Cc: Christoph Hellwig, J. Bruce Fields, Bryan Schumaker, linux-nfs, linux-kernel [-- Attachment #1: Type: text/plain, Size: 870 bytes --] On Wed, Oct 27, 2010 at 8:23 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > locks_delete_lock is also called with lock_flocks held and calls > fasync_helper... We don't really have to use fasync_helper. In fact, the whole interface is pretty broken for something like file locking, which isn't actually "fasync()". That whole "on/off as an argument" is just crazy. It would be _trivial_ to expose a version of fasync_helper() that takes a pre-allocated fasync_struct for add, and that has separate helper functions for the add/delete case so that you don't have the pointless crazy arguments (for "delete" the "fd" argument is useless, and I do hate "modal" functions that take what they should do as a flag). Then fcntl_setlease() would trivially just allocate the dang thing before. Something like the attached (UNTESTED!) perhaps? Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 5305 bytes --] fs/fcntl.c | 66 +++++++++++++++++++++++++++++++++++++++------------ fs/locks.c | 17 ++++++++++++- include/linux/fs.h | 5 ++++ 3 files changed, 71 insertions(+), 17 deletions(-) diff --git a/fs/fcntl.c b/fs/fcntl.c index f8cc34f..dcdbc6f 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -640,7 +640,7 @@ static void fasync_free_rcu(struct rcu_head *head) * match the state "is the filp on a fasync list". * */ -static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) +int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) { struct fasync_struct *fa, **fp; int result = 0; @@ -666,21 +666,28 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) return result; } +struct fasync_struct *fasync_alloc(void) +{ + return kmem_cache_alloc(fasync_cache, GFP_KERNEL); +} + /* - * Add a fasync entry. Return negative on error, positive if - * added, and zero if did nothing but change an existing one. - * - * NOTE! It is very important that the FASYNC flag always - * match the state "is the filp on a fasync list". + * NOTE! This can be used only for unused fasync entries: + * entries that actually got inserted on the fasync list + * need to be released by rcu - see fasync_remove_entry. */ -static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp) +void fasync_free(struct fasync_struct *new) { - struct fasync_struct *new, *fa, **fp; - int result = 0; + kmem_cache_free(fasync_cache, new); +} - new = kmem_cache_alloc(fasync_cache, GFP_KERNEL); - if (!new) - return -ENOMEM; +/* + * Insert a new entry into the fasync list. Return the pointer to the + * old one if we didn't use the new one. + */ +struct fasync_struct *fasync_insert_entry(int fd, struct file *filp, struct fasync_struct **fapp, struct fasync_struct *new) +{ + struct fasync_struct *fa, **fp; spin_lock(&filp->f_lock); spin_lock(&fasync_lock); @@ -691,8 +698,6 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa spin_lock_irq(&fa->fa_lock); fa->fa_fd = fd; spin_unlock_irq(&fa->fa_lock); - - kmem_cache_free(fasync_cache, new); goto out; } @@ -702,13 +707,42 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa new->fa_fd = fd; new->fa_next = *fapp; rcu_assign_pointer(*fapp, new); - result = 1; filp->f_flags |= FASYNC; out: spin_unlock(&fasync_lock); spin_unlock(&filp->f_lock); - return result; + return fa; +} + +/* + * Add a fasync entry. Return negative on error, positive if + * added, and zero if did nothing but change an existing one. + * + * NOTE! It is very important that the FASYNC flag always + * match the state "is the filp on a fasync list". + */ +static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp) +{ + struct fasync_struct *new; + + new = fasync_alloc(); + if (!new) + return -ENOMEM; + + /* + * fasync_insert_entry() returns the old (update) entry if + * it existed. + * + * So free the (unused) new entry and return 0 to let the + * caller know that we didn't add any new fasync entries. + */ + if (fasync_insert_entry(fd, filp, fapp, new)) { + fasync_free(new); + return 0; + } + + return 1; } /* diff --git a/fs/locks.c b/fs/locks.c index 4de3a26..9ff3f66 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1515,6 +1515,7 @@ EXPORT_SYMBOL_GPL(vfs_setlease); int fcntl_setlease(unsigned int fd, struct file *filp, long arg) { struct file_lock fl, *flp = &fl; + struct fasync_struct *new; struct inode *inode = filp->f_path.dentry->d_inode; int error; @@ -1523,13 +1524,25 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) if (error) return error; + new = fasync_alloc(); + if (!new) + return -ENOMEM; + lock_flocks(); error = __vfs_setlease(filp, arg, &flp); if (error || arg == F_UNLCK) goto out_unlock; - error = fasync_helper(fd, filp, 1, &flp->fl_fasync); + /* + * fasync_insert_entry() returns the old entry if any. + * If there was no old entry, then it used 'new' and + * inserted it into the fasync list. Clear new so that + * we don't release it here. + */ + if (!fasync_insert_entry(fd, filp, &flp->fl_fasync, new)) + new = NULL; + if (error < 0) { /* remove lease just inserted by setlease */ flp->fl_type = F_UNLCK | F_INPROGRESS; @@ -1541,6 +1554,8 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); out_unlock: unlock_flocks(); + if (new) + fasync_free(new); return error; } diff --git a/include/linux/fs.h b/include/linux/fs.h index 240eb1d..d487772 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1310,6 +1310,11 @@ struct fasync_struct { /* SMP safe fasync helpers: */ extern int fasync_helper(int, struct file *, int, struct fasync_struct **); +extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasync_struct **, struct fasync_struct *); +extern int fasync_remove_entry(struct file *, struct fasync_struct **); +extern struct fasync_struct *fasync_alloc(void); +extern void fasync_free(struct fasync_struct *); + /* can be called from interrupts */ extern void kill_fasync(struct fasync_struct **, int, int); ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 @ 2010-10-27 16:46 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 16:46 UTC (permalink / raw) To: Linus Torvalds Cc: Arnd Bergmann, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 09:12:06AM -0700, Linus Torvalds wrote: > On Wed, Oct 27, 2010 at 8:23 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > > > locks_delete_lock is also called with lock_flocks held and calls > > fasync_helper... > > We don't really have to use fasync_helper. > > In fact, the whole interface is pretty broken for something like file > locking, which isn't actually "fasync()". That whole "on/off as an > argument" is just crazy. It would be _trivial_ to expose a version of > fasync_helper() that takes a pre-allocated fasync_struct for add, and > that has separate helper functions for the add/delete case so that you > don't have the pointless crazy arguments (for "delete" the "fd" > argument is useless, and I do hate "modal" functions that take what > they should do as a flag). > > Then fcntl_setlease() would trivially just allocate the dang thing before. > > Something like the attached (UNTESTED!) perhaps? Makes sense to me. Testing.... --b. > > Linus > fs/fcntl.c | 66 +++++++++++++++++++++++++++++++++++++++------------ > fs/locks.c | 17 ++++++++++++- > include/linux/fs.h | 5 ++++ > 3 files changed, 71 insertions(+), 17 deletions(-) > > diff --git a/fs/fcntl.c b/fs/fcntl.c > index f8cc34f..dcdbc6f 100644 > --- a/fs/fcntl.c > +++ b/fs/fcntl.c > @@ -640,7 +640,7 @@ static void fasync_free_rcu(struct rcu_head *head) > * match the state "is the filp on a fasync list". > * > */ > -static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) > +int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) > { > struct fasync_struct *fa, **fp; > int result = 0; > @@ -666,21 +666,28 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) > return result; > } > > +struct fasync_struct *fasync_alloc(void) > +{ > + return kmem_cache_alloc(fasync_cache, GFP_KERNEL); > +} > + > /* > - * Add a fasync entry. Return negative on error, positive if > - * added, and zero if did nothing but change an existing one. > - * > - * NOTE! It is very important that the FASYNC flag always > - * match the state "is the filp on a fasync list". > + * NOTE! This can be used only for unused fasync entries: > + * entries that actually got inserted on the fasync list > + * need to be released by rcu - see fasync_remove_entry. > */ > -static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp) > +void fasync_free(struct fasync_struct *new) > { > - struct fasync_struct *new, *fa, **fp; > - int result = 0; > + kmem_cache_free(fasync_cache, new); > +} > > - new = kmem_cache_alloc(fasync_cache, GFP_KERNEL); > - if (!new) > - return -ENOMEM; > +/* > + * Insert a new entry into the fasync list. Return the pointer to the > + * old one if we didn't use the new one. > + */ > +struct fasync_struct *fasync_insert_entry(int fd, struct file *filp, struct fasync_struct **fapp, struct fasync_struct *new) > +{ > + struct fasync_struct *fa, **fp; > > spin_lock(&filp->f_lock); > spin_lock(&fasync_lock); > @@ -691,8 +698,6 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa > spin_lock_irq(&fa->fa_lock); > fa->fa_fd = fd; > spin_unlock_irq(&fa->fa_lock); > - > - kmem_cache_free(fasync_cache, new); > goto out; > } > > @@ -702,13 +707,42 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa > new->fa_fd = fd; > new->fa_next = *fapp; > rcu_assign_pointer(*fapp, new); > - result = 1; > filp->f_flags |= FASYNC; > > out: > spin_unlock(&fasync_lock); > spin_unlock(&filp->f_lock); > - return result; > + return fa; > +} > + > +/* > + * Add a fasync entry. Return negative on error, positive if > + * added, and zero if did nothing but change an existing one. > + * > + * NOTE! It is very important that the FASYNC flag always > + * match the state "is the filp on a fasync list". > + */ > +static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp) > +{ > + struct fasync_struct *new; > + > + new = fasync_alloc(); > + if (!new) > + return -ENOMEM; > + > + /* > + * fasync_insert_entry() returns the old (update) entry if > + * it existed. > + * > + * So free the (unused) new entry and return 0 to let the > + * caller know that we didn't add any new fasync entries. > + */ > + if (fasync_insert_entry(fd, filp, fapp, new)) { > + fasync_free(new); > + return 0; > + } > + > + return 1; > } > > /* > diff --git a/fs/locks.c b/fs/locks.c > index 4de3a26..9ff3f66 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -1515,6 +1515,7 @@ EXPORT_SYMBOL_GPL(vfs_setlease); > int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > { > struct file_lock fl, *flp = &fl; > + struct fasync_struct *new; > struct inode *inode = filp->f_path.dentry->d_inode; > int error; > > @@ -1523,13 +1524,25 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > if (error) > return error; > > + new = fasync_alloc(); > + if (!new) > + return -ENOMEM; > + > lock_flocks(); > > error = __vfs_setlease(filp, arg, &flp); > if (error || arg == F_UNLCK) > goto out_unlock; > > - error = fasync_helper(fd, filp, 1, &flp->fl_fasync); > + /* > + * fasync_insert_entry() returns the old entry if any. > + * If there was no old entry, then it used 'new' and > + * inserted it into the fasync list. Clear new so that > + * we don't release it here. > + */ > + if (!fasync_insert_entry(fd, filp, &flp->fl_fasync, new)) > + new = NULL; > + > if (error < 0) { > /* remove lease just inserted by setlease */ > flp->fl_type = F_UNLCK | F_INPROGRESS; > @@ -1541,6 +1554,8 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); > out_unlock: > unlock_flocks(); > + if (new) > + fasync_free(new); > return error; > } > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 240eb1d..d487772 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1310,6 +1310,11 @@ struct fasync_struct { > > /* SMP safe fasync helpers: */ > extern int fasync_helper(int, struct file *, int, struct fasync_struct **); > +extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasync_struct **, struct fasync_struct *); > +extern int fasync_remove_entry(struct file *, struct fasync_struct **); > +extern struct fasync_struct *fasync_alloc(void); > +extern void fasync_free(struct fasync_struct *); > + > /* can be called from interrupts */ > extern void kill_fasync(struct fasync_struct **, int, int); > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 @ 2010-10-27 16:46 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 16:46 UTC (permalink / raw) To: Linus Torvalds Cc: Arnd Bergmann, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 09:12:06AM -0700, Linus Torvalds wrote: > On Wed, Oct 27, 2010 at 8:23 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > > > locks_delete_lock is also called with lock_flocks held and calls > > fasync_helper... > > We don't really have to use fasync_helper. > > In fact, the whole interface is pretty broken for something like file > locking, which isn't actually "fasync()". That whole "on/off as an > argument" is just crazy. It would be _trivial_ to expose a version of > fasync_helper() that takes a pre-allocated fasync_struct for add, and > that has separate helper functions for the add/delete case so that you > don't have the pointless crazy arguments (for "delete" the "fd" > argument is useless, and I do hate "modal" functions that take what > they should do as a flag). > > Then fcntl_setlease() would trivially just allocate the dang thing before. > > Something like the attached (UNTESTED!) perhaps? Makes sense to me. Testing.... --b. > > Linus > fs/fcntl.c | 66 +++++++++++++++++++++++++++++++++++++++------------ > fs/locks.c | 17 ++++++++++++- > include/linux/fs.h | 5 ++++ > 3 files changed, 71 insertions(+), 17 deletions(-) > > diff --git a/fs/fcntl.c b/fs/fcntl.c > index f8cc34f..dcdbc6f 100644 > --- a/fs/fcntl.c > +++ b/fs/fcntl.c > @@ -640,7 +640,7 @@ static void fasync_free_rcu(struct rcu_head *head) > * match the state "is the filp on a fasync list". > * > */ > -static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) > +int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) > { > struct fasync_struct *fa, **fp; > int result = 0; > @@ -666,21 +666,28 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) > return result; > } > > +struct fasync_struct *fasync_alloc(void) > +{ > + return kmem_cache_alloc(fasync_cache, GFP_KERNEL); > +} > + > /* > - * Add a fasync entry. Return negative on error, positive if > - * added, and zero if did nothing but change an existing one. > - * > - * NOTE! It is very important that the FASYNC flag always > - * match the state "is the filp on a fasync list". > + * NOTE! This can be used only for unused fasync entries: > + * entries that actually got inserted on the fasync list > + * need to be released by rcu - see fasync_remove_entry. > */ > -static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp) > +void fasync_free(struct fasync_struct *new) > { > - struct fasync_struct *new, *fa, **fp; > - int result = 0; > + kmem_cache_free(fasync_cache, new); > +} > > - new = kmem_cache_alloc(fasync_cache, GFP_KERNEL); > - if (!new) > - return -ENOMEM; > +/* > + * Insert a new entry into the fasync list. Return the pointer to the > + * old one if we didn't use the new one. > + */ > +struct fasync_struct *fasync_insert_entry(int fd, struct file *filp, struct fasync_struct **fapp, struct fasync_struct *new) > +{ > + struct fasync_struct *fa, **fp; > > spin_lock(&filp->f_lock); > spin_lock(&fasync_lock); > @@ -691,8 +698,6 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa > spin_lock_irq(&fa->fa_lock); > fa->fa_fd = fd; > spin_unlock_irq(&fa->fa_lock); > - > - kmem_cache_free(fasync_cache, new); > goto out; > } > > @@ -702,13 +707,42 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa > new->fa_fd = fd; > new->fa_next = *fapp; > rcu_assign_pointer(*fapp, new); > - result = 1; > filp->f_flags |= FASYNC; > > out: > spin_unlock(&fasync_lock); > spin_unlock(&filp->f_lock); > - return result; > + return fa; > +} > + > +/* > + * Add a fasync entry. Return negative on error, positive if > + * added, and zero if did nothing but change an existing one. > + * > + * NOTE! It is very important that the FASYNC flag always > + * match the state "is the filp on a fasync list". > + */ > +static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp) > +{ > + struct fasync_struct *new; > + > + new = fasync_alloc(); > + if (!new) > + return -ENOMEM; > + > + /* > + * fasync_insert_entry() returns the old (update) entry if > + * it existed. > + * > + * So free the (unused) new entry and return 0 to let the > + * caller know that we didn't add any new fasync entries. > + */ > + if (fasync_insert_entry(fd, filp, fapp, new)) { > + fasync_free(new); > + return 0; > + } > + > + return 1; > } > > /* > diff --git a/fs/locks.c b/fs/locks.c > index 4de3a26..9ff3f66 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -1515,6 +1515,7 @@ EXPORT_SYMBOL_GPL(vfs_setlease); > int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > { > struct file_lock fl, *flp = &fl; > + struct fasync_struct *new; > struct inode *inode = filp->f_path.dentry->d_inode; > int error; > > @@ -1523,13 +1524,25 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > if (error) > return error; > > + new = fasync_alloc(); > + if (!new) > + return -ENOMEM; > + > lock_flocks(); > > error = __vfs_setlease(filp, arg, &flp); > if (error || arg == F_UNLCK) > goto out_unlock; > > - error = fasync_helper(fd, filp, 1, &flp->fl_fasync); > + /* > + * fasync_insert_entry() returns the old entry if any. > + * If there was no old entry, then it used 'new' and > + * inserted it into the fasync list. Clear new so that > + * we don't release it here. > + */ > + if (!fasync_insert_entry(fd, filp, &flp->fl_fasync, new)) > + new = NULL; > + > if (error < 0) { > /* remove lease just inserted by setlease */ > flp->fl_type = F_UNLCK | F_INPROGRESS; > @@ -1541,6 +1554,8 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) > error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); > out_unlock: > unlock_flocks(); > + if (new) > + fasync_free(new); > return error; > } > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 240eb1d..d487772 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -1310,6 +1310,11 @@ struct fasync_struct { > > /* SMP safe fasync helpers: */ > extern int fasync_helper(int, struct file *, int, struct fasync_struct **); > +extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasync_struct **, struct fasync_struct *); > +extern int fasync_remove_entry(struct file *, struct fasync_struct **); > +extern struct fasync_struct *fasync_alloc(void); > +extern void fasync_free(struct fasync_struct *); > + > /* can be called from interrupts */ > extern void kill_fasync(struct fasync_struct **, int, int); > ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 16:46 ` J. Bruce Fields (?) @ 2010-10-27 17:32 ` Linus Torvalds 2010-10-27 17:40 ` J. Bruce Fields -1 siblings, 1 reply; 62+ messages in thread From: Linus Torvalds @ 2010-10-27 17:32 UTC (permalink / raw) To: J. Bruce Fields Cc: Arnd Bergmann, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 9:46 AM, J. Bruce Fields <bfields@fieldses.org> wrote: > On Wed, Oct 27, 2010 at 09:12:06AM -0700, Linus Torvalds wrote: >> >> Something like the attached (UNTESTED!) perhaps? > > Makes sense to me. Testing.... So I found a buglet in the patch: the NOTE! It is very important that the FASYNC flag always match the state "is the filp on a fasync list". comment should be moved to be associated with "fasync_insert_entry()" rather than "fasync_add_entry()", since it's the insert-entry thing that does the actual FASYNC flag handling. But that incorrect comment placement shouldn't affect testing, obviously ;) Btw, who is going to collect these things assuming it passes testing? Arnd? You? I'll happily sign off on the fasync patch (with the comment movement) assuming it tests out ok, but there's all the other patches too that have been passed around. I really do want to get this into the merge window, because it would be a big shame if we couldn't effectively get rid of the BKL now just because of these kinds of smallish final details, so I'm just checking who wants to step up to the plate to collect it all together and make sure I have it? Linus ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 17:32 ` Linus Torvalds @ 2010-10-27 17:40 ` J. Bruce Fields 2010-10-27 18:20 ` Arnd Bergmann 0 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 17:40 UTC (permalink / raw) To: Linus Torvalds Cc: Arnd Bergmann, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 10:32:20AM -0700, Linus Torvalds wrote: > On Wed, Oct 27, 2010 at 9:46 AM, J. Bruce Fields <bfields@fieldses.org> wrote: > > On Wed, Oct 27, 2010 at 09:12:06AM -0700, Linus Torvalds wrote: > >> > >> Something like the attached (UNTESTED!) perhaps? > > > > Makes sense to me. Testing.... > > So I found a buglet in the patch: the > > NOTE! It is very important that the FASYNC flag always > match the state "is the filp on a fasync list". > > comment should be moved to be associated with "fasync_insert_entry()" > rather than "fasync_add_entry()", since it's the insert-entry thing > that does the actual FASYNC flag handling. > > But that incorrect comment placement shouldn't affect testing, obviously ;) > > Btw, who is going to collect these things assuming it passes testing? > Arnd? You? I'll happily sign off on the fasync patch (with the comment > movement) assuming it tests out ok, but there's all the other patches > too that have been passed around. I really do want to get this into > the merge window, because it would be a big shame if we couldn't > effectively get rid of the BKL now just because of these kinds of > smallish final details, so I'm just checking who wants to step up to > the plate to collect it all together and make sure I have it? Doesn't matter to me. What I've currently got (Arnd's patches with some minor fixes from me, and your patch, all in a poorly changelogged jumble) is in git://linux-nfs.org/~bfields/linux-topics.git TMP-BKL-TESTING You wouldn't want to merge that as is, but it does seem to work. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 17:40 ` J. Bruce Fields @ 2010-10-27 18:20 ` Arnd Bergmann 2010-10-27 18:42 ` Linus Torvalds 0 siblings, 1 reply; 62+ messages in thread From: Arnd Bergmann @ 2010-10-27 18:20 UTC (permalink / raw) To: J. Bruce Fields Cc: Linus Torvalds, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wednesday 27 October 2010 19:40:16 J. Bruce Fields wrote: > Doesn't matter to me. What I've currently got (Arnd's patches with some > minor fixes from me, and your patch, all in a poorly changelogged > jumble) is in > > git://linux-nfs.org/~bfields/linux-topics.git TMP-BKL-TESTING > > You wouldn't want to merge that as is, but it does seem to work. Ok, then let's use your tree with proper changelogs. That should be the easiest way to make sure that the code you test is the one that gets in. Arnd ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 18:20 ` Arnd Bergmann @ 2010-10-27 18:42 ` Linus Torvalds 2010-10-27 18:43 ` Linus Torvalds 0 siblings, 1 reply; 62+ messages in thread From: Linus Torvalds @ 2010-10-27 18:42 UTC (permalink / raw) To: Arnd Bergmann Cc: J. Bruce Fields, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel [-- Attachment #1: Type: text/plain, Size: 669 bytes --] On Wed, Oct 27, 2010 at 11:20 AM, Arnd Bergmann <arnd@arndb.de> wrote: > > Ok, then let's use your tree with proper changelogs. That should be the > easiest way to make sure that the code you test is the one that gets in. Half of them are yours, and the one that is mine needs the comment fixup. Can you organize it all for me to pull? Here's my patch again, with the comment move and a suggested commit message (and my sign-off and Bruce's tested-by: the only thing that changed in the patch was literally some comment placement so the tested-by should still be perfectly valid). Feel free to edit the message/patch to your hearts content. Thanks, Linus [-- Attachment #2: patch.diff --] [-- Type: text/x-patch, Size: 6020 bytes --] From: Linus Torvalds <torvalds@linux-foundation.org> Subject: fasync: re-organize fasync entry insertion to allow it under a spinlock You currently cannot use "fasync_helper()" in an atomic environment to insert a new fasync entry, because it will need to allocate the new "struct fasync_struct". Yet fcntl_setlease() wants to call this under lock_flocks(), which is in the process of being converted from the BKL to a spinlock. In order to fix this, this abstracts out the actual fasync list insertion and the fasync allocations into functions of their own, and teaches fs/locks.c to pre-allocate the fasync_struct entry. That way the actual list insertion can happen while holding the required spinlock. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Bruce Fields <bfields@fieldses.org> --- fs/fcntl.c | 62 ++++++++++++++++++++++++++++++++++++++++----------- fs/locks.c | 17 +++++++++++++- include/linux/fs.h | 5 ++++ 3 files changed, 69 insertions(+), 15 deletions(-) diff --git a/fs/fcntl.c b/fs/fcntl.c index f8cc34f..ecc8b39 100644 --- a/fs/fcntl.c +++ b/fs/fcntl.c @@ -640,7 +640,7 @@ static void fasync_free_rcu(struct rcu_head *head) * match the state "is the filp on a fasync list". * */ -static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) +int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) { struct fasync_struct *fa, **fp; int result = 0; @@ -666,21 +666,31 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp) return result; } +struct fasync_struct *fasync_alloc(void) +{ + return kmem_cache_alloc(fasync_cache, GFP_KERNEL); +} + /* - * Add a fasync entry. Return negative on error, positive if - * added, and zero if did nothing but change an existing one. + * NOTE! This can be used only for unused fasync entries: + * entries that actually got inserted on the fasync list + * need to be released by rcu - see fasync_remove_entry. + */ +void fasync_free(struct fasync_struct *new) +{ + kmem_cache_free(fasync_cache, new); +} + +/* + * Insert a new entry into the fasync list. Return the pointer to the + * old one if we didn't use the new one. * * NOTE! It is very important that the FASYNC flag always * match the state "is the filp on a fasync list". */ -static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp) +struct fasync_struct *fasync_insert_entry(int fd, struct file *filp, struct fasync_struct **fapp, struct fasync_struct *new) { - struct fasync_struct *new, *fa, **fp; - int result = 0; - - new = kmem_cache_alloc(fasync_cache, GFP_KERNEL); - if (!new) - return -ENOMEM; + struct fasync_struct *fa, **fp; spin_lock(&filp->f_lock); spin_lock(&fasync_lock); @@ -691,8 +701,6 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa spin_lock_irq(&fa->fa_lock); fa->fa_fd = fd; spin_unlock_irq(&fa->fa_lock); - - kmem_cache_free(fasync_cache, new); goto out; } @@ -702,13 +710,39 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa new->fa_fd = fd; new->fa_next = *fapp; rcu_assign_pointer(*fapp, new); - result = 1; filp->f_flags |= FASYNC; out: spin_unlock(&fasync_lock); spin_unlock(&filp->f_lock); - return result; + return fa; +} + +/* + * Add a fasync entry. Return negative on error, positive if + * added, and zero if did nothing but change an existing one. + */ +static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp) +{ + struct fasync_struct *new; + + new = fasync_alloc(); + if (!new) + return -ENOMEM; + + /* + * fasync_insert_entry() returns the old (update) entry if + * it existed. + * + * So free the (unused) new entry and return 0 to let the + * caller know that we didn't add any new fasync entries. + */ + if (fasync_insert_entry(fd, filp, fapp, new)) { + fasync_free(new); + return 0; + } + + return 1; } /* diff --git a/fs/locks.c b/fs/locks.c index 4de3a26..9ff3f66 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1515,6 +1515,7 @@ EXPORT_SYMBOL_GPL(vfs_setlease); int fcntl_setlease(unsigned int fd, struct file *filp, long arg) { struct file_lock fl, *flp = &fl; + struct fasync_struct *new; struct inode *inode = filp->f_path.dentry->d_inode; int error; @@ -1523,13 +1524,25 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) if (error) return error; + new = fasync_alloc(); + if (!new) + return -ENOMEM; + lock_flocks(); error = __vfs_setlease(filp, arg, &flp); if (error || arg == F_UNLCK) goto out_unlock; - error = fasync_helper(fd, filp, 1, &flp->fl_fasync); + /* + * fasync_insert_entry() returns the old entry if any. + * If there was no old entry, then it used 'new' and + * inserted it into the fasync list. Clear new so that + * we don't release it here. + */ + if (!fasync_insert_entry(fd, filp, &flp->fl_fasync, new)) + new = NULL; + if (error < 0) { /* remove lease just inserted by setlease */ flp->fl_type = F_UNLCK | F_INPROGRESS; @@ -1541,6 +1554,8 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); out_unlock: unlock_flocks(); + if (new) + fasync_free(new); return error; } diff --git a/include/linux/fs.h b/include/linux/fs.h index 240eb1d..d487772 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1310,6 +1310,11 @@ struct fasync_struct { /* SMP safe fasync helpers: */ extern int fasync_helper(int, struct file *, int, struct fasync_struct **); +extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasync_struct **, struct fasync_struct *); +extern int fasync_remove_entry(struct file *, struct fasync_struct **); +extern struct fasync_struct *fasync_alloc(void); +extern void fasync_free(struct fasync_struct *); + /* can be called from interrupts */ extern void kill_fasync(struct fasync_struct **, int, int); ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 18:42 ` Linus Torvalds @ 2010-10-27 18:43 ` Linus Torvalds 2010-10-27 19:48 ` Arnd Bergmann 0 siblings, 1 reply; 62+ messages in thread From: Linus Torvalds @ 2010-10-27 18:43 UTC (permalink / raw) To: Arnd Bergmann Cc: J. Bruce Fields, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 11:42 AM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > Feel free to edit the message/patch to your hearts content. Oh, and if you want me to just commit this part, I can do so. It doesn't make much sense without the other parts to actually make it useful, though, so it probably makes more sense to come with them. Linus ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 18:43 ` Linus Torvalds @ 2010-10-27 19:48 ` Arnd Bergmann 2010-10-27 20:01 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Arnd Bergmann @ 2010-10-27 19:48 UTC (permalink / raw) To: Linus Torvalds Cc: J. Bruce Fields, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wednesday 27 October 2010 20:43:59 Linus Torvalds wrote: > On Wed, Oct 27, 2010 at 11:42 AM, Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > Feel free to edit the message/patch to your hearts content. > > Oh, and if you want me to just commit this part, I can do so. It > doesn't make much sense without the other parts to actually make it > useful, though, so it probably makes more sense to come with them. Once Bruce is happy with the test results, you can pull it from git+ssh://master.kernel.org/pub/scm/linux/kernel/git/arnd/bkl.git flock I've rewritten all the changelogs to make more sense as a series, and split out the bits that turn lock_flocks into a spinlock to come last, so we get a bisectable series. The contents are left unchanged. Arnd --- commit b3426739cc8f7c7dd127ca8dad5e25195930cac1 Author: Arnd Bergmann <arnd@arndb.de> Date: Wed Oct 27 21:39:58 2010 +0200 locks: turn lock_flocks into a spinlock Nothing depends on lock_flocks using the BKL any more, so we can do the switch over to a private spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> fs/Kconfig | 1 - fs/locks.c | 5 +++-- 2 files changed, 3 insertions(+), 3 deletions(-) commit 55d3ff97c5c0b3dce39f705e2b1fe85818891822 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Oct 27 12:38:12 2010 -0400 locks: avoid fasync allocation under lock_flocks This splits fasync_helper into four functions: fasync_alloc, fasync_free, fasync_insert_entry and fasync_remove_entry, in order to allow the lease handling to call them directly instead of going through fasync_helper. The fasync_helper interface really doesn't make sense outside of ->fasync file operations, which use the same calling conventions of passing flags in and out as the helper. After the change, fcntl_setlease can simply allocate the new fasync_struct outside of lock_flocks, which is required to turn that into a spinlock. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bfields@redhat.com: rebase on top of my changes to Arnd's patch] Signed-off-by: J. Bruce Fields <bfields@redhat.com> [arnd: rewrite changelog text] Signed-off-by: Arnd Bergmann <arnd@arndb.de> fs/fcntl.c | 66 +++++++++++++++++++++++++++++++++++++++------------ fs/locks.c | 18 +++++++++++++- include/linux/fs.h | 5 ++++ 3 files changed, 72 insertions(+), 17 deletions(-) commit c5b1f0d92c36851aca09ac6c7c0c4f9690ac14f3 Author: Arnd Bergmann <arnd@arndb.de> Date: Wed Oct 27 15:46:08 2010 +0200 locks/nfsd: allocate file lock outside of spinlock As suggested by Christoph Hellwig, this moves allocation of new file locks out of generic_setlease into the callers, nfs4_open_delegation and fcntl_setlease in order to allow GFP_KERNEL allocations when lock_flocks has become a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com> fs/locks.c | 36 ++++++++++++------------------------ fs/nfsd/nfs4state.c | 26 +++++++++++++++----------- include/linux/fs.h | 1 + 3 files changed, 28 insertions(+), 35 deletions(-) commit a282a1fa6b23bd21ba0b86e53ed2a316b001836f Author: J. Bruce Fields <bfields@redhat.com> Date: Tue Oct 26 18:25:30 2010 -0400 lockd: fix nlmsvc_notify_blocked locking nlmsvc_notify_blocked walks the nlm_blocked list, which requires nlm_blocked_lock. Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> fs/lockd/svclock.c | 6 ++++-- 1 files changed, 4 insertions(+), 2 deletions(-) commit 763641d81202834e9d64de2019d1edec12868f4f Author: Arnd Bergmann <arnd@arndb.de> Date: Tue Oct 26 22:55:40 2010 +0200 lockd: push lock_flocks down lockd should use lock_flocks() instead of lock_kernel() to lock against posix locks accessing the i_flock list. This is a prerequisite to turning lock_flocks into a spinlock. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: J. Bruce Fields <bfields@redhat.com> fs/lockd/svc.c | 11 ----------- fs/lockd/svcsubs.c | 9 ++++++++- fs/nfs/Kconfig | 1 - fs/nfsd/Kconfig | 1 - 4 files changed, 8 insertions(+), 14 deletions(-) ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 19:48 ` Arnd Bergmann @ 2010-10-27 20:01 ` J. Bruce Fields 2010-10-27 20:20 ` Arnd Bergmann 0 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 20:01 UTC (permalink / raw) To: Arnd Bergmann Cc: Linus Torvalds, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 09:48:47PM +0200, Arnd Bergmann wrote: > On Wednesday 27 October 2010 20:43:59 Linus Torvalds wrote: > > On Wed, Oct 27, 2010 at 11:42 AM, Linus Torvalds > > <torvalds@linux-foundation.org> wrote: > > > > > > Feel free to edit the message/patch to your hearts content. > > > > Oh, and if you want me to just commit this part, I can do so. It > > doesn't make much sense without the other parts to actually make it > > useful, though, so it probably makes more sense to come with them. > > Once Bruce is happy with the test results, you can pull it from > > git+ssh://master.kernel.org/pub/scm/linux/kernel/git/arnd/bkl.git flock Not the useful url for us hoi polloi out here. OK, looking at git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl.git flock.... > I've rewritten all the changelogs to make more sense as a series, > and split out the bits that turn lock_flocks into a spinlock to > come last, so we get a bisectable series. > > The contents are left unchanged. Looks fine to me! I haven't retested that, but I can see that the result is identical to what I tested. (You're missing Linus's comment fix, though, unless I'm confused.) --b. > > Arnd > --- > > commit b3426739cc8f7c7dd127ca8dad5e25195930cac1 > Author: Arnd Bergmann <arnd@arndb.de> > Date: Wed Oct 27 21:39:58 2010 +0200 > > locks: turn lock_flocks into a spinlock > > Nothing depends on lock_flocks using the BKL > any more, so we can do the switch over to > a private spinlock. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > fs/Kconfig | 1 - > fs/locks.c | 5 +++-- > 2 files changed, 3 insertions(+), 3 deletions(-) > > commit 55d3ff97c5c0b3dce39f705e2b1fe85818891822 > Author: Linus Torvalds <torvalds@linux-foundation.org> > Date: Wed Oct 27 12:38:12 2010 -0400 > > locks: avoid fasync allocation under lock_flocks > > This splits fasync_helper into four functions: fasync_alloc, > fasync_free, fasync_insert_entry and fasync_remove_entry, > in order to allow the lease handling to call them directly > instead of going through fasync_helper. > > The fasync_helper interface really doesn't make sense outside > of ->fasync file operations, which use the same calling > conventions of passing flags in and out as the helper. > > After the change, fcntl_setlease can simply allocate the > new fasync_struct outside of lock_flocks, which is required > to turn that into a spinlock. > > Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> > [bfields@redhat.com: rebase on top of my changes to Arnd's patch] > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > [arnd: rewrite changelog text] > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > fs/fcntl.c | 66 +++++++++++++++++++++++++++++++++++++++------------ > fs/locks.c | 18 +++++++++++++- > include/linux/fs.h | 5 ++++ > 3 files changed, 72 insertions(+), 17 deletions(-) > > commit c5b1f0d92c36851aca09ac6c7c0c4f9690ac14f3 > Author: Arnd Bergmann <arnd@arndb.de> > Date: Wed Oct 27 15:46:08 2010 +0200 > > locks/nfsd: allocate file lock outside of spinlock > > As suggested by Christoph Hellwig, this moves allocation > of new file locks out of generic_setlease into the > callers, nfs4_open_delegation and fcntl_setlease in order > to allow GFP_KERNEL allocations when lock_flocks has > become a spinlock. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > Acked-by: J. Bruce Fields <bfields@redhat.com> > > fs/locks.c | 36 ++++++++++++------------------------ > fs/nfsd/nfs4state.c | 26 +++++++++++++++----------- > include/linux/fs.h | 1 + > 3 files changed, 28 insertions(+), 35 deletions(-) > > commit a282a1fa6b23bd21ba0b86e53ed2a316b001836f > Author: J. Bruce Fields <bfields@redhat.com> > Date: Tue Oct 26 18:25:30 2010 -0400 > > lockd: fix nlmsvc_notify_blocked locking > > nlmsvc_notify_blocked walks the nlm_blocked list, > which requires nlm_blocked_lock. > > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > fs/lockd/svclock.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > commit 763641d81202834e9d64de2019d1edec12868f4f > Author: Arnd Bergmann <arnd@arndb.de> > Date: Tue Oct 26 22:55:40 2010 +0200 > > lockd: push lock_flocks down > > lockd should use lock_flocks() instead of lock_kernel() > to lock against posix locks accessing the i_flock list. > > This is a prerequisite to turning lock_flocks into a > spinlock. > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > Acked-by: J. Bruce Fields <bfields@redhat.com> > > fs/lockd/svc.c | 11 ----------- > fs/lockd/svcsubs.c | 9 ++++++++- > fs/nfs/Kconfig | 1 - > fs/nfsd/Kconfig | 1 - > 4 files changed, 8 insertions(+), 14 deletions(-) ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 20:01 ` J. Bruce Fields @ 2010-10-27 20:20 ` Arnd Bergmann 2010-10-27 20:24 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Arnd Bergmann @ 2010-10-27 20:20 UTC (permalink / raw) To: J. Bruce Fields Cc: Linus Torvalds, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wednesday 27 October 2010 22:01:45 J. Bruce Fields wrote: > (You're missing Linus's comment fix, though, unless I'm confused.) Right, I didn't look into the attachment, so I hadn't noticed the new changelog. I've updated the changelog to Linus' version, as it's more verbose than mine. I also checked that the version in your tree is actually the right one, so I left the code untouched. Arnd ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 20:20 ` Arnd Bergmann @ 2010-10-27 20:24 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-27 20:24 UTC (permalink / raw) To: Arnd Bergmann Cc: Linus Torvalds, Christoph Hellwig, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Oct 27, 2010 at 10:20:20PM +0200, Arnd Bergmann wrote: > On Wednesday 27 October 2010 22:01:45 J. Bruce Fields wrote: > > (You're missing Linus's comment fix, though, unless I'm confused.) > > Right, I didn't look into the attachment, so I hadn't noticed > the new changelog. I've updated the changelog to Linus' version, > as it's more verbose than mine. He fixed a code comment too, though. No big deal. > I also checked that the version in your tree is actually the > right one, so I left the code untouched. Yup, looks good. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-27 14:59 ` Christoph Hellwig 2010-10-27 15:16 ` J. Bruce Fields 2010-10-27 15:23 ` Arnd Bergmann @ 2010-10-30 21:25 ` J. Bruce Fields 2010-10-30 21:31 ` [PATCH 1/4] locks: prevent ENOMEM on lease unlock J. Bruce Fields ` (6 more replies) 2 siblings, 7 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-30 21:25 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel I noticed a couple problems with the lease patches. Also: On Wed, Oct 27, 2010 at 10:59:29AM -0400, Christoph Hellwig wrote: > On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote: > > Hm, two problems: > > - We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK) > > failing with ENOMEM. > > splitt ->setlease into ->add_least and ->delete_lease. No need to pass > in a structure for the later. No need to return one either. We still need to do this part. I just did the minimum required to fix the problem--I figure splitting out a separate ->delete_lease method (which I agree would be cleaner) could wait till the next merge window. Patches follow, tested on top of the merge of Arnd's patches; if they look OK they can go in, or I can commit them on top of a few pending nfsd fixes and send another pull request. It'd be nice if we get this fixed by -rc1, but not the end of the world if we don't. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH 1/4] locks: prevent ENOMEM on lease unlock 2010-10-30 21:25 ` J. Bruce Fields @ 2010-10-30 21:31 ` J. Bruce Fields 2010-10-30 21:31 ` [PATCH 2/4] locks: fix leaks on setlease errors J. Bruce Fields ` (5 subsequent siblings) 6 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-30 21:31 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel, J. Bruce Fields Removing a lock shouldn't require any allocations; a failure due to ENOMEM leaves the caller with a choice between retrying or giving up and leaking an unused lease. Next we should split the other lease calls into add and delete cases. I wanted to start with just the bugfix. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/locks.c | 43 ++++++++++++++++++++++++++++++------------- 1 files changed, 30 insertions(+), 13 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 50ec159..06c7773 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1441,7 +1441,8 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) return 0; out: - locks_free_lock(lease); + if (arg != F_UNLCK) + locks_free_lock(lease); return error; } EXPORT_SYMBOL(generic_setlease); @@ -1493,17 +1494,16 @@ int vfs_setlease(struct file *filp, long arg, struct file_lock **lease) } EXPORT_SYMBOL_GPL(vfs_setlease); -/** - * fcntl_setlease - sets a lease on an open file - * @fd: open file descriptor - * @filp: file pointer - * @arg: type of lease to obtain - * - * Call this fcntl to establish a lease on the file. - * Note that you also need to call %F_SETSIG to - * receive a signal when the lease is broken. - */ -int fcntl_setlease(unsigned int fd, struct file *filp, long arg) +static int do_fcntl_delete_lease(struct file *filp) +{ + struct file_lock fl, *flp = &fl; + + lease_init(filp, F_UNLCK, flp); + + return vfs_setlease(filp, F_UNLCK, &flp); +} + +static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) { struct file_lock *fl; struct fasync_struct *new; @@ -1521,7 +1521,7 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg) } lock_flocks(); error = __vfs_setlease(filp, arg, &fl); - if (error || arg == F_UNLCK) + if (error) goto out_unlock; /* @@ -1550,6 +1550,23 @@ out_unlock: } /** + * fcntl_setlease - sets a lease on an open file + * @fd: open file descriptor + * @filp: file pointer + * @arg: type of lease to obtain + * + * Call this fcntl to establish a lease on the file. + * Note that you also need to call %F_SETSIG to + * receive a signal when the lease is broken. + */ +int fcntl_setlease(unsigned int fd, struct file *filp, long arg) +{ + if (arg == F_UNLCK) + return do_fcntl_delete_lease(filp); + return do_fcntl_add_lease(fd, filp, arg); +} + +/** * flock_lock_file_wait - Apply a FLOCK-style lock to a file * @filp: The file to apply the lock to * @fl: The lock to be applied -- 1.7.1 ^ permalink raw reply related [flat|nested] 62+ messages in thread
* [PATCH 2/4] locks: fix leaks on setlease errors 2010-10-30 21:25 ` J. Bruce Fields 2010-10-30 21:31 ` [PATCH 1/4] locks: prevent ENOMEM on lease unlock J. Bruce Fields @ 2010-10-30 21:31 ` J. Bruce Fields 2010-10-31 11:10 ` Christoph Hellwig 2010-10-30 21:31 ` [PATCH 3/4] locks: fix setlease methods to free passed-in lock J. Bruce Fields ` (4 subsequent siblings) 6 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-30 21:31 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel, J. Bruce Fields We're depending on setlease to free the passed-in lease on failure. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/locks.c | 12 +++++++----- 1 files changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 06c7773..63fbc41 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1371,20 +1371,22 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp) struct inode *inode = dentry->d_inode; int error, rdlease_count = 0, wrlease_count = 0; + lease = *flp; + + error = -EACCES; if ((current_fsuid() != inode->i_uid) && !capable(CAP_LEASE)) - return -EACCES; + goto out; + error = -EINVAL; if (!S_ISREG(inode->i_mode)) - return -EINVAL; + goto out; error = security_file_lock(filp, arg); if (error) - return error; + goto out; time_out_leases(inode); BUG_ON(!(*flp)->fl_lmops->fl_break); - lease = *flp; - if (arg != F_UNLCK) { error = -EAGAIN; if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0)) -- 1.7.1 ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [PATCH 2/4] locks: fix leaks on setlease errors 2010-10-30 21:31 ` [PATCH 2/4] locks: fix leaks on setlease errors J. Bruce Fields @ 2010-10-31 11:10 ` Christoph Hellwig 2010-11-01 17:24 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Christoph Hellwig @ 2010-10-31 11:10 UTC (permalink / raw) To: J. Bruce Fields Cc: Linus Torvalds, Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Sat, Oct 30, 2010 at 05:31:14PM -0400, J. Bruce Fields wrote: > We're depending on setlease to free the passed-in lease on failure. But we would be much better to just free it in the caller. I'ts much more natural - caller allocates, caller frees, and it's also simpler. I'll send a patch to do so shortly, together with sorting out the remaining nfs4d lock_manager_operations abuses. I think we're set with that for 2.6.37, the setlease split can wait once we've sorted out the lock freeing issue. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH 2/4] locks: fix leaks on setlease errors 2010-10-31 11:10 ` Christoph Hellwig @ 2010-11-01 17:24 ` J. Bruce Fields 2010-11-01 17:41 ` Christoph Hellwig 0 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-11-01 17:24 UTC (permalink / raw) To: Christoph Hellwig Cc: J. Bruce Fields, Linus Torvalds, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Sun, Oct 31, 2010 at 07:10:57AM -0400, Christoph Hellwig wrote: > On Sat, Oct 30, 2010 at 05:31:14PM -0400, J. Bruce Fields wrote: > > We're depending on setlease to free the passed-in lease on failure. > > But we would be much better to just free it in the caller. I'ts much > more natural - caller allocates, caller frees, and it's also simpler. > > I'll send a patch to do so shortly, together with sorting out the > remaining nfs4d lock_manager_operations abuses. I think we're set with > that for 2.6.37, the setlease split can wait once we've sorted out the > lock freeing issue. I also have patches that get rid of fl_release_private, fl_mylease, and (almost done) fl_change. Unless you've a better suggestion I'll probably send them out for review and then queue them up with other nfsd changes for 2.6.38. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH 2/4] locks: fix leaks on setlease errors 2010-11-01 17:24 ` J. Bruce Fields @ 2010-11-01 17:41 ` Christoph Hellwig 2010-11-01 18:34 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Christoph Hellwig @ 2010-11-01 17:41 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, J. Bruce Fields, Linus Torvalds, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Mon, Nov 01, 2010 at 01:24:40PM -0400, J. Bruce Fields wrote: > I also have patches that get rid of fl_release_private, fl_mylease, and > (almost done) fl_change. > > Unless you've a better suggestion I'll probably send them out for review > and then queue them up with other nfsd changes for 2.6.38. Sounds good. I was also wondering if we can get rid of ->setlease entirely. The file_lock_operations should be enough abstraction to reject the leases in theory, but I need to look into it a bit more. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH 2/4] locks: fix leaks on setlease errors 2010-11-01 17:41 ` Christoph Hellwig @ 2010-11-01 18:34 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-11-01 18:34 UTC (permalink / raw) To: Christoph Hellwig Cc: J. Bruce Fields, Linus Torvalds, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Mon, Nov 01, 2010 at 01:41:22PM -0400, Christoph Hellwig wrote: > On Mon, Nov 01, 2010 at 01:24:40PM -0400, J. Bruce Fields wrote: > > I also have patches that get rid of fl_release_private, fl_mylease, and > > (almost done) fl_change. > > > > Unless you've a better suggestion I'll probably send them out for review > > and then queue them up with other nfsd changes for 2.6.38. > > Sounds good. I was also wondering if we can get rid of ->setlease > entirely. The file_lock_operations should be enough abstraction to > reject the leases in theory, but I need to look into it a bit more. Note that cifs has a real lease operation now. No idea how it works or if it's really correct. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH 3/4] locks: fix setlease methods to free passed-in lock 2010-10-30 21:25 ` J. Bruce Fields 2010-10-30 21:31 ` [PATCH 1/4] locks: prevent ENOMEM on lease unlock J. Bruce Fields 2010-10-30 21:31 ` [PATCH 2/4] locks: fix leaks on setlease errors J. Bruce Fields @ 2010-10-30 21:31 ` J. Bruce Fields 2010-10-30 21:31 ` [PATCH 4/4] nfsd4: initialize delegation pointer to lease J. Bruce Fields ` (3 subsequent siblings) 6 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-30 21:31 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel, J. Bruce Fields, Steven Whitehouse, Steve French, Trond Myklebust We modified setlease to require the caller to allocate the new lease in the case of creating a new lease, but forgot to fix up the filesystem methods. Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Steve French <sfrench@samba.org> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/cifs/cifsfs.c | 5 ++++- fs/gfs2/file.c | 2 ++ fs/locks.c | 3 ++- fs/nfs/file.c | 3 ++- include/linux/fs.h | 1 + 5 files changed, 11 insertions(+), 3 deletions(-) diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index 75c4eaa..54745b6 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -625,8 +625,11 @@ static int cifs_setlease(struct file *file, long arg, struct file_lock **lease) knows that the file won't be changed on the server by anyone else */ return generic_setlease(file, arg, lease); - else + else { + if (arg != F_UNLCK) + locks_free_lock(*lease); return -EAGAIN; + } } struct file_system_type cifs_fs_type = { diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index aa99647..ac943c1 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -629,6 +629,8 @@ static ssize_t gfs2_file_aio_write(struct kiocb *iocb, const struct iovec *iov, static int gfs2_setlease(struct file *file, long arg, struct file_lock **fl) { + if (arg != F_UNLCK) + locks_free_lock(*fl); return -EINVAL; } diff --git a/fs/locks.c b/fs/locks.c index 63fbc41..5b526a9 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -186,7 +186,7 @@ void locks_release_private(struct file_lock *fl) EXPORT_SYMBOL_GPL(locks_release_private); /* Free a lock which is not in use. */ -static void locks_free_lock(struct file_lock *fl) +void locks_free_lock(struct file_lock *fl) { BUG_ON(waitqueue_active(&fl->fl_wait)); BUG_ON(!list_empty(&fl->fl_block)); @@ -195,6 +195,7 @@ static void locks_free_lock(struct file_lock *fl) locks_release_private(fl); kmem_cache_free(filelock_cache, fl); } +EXPORT_SYMBOL(locks_free_lock); void locks_init_lock(struct file_lock *fl) { diff --git a/fs/nfs/file.c b/fs/nfs/file.c index e756075..1e524fb 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -884,6 +884,7 @@ static int nfs_setlease(struct file *file, long arg, struct file_lock **fl) dprintk("NFS: setlease(%s/%s, arg=%ld)\n", file->f_path.dentry->d_parent->d_name.name, file->f_path.dentry->d_name.name, arg); - + if (arg != F_UNLCK) + locks_free_lock(*fl); return -EINVAL; } diff --git a/include/linux/fs.h b/include/linux/fs.h index 7b7b507..1eb2939 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1129,6 +1129,7 @@ extern int fcntl_setlease(unsigned int fd, struct file *filp, long arg); extern int fcntl_getlease(struct file *filp); /* fs/locks.c */ +void locks_free_lock(struct file_lock *fl); extern void locks_init_lock(struct file_lock *); extern struct file_lock * locks_alloc_lock(void); extern void locks_copy_lock(struct file_lock *, struct file_lock *); -- 1.7.1 ^ permalink raw reply related [flat|nested] 62+ messages in thread
* [PATCH 4/4] nfsd4: initialize delegation pointer to lease 2010-10-30 21:25 ` J. Bruce Fields ` (2 preceding siblings ...) 2010-10-30 21:31 ` [PATCH 3/4] locks: fix setlease methods to free passed-in lock J. Bruce Fields @ 2010-10-30 21:31 ` J. Bruce Fields 2010-10-31 2:04 ` Christoph Hellwig 2010-10-30 21:40 ` nfsd changes for 2.6.37 Arnd Bergmann ` (2 subsequent siblings) 6 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-10-30 21:31 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel, J. Bruce Fields The NFSv4 server was initializing the dp->dl_flock pointer by the somewhat ridiculous method of a locks_copy_lock callback. Now that setlease uses the passed-in lock instead of doing a copy, dl_flock no longer gets set, resulting in the lock leaking on delegation release, and later possible hangs (among other problems). So, initialize dl_flock and get rid of the callback. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/nfsd/nfs4state.c | 19 ++----------------- 1 files changed, 2 insertions(+), 17 deletions(-) diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 56347e0..b7f818b 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2310,22 +2310,6 @@ void nfsd_release_deleg_cb(struct file_lock *fl) } /* - * Set the delegation file_lock back pointer. - * - * Called from setlease() with lock_kernel() held. - */ -static -void nfsd_copy_lock_deleg_cb(struct file_lock *new, struct file_lock *fl) -{ - struct nfs4_delegation *dp = (struct nfs4_delegation *)new->fl_owner; - - dprintk("NFSD: nfsd_copy_lock_deleg_cb: new fl %p dp %p\n", new, dp); - if (!dp) - return; - dp->dl_flock = new; -} - -/* * Called from setlease() with lock_kernel() held */ static @@ -2355,7 +2339,6 @@ int nfsd_change_deleg_cb(struct file_lock **onlist, int arg) static const struct lock_manager_operations nfsd_lease_mng_ops = { .fl_break = nfsd_break_deleg_cb, .fl_release_private = nfsd_release_deleg_cb, - .fl_copy_lock = nfsd_copy_lock_deleg_cb, .fl_mylease = nfsd_same_client_deleg_cb, .fl_change = nfsd_change_deleg_cb, }; @@ -2661,12 +2644,14 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta fl->fl_file = find_readable_file(stp->st_file); BUG_ON(!fl->fl_file); fl->fl_pid = current->tgid; + dp->dl_flock = fl; /* vfs_setlease checks to see if delegation should be handed out. * the lock_manager callbacks fl_mylease and fl_change are used */ if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) { dprintk("NFSD: setlease failed [%d], no delegation\n", status); + dp->dl_flock = NULL; unhash_delegation(dp); flag = NFS4_OPEN_DELEGATE_NONE; goto out; -- 1.7.1 ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [PATCH 4/4] nfsd4: initialize delegation pointer to lease 2010-10-30 21:31 ` [PATCH 4/4] nfsd4: initialize delegation pointer to lease J. Bruce Fields @ 2010-10-31 2:04 ` Christoph Hellwig 2010-10-31 3:04 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: Christoph Hellwig @ 2010-10-31 2:04 UTC (permalink / raw) To: J. Bruce Fields Cc: Linus Torvalds, Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Sat, Oct 30, 2010 at 05:31:16PM -0400, J. Bruce Fields wrote: > The NFSv4 server was initializing the dp->dl_flock pointer by the > somewhat ridiculous method of a locks_copy_lock callback. > > Now that setlease uses the passed-in lock instead of doing a copy, > dl_flock no longer gets set, resulting in the lock leaking on delegation > release, and later possible hangs (among other problems). > > So, initialize dl_flock and get rid of the callback. >From what I can see this was the only instance of lock_manager_operations.fl_copy_lock. Please kill it while you're at it. Also lock_manager_operations.fl_release_private has only exact one instance in nfs4d which is part of the same abuse scheme. Please also get rid of it. I recently noticed this while updating Documentation/filesystems/Locking for the grand new BKL-less world. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH 4/4] nfsd4: initialize delegation pointer to lease 2010-10-31 2:04 ` Christoph Hellwig @ 2010-10-31 3:04 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-31 3:04 UTC (permalink / raw) To: Christoph Hellwig Cc: J. Bruce Fields, Linus Torvalds, Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel On Sat, Oct 30, 2010 at 10:04:31PM -0400, Christoph Hellwig wrote: > On Sat, Oct 30, 2010 at 05:31:16PM -0400, J. Bruce Fields wrote: > > The NFSv4 server was initializing the dp->dl_flock pointer by the > > somewhat ridiculous method of a locks_copy_lock callback. > > > > Now that setlease uses the passed-in lock instead of doing a copy, > > dl_flock no longer gets set, resulting in the lock leaking on delegation > > release, and later possible hangs (among other problems). > > > > So, initialize dl_flock and get rid of the callback. > > >From what I can see this was the only instance of > lock_manager_operations.fl_copy_lock. Please kill it while you're at > it. > > Also lock_manager_operations.fl_release_private has only exact one > instance in nfs4d which is part of the same abuse scheme. Please also > get rid of it. I recently noticed this while updating > Documentation/filesystems/Locking for the grand new BKL-less world. Yeah, I wanted to maximize chances of getting the minimal fixes into -rc1, but I've got patches for that queued up too. I was planning on waiting for the next merge window, but I'm happy enough to send them in now. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-30 21:25 ` J. Bruce Fields ` (3 preceding siblings ...) 2010-10-30 21:31 ` [PATCH 4/4] nfsd4: initialize delegation pointer to lease J. Bruce Fields @ 2010-10-30 21:40 ` Arnd Bergmann 2010-10-31 2:07 ` Christoph Hellwig 2010-10-31 12:34 ` Christoph Hellwig 6 siblings, 0 replies; 62+ messages in thread From: Arnd Bergmann @ 2010-10-30 21:40 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Saturday 30 October 2010, J. Bruce Fields wrote: > We still need to do this part. I just did the minimum required to fix > the problem--I figure splitting out a separate ->delete_lease method > (which I agree would be cleaner) could wait till the next merge window. > > Patches follow, tested on top of the merge of Arnd's patches; if they > look OK they can go in, or I can commit them on top of a few pending > nfsd fixes and send another pull request. All four patches look good AFAICT. Thanks for the follow-up! Arnd ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-30 21:25 ` J. Bruce Fields ` (4 preceding siblings ...) 2010-10-30 21:40 ` nfsd changes for 2.6.37 Arnd Bergmann @ 2010-10-31 2:07 ` Christoph Hellwig 2010-10-31 3:05 ` J. Bruce Fields 2010-10-31 12:34 ` Christoph Hellwig 6 siblings, 1 reply; 62+ messages in thread From: Christoph Hellwig @ 2010-10-31 2:07 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Sat, Oct 30, 2010 at 05:25:00PM -0400, J. Bruce Fields wrote: > We still need to do this part. I just did the minimum required to fix > the problem--I figure splitting out a separate ->delete_lease method > (which I agree would be cleaner) could wait till the next merge window. I'd say please do it now. The pre-allocated file_lock changes make the semantics inside ->setlease really nasty right now. I think it's a dumb idea to leave it that way even for a single merge window. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-31 2:07 ` Christoph Hellwig @ 2010-10-31 3:05 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-10-31 3:05 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Sat, Oct 30, 2010 at 10:07:08PM -0400, Christoph Hellwig wrote: > On Sat, Oct 30, 2010 at 05:25:00PM -0400, J. Bruce Fields wrote: > > We still need to do this part. I just did the minimum required to fix > > the problem--I figure splitting out a separate ->delete_lease method > > (which I agree would be cleaner) could wait till the next merge window. > > I'd say please do it now. The pre-allocated file_lock changes make > the semantics inside ->setlease really nasty right now. I think it's a > dumb idea to leave it that way even for a single merge window. OK, will do. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-30 21:25 ` J. Bruce Fields ` (5 preceding siblings ...) 2010-10-31 2:07 ` Christoph Hellwig @ 2010-10-31 12:34 ` Christoph Hellwig 2010-10-31 12:35 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig ` (2 more replies) 6 siblings, 3 replies; 62+ messages in thread From: Christoph Hellwig @ 2010-10-31 12:34 UTC (permalink / raw) To: J. Bruce Fields Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel And two more fixups on top of the set sent by Bruce that already are in Linus' tree. ^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure 2010-10-31 12:34 ` Christoph Hellwig @ 2010-10-31 12:35 ` Christoph Hellwig 2010-11-03 20:41 ` J. Bruce Fields 2010-10-31 12:35 ` [PATCH 2/2] locks: remove fl_copy_lock lock_manager operation Christoph Hellwig 2010-11-01 15:02 ` nfsd changes for 2.6.37 J. Bruce Fields 2 siblings, 1 reply; 62+ messages in thread From: Christoph Hellwig @ 2010-10-31 12:35 UTC (permalink / raw) To: J. Bruce Fields Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel The caller allocated it, the caller should free it. The only issue so far is that we could change the flp pointer even on an error return if the fl_change callback failed. But we can simply move the flp assignment after the fl_change invocation, as the callers don't care about the flp return value if the setlease call failed. Signed-off-by: Christoph Hellwig <hch@lst.de> Index: linux-2.6/fs/cifs/cifsfs.c =================================================================== --- linux-2.6.orig/fs/cifs/cifsfs.c 2010-10-31 07:10:07.636004223 -0400 +++ linux-2.6/fs/cifs/cifsfs.c 2010-10-31 07:10:10.922004154 -0400 @@ -625,11 +625,8 @@ static int cifs_setlease(struct file *fi knows that the file won't be changed on the server by anyone else */ return generic_setlease(file, arg, lease); - else { - if (arg != F_UNLCK) - locks_free_lock(*lease); + else return -EAGAIN; - } } struct file_system_type cifs_fs_type = { Index: linux-2.6/fs/gfs2/file.c =================================================================== --- linux-2.6.orig/fs/gfs2/file.c 2010-10-31 07:10:07.643004363 -0400 +++ linux-2.6/fs/gfs2/file.c 2010-10-31 07:10:10.923003665 -0400 @@ -629,8 +629,6 @@ static ssize_t gfs2_file_aio_write(struc static int gfs2_setlease(struct file *file, long arg, struct file_lock **fl) { - if (arg != F_UNLCK) - locks_free_lock(*fl); return -EINVAL; } Index: linux-2.6/fs/locks.c =================================================================== --- linux-2.6.orig/fs/locks.c 2010-10-31 07:10:07.649004084 -0400 +++ linux-2.6/fs/locks.c 2010-10-31 07:34:10.102255587 -0400 @@ -1428,8 +1425,9 @@ int generic_setlease(struct file *filp, goto out; if (my_before != NULL) { - *flp = *my_before; error = lease->fl_lmops->fl_change(my_before, arg); + if (!error) + *flp = *my_before; goto out; } @@ -1444,8 +1442,6 @@ int generic_setlease(struct file *filp, return 0; out: - if (arg != F_UNLCK) - locks_free_lock(lease); return error; } EXPORT_SYMBOL(generic_setlease); @@ -1524,8 +1520,11 @@ static int do_fcntl_add_lease(unsigned i } lock_flocks(); error = __vfs_setlease(filp, arg, &fl); - if (error) - goto out_unlock; + if (error) { + unlock_flocks(); + locks_free_lock(fl); + goto out_free_fasync; + } /* * fasync_insert_entry() returns the old entry if any. @@ -1541,12 +1540,12 @@ static int do_fcntl_add_lease(unsigned i fl->fl_type = F_UNLCK | F_INPROGRESS; fl->fl_break_time = jiffies - 10; time_out_leases(inode); - goto out_unlock; + } else { + error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); } - - error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); -out_unlock: unlock_flocks(); + +out_free_fasync: if (new) fasync_free(new); return error; Index: linux-2.6/fs/nfs/file.c =================================================================== --- linux-2.6.orig/fs/nfs/file.c 2010-10-31 07:10:07.658003804 -0400 +++ linux-2.6/fs/nfs/file.c 2010-10-31 07:10:10.936003734 -0400 @@ -884,7 +884,5 @@ static int nfs_setlease(struct file *fil dprintk("NFS: setlease(%s/%s, arg=%ld)\n", file->f_path.dentry->d_parent->d_name.name, file->f_path.dentry->d_name.name, arg); - if (arg != F_UNLCK) - locks_free_lock(*fl); return -EINVAL; } Index: linux-2.6/fs/nfsd/nfs4state.c =================================================================== --- linux-2.6.orig/fs/nfsd/nfs4state.c 2010-10-31 07:10:07.666004084 -0400 +++ linux-2.6/fs/nfsd/nfs4state.c 2010-10-31 07:32:56.906254608 -0400 @@ -2652,6 +2652,7 @@ nfs4_open_delegation(struct svc_fh *fh, if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) { dprintk("NFSD: setlease failed [%d], no delegation\n", status); dp->dl_flock = NULL; + locks_free_lock(fl); unhash_delegation(dp); flag = NFS4_OPEN_DELEGATE_NONE; goto out; ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure 2010-10-31 12:35 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig @ 2010-11-03 20:41 ` J. Bruce Fields 2010-11-04 1:40 ` J. Bruce Fields 0 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-11-03 20:41 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Sun, Oct 31, 2010 at 08:35:10AM -0400, Christoph Hellwig wrote: > The caller allocated it, the caller should free it. The only issue so > far is that we could change the flp pointer even on an error return if > the fl_change callback failed. But we can simply move the flp assignment > after the fl_change invocation, as the callers don't care about the > flp return value if the setlease call failed. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > > Index: linux-2.6/fs/cifs/cifsfs.c > =================================================================== > --- linux-2.6.orig/fs/cifs/cifsfs.c 2010-10-31 07:10:07.636004223 -0400 > +++ linux-2.6/fs/cifs/cifsfs.c 2010-10-31 07:10:10.922004154 -0400 > @@ -625,11 +625,8 @@ static int cifs_setlease(struct file *fi > knows that the file won't be changed on the server > by anyone else */ > return generic_setlease(file, arg, lease); > - else { > - if (arg != F_UNLCK) > - locks_free_lock(*lease); > + else > return -EAGAIN; > - } > } > > struct file_system_type cifs_fs_type = { > Index: linux-2.6/fs/gfs2/file.c > =================================================================== > --- linux-2.6.orig/fs/gfs2/file.c 2010-10-31 07:10:07.643004363 -0400 > +++ linux-2.6/fs/gfs2/file.c 2010-10-31 07:10:10.923003665 -0400 > @@ -629,8 +629,6 @@ static ssize_t gfs2_file_aio_write(struc > > static int gfs2_setlease(struct file *file, long arg, struct file_lock **fl) > { > - if (arg != F_UNLCK) > - locks_free_lock(*fl); > return -EINVAL; > } > > Index: linux-2.6/fs/locks.c > =================================================================== > --- linux-2.6.orig/fs/locks.c 2010-10-31 07:10:07.649004084 -0400 > +++ linux-2.6/fs/locks.c 2010-10-31 07:34:10.102255587 -0400 > @@ -1428,8 +1425,9 @@ int generic_setlease(struct file *filp, > goto out; > > if (my_before != NULL) { > - *flp = *my_before; > error = lease->fl_lmops->fl_change(my_before, arg); > + if (!error) > + *flp = *my_before; Argh, missed this: we're leaking the passed-in lease in this case. --b. > goto out; > } > > @@ -1444,8 +1442,6 @@ int generic_setlease(struct file *filp, > return 0; > > out: > - if (arg != F_UNLCK) > - locks_free_lock(lease); > return error; > } > EXPORT_SYMBOL(generic_setlease); > @@ -1524,8 +1520,11 @@ static int do_fcntl_add_lease(unsigned i > } > lock_flocks(); > error = __vfs_setlease(filp, arg, &fl); > - if (error) > - goto out_unlock; > + if (error) { > + unlock_flocks(); > + locks_free_lock(fl); > + goto out_free_fasync; > + } > > /* > * fasync_insert_entry() returns the old entry if any. > @@ -1541,12 +1540,12 @@ static int do_fcntl_add_lease(unsigned i > fl->fl_type = F_UNLCK | F_INPROGRESS; > fl->fl_break_time = jiffies - 10; > time_out_leases(inode); > - goto out_unlock; > + } else { > + error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); > } > - > - error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); > -out_unlock: > unlock_flocks(); > + > +out_free_fasync: > if (new) > fasync_free(new); > return error; > Index: linux-2.6/fs/nfs/file.c > =================================================================== > --- linux-2.6.orig/fs/nfs/file.c 2010-10-31 07:10:07.658003804 -0400 > +++ linux-2.6/fs/nfs/file.c 2010-10-31 07:10:10.936003734 -0400 > @@ -884,7 +884,5 @@ static int nfs_setlease(struct file *fil > dprintk("NFS: setlease(%s/%s, arg=%ld)\n", > file->f_path.dentry->d_parent->d_name.name, > file->f_path.dentry->d_name.name, arg); > - if (arg != F_UNLCK) > - locks_free_lock(*fl); > return -EINVAL; > } > Index: linux-2.6/fs/nfsd/nfs4state.c > =================================================================== > --- linux-2.6.orig/fs/nfsd/nfs4state.c 2010-10-31 07:10:07.666004084 -0400 > +++ linux-2.6/fs/nfsd/nfs4state.c 2010-10-31 07:32:56.906254608 -0400 > @@ -2652,6 +2652,7 @@ nfs4_open_delegation(struct svc_fh *fh, > if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) { > dprintk("NFSD: setlease failed [%d], no delegation\n", status); > dp->dl_flock = NULL; > + locks_free_lock(fl); > unhash_delegation(dp); > flag = NFS4_OPEN_DELEGATE_NONE; > goto out; ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure 2010-11-03 20:41 ` J. Bruce Fields @ 2010-11-04 1:40 ` J. Bruce Fields 2010-11-04 1:41 ` J. Bruce Fields 2010-11-06 19:03 ` Christoph Hellwig 0 siblings, 2 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-11-04 1:40 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Nov 03, 2010 at 04:41:48PM -0400, J. Bruce Fields wrote: > On Sun, Oct 31, 2010 at 08:35:10AM -0400, Christoph Hellwig wrote: > > Index: linux-2.6/fs/locks.c > > =================================================================== > > --- linux-2.6.orig/fs/locks.c 2010-10-31 07:10:07.649004084 -0400 > > +++ linux-2.6/fs/locks.c 2010-10-31 07:34:10.102255587 -0400 > > @@ -1428,8 +1425,9 @@ int generic_setlease(struct file *filp, > > goto out; > > > > if (my_before != NULL) { > > - *flp = *my_before; > > error = lease->fl_lmops->fl_change(my_before, arg); > > + if (!error) > > + *flp = *my_before; > > Argh, missed this: we're leaking the passed-in lease in this case. We could do something like this. The irritating thing is that the only lease user I understand is the nfsd code, and it doesn't want this lease-merging behavior; the only reason that fl_change is there is so it can just turn this case into an error every time. And I have no idea what the requirements are of any other users: do leases behave like this on purpose, or was it just an arbitrary choice, and does anyone depend on it now? In the end maybe it would be better just to leave leases as they are and define a new lock type for nfsd. We'd probably have to do that eventually anyway, and it'd save me trying to guess what the lease semantics are supposed to be.... --b. >From f5c6d51ae638af213d8bda31504f4b2287b8a801 Mon Sep 17 00:00:00 2001 From: J. Bruce Fields <bfields@redhat.com> Date: Wed, 3 Nov 2010 16:49:44 -0400 Subject: [PATCH 1/2] locks: fix leak on merging leases We must also free the passed-in lease in the case it wasn't used because an existing lease was upgrade/downgraded or already existed. Note the nfsd caller doesn't care because it's fl_change callback returns an error in those cases. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/locks.c | 7 +++++-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 65765cb..61c22f7 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1504,7 +1504,7 @@ static int do_fcntl_delete_lease(struct file *filp) static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) { - struct file_lock *fl; + struct file_lock *fl, *ret; struct fasync_struct *new; struct inode *inode = filp->f_path.dentry->d_inode; int error; @@ -1518,6 +1518,7 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) locks_free_lock(fl); return -ENOMEM; } + ret = fl; lock_flocks(); error = __vfs_setlease(filp, arg, &fl); if (error) { @@ -1525,6 +1526,8 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) locks_free_lock(fl); goto out_free_fasync; } + if (ret != fl) + locks_free_lock(fl); /* * fasync_insert_entry() returns the old entry if any. @@ -1532,7 +1535,7 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) * inserted it into the fasync list. Clear new so that * we don't release it here. */ - if (!fasync_insert_entry(fd, filp, &fl->fl_fasync, new)) + if (!fasync_insert_entry(fd, filp, &ret->fl_fasync, new)) new = NULL; if (error < 0) { -- 1.7.1 ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure 2010-11-04 1:40 ` J. Bruce Fields @ 2010-11-04 1:41 ` J. Bruce Fields 2010-11-06 19:03 ` Christoph Hellwig 2010-11-06 19:03 ` Christoph Hellwig 1 sibling, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-11-04 1:41 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Wed, Nov 03, 2010 at 09:40:24PM -0400, J. Bruce Fields wrote: > On Wed, Nov 03, 2010 at 04:41:48PM -0400, J. Bruce Fields wrote: > > On Sun, Oct 31, 2010 at 08:35:10AM -0400, Christoph Hellwig wrote: > > > Index: linux-2.6/fs/locks.c > > > =================================================================== > > > --- linux-2.6.orig/fs/locks.c 2010-10-31 07:10:07.649004084 -0400 > > > +++ linux-2.6/fs/locks.c 2010-10-31 07:34:10.102255587 -0400 > > > @@ -1428,8 +1425,9 @@ int generic_setlease(struct file *filp, > > > goto out; > > > > > > if (my_before != NULL) { > > > - *flp = *my_before; > > > error = lease->fl_lmops->fl_change(my_before, arg); > > > + if (!error) > > > + *flp = *my_before; > > > > Argh, missed this: we're leaking the passed-in lease in this case. > > We could do something like this. And while I was here I noticed: --b. >From 072be1f975c6f839661cfc4f6c45ccb944417eea Mon Sep 17 00:00:00 2001 From: J. Bruce Fields <bfields@redhat.com> Date: Wed, 3 Nov 2010 18:09:18 -0400 Subject: [PATCH 2/2] locks: remove dead lease error-handling code A minor oversight from f7347ce4ee7c65415f84be915c018473e7076f31, "fasync: re-organize fasync entry insertion to allow it under a spinlock": this cleanup-on-error was only needed to handle -ENOMEM. Now that we're preallocating it's unneeded. Signed-off-by: J. Bruce Fields <bfields@redhat.com> --- fs/locks.c | 12 ++---------- 1 files changed, 2 insertions(+), 10 deletions(-) diff --git a/fs/locks.c b/fs/locks.c index 61c22f7..0e62dd3 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1506,7 +1506,6 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) { struct file_lock *fl, *ret; struct fasync_struct *new; - struct inode *inode = filp->f_path.dentry->d_inode; int error; fl = lease_alloc(filp, arg); @@ -1520,7 +1519,7 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) } ret = fl; lock_flocks(); - error = __vfs_setlease(filp, arg, &fl); + error = __vfs_setlease(filp, arg, &ret); if (error) { unlock_flocks(); locks_free_lock(fl); @@ -1538,14 +1537,7 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg) if (!fasync_insert_entry(fd, filp, &ret->fl_fasync, new)) new = NULL; - if (error < 0) { - /* remove lease just inserted by setlease */ - fl->fl_type = F_UNLCK | F_INPROGRESS; - fl->fl_break_time = jiffies - 10; - time_out_leases(inode); - } else { - error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); - } + error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0); unlock_flocks(); out_free_fasync: -- 1.7.1 ^ permalink raw reply related [flat|nested] 62+ messages in thread
* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure 2010-11-04 1:41 ` J. Bruce Fields @ 2010-11-06 19:03 ` Christoph Hellwig 0 siblings, 0 replies; 62+ messages in thread From: Christoph Hellwig @ 2010-11-06 19:03 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel > A minor oversight from f7347ce4ee7c65415f84be915c018473e7076f31, > "fasync: re-organize fasync entry insertion to allow it under a > spinlock": this cleanup-on-error was only needed to handle -ENOMEM. Now > that we're preallocating it's unneeded. Looks good to me, too. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure 2010-11-04 1:40 ` J. Bruce Fields 2010-11-04 1:41 ` J. Bruce Fields @ 2010-11-06 19:03 ` Christoph Hellwig 2010-11-08 16:10 ` J. Bruce Fields 1 sibling, 1 reply; 62+ messages in thread From: Christoph Hellwig @ 2010-11-06 19:03 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel, willy, sfr On Wed, Nov 03, 2010 at 09:40:24PM -0400, J. Bruce Fields wrote: > The irritating thing is that the only lease user I understand is the > nfsd code, and it doesn't want this lease-merging behavior; the only > reason that fl_change is there is so it can just turn this case into an > error every time. Yes. > And I have no idea what the requirements are of any other users: do > leases behave like this on purpose, or was it just an arbitrary choice, > and does anyone depend on it now? Adding Willy and Stephen to the Cc list as they wrote the code. > In the end maybe it would be better just to leave leases as they are and > define a new lock type for nfsd. > > We'd probably have to do that eventually anyway, and it'd save me trying > to guess what the lease semantics are supposed to be.... I'd rather see both leases and the nfs4 delegations detangled from the locks.c code. It's far too much of a mess already anyway. > Subject: [PATCH 1/2] locks: fix leak on merging leases > > We must also free the passed-in lease in the case it wasn't used because > an existing lease was upgrade/downgraded or already existed. > > Note the nfsd caller doesn't care because it's fl_change callback > returns an error in those cases. The patch looks good to me. Care to feed it to Linus? ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure 2010-11-06 19:03 ` Christoph Hellwig @ 2010-11-08 16:10 ` J. Bruce Fields 0 siblings, 0 replies; 62+ messages in thread From: J. Bruce Fields @ 2010-11-08 16:10 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel, willy, sfr On Sat, Nov 06, 2010 at 03:03:32PM -0400, Christoph Hellwig wrote: > On Wed, Nov 03, 2010 at 09:40:24PM -0400, J. Bruce Fields wrote: > > The irritating thing is that the only lease user I understand is the > > nfsd code, and it doesn't want this lease-merging behavior; the only > > reason that fl_change is there is so it can just turn this case into an > > error every time. > > Yes. > > > And I have no idea what the requirements are of any other users: do > > leases behave like this on purpose, or was it just an arbitrary choice, > > and does anyone depend on it now? > > Adding Willy and Stephen to the Cc list as they wrote the code. > > > In the end maybe it would be better just to leave leases as they are and > > define a new lock type for nfsd. > > > > We'd probably have to do that eventually anyway, and it'd save me trying > > to guess what the lease semantics are supposed to be.... > > I'd rather see both leases and the nfs4 delegations detangled from the > locks.c code. What are you thinking of? > It's far too much of a mess already anyway. > > > Subject: [PATCH 1/2] locks: fix leak on merging leases > > > > We must also free the passed-in lease in the case it wasn't used because > > an existing lease was upgrade/downgraded or already existed. > > > > Note the nfsd caller doesn't care because it's fl_change callback > > returns an error in those cases. > > The patch looks good to me. Care to feed it to Linus? Yep, will do. --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* [PATCH 2/2] locks: remove fl_copy_lock lock_manager operation 2010-10-31 12:34 ` Christoph Hellwig 2010-10-31 12:35 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig @ 2010-10-31 12:35 ` Christoph Hellwig 2010-11-01 15:02 ` nfsd changes for 2.6.37 J. Bruce Fields 2 siblings, 0 replies; 62+ messages in thread From: Christoph Hellwig @ 2010-10-31 12:35 UTC (permalink / raw) To: J. Bruce Fields Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel This one was only used for a nasty hack in nfsd, which has recently been removed. Signed-off-by: Christoph Hellwig <hch@lst.de> Index: linux-2.6/fs/locks.c =================================================================== --- linux-2.6.orig/fs/locks.c 2010-10-31 07:10:07.649004084 -0400 +++ linux-2.6/fs/locks.c 2010-10-31 07:34:10.102255587 -0400 @@ -235,11 +235,8 @@ static void locks_copy_private(struct fi fl->fl_ops->fl_copy_lock(new, fl); new->fl_ops = fl->fl_ops; } - if (fl->fl_lmops) { - if (fl->fl_lmops->fl_copy_lock) - fl->fl_lmops->fl_copy_lock(new, fl); + if (fl->fl_lmops) new->fl_lmops = fl->fl_lmops; - } } /* Index: linux-2.6/Documentation/filesystems/Locking =================================================================== --- linux-2.6.orig/Documentation/filesystems/Locking 2010-10-31 07:34:13.269012536 -0400 +++ linux-2.6/Documentation/filesystems/Locking 2010-10-31 07:34:20.567292713 -0400 @@ -322,7 +322,6 @@ fl_release_private: yes yes prototypes: int (*fl_compare_owner)(struct file_lock *, struct file_lock *); void (*fl_notify)(struct file_lock *); /* unblock callback */ - void (*fl_copy_lock)(struct file_lock *, struct file_lock *); void (*fl_release_private)(struct file_lock *); void (*fl_break)(struct file_lock *); /* break_lease callback */ @@ -330,7 +329,6 @@ locking rules: BKL may block fl_compare_owner: yes no fl_notify: yes no -fl_copy_lock: yes no fl_release_private: yes yes fl_break: yes no Index: linux-2.6/include/linux/fs.h =================================================================== --- linux-2.6.orig/include/linux/fs.h 2010-10-31 07:33:13.131262640 -0400 +++ linux-2.6/include/linux/fs.h 2010-10-31 07:33:25.959005833 -0400 @@ -1056,7 +1056,6 @@ struct lock_manager_operations { int (*fl_compare_owner)(struct file_lock *, struct file_lock *); void (*fl_notify)(struct file_lock *); /* unblock callback */ int (*fl_grant)(struct file_lock *, struct file_lock *, int); - void (*fl_copy_lock)(struct file_lock *, struct file_lock *); void (*fl_release_private)(struct file_lock *); void (*fl_break)(struct file_lock *); int (*fl_mylease)(struct file_lock *, struct file_lock *); ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-10-31 12:34 ` Christoph Hellwig 2010-10-31 12:35 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig 2010-10-31 12:35 ` [PATCH 2/2] locks: remove fl_copy_lock lock_manager operation Christoph Hellwig @ 2010-11-01 15:02 ` J. Bruce Fields 2010-11-06 19:04 ` Christoph Hellwig 2 siblings, 1 reply; 62+ messages in thread From: J. Bruce Fields @ 2010-11-01 15:02 UTC (permalink / raw) To: Christoph Hellwig Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel On Sun, Oct 31, 2010 at 08:34:20AM -0400, Christoph Hellwig wrote: > And two more fixups on top of the set sent by Bruce that already are > in Linus' tree. Both look good to me, thanks! --b. ^ permalink raw reply [flat|nested] 62+ messages in thread
* Re: nfsd changes for 2.6.37 2010-11-01 15:02 ` nfsd changes for 2.6.37 J. Bruce Fields @ 2010-11-06 19:04 ` Christoph Hellwig 0 siblings, 0 replies; 62+ messages in thread From: Christoph Hellwig @ 2010-11-06 19:04 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel Btw, the nfsd delegation code still has a few comments referring to the BKL. This should probably be cleaned up: fs/nfsd/nfs4state.c: * Called from break_lease() with lock_kernel() held. fs/nfsd/nfs4state.c: /* only place dl_time is set. protected by lock_kernel*/ fs/nfsd/nfs4state.c: * Called by locks_free_lock() with lock_kernel() held. fs/nfsd/nfs4state.c: * Called from setlease() with lock_kernel() held ^ permalink raw reply [flat|nested] 62+ messages in thread
end of thread, other threads:[~2010-11-08 16:11 UTC | newest] Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-10-26 16:45 nfsd changes for 2.6.37 J. Bruce Fields 2010-10-26 17:22 ` J. Bruce Fields 2010-10-26 17:39 ` Linus Torvalds 2010-10-26 17:46 ` J. Bruce Fields 2010-10-26 17:46 ` J. Bruce Fields 2010-10-26 20:18 ` Arnd Bergmann 2010-10-26 20:35 ` Bryan Schumaker 2010-10-26 20:55 ` Arnd Bergmann 2010-10-26 21:02 ` Linus Torvalds 2010-10-26 21:24 ` J. Bruce Fields 2010-10-26 21:37 ` Linus Torvalds 2010-10-26 21:44 ` J. Bruce Fields 2010-10-26 22:11 ` J. Bruce Fields 2010-10-26 22:41 ` J. Bruce Fields 2010-10-27 7:21 ` Arnd Bergmann 2010-10-27 8:39 ` Christoph Hellwig 2010-10-27 13:39 ` J. Bruce Fields 2010-10-27 13:46 ` Arnd Bergmann 2010-10-27 14:55 ` J. Bruce Fields 2010-10-27 14:59 ` Christoph Hellwig 2010-10-27 15:16 ` J. Bruce Fields 2010-10-27 15:19 ` Christoph Hellwig 2010-10-27 15:23 ` Arnd Bergmann 2010-10-27 15:28 ` J. Bruce Fields 2010-10-27 15:31 ` Christoph Hellwig 2010-10-27 16:12 ` Linus Torvalds 2010-10-27 16:46 ` J. Bruce Fields 2010-10-27 16:46 ` J. Bruce Fields 2010-10-27 17:32 ` Linus Torvalds 2010-10-27 17:40 ` J. Bruce Fields 2010-10-27 18:20 ` Arnd Bergmann 2010-10-27 18:42 ` Linus Torvalds 2010-10-27 18:43 ` Linus Torvalds 2010-10-27 19:48 ` Arnd Bergmann 2010-10-27 20:01 ` J. Bruce Fields 2010-10-27 20:20 ` Arnd Bergmann 2010-10-27 20:24 ` J. Bruce Fields 2010-10-30 21:25 ` J. Bruce Fields 2010-10-30 21:31 ` [PATCH 1/4] locks: prevent ENOMEM on lease unlock J. Bruce Fields 2010-10-30 21:31 ` [PATCH 2/4] locks: fix leaks on setlease errors J. Bruce Fields 2010-10-31 11:10 ` Christoph Hellwig 2010-11-01 17:24 ` J. Bruce Fields 2010-11-01 17:41 ` Christoph Hellwig 2010-11-01 18:34 ` J. Bruce Fields 2010-10-30 21:31 ` [PATCH 3/4] locks: fix setlease methods to free passed-in lock J. Bruce Fields 2010-10-30 21:31 ` [PATCH 4/4] nfsd4: initialize delegation pointer to lease J. Bruce Fields 2010-10-31 2:04 ` Christoph Hellwig 2010-10-31 3:04 ` J. Bruce Fields 2010-10-30 21:40 ` nfsd changes for 2.6.37 Arnd Bergmann 2010-10-31 2:07 ` Christoph Hellwig 2010-10-31 3:05 ` J. Bruce Fields 2010-10-31 12:34 ` Christoph Hellwig 2010-10-31 12:35 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig 2010-11-03 20:41 ` J. Bruce Fields 2010-11-04 1:40 ` J. Bruce Fields 2010-11-04 1:41 ` J. Bruce Fields 2010-11-06 19:03 ` Christoph Hellwig 2010-11-06 19:03 ` Christoph Hellwig 2010-11-08 16:10 ` J. Bruce Fields 2010-10-31 12:35 ` [PATCH 2/2] locks: remove fl_copy_lock lock_manager operation Christoph Hellwig 2010-11-01 15:02 ` nfsd changes for 2.6.37 J. Bruce Fields 2010-11-06 19:04 ` Christoph Hellwig
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.