All of lore.kernel.org
 help / color / mirror / Atom feed
* nfsd changes for 2.6.37
@ 2010-10-26 16:45 J. Bruce Fields
  2010-10-26 17:22 ` J. Bruce Fields
  2010-10-26 20:18 ` Arnd Bergmann
  0 siblings, 2 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-26 16:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-nfs, linux-kernel

Please pull the following nfsd updates from the for-2.6.37 branch at:

	git://linux-nfs.org/~bfields/linux.git for-2.6.37

Neil Brown modifed the way we wait for upcalls to solve a performance
problem and a long-standing NFSv4 correctness problem.

Pavel Emelyanov started the job of making the NFS client and server
container-aware.  (Note that for the sake of keeping his work in one
tree, Trond and I agreed to merge it through my tree, though it includes
client as well as server changes.)  Thanks to Chuck Lever for help with
review.

Thanks also to Tom Tucker for some rdma fixes (and review help).

I'm continuing work to bring the NFSv4.1 server close enough to the spec
that we could turn it on by default without risking future
interoperability problems.  We're not quite there yet.

A moment's silence for the spkm3 code, please.  It's been lying unused
for years now, so we decided to finally put it out of its misery.

We're also deprecating (not removing quite yet) the old nfsd control
interfaces.

I have a few last-minute changes still under testing that I may try to
sneak in later in the week, but given the short merge window I thought
it would make more sense to send the bulk of the changes now.

--b.

Andy Adamson (1):
      nfsd: remove duplicate NFS4_STATEID_SIZE declaration

Andy Shevchenko (1):
      sunrpc/cache: don't use custom hex_to_bin() converter

Benny Halevy (1):
      nfsd4: adjust buflen for encoded attrs bitmap based on actual bitmap length

Bryan Schumaker (1):
      lockd: Mostly remove BKL from the server

Chuck Lever (2):
      SUNRPC: Use conventional switch statement when reclassifying sockets
      SUNRPC: Properly initialize sock_xprt.srcaddr in all cases

J. Bruce Fields (39):
      svcrpc: minor cache cleanup
      svcrpc: cache deferral cleanup
      Merge remote branch 'trond/bugfixes' into for-2.6.37
      nfsd4: fix hang on fast-booting nfs servers
      nfsd: fix /proc/net/rpc/nfsd.export/content display
      nfsd4: remove spkm3
      nfsd4: minor variable renaming (cb -> conn)
      nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence
      nfsd4: rename nfs4_rpc_args->nfsd4_cb_args
      nfsd4: generic callback code
      nfsd4: use generic callback code in null case
      nfsd4: remove separate cb_args struct
      nfsd4: Move callback setup to callback queue
      nfsd4: fix alloc_init_session BUILD_BUG_ON()
      nfsd4: fix alloc_init_session return type
      nfsd4: clean up session allocation
      nfsd4: keep per-session list of connections
      nfsd: provide callbacks on svc_xprt deletion
      nfsd4: use callbacks on svc_xprt_deletion
      nfsd4: refactor connection allocation
      nfsd4: add new connections to session
      nfsd4: return expired on unfound stateid's
      nfsd4: expire clients more promptly
      nfsd4: don't cache seq_misordered replies
      nfsd4: move callback setup into session init code
      nfsd4: use client pointer to backchannel session
      nfsd4: make backchannel sequence number per-session
      nfsd4: confirm only on succesful create_session
      nfsd4: track backchannel connections
      nfsd4: callback program number is per-session
      nfsd4: separate callback change and callback probe
      nfsd4: delay session removal till free_client
      nfsd4: move minorversion to client
      nfsd4: only require krb5 principal for NFSv4.0 callbacks
      nfsd4: fix connection allocation in sequence()
      svcrpc: never clear XPT_BUSY on dead xprt
      svcrpc: assume svc_delete_xprt() called only once
      svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue
      svcrpc: svc_tcp_sendto XPT_DEAD check is redundant

NeilBrown (14):
      sunrpc: extract some common sunrpc_cache code from nfsd
      sunrpc: use seconds since boot in expiry cache
      sunrpc/cache: allow threads to block while waiting for cache update.
      sunrpc: close connection when a request is irretrievably lost.
      nfsd: disable deferral for NFSv4
      nfsd/idmap: drop special request deferal in favour of improved default.
      svcauth_gss: replace a trivial 'switch' with an 'if'
      sunrpc/cache: change deferred-request hash table to use hlist.
      sunrpc/cache: fix recent breakage of cache_clean_deferred
      nfsd: formally deprecate legacy nfsd syscall interface
      nfsd: allow deprecated interface to be compiled out.
      sunrpc: fix race in new cache_wait code.
      sunrpc: Simplify cache_defer_req and related functions.
      sunrpc/cache: centralise handling of size limit on deferred list.

Pavel Emelyanov (37):
      nfsd: Export get_task_comm for nfsd
      sunrpc: Pass the ip_map_parse's cd to lower calls
      sunrpc: Make xprt auth cache release work with the xprt
      sunrpc: Pass xprt to cached get/put routines
      sunrpc: Add net to pure API calls
      sunrpc: Add routines that allow registering per-net caches
      sunrpc: Tag svc_xprt with net
      sunrpc: The per-net skeleton
      sunrpc: Make the /proc/net/rpc appear in net namespaces
      sunrpc: Make the ip_map_cache be per-net
      sunrpc: Factor out rpc_xprt allocation
      sunrpc: Factor out rpc_xprt freeing
      sunrpc: Add net argument to svc_create_xprt
      sunrpc: Pull net argument downto svc_create_socket
      sunrpc: Add net to rpc_create_args
      sunrpc: Add net to xprt_create
      sunrpc: Tag rpc_xprt with net
      net: Export __sock_create
      sunrpc: Create sockets in net namespaces
      sunrpc: Use helper to set v4 mapped addr in ip_map_parse
      sunrpc: Remove unused sock arg from xs_get_srcport
      sunrpc: Remove unused sock arg from xs_next_srcport
      sunrpc: Get xprt pointer once in xs_tcp_setup_socket
      sunrpc: Remove duplicate xprt/transport arguments from calls
      sunrpc: Factor out udp sockets creation
      sunrpc: Factor out v4 sockets creation
      sunrpc: Factor out v6 sockets creation
      sunrpc: Call xs_create_sockX directly from setup_socket
      sunrpc: Merge the xs_bind code
      sunrpc: Merge xs_create_sock code
      sunrpc: Pass family to setup_socket calls
      sunrpc: Remove TCP worker wrappers
      sunrpc: Remove UDP worker wrappers
      sunrpc: Remove useless if (task == NULL) from xprt_reserve_xprt
      sunrpc: Don't return NULL from rpcb_create
      sunrpc: Remove dead "else" branch from bc xprt creation
      sunrpc: Turn list_for_each-s into the ..._entry-s

Stephen Rothwell (1):
      sunrpc: fix up rpcauth_remove_module section mismatch

Tejun Heo (1):
      sunrpc/xprtrdma: clean up workqueue usage

Tom Tucker (2):
      svcrdma: Change DMA mapping logic to avoid the page_address kernel API
      svcrdma: Cleanup DMA unmapping in error paths.

 Documentation/feature-removal-schedule.txt |   10 +
 fs/Makefile                                |    5 +-
 fs/compat.c                                |    2 +-
 fs/lockd/host.c                            |    1 +
 fs/lockd/mon.c                             |    1 +
 fs/lockd/svc.c                             |    2 +-
 fs/lockd/svc4proc.c                        |    2 -
 fs/lockd/svclock.c                         |   31 ++-
 fs/lockd/svcproc.c                         |    2 -
 fs/nfs/callback.c                          |    4 +-
 fs/nfs/client.c                            |    1 +
 fs/nfs/dns_resolve.c                       |    6 +-
 fs/nfs/mount_clnt.c                        |    2 +
 fs/nfsd/Kconfig                            |   12 +
 fs/nfsd/export.c                           |   73 +++--
 fs/nfsd/nfs4callback.c                     |  245 ++++++++------
 fs/nfsd/nfs4idmap.c                        |  105 +------
 fs/nfsd/nfs4proc.c                         |    7 +-
 fs/nfsd/nfs4state.c                        |  491 ++++++++++++++++++----------
 fs/nfsd/nfs4xdr.c                          |   18 +-
 fs/nfsd/nfsctl.c                           |   26 ++-
 fs/nfsd/nfsd.h                             |    2 +-
 fs/nfsd/nfssvc.c                           |    5 +-
 fs/nfsd/state.h                            |   52 ++-
 include/linux/net.h                        |    2 +
 include/linux/nfs4.h                       |    3 +
 include/linux/sunrpc/auth.h                |    4 +-
 include/linux/sunrpc/cache.h               |   37 ++-
 include/linux/sunrpc/clnt.h                |    1 +
 include/linux/sunrpc/gss_spkm3.h           |   55 ---
 include/linux/sunrpc/stats.h               |   23 +-
 include/linux/sunrpc/svc_xprt.h            |   32 ++-
 include/linux/sunrpc/svcauth.h             |   17 +-
 include/linux/sunrpc/xprt.h                |    4 +
 net/socket.c                               |    3 +-
 net/sunrpc/Kconfig                         |   19 -
 net/sunrpc/auth.c                          |    2 +-
 net/sunrpc/auth_generic.c                  |    2 +-
 net/sunrpc/auth_gss/Makefile               |    5 -
 net/sunrpc/auth_gss/gss_spkm3_mech.c       |  247 --------------
 net/sunrpc/auth_gss/gss_spkm3_seal.c       |  186 -----------
 net/sunrpc/auth_gss/gss_spkm3_token.c      |  267 ---------------
 net/sunrpc/auth_gss/gss_spkm3_unseal.c     |  127 -------
 net/sunrpc/auth_gss/svcauth_gss.c          |   51 ++--
 net/sunrpc/cache.c                         |  288 ++++++++++++-----
 net/sunrpc/clnt.c                          |    1 +
 net/sunrpc/netns.h                         |   19 +
 net/sunrpc/rpcb_clnt.c                     |    4 +-
 net/sunrpc/stats.c                         |   43 ++-
 net/sunrpc/sunrpc_syms.c                   |   58 +++-
 net/sunrpc/svc.c                           |    3 +
 net/sunrpc/svc_xprt.c                      |   59 +++--
 net/sunrpc/svcauth_unix.c                  |  194 ++++++++---
 net/sunrpc/svcsock.c                       |   27 +-
 net/sunrpc/xprt.c                          |   39 ++-
 net/sunrpc/xprtrdma/svc_rdma.c             |   11 +-
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c    |   19 +-
 net/sunrpc/xprtrdma/svc_rdma_sendto.c      |   82 ++++--
 net/sunrpc/xprtrdma/svc_rdma_transport.c   |   49 ++--
 net/sunrpc/xprtrdma/transport.c            |   25 +-
 net/sunrpc/xprtsock.c                      |  358 +++++++-------------
 61 files changed, 1534 insertions(+), 1937 deletions(-)
 delete mode 100644 include/linux/sunrpc/gss_spkm3.h
 delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_mech.c
 delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_seal.c
 delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_token.c
 delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_unseal.c
 create mode 100644 net/sunrpc/netns.h

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 16:45 nfsd changes for 2.6.37 J. Bruce Fields
@ 2010-10-26 17:22 ` J. Bruce Fields
  2010-10-26 17:39   ` Linus Torvalds
  2010-10-26 20:18 ` Arnd Bergmann
  1 sibling, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-26 17:22 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 12:45:50PM -0400, J. Bruce Fields wrote:
> Please pull the following nfsd updates from the for-2.6.37 branch at:
> 
> 	git://linux-nfs.org/~bfields/linux.git for-2.6.37

By the way, apologies, I see there's a conflict with upstream--looks
obvious (conflicting appends to
Documentation/feature-removal-schedule.txt), but I'm happy to fix it up
and add a merge commit on that branch if it saves you time.

--b.

> 
> Neil Brown modifed the way we wait for upcalls to solve a performance
> problem and a long-standing NFSv4 correctness problem.
> 
> Pavel Emelyanov started the job of making the NFS client and server
> container-aware.  (Note that for the sake of keeping his work in one
> tree, Trond and I agreed to merge it through my tree, though it includes
> client as well as server changes.)  Thanks to Chuck Lever for help with
> review.
> 
> Thanks also to Tom Tucker for some rdma fixes (and review help).
> 
> I'm continuing work to bring the NFSv4.1 server close enough to the spec
> that we could turn it on by default without risking future
> interoperability problems.  We're not quite there yet.
> 
> A moment's silence for the spkm3 code, please.  It's been lying unused
> for years now, so we decided to finally put it out of its misery.
> 
> We're also deprecating (not removing quite yet) the old nfsd control
> interfaces.
> 
> I have a few last-minute changes still under testing that I may try to
> sneak in later in the week, but given the short merge window I thought
> it would make more sense to send the bulk of the changes now.
> 
> --b.
> 
> Andy Adamson (1):
>       nfsd: remove duplicate NFS4_STATEID_SIZE declaration
> 
> Andy Shevchenko (1):
>       sunrpc/cache: don't use custom hex_to_bin() converter
> 
> Benny Halevy (1):
>       nfsd4: adjust buflen for encoded attrs bitmap based on actual bitmap length
> 
> Bryan Schumaker (1):
>       lockd: Mostly remove BKL from the server
> 
> Chuck Lever (2):
>       SUNRPC: Use conventional switch statement when reclassifying sockets
>       SUNRPC: Properly initialize sock_xprt.srcaddr in all cases
> 
> J. Bruce Fields (39):
>       svcrpc: minor cache cleanup
>       svcrpc: cache deferral cleanup
>       Merge remote branch 'trond/bugfixes' into for-2.6.37
>       nfsd4: fix hang on fast-booting nfs servers
>       nfsd: fix /proc/net/rpc/nfsd.export/content display
>       nfsd4: remove spkm3
>       nfsd4: minor variable renaming (cb -> conn)
>       nfsd4: combine nfs4_rpc_args and nfsd4_cb_sequence
>       nfsd4: rename nfs4_rpc_args->nfsd4_cb_args
>       nfsd4: generic callback code
>       nfsd4: use generic callback code in null case
>       nfsd4: remove separate cb_args struct
>       nfsd4: Move callback setup to callback queue
>       nfsd4: fix alloc_init_session BUILD_BUG_ON()
>       nfsd4: fix alloc_init_session return type
>       nfsd4: clean up session allocation
>       nfsd4: keep per-session list of connections
>       nfsd: provide callbacks on svc_xprt deletion
>       nfsd4: use callbacks on svc_xprt_deletion
>       nfsd4: refactor connection allocation
>       nfsd4: add new connections to session
>       nfsd4: return expired on unfound stateid's
>       nfsd4: expire clients more promptly
>       nfsd4: don't cache seq_misordered replies
>       nfsd4: move callback setup into session init code
>       nfsd4: use client pointer to backchannel session
>       nfsd4: make backchannel sequence number per-session
>       nfsd4: confirm only on succesful create_session
>       nfsd4: track backchannel connections
>       nfsd4: callback program number is per-session
>       nfsd4: separate callback change and callback probe
>       nfsd4: delay session removal till free_client
>       nfsd4: move minorversion to client
>       nfsd4: only require krb5 principal for NFSv4.0 callbacks
>       nfsd4: fix connection allocation in sequence()
>       svcrpc: never clear XPT_BUSY on dead xprt
>       svcrpc: assume svc_delete_xprt() called only once
>       svcrpc: no need for XPT_DEAD check in svc_xprt_enqueue
>       svcrpc: svc_tcp_sendto XPT_DEAD check is redundant
> 
> NeilBrown (14):
>       sunrpc: extract some common sunrpc_cache code from nfsd
>       sunrpc: use seconds since boot in expiry cache
>       sunrpc/cache: allow threads to block while waiting for cache update.
>       sunrpc: close connection when a request is irretrievably lost.
>       nfsd: disable deferral for NFSv4
>       nfsd/idmap: drop special request deferal in favour of improved default.
>       svcauth_gss: replace a trivial 'switch' with an 'if'
>       sunrpc/cache: change deferred-request hash table to use hlist.
>       sunrpc/cache: fix recent breakage of cache_clean_deferred
>       nfsd: formally deprecate legacy nfsd syscall interface
>       nfsd: allow deprecated interface to be compiled out.
>       sunrpc: fix race in new cache_wait code.
>       sunrpc: Simplify cache_defer_req and related functions.
>       sunrpc/cache: centralise handling of size limit on deferred list.
> 
> Pavel Emelyanov (37):
>       nfsd: Export get_task_comm for nfsd
>       sunrpc: Pass the ip_map_parse's cd to lower calls
>       sunrpc: Make xprt auth cache release work with the xprt
>       sunrpc: Pass xprt to cached get/put routines
>       sunrpc: Add net to pure API calls
>       sunrpc: Add routines that allow registering per-net caches
>       sunrpc: Tag svc_xprt with net
>       sunrpc: The per-net skeleton
>       sunrpc: Make the /proc/net/rpc appear in net namespaces
>       sunrpc: Make the ip_map_cache be per-net
>       sunrpc: Factor out rpc_xprt allocation
>       sunrpc: Factor out rpc_xprt freeing
>       sunrpc: Add net argument to svc_create_xprt
>       sunrpc: Pull net argument downto svc_create_socket
>       sunrpc: Add net to rpc_create_args
>       sunrpc: Add net to xprt_create
>       sunrpc: Tag rpc_xprt with net
>       net: Export __sock_create
>       sunrpc: Create sockets in net namespaces
>       sunrpc: Use helper to set v4 mapped addr in ip_map_parse
>       sunrpc: Remove unused sock arg from xs_get_srcport
>       sunrpc: Remove unused sock arg from xs_next_srcport
>       sunrpc: Get xprt pointer once in xs_tcp_setup_socket
>       sunrpc: Remove duplicate xprt/transport arguments from calls
>       sunrpc: Factor out udp sockets creation
>       sunrpc: Factor out v4 sockets creation
>       sunrpc: Factor out v6 sockets creation
>       sunrpc: Call xs_create_sockX directly from setup_socket
>       sunrpc: Merge the xs_bind code
>       sunrpc: Merge xs_create_sock code
>       sunrpc: Pass family to setup_socket calls
>       sunrpc: Remove TCP worker wrappers
>       sunrpc: Remove UDP worker wrappers
>       sunrpc: Remove useless if (task == NULL) from xprt_reserve_xprt
>       sunrpc: Don't return NULL from rpcb_create
>       sunrpc: Remove dead "else" branch from bc xprt creation
>       sunrpc: Turn list_for_each-s into the ..._entry-s
> 
> Stephen Rothwell (1):
>       sunrpc: fix up rpcauth_remove_module section mismatch
> 
> Tejun Heo (1):
>       sunrpc/xprtrdma: clean up workqueue usage
> 
> Tom Tucker (2):
>       svcrdma: Change DMA mapping logic to avoid the page_address kernel API
>       svcrdma: Cleanup DMA unmapping in error paths.
> 
>  Documentation/feature-removal-schedule.txt |   10 +
>  fs/Makefile                                |    5 +-
>  fs/compat.c                                |    2 +-
>  fs/lockd/host.c                            |    1 +
>  fs/lockd/mon.c                             |    1 +
>  fs/lockd/svc.c                             |    2 +-
>  fs/lockd/svc4proc.c                        |    2 -
>  fs/lockd/svclock.c                         |   31 ++-
>  fs/lockd/svcproc.c                         |    2 -
>  fs/nfs/callback.c                          |    4 +-
>  fs/nfs/client.c                            |    1 +
>  fs/nfs/dns_resolve.c                       |    6 +-
>  fs/nfs/mount_clnt.c                        |    2 +
>  fs/nfsd/Kconfig                            |   12 +
>  fs/nfsd/export.c                           |   73 +++--
>  fs/nfsd/nfs4callback.c                     |  245 ++++++++------
>  fs/nfsd/nfs4idmap.c                        |  105 +------
>  fs/nfsd/nfs4proc.c                         |    7 +-
>  fs/nfsd/nfs4state.c                        |  491 ++++++++++++++++++----------
>  fs/nfsd/nfs4xdr.c                          |   18 +-
>  fs/nfsd/nfsctl.c                           |   26 ++-
>  fs/nfsd/nfsd.h                             |    2 +-
>  fs/nfsd/nfssvc.c                           |    5 +-
>  fs/nfsd/state.h                            |   52 ++-
>  include/linux/net.h                        |    2 +
>  include/linux/nfs4.h                       |    3 +
>  include/linux/sunrpc/auth.h                |    4 +-
>  include/linux/sunrpc/cache.h               |   37 ++-
>  include/linux/sunrpc/clnt.h                |    1 +
>  include/linux/sunrpc/gss_spkm3.h           |   55 ---
>  include/linux/sunrpc/stats.h               |   23 +-
>  include/linux/sunrpc/svc_xprt.h            |   32 ++-
>  include/linux/sunrpc/svcauth.h             |   17 +-
>  include/linux/sunrpc/xprt.h                |    4 +
>  net/socket.c                               |    3 +-
>  net/sunrpc/Kconfig                         |   19 -
>  net/sunrpc/auth.c                          |    2 +-
>  net/sunrpc/auth_generic.c                  |    2 +-
>  net/sunrpc/auth_gss/Makefile               |    5 -
>  net/sunrpc/auth_gss/gss_spkm3_mech.c       |  247 --------------
>  net/sunrpc/auth_gss/gss_spkm3_seal.c       |  186 -----------
>  net/sunrpc/auth_gss/gss_spkm3_token.c      |  267 ---------------
>  net/sunrpc/auth_gss/gss_spkm3_unseal.c     |  127 -------
>  net/sunrpc/auth_gss/svcauth_gss.c          |   51 ++--
>  net/sunrpc/cache.c                         |  288 ++++++++++++-----
>  net/sunrpc/clnt.c                          |    1 +
>  net/sunrpc/netns.h                         |   19 +
>  net/sunrpc/rpcb_clnt.c                     |    4 +-
>  net/sunrpc/stats.c                         |   43 ++-
>  net/sunrpc/sunrpc_syms.c                   |   58 +++-
>  net/sunrpc/svc.c                           |    3 +
>  net/sunrpc/svc_xprt.c                      |   59 +++--
>  net/sunrpc/svcauth_unix.c                  |  194 ++++++++---
>  net/sunrpc/svcsock.c                       |   27 +-
>  net/sunrpc/xprt.c                          |   39 ++-
>  net/sunrpc/xprtrdma/svc_rdma.c             |   11 +-
>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c    |   19 +-
>  net/sunrpc/xprtrdma/svc_rdma_sendto.c      |   82 ++++--
>  net/sunrpc/xprtrdma/svc_rdma_transport.c   |   49 ++--
>  net/sunrpc/xprtrdma/transport.c            |   25 +-
>  net/sunrpc/xprtsock.c                      |  358 +++++++-------------
>  61 files changed, 1534 insertions(+), 1937 deletions(-)
>  delete mode 100644 include/linux/sunrpc/gss_spkm3.h
>  delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_mech.c
>  delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_seal.c
>  delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_token.c
>  delete mode 100644 net/sunrpc/auth_gss/gss_spkm3_unseal.c
>  create mode 100644 net/sunrpc/netns.h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 17:22 ` J. Bruce Fields
@ 2010-10-26 17:39   ` Linus Torvalds
  2010-10-26 17:46       ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2010-10-26 17:39 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 10:22 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> On Tue, Oct 26, 2010 at 12:45:50PM -0400, J. Bruce Fields wrote:
>> Please pull the following nfsd updates from the for-2.6.37 branch at:
>>
>>       git://linux-nfs.org/~bfields/linux.git for-2.6.37
>
> By the way, apologies, I see there's a conflict with upstream--looks
> obvious (conflicting appends to
> Documentation/feature-removal-schedule.txt), but I'm happy to fix it up
> and add a merge commit on that branch if it saves you time.

No, as mentioned elsewhere in other similar threads, I actually prefer
people _not_ to pre-merge. It results in significantly uglier history
(especially since I tend to merge a lot during the merge window, so
your pre-merge will not obviate my later merge, because my tree will
have moved forward), since criss-crossing merges really end up making
the gitk graph rather harder to read.

But to me personally, the "uglier history" is actually the secondary
concern - I much prefer seeing the conflicts rather than have them
hidden from me by a pre-merge. Now, for something like
feature-removal-schedule.txt, the conflicts are usually not
interesting from a development angle, but on the other hand they are
also so trivial that they don't inconvenience me and I'd rather just
do them myself and avoid the history pollution of extra merges. But
then when the conflicts get more interesting and touching actual code,
I end up spending more time at them, but I _still_ want to see them,
because they inform me where different people ended up stepping on
each others code.

And those conflicts are interesting to me as a top-level maintainer,
because it either shows code organizational problems, or it shows
where people really were touching the same code for good reasons. And
regardless of the reason for the conflict, I like being aware of it,
because merges like that are often indicative of "this might cause
issues".

If the merge conflict ends up being so involved that I can't resolve
it, I'll push back an email saying so. It happens, but not very often
I'm happy to say. Our source organization is good enough that we
seldom have really complicated merges.

One thing that some maintainers do is to do a private pre-merge just
to see if there are problems (and because it can give a more accurate
diffstat in the presense of criss-cross merges or duplicate changes)
and then if that merge is complex, expose both an unmerged branch and
a "here, if you have issues, this is how I resolved the merge" branch.
What I usually end up doing in that case is actually to do the merge
myself anyway (because of all the issues above - I just want to know
what is going on), but then also compare my end result with the
maintainer pre-merge result, just to verify.

But even that kind of "both non-merged and pre-merged" pull requests
should be reserved just for cases where there really was a somewhat
involved conflict. For trivial conflicts like a feature removal doc
clash, don't even bother.  I do several trivial merges a day, it's
normal and it doesn't really take me any appreciable extra time. Git
command line of the day:

  git log --oneline v2.6.36.. --merges --author=Torvalds --grep=onflict

to see at least a partial list of the trivial conflicts I've done in
just this merge window. It's 3-4 per day, it's perfectly normal.

                                Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
@ 2010-10-26 17:46       ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-26 17:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 10:39:41AM -0700, Linus Torvalds wrote:
> But even that kind of "both non-merged and pre-merged" pull requests
> should be reserved just for cases where there really was a somewhat
> involved conflict. For trivial conflicts like a feature removal doc
> clash, don't even bother.  I do several trivial merges a day, it's
> normal and it doesn't really take me any appreciable extra time. Git
> command line of the day:
> 
>   git log --oneline v2.6.36.. --merges --author=Torvalds --grep=onflict
> 
> to see at least a partial list of the trivial conflicts I've done in
> just this merge window. It's 3-4 per day, it's perfectly normal.

Makes sense, thanks!

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
@ 2010-10-26 17:46       ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-26 17:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 10:39:41AM -0700, Linus Torvalds wrote:
> But even that kind of "both non-merged and pre-merged" pull requests
> should be reserved just for cases where there really was a somewhat
> involved conflict. For trivial conflicts like a feature removal doc
> clash, don't even bother.  I do several trivial merges a day, it's
> normal and it doesn't really take me any appreciable extra time. Git
> command line of the day:
> 
>   git log --oneline v2.6.36.. --merges --author=Torvalds --grep=onflict
> 
> to see at least a partial list of the trivial conflicts I've done in
> just this merge window. It's 3-4 per day, it's perfectly normal.

Makes sense, thanks!

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 16:45 nfsd changes for 2.6.37 J. Bruce Fields
  2010-10-26 17:22 ` J. Bruce Fields
@ 2010-10-26 20:18 ` Arnd Bergmann
  2010-10-26 20:35   ` Bryan Schumaker
  1 sibling, 1 reply; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-26 20:18 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Linus Torvalds, linux-nfs, linux-kernel, Bryan Schumaker

On Tuesday 26 October 2010 18:45:50 J. Bruce Fields wrote:
> Bryan Schumaker (1):
>       lockd: Mostly remove BKL from the server

Could you explain the "mostly" part of this commit?

The commit message only says "This patch removes all
but one call to lock_kernel() from the server." This one
call is what keeps us from removing the BKL from fs/locks.c
because I can't tell if you still suspect that lockd
needs to lock against posix file locks or if there was
a different reason for leaving it in.

I can't think of anything else that this might be locking
against because everything that might interact with lockd
now does not use the BKL any more and lockd is
single-threaded by definition.

My guess is that the only thing that really needs to
lock_flocks() in lockd are the nlm_file_inuse and
nlm_traverse_locks functions because they traverse
the inode->i_flock list. All the exported functions
from fs/lock.c take care of locking in their own way
(possibly not lease_get_time, as I just discovered,
but that was never called under the BKL...).

	Arnd

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 20:18 ` Arnd Bergmann
@ 2010-10-26 20:35   ` Bryan Schumaker
  2010-10-26 20:55     ` Arnd Bergmann
  0 siblings, 1 reply; 62+ messages in thread
From: Bryan Schumaker @ 2010-10-26 20:35 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: J. Bruce Fields, Linus Torvalds, linux-nfs, linux-kernel

Hi

I left the one call because I was unable to figure out what was being protected with the BKL in that section of the code.  I figured I would leave it for the maintainers, since they know more about the code than I do.

Bryan

On 10/26/2010 04:18 PM, Arnd Bergmann wrote:
> On Tuesday 26 October 2010 18:45:50 J. Bruce Fields wrote:
>> Bryan Schumaker (1):
>>       lockd: Mostly remove BKL from the server
> 
> Could you explain the "mostly" part of this commit?
> 
> The commit message only says "This patch removes all
> but one call to lock_kernel() from the server." This one
> call is what keeps us from removing the BKL from fs/locks.c
> because I can't tell if you still suspect that lockd
> needs to lock against posix file locks or if there was
> a different reason for leaving it in.
> 
> I can't think of anything else that this might be locking
> against because everything that might interact with lockd
> now does not use the BKL any more and lockd is
> single-threaded by definition.
> 
> My guess is that the only thing that really needs to
> lock_flocks() in lockd are the nlm_file_inuse and
> nlm_traverse_locks functions because they traverse
> the inode->i_flock list. All the exported functions
> from fs/lock.c take care of locking in their own way
> (possibly not lease_get_time, as I just discovered,
> but that was never called under the BKL...).
> 
> 	Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 20:35   ` Bryan Schumaker
@ 2010-10-26 20:55     ` Arnd Bergmann
  2010-10-26 21:02       ` Linus Torvalds
  0 siblings, 1 reply; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-26 20:55 UTC (permalink / raw)
  To: Bryan Schumaker; +Cc: J. Bruce Fields, Linus Torvalds, linux-nfs, linux-kernel

On Tuesday 26 October 2010 22:35:50 Bryan Schumaker wrote:
> Hi
> 
> I left the one call because I was unable to figure out what
> was being protected with the BKL in that section of the code.
> I figured I would leave it for the maintainers, since they
> know more about the code than I do.

My guess is that we need something like the patch below.

This moves the lock_flocks() call into the places where it
is needed and can be safely converted into a spinlock because
that code does not sleep.

Finally, we can build fs/locks.c, nfs and nfsd with CONFIG_BKL
disabled.

*not tested*

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 Kconfig         |    1 -
 lockd/svc.c     |   11 -----------
 lockd/svcsubs.c |    9 ++++++++-
 locks.c         |    5 +++--
 nfs/Kconfig     |    1 -
 nfsd/Kconfig    |    1 -
 6 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/fs/Kconfig b/fs/Kconfig
index 65781de..3d18530 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -50,7 +50,6 @@ endif # BLOCK
 config FILE_LOCKING
 	bool "Enable POSIX file locking API" if EMBEDDED
 	default y
-	select BKL # while lockd still uses it.
 	help
 	  This option enables standard file locking support, required
           for filesystems like NFS and for the flock() system
diff --git a/fs/lockd/svc.c b/fs/lockd/svc.c
index b13aabc..abfff9d 100644
--- a/fs/lockd/svc.c
+++ b/fs/lockd/svc.c
@@ -22,7 +22,6 @@
 #include <linux/in.h>
 #include <linux/uio.h>
 #include <linux/smp.h>
-#include <linux/smp_lock.h>
 #include <linux/mutex.h>
 #include <linux/kthread.h>
 #include <linux/freezer.h>
@@ -130,15 +129,6 @@ lockd(void *vrqstp)
 
 	dprintk("NFS locking service started (ver " LOCKD_VERSION ").\n");
 
-	/*
-	 * FIXME: it would be nice if lockd didn't spend its entire life
-	 * running under the BKL. At the very least, it would be good to
-	 * have someone clarify what it's intended to protect here. I've
-	 * seen some handwavy posts about posix locking needing to be
-	 * done under the BKL, but it's far from clear.
-	 */
-	lock_kernel();
-
 	if (!nlm_timeout)
 		nlm_timeout = LOCKD_DFLT_TIMEO;
 	nlmsvc_timeout = nlm_timeout * HZ;
@@ -195,7 +185,6 @@ lockd(void *vrqstp)
 	if (nlmsvc_ops)
 		nlmsvc_invalidate_all();
 	nlm_shutdown_hosts();
-	unlock_kernel();
 	return 0;
 }
 
diff --git a/fs/lockd/svcsubs.c b/fs/lockd/svcsubs.c
index d0ef94c..1ca0679 100644
--- a/fs/lockd/svcsubs.c
+++ b/fs/lockd/svcsubs.c
@@ -170,6 +170,7 @@ nlm_traverse_locks(struct nlm_host *host, struct nlm_file *file,
 
 again:
 	file->f_locks = 0;
+	lock_flocks(); /* protects i_flock list */
 	for (fl = inode->i_flock; fl; fl = fl->fl_next) {
 		if (fl->fl_lmops != &nlmsvc_lock_operations)
 			continue;
@@ -181,6 +182,7 @@ again:
 		if (match(lockhost, host)) {
 			struct file_lock lock = *fl;
 
+			unlock_flocks();
 			lock.fl_type  = F_UNLCK;
 			lock.fl_start = 0;
 			lock.fl_end   = OFFSET_MAX;
@@ -192,6 +194,7 @@ again:
 			goto again;
 		}
 	}
+	unlock_flocks();
 
 	return 0;
 }
@@ -226,10 +229,14 @@ nlm_file_inuse(struct nlm_file *file)
 	if (file->f_count || !list_empty(&file->f_blocks) || file->f_shares)
 		return 1;
 
+	lock_flocks();
 	for (fl = inode->i_flock; fl; fl = fl->fl_next) {
-		if (fl->fl_lmops == &nlmsvc_lock_operations)
+		if (fl->fl_lmops == &nlmsvc_lock_operations) {
+			unlock_flocks();
 			return 1;
+		}
 	}
+	unlock_flocks();
 	file->f_locks = 0;
 	return 0;
 }
diff --git a/fs/locks.c b/fs/locks.c
index 8b2b6ad..a417901 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -142,6 +142,7 @@ int lease_break_time = 45;
 
 static LIST_HEAD(file_lock_list);
 static LIST_HEAD(blocked_list);
+static DEFINE_SPINLOCK(file_lock_lock);
 
 /*
  * Protects the two list heads above, plus the inode->i_flock list
@@ -149,13 +150,13 @@ static LIST_HEAD(blocked_list);
  */
 void lock_flocks(void)
 {
-	lock_kernel();
+	spin_lock(&file_lock_lock);
 }
 EXPORT_SYMBOL_GPL(lock_flocks);
 
 void unlock_flocks(void)
 {
-	unlock_kernel();
+	spin_unlock(&file_lock_lock);
 }
 EXPORT_SYMBOL_GPL(unlock_flocks);
 
diff --git a/fs/nfs/Kconfig b/fs/nfs/Kconfig
index fd66765..ba30665 100644
--- a/fs/nfs/Kconfig
+++ b/fs/nfs/Kconfig
@@ -1,7 +1,6 @@
 config NFS_FS
 	tristate "NFS client support"
 	depends on INET && FILE_LOCKING
-	depends on BKL # fix as soon as lockd is done
 	select LOCKD
 	select SUNRPC
 	select NFS_ACL_SUPPORT if NFS_V3_ACL
diff --git a/fs/nfsd/Kconfig b/fs/nfsd/Kconfig
index 31a78fc..18b3e89 100644
--- a/fs/nfsd/Kconfig
+++ b/fs/nfsd/Kconfig
@@ -2,7 +2,6 @@ config NFSD
 	tristate "NFS server support"
 	depends on INET
 	depends on FILE_LOCKING
-	depends on BKL # fix as soon as lockd is done
 	select LOCKD
 	select SUNRPC
 	select EXPORTFS

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 20:55     ` Arnd Bergmann
@ 2010-10-26 21:02       ` Linus Torvalds
  2010-10-26 21:24         ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2010-10-26 21:02 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Bryan Schumaker, J. Bruce Fields, linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 1:55 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> Finally, we can build fs/locks.c, nfs and nfsd with CONFIG_BKL
> disabled.
>
> *not tested*

Ok, please make this so, so that we finally can effectively get rid of
the BKL in this release (I don't really care about the random old
drivers that may be "depends on BKL").

But that obviously requires at least some minimal testing by somebody
who is using NFS heavily, including locking. I assume that the
connectathon tests include file locking? And I presume that the nfs
developers run those at least semi-regularly and could do at least
that kind of minimal testing?

                        Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 21:02       ` Linus Torvalds
@ 2010-10-26 21:24         ` J. Bruce Fields
  2010-10-26 21:37           ` Linus Torvalds
  0 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-26 21:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 02:02:05PM -0700, Linus Torvalds wrote:
> On Tue, Oct 26, 2010 at 1:55 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > Finally, we can build fs/locks.c, nfs and nfsd with CONFIG_BKL
> > disabled.
> >
> > *not tested*
> 
> Ok, please make this so, so that we finally can effectively get rid of
> the BKL in this release (I don't really care about the random old
> drivers that may be "depends on BKL").
> 
> But that obviously requires at least some minimal testing by somebody
> who is using NFS heavily, including locking. I assume that the
> connectathon tests include file locking? And I presume that the nfs
> developers run those at least semi-regularly and could do at least
> that kind of minimal testing?

I did a couple connectathon runs just now with no obvious ill effects
except for some sleep-within-spinlock warnings in the lease code.

Yeah, connecthon does include lock tests, though they're just basic
correctness tests.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 21:24         ` J. Bruce Fields
@ 2010-10-26 21:37           ` Linus Torvalds
  2010-10-26 21:44             ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2010-10-26 21:37 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 2:24 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
>
> I did a couple connectathon runs just now with no obvious ill effects
> except for some sleep-within-spinlock warnings in the lease code.

Hmm. Those sleep-within-spinlock warnings are very likely serious
bugs. Can you quote the whole warning with stack trace?

                           Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 21:37           ` Linus Torvalds
@ 2010-10-26 21:44             ` J. Bruce Fields
  2010-10-26 22:11               ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-26 21:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 02:37:26PM -0700, Linus Torvalds wrote:
> On Tue, Oct 26, 2010 at 2:24 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > I did a couple connectathon runs just now with no obvious ill effects
> > except for some sleep-within-spinlock warnings in the lease code.
> 
> Hmm. Those sleep-within-spinlock warnings are very likely serious
> bugs.

Yeah, didn't mean to belittle them.

> Can you quote the whole warning with stack trace?

It's just obvious allocations in setlease:

BUG: sleeping function called from invalid context at mm/slab.c:3101
in_atomic(): 1, irqs_disabled(): 0, pid: 4345, name: lease_tests
1 lock held by lease_tests/4345:
 #0:  (file_lock_lock){+.+.+.}, at: [<ffffffff81128be5>] lock_flocks+0x15/0x20
Pid: 4345, comm: lease_tests Not tainted 2.6.36-05858-gbd5e20b #1028
Call Trace:
 [<ffffffff8103141d>] __might_sleep+0x10d/0x140
 [<ffffffff810e3ad3>] kmem_cache_alloc+0x1f3/0x230
 [<ffffffff8112a4d2>] generic_setlease+0x112/0x2c0
 [<ffffffff8112a6b5>] __vfs_setlease+0x35/0x40
 [<ffffffff8112acfe>] fcntl_setlease+0xce/0x180
 [<ffffffff810f7c2e>] sys_fcntl+0x2fe/0x630
 [<ffffffff81961999>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81002658>] system_call_fastpath+0x16/0x1b

I'm testing a patch.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 21:44             ` J. Bruce Fields
@ 2010-10-26 22:11               ` J. Bruce Fields
  2010-10-26 22:41                 ` J. Bruce Fields
  2010-10-27  7:21                 ` Arnd Bergmann
  0 siblings, 2 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-26 22:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 05:44:41PM -0400, J. Bruce Fields wrote:
> On Tue, Oct 26, 2010 at 02:37:26PM -0700, Linus Torvalds wrote:
> > On Tue, Oct 26, 2010 at 2:24 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > >
> > > I did a couple connectathon runs just now with no obvious ill effects
> > > except for some sleep-within-spinlock warnings in the lease code.
> > 
> > Hmm. Those sleep-within-spinlock warnings are very likely serious
> > bugs.
> 
> Yeah, didn't mean to belittle them.
> 
> > Can you quote the whole warning with stack trace?
> 
> It's just obvious allocations in setlease:
> 
> BUG: sleeping function called from invalid context at mm/slab.c:3101
> in_atomic(): 1, irqs_disabled(): 0, pid: 4345, name: lease_tests
> 1 lock held by lease_tests/4345:
>  #0:  (file_lock_lock){+.+.+.}, at: [<ffffffff81128be5>] lock_flocks+0x15/0x20
> Pid: 4345, comm: lease_tests Not tainted 2.6.36-05858-gbd5e20b #1028
> Call Trace:
>  [<ffffffff8103141d>] __might_sleep+0x10d/0x140
>  [<ffffffff810e3ad3>] kmem_cache_alloc+0x1f3/0x230
>  [<ffffffff8112a4d2>] generic_setlease+0x112/0x2c0
>  [<ffffffff8112a6b5>] __vfs_setlease+0x35/0x40
>  [<ffffffff8112acfe>] fcntl_setlease+0xce/0x180
>  [<ffffffff810f7c2e>] sys_fcntl+0x2fe/0x630
>  [<ffffffff81961999>] ? trace_hardirqs_on_thunk+0x3a/0x3f
>  [<ffffffff81002658>] system_call_fastpath+0x16/0x1b
> 
> I'm testing a patch.

This works for me.

I'm not saying it's correct, but it does at least pass my dumb tests
without complaining.

--b.

diff --git a/fs/locks.c b/fs/locks.c
index 02b6e0e..db3afa0 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1379,7 +1379,9 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 	if (error)
 		return error;
 
+	lock_flocks();
 	time_out_leases(inode);
+	unlock_flocks();
 
 	BUG_ON(!(*flp)->fl_lmops->fl_break);
 
@@ -1400,6 +1402,7 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 			goto out;
 	}
 
+	lock_flocks();
 	/*
 	 * At this point, we know that if there is an exclusive
 	 * lease on this file, then we hold it on this filp
@@ -1427,28 +1430,31 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 	error = -EAGAIN;
 	if ((arg == F_RDLCK && (wrlease_count > 0)) ||
 	    (arg == F_WRLCK && ((rdlease_count + wrlease_count) > 0)))
-		goto out;
+		goto out_unlock;
 
 	if (my_before != NULL) {
 		*flp = *my_before;
 		error = lease->fl_lmops->fl_change(my_before, arg);
-		goto out;
+		goto out_unlock;
 	}
 
 	error = 0;
 	if (arg == F_UNLCK)
-		goto out;
+		goto out_unlock;
 
 	error = -EINVAL;
 	if (!leases_enable)
-		goto out;
+		goto out_unlock;
 
 	locks_copy_lock(new_fl, lease);
 	locks_insert_lock(before, new_fl);
 
 	*flp = new_fl;
+	unlock_flocks();
 	return 0;
 
+out_unlock:
+	unlock_flocks();
 out:
 	if (new_fl != NULL)
 		locks_free_lock(new_fl);
@@ -1495,9 +1501,7 @@ int vfs_setlease(struct file *filp, long arg, struct file_lock **lease)
 {
 	int error;
 
-	lock_flocks();
 	error = __vfs_setlease(filp, arg, lease);
-	unlock_flocks();
 
 	return error;
 }
@@ -1524,8 +1528,6 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 	if (error)
 		return error;
 
-	lock_flocks();
-
 	error = __vfs_setlease(filp, arg, &flp);
 	if (error || arg == F_UNLCK)
 		goto out_unlock;
@@ -1541,7 +1543,6 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 
 	error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
 out_unlock:
-	unlock_flocks();
 	return error;
 }
 

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 22:11               ` J. Bruce Fields
@ 2010-10-26 22:41                 ` J. Bruce Fields
  2010-10-27  7:21                 ` Arnd Bergmann
  1 sibling, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-26 22:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel

On Tue, Oct 26, 2010 at 06:11:56PM -0400, J. Bruce Fields wrote:
> On Tue, Oct 26, 2010 at 05:44:41PM -0400, J. Bruce Fields wrote:
> > On Tue, Oct 26, 2010 at 02:37:26PM -0700, Linus Torvalds wrote:
> > > On Tue, Oct 26, 2010 at 2:24 PM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > > >
> > > > I did a couple connectathon runs just now with no obvious ill effects
> > > > except for some sleep-within-spinlock warnings in the lease code.
> > > 
> > > Hmm. Those sleep-within-spinlock warnings are very likely serious
> > > bugs.
> > 
> > Yeah, didn't mean to belittle them.
> > 
> > > Can you quote the whole warning with stack trace?
> > 
> > It's just obvious allocations in setlease:
> > 
> > BUG: sleeping function called from invalid context at mm/slab.c:3101
> > in_atomic(): 1, irqs_disabled(): 0, pid: 4345, name: lease_tests
> > 1 lock held by lease_tests/4345:
> >  #0:  (file_lock_lock){+.+.+.}, at: [<ffffffff81128be5>] lock_flocks+0x15/0x20
> > Pid: 4345, comm: lease_tests Not tainted 2.6.36-05858-gbd5e20b #1028
> > Call Trace:
> >  [<ffffffff8103141d>] __might_sleep+0x10d/0x140
> >  [<ffffffff810e3ad3>] kmem_cache_alloc+0x1f3/0x230
> >  [<ffffffff8112a4d2>] generic_setlease+0x112/0x2c0
> >  [<ffffffff8112a6b5>] __vfs_setlease+0x35/0x40
> >  [<ffffffff8112acfe>] fcntl_setlease+0xce/0x180
> >  [<ffffffff810f7c2e>] sys_fcntl+0x2fe/0x630
> >  [<ffffffff81961999>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> >  [<ffffffff81002658>] system_call_fastpath+0x16/0x1b
> > 
> > I'm testing a patch.
> 
> This works for me.
> 
> I'm not saying it's correct, but it does at least pass my dumb tests
> without complaining.

I can't think of any more missing locking, though I did notice this on a
quick look.

--b.

commit fc42117585672abd3cbf247dd311869233d1606a
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Tue Oct 26 18:25:30 2010 -0400

    fix nlmsvc_notify_blocked locking
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
index 6f1ef00..c462d34 100644
--- a/fs/lockd/svclock.c
+++ b/fs/lockd/svclock.c
@@ -700,14 +700,16 @@ nlmsvc_notify_blocked(struct file_lock *fl)
 	struct nlm_block	*block;
 
 	dprintk("lockd: VFS unblock notification for block %p\n", fl);
+	spin_lock(&nlm_blocked_lock);
 	list_for_each_entry(block, &nlm_blocked, b_list) {
 		if (nlm_compare_locks(&block->b_call->a_args.lock.fl, fl)) {
-			nlmsvc_insert_block(block, 0);
+			nlmsvc_insert_block_locked(block, 0);
+			spin_unlock(&nlm_blocked_lock);
 			svc_wake_up(block->b_daemon);
 			return;
 		}
 	}
-
+	spin_unlock(&nlm_blocked_lock);
 	printk(KERN_WARNING "lockd: notification for unknown block!\n");
 }
 

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-26 22:11               ` J. Bruce Fields
  2010-10-26 22:41                 ` J. Bruce Fields
@ 2010-10-27  7:21                 ` Arnd Bergmann
  2010-10-27  8:39                   ` Christoph Hellwig
  1 sibling, 1 reply; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-27  7:21 UTC (permalink / raw)
  To: J. Bruce Fields; +Cc: Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

On Wednesday 27 October 2010 00:11:56 J. Bruce Fields wrote:
> > BUG: sleeping function called from invalid context at mm/slab.c:3101
> > in_atomic(): 1, irqs_disabled(): 0, pid: 4345, name: lease_tests
> > 1 lock held by lease_tests/4345:
> >  #0:  (file_lock_lock){+.+.+.}, at: [<ffffffff81128be5>] lock_flocks+0x15/0x20
> > Pid: 4345, comm: lease_tests Not tainted 2.6.36-05858-gbd5e20b #1028
> > Call Trace:
> >  [<ffffffff8103141d>] __might_sleep+0x10d/0x140
> >  [<ffffffff810e3ad3>] kmem_cache_alloc+0x1f3/0x230
> >  [<ffffffff8112a4d2>] generic_setlease+0x112/0x2c0
> >  [<ffffffff8112a6b5>] __vfs_setlease+0x35/0x40
> >  [<ffffffff8112acfe>] fcntl_setlease+0xce/0x180
> >  [<ffffffff810f7c2e>] sys_fcntl+0x2fe/0x630
> >  [<ffffffff81961999>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> >  [<ffffffff81002658>] system_call_fastpath+0x16/0x1b
> > 
> > I'm testing a patch.
> 

Thanks for the report!

> @@ -1524,8 +1528,6 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>         if (error)
>                 return error;
>  
> -       lock_flocks();
> -
>         error = __vfs_setlease(filp, arg, &flp);
>         if (error || arg == F_UNLCK)
>                 goto out_unlock;
> @@ -1541,7 +1543,6 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>  
>         error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
>  out_unlock:
> -       unlock_flocks();
>         return error;
>  }

If you don't hold lock_flocks throughout fcntl_setlease, the flp variable
points to a flock that may get modified by another thread and you call
time_out_leases() without holding lock_flocks, which it requires.

The two alternatives I can see are to either use GFP_ATOMIC or to
take the lock inside of generic_setlease and drop it outside.
Neither of the two sounds particularly appealing.

	Arnd

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27  7:21                 ` Arnd Bergmann
@ 2010-10-27  8:39                   ` Christoph Hellwig
  2010-10-27 13:39                     ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-27  8:39 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: J. Bruce Fields, Linus Torvalds, Bryan Schumaker, linux-nfs,
	linux-kernel

> If you don't hold lock_flocks throughout fcntl_setlease, the flp variable
> points to a flock that may get modified by another thread and you call
> time_out_leases() without holding lock_flocks, which it requires.
> 
> The two alternatives I can see are to either use GFP_ATOMIC or to
> take the lock inside of generic_setlease and drop it outside.
> Neither of the two sounds particularly appealing.


Do locks_alloc_lock and initialization of the heap struct file_lock
in the caller.  This also avoids an entirely useless copy of the
lock structure.  free the passed in structure if we are modifying
an existing lock structure.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27  8:39                   ` Christoph Hellwig
@ 2010-10-27 13:39                     ` J. Bruce Fields
  2010-10-27 13:46                       ` Arnd Bergmann
  0 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 13:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

Arnd Bergmann wrote:
> > If you don't hold lock_flocks throughout fcntl_setlease, the flp variable
> > points to a flock that may get modified by another thread and you call
> > time_out_leases() without holding lock_flocks, which it requires.

Whoops, thanks for catching that.

On Wed, Oct 27, 2010 at 04:39:24AM -0400, Christoph Hellwig wrote:
> Do locks_alloc_lock and initialization of the heap struct file_lock
> in the caller.  This also avoids an entirely useless copy of the
> lock structure.  free the passed in structure if we are modifying
> an existing lock structure.

That sounds good; I'll give it a try.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 13:39                     ` J. Bruce Fields
@ 2010-10-27 13:46                       ` Arnd Bergmann
  2010-10-27 14:55                         ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-27 13:46 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wednesday 27 October 2010, J. Bruce Fields wrote:
> On Wed, Oct 27, 2010 at 04:39:24AM -0400, Christoph Hellwig wrote:
> > Do locks_alloc_lock and initialization of the heap struct file_lock
> > in the caller.  This also avoids an entirely useless copy of the
> > lock structure.  free the passed in structure if we are modifying
> > an existing lock structure.
> 
> That sounds good; I'll give it a try.

I've already started, this is what I've come up with. I don't have
a good way to test it though, and don't know as much as you about this
code. Feel free to use it as a starting point if you like.
It does look correct to me, but I'm not very confident in this.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---

diff --git a/fs/locks.c b/fs/locks.c
index 8b2b6ad..c5ec771 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -162,10 +162,11 @@ EXPORT_SYMBOL_GPL(unlock_flocks);
 static struct kmem_cache *filelock_cache __read_mostly;
 
 /* Allocate an empty lock structure. */
-static struct file_lock *locks_alloc_lock(void)
+struct file_lock *locks_alloc_lock(void)
 {
 	return kmem_cache_alloc(filelock_cache, GFP_KERNEL);
 }
+EXPORT_SYMBOL_GPL(locks_alloc_lock);
 
 void locks_release_private(struct file_lock *fl)
 {
@@ -1365,7 +1366,6 @@ int fcntl_getlease(struct file *filp)
 int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 {
 	struct file_lock *fl, **before, **my_before = NULL, *lease;
-	struct file_lock *new_fl = NULL;
 	struct dentry *dentry = filp->f_path.dentry;
 	struct inode *inode = dentry->d_inode;
 	int error, rdlease_count = 0, wrlease_count = 0;
@@ -1385,11 +1385,6 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 	lease = *flp;
 
 	if (arg != F_UNLCK) {
-		error = -ENOMEM;
-		new_fl = locks_alloc_lock();
-		if (new_fl == NULL)
-			goto out;
-
 		error = -EAGAIN;
 		if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
 			goto out;
@@ -1434,23 +1429,19 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 		goto out;
 	}
 
-	error = 0;
 	if (arg == F_UNLCK)
-		goto out;
+		goto out_unlck;
 
 	error = -EINVAL;
 	if (!leases_enable)
 		goto out;
 
-	locks_copy_lock(new_fl, lease);
-	locks_insert_lock(before, new_fl);
-
-	*flp = new_fl;
+	locks_insert_lock(before, lease);
+out_unlck:
 	return 0;
 
 out:
-	if (new_fl != NULL)
-		locks_free_lock(new_fl);
+	locks_free_lock(lease);
 	return error;
 }
 EXPORT_SYMBOL(generic_setlease);
@@ -1514,26 +1505,31 @@ EXPORT_SYMBOL_GPL(vfs_setlease);
  */
 int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 {
-	struct file_lock fl, *flp = &fl;
+	struct file_lock *fl;
 	struct inode *inode = filp->f_path.dentry->d_inode;
 	int error;
 
-	locks_init_lock(&fl);
-	error = lease_init(filp, arg, &fl);
-	if (error)
+	fl = locks_alloc_lock();
+	if (!fl)
+		return -ENOMEM;
+
+	locks_init_lock(fl);
+	error = lease_init(filp, arg, fl);
+	if (error) {
+		locks_free_lock(fl);
 		return error;
+	}
 
 	lock_flocks();
-
-	error = __vfs_setlease(filp, arg, &flp);
+	error = __vfs_setlease(filp, arg, &fl);
 	if (error || arg == F_UNLCK)
 		goto out_unlock;
 
-	error = fasync_helper(fd, filp, 1, &flp->fl_fasync);
+	error = fasync_helper(fd, filp, 1, &fl->fl_fasync);
 	if (error < 0) {
 		/* remove lease just inserted by setlease */
-		flp->fl_type = F_UNLCK | F_INPROGRESS;
-		flp->fl_break_time = jiffies - 10;
+		fl->fl_type = F_UNLCK | F_INPROGRESS;
+		fl->fl_break_time = jiffies - 10;
 		time_out_leases(inode);
 		goto out_unlock;
 	}
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 9019e8e..56347e0 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2614,7 +2614,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta
 	struct nfs4_delegation *dp;
 	struct nfs4_stateowner *sop = stp->st_stateowner;
 	int cb_up = atomic_read(&sop->so_client->cl_cb_set);
-	struct file_lock fl, *flp = &fl;
+	struct file_lock *fl;
 	int status, flag = 0;
 
 	flag = NFS4_OPEN_DELEGATE_NONE;
@@ -2648,20 +2648,24 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta
 		flag = NFS4_OPEN_DELEGATE_NONE;
 		goto out;
 	}
-	locks_init_lock(&fl);
-	fl.fl_lmops = &nfsd_lease_mng_ops;
-	fl.fl_flags = FL_LEASE;
-	fl.fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
-	fl.fl_end = OFFSET_MAX;
-	fl.fl_owner =  (fl_owner_t)dp;
-	fl.fl_file = find_readable_file(stp->st_file);
-	BUG_ON(!fl.fl_file);
-	fl.fl_pid = current->tgid;
+	status = -ENOMEM;
+	fl = locks_alloc_lock();
+	if (!fl)
+		goto out;
+	locks_init_lock(fl);
+	fl->fl_lmops = &nfsd_lease_mng_ops;
+	fl->fl_flags = FL_LEASE;
+	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
+	fl->fl_end = OFFSET_MAX;
+	fl->fl_owner =  (fl_owner_t)dp;
+	fl->fl_file = find_readable_file(stp->st_file);
+	BUG_ON(!fl->fl_file);
+	fl->fl_pid = current->tgid;
 
 	/* vfs_setlease checks to see if delegation should be handed out.
 	 * the lock_manager callbacks fl_mylease and fl_change are used
 	 */
-	if ((status = vfs_setlease(fl.fl_file, fl.fl_type, &flp))) {
+	if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) {
 		dprintk("NFSD: setlease failed [%d], no delegation\n", status);
 		unhash_delegation(dp);
 		flag = NFS4_OPEN_DELEGATE_NONE;

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 13:46                       ` Arnd Bergmann
@ 2010-10-27 14:55                         ` J. Bruce Fields
  2010-10-27 14:59                           ` Christoph Hellwig
  0 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 14:55 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 03:46:08PM +0200, Arnd Bergmann wrote:
> On Wednesday 27 October 2010, J. Bruce Fields wrote:
> > On Wed, Oct 27, 2010 at 04:39:24AM -0400, Christoph Hellwig wrote:
> > > Do locks_alloc_lock and initialization of the heap struct file_lock
> > > in the caller.  This also avoids an entirely useless copy of the
> > > lock structure.  free the passed in structure if we are modifying
> > > an existing lock structure.
> > 
> > That sounds good; I'll give it a try.
> 
> I've already started, this is what I've come up with. I don't have
> a good way to test it though, and don't know as much as you about this
> code. Feel free to use it as a starting point if you like.
> It does look correct to me, but I'm not very confident in this.

Hm, two problems:
	- We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK)
	  failing with ENOMEM.
	- fasync_helper(.,.,1,.) sleeps.  Argh.

--b.
> 
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 8b2b6ad..c5ec771 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -162,10 +162,11 @@ EXPORT_SYMBOL_GPL(unlock_flocks);
>  static struct kmem_cache *filelock_cache __read_mostly;
>  
>  /* Allocate an empty lock structure. */
> -static struct file_lock *locks_alloc_lock(void)
> +struct file_lock *locks_alloc_lock(void)
>  {
>  	return kmem_cache_alloc(filelock_cache, GFP_KERNEL);
>  }
> +EXPORT_SYMBOL_GPL(locks_alloc_lock);
>  
>  void locks_release_private(struct file_lock *fl)
>  {
> @@ -1365,7 +1366,6 @@ int fcntl_getlease(struct file *filp)
>  int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
>  {
>  	struct file_lock *fl, **before, **my_before = NULL, *lease;
> -	struct file_lock *new_fl = NULL;
>  	struct dentry *dentry = filp->f_path.dentry;
>  	struct inode *inode = dentry->d_inode;
>  	int error, rdlease_count = 0, wrlease_count = 0;
> @@ -1385,11 +1385,6 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
>  	lease = *flp;
>  
>  	if (arg != F_UNLCK) {
> -		error = -ENOMEM;
> -		new_fl = locks_alloc_lock();
> -		if (new_fl == NULL)
> -			goto out;
> -
>  		error = -EAGAIN;
>  		if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
>  			goto out;
> @@ -1434,23 +1429,19 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
>  		goto out;
>  	}
>  
> -	error = 0;
>  	if (arg == F_UNLCK)
> -		goto out;
> +		goto out_unlck;
>  
>  	error = -EINVAL;
>  	if (!leases_enable)
>  		goto out;
>  
> -	locks_copy_lock(new_fl, lease);
> -	locks_insert_lock(before, new_fl);
> -
> -	*flp = new_fl;
> +	locks_insert_lock(before, lease);
> +out_unlck:
>  	return 0;
>  
>  out:
> -	if (new_fl != NULL)
> -		locks_free_lock(new_fl);
> +	locks_free_lock(lease);
>  	return error;
>  }
>  EXPORT_SYMBOL(generic_setlease);
> @@ -1514,26 +1505,31 @@ EXPORT_SYMBOL_GPL(vfs_setlease);
>   */
>  int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>  {
> -	struct file_lock fl, *flp = &fl;
> +	struct file_lock *fl;
>  	struct inode *inode = filp->f_path.dentry->d_inode;
>  	int error;
>  
> -	locks_init_lock(&fl);
> -	error = lease_init(filp, arg, &fl);
> -	if (error)
> +	fl = locks_alloc_lock();
> +	if (!fl)
> +		return -ENOMEM;
> +
> +	locks_init_lock(fl);
> +	error = lease_init(filp, arg, fl);
> +	if (error) {
> +		locks_free_lock(fl);
>  		return error;
> +	}
>  
>  	lock_flocks();
> -
> -	error = __vfs_setlease(filp, arg, &flp);
> +	error = __vfs_setlease(filp, arg, &fl);
>  	if (error || arg == F_UNLCK)
>  		goto out_unlock;
>  
> -	error = fasync_helper(fd, filp, 1, &flp->fl_fasync);
> +	error = fasync_helper(fd, filp, 1, &fl->fl_fasync);
>  	if (error < 0) {
>  		/* remove lease just inserted by setlease */
> -		flp->fl_type = F_UNLCK | F_INPROGRESS;
> -		flp->fl_break_time = jiffies - 10;
> +		fl->fl_type = F_UNLCK | F_INPROGRESS;
> +		fl->fl_break_time = jiffies - 10;
>  		time_out_leases(inode);
>  		goto out_unlock;
>  	}
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 9019e8e..56347e0 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -2614,7 +2614,7 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta
>  	struct nfs4_delegation *dp;
>  	struct nfs4_stateowner *sop = stp->st_stateowner;
>  	int cb_up = atomic_read(&sop->so_client->cl_cb_set);
> -	struct file_lock fl, *flp = &fl;
> +	struct file_lock *fl;
>  	int status, flag = 0;
>  
>  	flag = NFS4_OPEN_DELEGATE_NONE;
> @@ -2648,20 +2648,24 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta
>  		flag = NFS4_OPEN_DELEGATE_NONE;
>  		goto out;
>  	}
> -	locks_init_lock(&fl);
> -	fl.fl_lmops = &nfsd_lease_mng_ops;
> -	fl.fl_flags = FL_LEASE;
> -	fl.fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
> -	fl.fl_end = OFFSET_MAX;
> -	fl.fl_owner =  (fl_owner_t)dp;
> -	fl.fl_file = find_readable_file(stp->st_file);
> -	BUG_ON(!fl.fl_file);
> -	fl.fl_pid = current->tgid;
> +	status = -ENOMEM;
> +	fl = locks_alloc_lock();
> +	if (!fl)
> +		goto out;
> +	locks_init_lock(fl);
> +	fl->fl_lmops = &nfsd_lease_mng_ops;
> +	fl->fl_flags = FL_LEASE;
> +	fl->fl_type = flag == NFS4_OPEN_DELEGATE_READ? F_RDLCK: F_WRLCK;
> +	fl->fl_end = OFFSET_MAX;
> +	fl->fl_owner =  (fl_owner_t)dp;
> +	fl->fl_file = find_readable_file(stp->st_file);
> +	BUG_ON(!fl->fl_file);
> +	fl->fl_pid = current->tgid;
>  
>  	/* vfs_setlease checks to see if delegation should be handed out.
>  	 * the lock_manager callbacks fl_mylease and fl_change are used
>  	 */
> -	if ((status = vfs_setlease(fl.fl_file, fl.fl_type, &flp))) {
> +	if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) {
>  		dprintk("NFSD: setlease failed [%d], no delegation\n", status);
>  		unhash_delegation(dp);
>  		flag = NFS4_OPEN_DELEGATE_NONE;

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 14:55                         ` J. Bruce Fields
@ 2010-10-27 14:59                           ` Christoph Hellwig
  2010-10-27 15:16                             ` J. Bruce Fields
                                               ` (2 more replies)
  0 siblings, 3 replies; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-27 14:59 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Arnd Bergmann, Christoph Hellwig, Linus Torvalds,
	Bryan Schumaker, linux-nfs, linux-kernel

On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote:
> Hm, two problems:
> 	- We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK)
> 	  failing with ENOMEM.

splitt ->setlease into ->add_least and ->delete_lease.  No need to pass
in a structure for the later.  No need to return one either.

> 	- fasync_helper(.,.,1,.) sleeps.  Argh.

That's not new..


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 14:59                           ` Christoph Hellwig
@ 2010-10-27 15:16                             ` J. Bruce Fields
  2010-10-27 15:19                               ` Christoph Hellwig
  2010-10-27 15:23                             ` Arnd Bergmann
  2010-10-30 21:25                             ` J. Bruce Fields
  2 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 15:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

On Wed, Oct 27, 2010 at 10:59:29AM -0400, Christoph Hellwig wrote:
> On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote:
> > Hm, two problems:
> > 	- We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK)
> > 	  failing with ENOMEM.
> 
> splitt ->setlease into ->add_least and ->delete_lease.  No need to pass
> in a structure for the later.  No need to return one either.

Sounds fine to me.

> 
> > 	- fasync_helper(.,.,1,.) sleeps.  Argh.
> 
> That's not new..

So we could do

	unlock_flocks();
	error = fasync_helper(fd, filp, 1, &fl->fl_fasync);
	lock_flocks();

and say, hey, we didn't introduce any new bug there.  But....

I don't know, maybe add a version of fasync_add_entry() that takes a
preallocated fasync_struct??

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 15:16                             ` J. Bruce Fields
@ 2010-10-27 15:19                               ` Christoph Hellwig
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-27 15:19 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds,
	Bryan Schumaker, linux-nfs, linux-kernel

On Wed, Oct 27, 2010 at 11:16:46AM -0400, J. Bruce Fields wrote:
> On Wed, Oct 27, 2010 at 10:59:29AM -0400, Christoph Hellwig wrote:
> > On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote:
> > > Hm, two problems:
> > > 	- We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK)
> > > 	  failing with ENOMEM.
> > 
> > splitt ->setlease into ->add_least and ->delete_lease.  No need to pass
> > in a structure for the later.  No need to return one either.
> 
> Sounds fine to me.
> 
> > 
> > > 	- fasync_helper(.,.,1,.) sleeps.  Argh.
> > 
> > That's not new..
> 
> So we could do
> 
> 	unlock_flocks();
> 	error = fasync_helper(fd, filp, 1, &fl->fl_fasync);
> 	lock_flocks();
> 
> and say, hey, we didn't introduce any new bug there.  But....
> 
> I don't know, maybe add a version of fasync_add_entry() that takes a
> preallocated fasync_struct??

Or just convert the lock to a sleeping mutex.  Now that we have adaptive
spinning the horrible behaviour that Willy saw years ago might not be
that bad any more.  That'll need some benchmarking, though.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 14:59                           ` Christoph Hellwig
  2010-10-27 15:16                             ` J. Bruce Fields
@ 2010-10-27 15:23                             ` Arnd Bergmann
  2010-10-27 15:28                               ` J. Bruce Fields
                                                 ` (2 more replies)
  2010-10-30 21:25                             ` J. Bruce Fields
  2 siblings, 3 replies; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-27 15:23 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: J. Bruce Fields, Linus Torvalds, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wednesday 27 October 2010, Christoph Hellwig wrote:
> On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote:
> > Hm, two problems:
> > 	- We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK)
> > 	  failing with ENOMEM.
> 
> splitt ->setlease into ->add_least and ->delete_lease.  No need to pass
> in a structure for the later.  No need to return one either.

That sounds like a good way to get rid of a lot of special cases, too.

> > 	- fasync_helper(.,.,1,.) sleeps.  Argh.
> 
> That's not new..

It comes back to the original problem with Bruce's patch though:
fcntl_setlease wants to atomically add a lease or fail. If
fasync_helper fails, we want to remove the lease that was
just added before anyone can see it. At the very least we need
to keep the flock from getting freed in another thread while
we call fasync_helper without the lock.

locks_delete_lock is also called with lock_flocks held and calls
fasync_helper...

	Arnd

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 15:23                             ` Arnd Bergmann
@ 2010-10-27 15:28                               ` J. Bruce Fields
  2010-10-27 15:31                               ` Christoph Hellwig
  2010-10-27 16:12                               ` Linus Torvalds
  2 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 15:28 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 05:23:59PM +0200, Arnd Bergmann wrote:
> On Wednesday 27 October 2010, Christoph Hellwig wrote:
> > On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote:
> > > Hm, two problems:
> > > 	- We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK)
> > > 	  failing with ENOMEM.
> > 
> > splitt ->setlease into ->add_least and ->delete_lease.  No need to pass
> > in a structure for the later.  No need to return one either.
> 
> That sounds like a good way to get rid of a lot of special cases, too.
> 
> > > 	- fasync_helper(.,.,1,.) sleeps.  Argh.
> > 
> > That's not new..
> 
> It comes back to the original problem with Bruce's patch though:
> fcntl_setlease wants to atomically add a lease or fail. If
> fasync_helper fails, we want to remove the lease that was
> just added before anyone can see it. At the very least we need
> to keep the flock from getting freed in another thread while
> we call fasync_helper without the lock.
> 
> locks_delete_lock is also called with lock_flocks held and calls
> fasync_helper...

Yeah, but just the fasync_remove_entry() case.

It would really seem nicer to me if people called
fasync_{add,remove}_entry() instead of having this silly fasync_helper()
and making the reader remember what the third argument means.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 15:23                             ` Arnd Bergmann
  2010-10-27 15:28                               ` J. Bruce Fields
@ 2010-10-27 15:31                               ` Christoph Hellwig
  2010-10-27 16:12                               ` Linus Torvalds
  2 siblings, 0 replies; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-27 15:31 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, J. Bruce Fields, Linus Torvalds,
	Bryan Schumaker, linux-nfs, linux-kernel

On Wed, Oct 27, 2010 at 05:23:59PM +0200, Arnd Bergmann wrote:
> It comes back to the original problem with Bruce's patch though:
> fcntl_setlease wants to atomically add a lease or fail. If
> fasync_helper fails, we want to remove the lease that was
> just added before anyone can see it. At the very least we need
> to keep the flock from getting freed in another thread while
> we call fasync_helper without the lock.

a variant of fasync_add_entry that takes a preallocated
fasync_struct would indeed take care of that.

> locks_delete_lock is also called with lock_flocks held and calls
> fasync_helper...

but with on = 0, which calls into fasync_remove_entry, which doesn't
sleep.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 15:23                             ` Arnd Bergmann
  2010-10-27 15:28                               ` J. Bruce Fields
  2010-10-27 15:31                               ` Christoph Hellwig
@ 2010-10-27 16:12                               ` Linus Torvalds
  2010-10-27 16:46                                   ` J. Bruce Fields
  2 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2010-10-27 16:12 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christoph Hellwig, J. Bruce Fields, Bryan Schumaker, linux-nfs,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

On Wed, Oct 27, 2010 at 8:23 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> locks_delete_lock is also called with lock_flocks held and calls
> fasync_helper...

We don't really have to use fasync_helper.

In fact, the whole interface is pretty broken for something like file
locking, which isn't actually "fasync()". That whole "on/off as an
argument" is just crazy. It would be _trivial_ to expose a version of
fasync_helper() that takes a pre-allocated fasync_struct for add, and
that has separate helper functions for the add/delete case so that you
don't have the pointless crazy arguments (for "delete" the "fd"
argument is useless, and I do hate "modal" functions that take what
they should do as a flag).

Then fcntl_setlease() would trivially just allocate the dang thing before.

Something like the attached (UNTESTED!) perhaps?

                           Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 5305 bytes --]

 fs/fcntl.c         |   66 +++++++++++++++++++++++++++++++++++++++------------
 fs/locks.c         |   17 ++++++++++++-
 include/linux/fs.h |    5 ++++
 3 files changed, 71 insertions(+), 17 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index f8cc34f..dcdbc6f 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -640,7 +640,7 @@ static void fasync_free_rcu(struct rcu_head *head)
  * match the state "is the filp on a fasync list".
  *
  */
-static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
+int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
 {
 	struct fasync_struct *fa, **fp;
 	int result = 0;
@@ -666,21 +666,28 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
 	return result;
 }
 
+struct fasync_struct *fasync_alloc(void)
+{
+	return kmem_cache_alloc(fasync_cache, GFP_KERNEL);
+}
+
 /*
- * Add a fasync entry. Return negative on error, positive if
- * added, and zero if did nothing but change an existing one.
- *
- * NOTE! It is very important that the FASYNC flag always
- * match the state "is the filp on a fasync list".
+ * NOTE! This can be used only for unused fasync entries:
+ * entries that actually got inserted on the fasync list
+ * need to be released by rcu - see fasync_remove_entry.
  */
-static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp)
+void fasync_free(struct fasync_struct *new)
 {
-	struct fasync_struct *new, *fa, **fp;
-	int result = 0;
+	kmem_cache_free(fasync_cache, new);
+}
 
-	new = kmem_cache_alloc(fasync_cache, GFP_KERNEL);
-	if (!new)
-		return -ENOMEM;
+/*
+ * Insert a new entry into the fasync list.  Return the pointer to the
+ * old one if we didn't use the new one.
+ */
+struct fasync_struct *fasync_insert_entry(int fd, struct file *filp, struct fasync_struct **fapp, struct fasync_struct *new)
+{
+        struct fasync_struct *fa, **fp;
 
 	spin_lock(&filp->f_lock);
 	spin_lock(&fasync_lock);
@@ -691,8 +698,6 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
 		spin_lock_irq(&fa->fa_lock);
 		fa->fa_fd = fd;
 		spin_unlock_irq(&fa->fa_lock);
-
-		kmem_cache_free(fasync_cache, new);
 		goto out;
 	}
 
@@ -702,13 +707,42 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
 	new->fa_fd = fd;
 	new->fa_next = *fapp;
 	rcu_assign_pointer(*fapp, new);
-	result = 1;
 	filp->f_flags |= FASYNC;
 
 out:
 	spin_unlock(&fasync_lock);
 	spin_unlock(&filp->f_lock);
-	return result;
+	return fa;
+}
+
+/*
+ * Add a fasync entry. Return negative on error, positive if
+ * added, and zero if did nothing but change an existing one.
+ *
+ * NOTE! It is very important that the FASYNC flag always
+ * match the state "is the filp on a fasync list".
+ */
+static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp)
+{
+	struct fasync_struct *new;
+
+	new = fasync_alloc();
+	if (!new)
+		return -ENOMEM;
+
+	/*
+	 * fasync_insert_entry() returns the old (update) entry if
+	 * it existed.
+	 *
+	 * So free the (unused) new entry and return 0 to let the
+	 * caller know that we didn't add any new fasync entries.
+	 */
+	if (fasync_insert_entry(fd, filp, fapp, new)) {
+		fasync_free(new);
+		return 0;
+	}
+
+	return 1;
 }
 
 /*
diff --git a/fs/locks.c b/fs/locks.c
index 4de3a26..9ff3f66 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1515,6 +1515,7 @@ EXPORT_SYMBOL_GPL(vfs_setlease);
 int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 {
 	struct file_lock fl, *flp = &fl;
+	struct fasync_struct *new;
 	struct inode *inode = filp->f_path.dentry->d_inode;
 	int error;
 
@@ -1523,13 +1524,25 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 	if (error)
 		return error;
 
+	new = fasync_alloc();
+	if (!new)
+		return -ENOMEM;
+
 	lock_flocks();
 
 	error = __vfs_setlease(filp, arg, &flp);
 	if (error || arg == F_UNLCK)
 		goto out_unlock;
 
-	error = fasync_helper(fd, filp, 1, &flp->fl_fasync);
+	/*
+	 * fasync_insert_entry() returns the old entry if any.
+	 * If there was no old entry, then it used 'new' and
+	 * inserted it into the fasync list. Clear new so that
+	 * we don't release it here.
+	 */
+	if (!fasync_insert_entry(fd, filp, &flp->fl_fasync, new))
+		new = NULL;
+
 	if (error < 0) {
 		/* remove lease just inserted by setlease */
 		flp->fl_type = F_UNLCK | F_INPROGRESS;
@@ -1541,6 +1554,8 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 	error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
 out_unlock:
 	unlock_flocks();
+	if (new)
+		fasync_free(new);
 	return error;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 240eb1d..d487772 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1310,6 +1310,11 @@ struct fasync_struct {
 
 /* SMP safe fasync helpers: */
 extern int fasync_helper(int, struct file *, int, struct fasync_struct **);
+extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasync_struct **, struct fasync_struct *);
+extern int fasync_remove_entry(struct file *, struct fasync_struct **);
+extern struct fasync_struct *fasync_alloc(void);
+extern void fasync_free(struct fasync_struct *);
+
 /* can be called from interrupts */
 extern void kill_fasync(struct fasync_struct **, int, int);
 

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
@ 2010-10-27 16:46                                   ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 16:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 09:12:06AM -0700, Linus Torvalds wrote:
> On Wed, Oct 27, 2010 at 8:23 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > locks_delete_lock is also called with lock_flocks held and calls
> > fasync_helper...
> 
> We don't really have to use fasync_helper.
> 
> In fact, the whole interface is pretty broken for something like file
> locking, which isn't actually "fasync()". That whole "on/off as an
> argument" is just crazy. It would be _trivial_ to expose a version of
> fasync_helper() that takes a pre-allocated fasync_struct for add, and
> that has separate helper functions for the add/delete case so that you
> don't have the pointless crazy arguments (for "delete" the "fd"
> argument is useless, and I do hate "modal" functions that take what
> they should do as a flag).
> 
> Then fcntl_setlease() would trivially just allocate the dang thing before.
> 
> Something like the attached (UNTESTED!) perhaps?

Makes sense to me.  Testing....

--b.

> 
>                            Linus

>  fs/fcntl.c         |   66 +++++++++++++++++++++++++++++++++++++++------------
>  fs/locks.c         |   17 ++++++++++++-
>  include/linux/fs.h |    5 ++++
>  3 files changed, 71 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/fcntl.c b/fs/fcntl.c
> index f8cc34f..dcdbc6f 100644
> --- a/fs/fcntl.c
> +++ b/fs/fcntl.c
> @@ -640,7 +640,7 @@ static void fasync_free_rcu(struct rcu_head *head)
>   * match the state "is the filp on a fasync list".
>   *
>   */
> -static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
> +int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
>  {
>  	struct fasync_struct *fa, **fp;
>  	int result = 0;
> @@ -666,21 +666,28 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
>  	return result;
>  }
>  
> +struct fasync_struct *fasync_alloc(void)
> +{
> +	return kmem_cache_alloc(fasync_cache, GFP_KERNEL);
> +}
> +
>  /*
> - * Add a fasync entry. Return negative on error, positive if
> - * added, and zero if did nothing but change an existing one.
> - *
> - * NOTE! It is very important that the FASYNC flag always
> - * match the state "is the filp on a fasync list".
> + * NOTE! This can be used only for unused fasync entries:
> + * entries that actually got inserted on the fasync list
> + * need to be released by rcu - see fasync_remove_entry.
>   */
> -static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp)
> +void fasync_free(struct fasync_struct *new)
>  {
> -	struct fasync_struct *new, *fa, **fp;
> -	int result = 0;
> +	kmem_cache_free(fasync_cache, new);
> +}
>  
> -	new = kmem_cache_alloc(fasync_cache, GFP_KERNEL);
> -	if (!new)
> -		return -ENOMEM;
> +/*
> + * Insert a new entry into the fasync list.  Return the pointer to the
> + * old one if we didn't use the new one.
> + */
> +struct fasync_struct *fasync_insert_entry(int fd, struct file *filp, struct fasync_struct **fapp, struct fasync_struct *new)
> +{
> +        struct fasync_struct *fa, **fp;
>  
>  	spin_lock(&filp->f_lock);
>  	spin_lock(&fasync_lock);
> @@ -691,8 +698,6 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
>  		spin_lock_irq(&fa->fa_lock);
>  		fa->fa_fd = fd;
>  		spin_unlock_irq(&fa->fa_lock);
> -
> -		kmem_cache_free(fasync_cache, new);
>  		goto out;
>  	}
>  
> @@ -702,13 +707,42 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
>  	new->fa_fd = fd;
>  	new->fa_next = *fapp;
>  	rcu_assign_pointer(*fapp, new);
> -	result = 1;
>  	filp->f_flags |= FASYNC;
>  
>  out:
>  	spin_unlock(&fasync_lock);
>  	spin_unlock(&filp->f_lock);
> -	return result;
> +	return fa;
> +}
> +
> +/*
> + * Add a fasync entry. Return negative on error, positive if
> + * added, and zero if did nothing but change an existing one.
> + *
> + * NOTE! It is very important that the FASYNC flag always
> + * match the state "is the filp on a fasync list".
> + */
> +static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp)
> +{
> +	struct fasync_struct *new;
> +
> +	new = fasync_alloc();
> +	if (!new)
> +		return -ENOMEM;
> +
> +	/*
> +	 * fasync_insert_entry() returns the old (update) entry if
> +	 * it existed.
> +	 *
> +	 * So free the (unused) new entry and return 0 to let the
> +	 * caller know that we didn't add any new fasync entries.
> +	 */
> +	if (fasync_insert_entry(fd, filp, fapp, new)) {
> +		fasync_free(new);
> +		return 0;
> +	}
> +
> +	return 1;
>  }
>  
>  /*
> diff --git a/fs/locks.c b/fs/locks.c
> index 4de3a26..9ff3f66 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1515,6 +1515,7 @@ EXPORT_SYMBOL_GPL(vfs_setlease);
>  int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>  {
>  	struct file_lock fl, *flp = &fl;
> +	struct fasync_struct *new;
>  	struct inode *inode = filp->f_path.dentry->d_inode;
>  	int error;
>  
> @@ -1523,13 +1524,25 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>  	if (error)
>  		return error;
>  
> +	new = fasync_alloc();
> +	if (!new)
> +		return -ENOMEM;
> +
>  	lock_flocks();
>  
>  	error = __vfs_setlease(filp, arg, &flp);
>  	if (error || arg == F_UNLCK)
>  		goto out_unlock;
>  
> -	error = fasync_helper(fd, filp, 1, &flp->fl_fasync);
> +	/*
> +	 * fasync_insert_entry() returns the old entry if any.
> +	 * If there was no old entry, then it used 'new' and
> +	 * inserted it into the fasync list. Clear new so that
> +	 * we don't release it here.
> +	 */
> +	if (!fasync_insert_entry(fd, filp, &flp->fl_fasync, new))
> +		new = NULL;
> +
>  	if (error < 0) {
>  		/* remove lease just inserted by setlease */
>  		flp->fl_type = F_UNLCK | F_INPROGRESS;
> @@ -1541,6 +1554,8 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>  	error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
>  out_unlock:
>  	unlock_flocks();
> +	if (new)
> +		fasync_free(new);
>  	return error;
>  }
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 240eb1d..d487772 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1310,6 +1310,11 @@ struct fasync_struct {
>  
>  /* SMP safe fasync helpers: */
>  extern int fasync_helper(int, struct file *, int, struct fasync_struct **);
> +extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasync_struct **, struct fasync_struct *);
> +extern int fasync_remove_entry(struct file *, struct fasync_struct **);
> +extern struct fasync_struct *fasync_alloc(void);
> +extern void fasync_free(struct fasync_struct *);
> +
>  /* can be called from interrupts */
>  extern void kill_fasync(struct fasync_struct **, int, int);
>  


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
@ 2010-10-27 16:46                                   ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 16:46 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 09:12:06AM -0700, Linus Torvalds wrote:
> On Wed, Oct 27, 2010 at 8:23 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > locks_delete_lock is also called with lock_flocks held and calls
> > fasync_helper...
> 
> We don't really have to use fasync_helper.
> 
> In fact, the whole interface is pretty broken for something like file
> locking, which isn't actually "fasync()". That whole "on/off as an
> argument" is just crazy. It would be _trivial_ to expose a version of
> fasync_helper() that takes a pre-allocated fasync_struct for add, and
> that has separate helper functions for the add/delete case so that you
> don't have the pointless crazy arguments (for "delete" the "fd"
> argument is useless, and I do hate "modal" functions that take what
> they should do as a flag).
> 
> Then fcntl_setlease() would trivially just allocate the dang thing before.
> 
> Something like the attached (UNTESTED!) perhaps?

Makes sense to me.  Testing....

--b.

> 
>                            Linus

>  fs/fcntl.c         |   66 +++++++++++++++++++++++++++++++++++++++------------
>  fs/locks.c         |   17 ++++++++++++-
>  include/linux/fs.h |    5 ++++
>  3 files changed, 71 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/fcntl.c b/fs/fcntl.c
> index f8cc34f..dcdbc6f 100644
> --- a/fs/fcntl.c
> +++ b/fs/fcntl.c
> @@ -640,7 +640,7 @@ static void fasync_free_rcu(struct rcu_head *head)
>   * match the state "is the filp on a fasync list".
>   *
>   */
> -static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
> +int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
>  {
>  	struct fasync_struct *fa, **fp;
>  	int result = 0;
> @@ -666,21 +666,28 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
>  	return result;
>  }
>  
> +struct fasync_struct *fasync_alloc(void)
> +{
> +	return kmem_cache_alloc(fasync_cache, GFP_KERNEL);
> +}
> +
>  /*
> - * Add a fasync entry. Return negative on error, positive if
> - * added, and zero if did nothing but change an existing one.
> - *
> - * NOTE! It is very important that the FASYNC flag always
> - * match the state "is the filp on a fasync list".
> + * NOTE! This can be used only for unused fasync entries:
> + * entries that actually got inserted on the fasync list
> + * need to be released by rcu - see fasync_remove_entry.
>   */
> -static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp)
> +void fasync_free(struct fasync_struct *new)
>  {
> -	struct fasync_struct *new, *fa, **fp;
> -	int result = 0;
> +	kmem_cache_free(fasync_cache, new);
> +}
>  
> -	new = kmem_cache_alloc(fasync_cache, GFP_KERNEL);
> -	if (!new)
> -		return -ENOMEM;
> +/*
> + * Insert a new entry into the fasync list.  Return the pointer to the
> + * old one if we didn't use the new one.
> + */
> +struct fasync_struct *fasync_insert_entry(int fd, struct file *filp, struct fasync_struct **fapp, struct fasync_struct *new)
> +{
> +        struct fasync_struct *fa, **fp;
>  
>  	spin_lock(&filp->f_lock);
>  	spin_lock(&fasync_lock);
> @@ -691,8 +698,6 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
>  		spin_lock_irq(&fa->fa_lock);
>  		fa->fa_fd = fd;
>  		spin_unlock_irq(&fa->fa_lock);
> -
> -		kmem_cache_free(fasync_cache, new);
>  		goto out;
>  	}
>  
> @@ -702,13 +707,42 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
>  	new->fa_fd = fd;
>  	new->fa_next = *fapp;
>  	rcu_assign_pointer(*fapp, new);
> -	result = 1;
>  	filp->f_flags |= FASYNC;
>  
>  out:
>  	spin_unlock(&fasync_lock);
>  	spin_unlock(&filp->f_lock);
> -	return result;
> +	return fa;
> +}
> +
> +/*
> + * Add a fasync entry. Return negative on error, positive if
> + * added, and zero if did nothing but change an existing one.
> + *
> + * NOTE! It is very important that the FASYNC flag always
> + * match the state "is the filp on a fasync list".
> + */
> +static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp)
> +{
> +	struct fasync_struct *new;
> +
> +	new = fasync_alloc();
> +	if (!new)
> +		return -ENOMEM;
> +
> +	/*
> +	 * fasync_insert_entry() returns the old (update) entry if
> +	 * it existed.
> +	 *
> +	 * So free the (unused) new entry and return 0 to let the
> +	 * caller know that we didn't add any new fasync entries.
> +	 */
> +	if (fasync_insert_entry(fd, filp, fapp, new)) {
> +		fasync_free(new);
> +		return 0;
> +	}
> +
> +	return 1;
>  }
>  
>  /*
> diff --git a/fs/locks.c b/fs/locks.c
> index 4de3a26..9ff3f66 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1515,6 +1515,7 @@ EXPORT_SYMBOL_GPL(vfs_setlease);
>  int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>  {
>  	struct file_lock fl, *flp = &fl;
> +	struct fasync_struct *new;
>  	struct inode *inode = filp->f_path.dentry->d_inode;
>  	int error;
>  
> @@ -1523,13 +1524,25 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>  	if (error)
>  		return error;
>  
> +	new = fasync_alloc();
> +	if (!new)
> +		return -ENOMEM;
> +
>  	lock_flocks();
>  
>  	error = __vfs_setlease(filp, arg, &flp);
>  	if (error || arg == F_UNLCK)
>  		goto out_unlock;
>  
> -	error = fasync_helper(fd, filp, 1, &flp->fl_fasync);
> +	/*
> +	 * fasync_insert_entry() returns the old entry if any.
> +	 * If there was no old entry, then it used 'new' and
> +	 * inserted it into the fasync list. Clear new so that
> +	 * we don't release it here.
> +	 */
> +	if (!fasync_insert_entry(fd, filp, &flp->fl_fasync, new))
> +		new = NULL;
> +
>  	if (error < 0) {
>  		/* remove lease just inserted by setlease */
>  		flp->fl_type = F_UNLCK | F_INPROGRESS;
> @@ -1541,6 +1554,8 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
>  	error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
>  out_unlock:
>  	unlock_flocks();
> +	if (new)
> +		fasync_free(new);
>  	return error;
>  }
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 240eb1d..d487772 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1310,6 +1310,11 @@ struct fasync_struct {
>  
>  /* SMP safe fasync helpers: */
>  extern int fasync_helper(int, struct file *, int, struct fasync_struct **);
> +extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasync_struct **, struct fasync_struct *);
> +extern int fasync_remove_entry(struct file *, struct fasync_struct **);
> +extern struct fasync_struct *fasync_alloc(void);
> +extern void fasync_free(struct fasync_struct *);
> +
>  /* can be called from interrupts */
>  extern void kill_fasync(struct fasync_struct **, int, int);
>  


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 16:46                                   ` J. Bruce Fields
  (?)
@ 2010-10-27 17:32                                   ` Linus Torvalds
  2010-10-27 17:40                                     ` J. Bruce Fields
  -1 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2010-10-27 17:32 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Arnd Bergmann, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 9:46 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> On Wed, Oct 27, 2010 at 09:12:06AM -0700, Linus Torvalds wrote:
>>
>> Something like the attached (UNTESTED!) perhaps?
>
> Makes sense to me.  Testing....

So I found a buglet in the patch: the

  NOTE! It is very important that the FASYNC flag always
  match the state "is the filp on a fasync list".

comment should be moved to be associated with "fasync_insert_entry()"
rather than "fasync_add_entry()", since it's the insert-entry thing
that does the actual FASYNC flag handling.

But that incorrect comment placement shouldn't affect testing, obviously ;)

Btw, who is going to collect these things assuming it passes testing?
Arnd? You? I'll happily sign off on the fasync patch (with the comment
movement) assuming it tests out ok, but there's all the other patches
too that have been passed around. I really do want to get this into
the merge window, because it would be a big shame if we couldn't
effectively get rid of the BKL now just because of these kinds of
smallish final details, so I'm just checking who wants to step up to
the plate to collect it all together and make sure I have it?

                      Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 17:32                                   ` Linus Torvalds
@ 2010-10-27 17:40                                     ` J. Bruce Fields
  2010-10-27 18:20                                       ` Arnd Bergmann
  0 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 17:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arnd Bergmann, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 10:32:20AM -0700, Linus Torvalds wrote:
> On Wed, Oct 27, 2010 at 9:46 AM, J. Bruce Fields <bfields@fieldses.org> wrote:
> > On Wed, Oct 27, 2010 at 09:12:06AM -0700, Linus Torvalds wrote:
> >>
> >> Something like the attached (UNTESTED!) perhaps?
> >
> > Makes sense to me.  Testing....
> 
> So I found a buglet in the patch: the
> 
>   NOTE! It is very important that the FASYNC flag always
>   match the state "is the filp on a fasync list".
> 
> comment should be moved to be associated with "fasync_insert_entry()"
> rather than "fasync_add_entry()", since it's the insert-entry thing
> that does the actual FASYNC flag handling.
> 
> But that incorrect comment placement shouldn't affect testing, obviously ;)
> 
> Btw, who is going to collect these things assuming it passes testing?
> Arnd? You? I'll happily sign off on the fasync patch (with the comment
> movement) assuming it tests out ok, but there's all the other patches
> too that have been passed around. I really do want to get this into
> the merge window, because it would be a big shame if we couldn't
> effectively get rid of the BKL now just because of these kinds of
> smallish final details, so I'm just checking who wants to step up to
> the plate to collect it all together and make sure I have it?

Doesn't matter to me.  What I've currently got (Arnd's patches with some
minor fixes from me, and your patch, all in a poorly changelogged
jumble) is in

	git://linux-nfs.org/~bfields/linux-topics.git TMP-BKL-TESTING

You wouldn't want to merge that as is, but it does seem to work.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 17:40                                     ` J. Bruce Fields
@ 2010-10-27 18:20                                       ` Arnd Bergmann
  2010-10-27 18:42                                         ` Linus Torvalds
  0 siblings, 1 reply; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-27 18:20 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Linus Torvalds, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wednesday 27 October 2010 19:40:16 J. Bruce Fields wrote:
> Doesn't matter to me.  What I've currently got (Arnd's patches with some
> minor fixes from me, and your patch, all in a poorly changelogged
> jumble) is in
> 
>         git://linux-nfs.org/~bfields/linux-topics.git TMP-BKL-TESTING
> 
> You wouldn't want to merge that as is, but it does seem to work.

Ok, then let's use your tree with proper changelogs. That should be the
easiest way to make sure that the code you test is the one that gets in.

	Arnd

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 18:20                                       ` Arnd Bergmann
@ 2010-10-27 18:42                                         ` Linus Torvalds
  2010-10-27 18:43                                           ` Linus Torvalds
  0 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2010-10-27 18:42 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: J. Bruce Fields, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

[-- Attachment #1: Type: text/plain, Size: 669 bytes --]

On Wed, Oct 27, 2010 at 11:20 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> Ok, then let's use your tree with proper changelogs. That should be the
> easiest way to make sure that the code you test is the one that gets in.

Half of them are yours, and the one that is mine needs the comment
fixup. Can you organize it all for me to pull?

Here's my patch again, with the comment move and a suggested commit
message (and my sign-off and Bruce's tested-by: the only thing that
changed in the patch was literally some comment placement so the
tested-by should still be perfectly valid).

Feel free to edit the message/patch to your hearts content.

  Thanks,
        Linus

[-- Attachment #2: patch.diff --]
[-- Type: text/x-patch, Size: 6020 bytes --]

From: Linus Torvalds <torvalds@linux-foundation.org>
Subject: fasync: re-organize fasync entry insertion to allow it under a spinlock

You currently cannot use "fasync_helper()" in an atomic environment to
insert a new fasync entry, because it will need to allocate the new
"struct fasync_struct". 

Yet fcntl_setlease() wants to call this under lock_flocks(), which is in
the process of being converted from the BKL to a spinlock. 

In order to fix this, this abstracts out the actual fasync list
insertion and the fasync allocations into functions of their own, and
teaches fs/locks.c to pre-allocate the fasync_struct entry.  That way
the actual list insertion can happen while holding the required
spinlock. 

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tested-by: Bruce Fields <bfields@fieldses.org>
---
 fs/fcntl.c         |   62 ++++++++++++++++++++++++++++++++++++++++-----------
 fs/locks.c         |   17 +++++++++++++-
 include/linux/fs.h |    5 ++++
 3 files changed, 69 insertions(+), 15 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index f8cc34f..ecc8b39 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -640,7 +640,7 @@ static void fasync_free_rcu(struct rcu_head *head)
  * match the state "is the filp on a fasync list".
  *
  */
-static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
+int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
 {
 	struct fasync_struct *fa, **fp;
 	int result = 0;
@@ -666,21 +666,31 @@ static int fasync_remove_entry(struct file *filp, struct fasync_struct **fapp)
 	return result;
 }
 
+struct fasync_struct *fasync_alloc(void)
+{
+	return kmem_cache_alloc(fasync_cache, GFP_KERNEL);
+}
+
 /*
- * Add a fasync entry. Return negative on error, positive if
- * added, and zero if did nothing but change an existing one.
+ * NOTE! This can be used only for unused fasync entries:
+ * entries that actually got inserted on the fasync list
+ * need to be released by rcu - see fasync_remove_entry.
+ */
+void fasync_free(struct fasync_struct *new)
+{
+	kmem_cache_free(fasync_cache, new);
+}
+
+/*
+ * Insert a new entry into the fasync list.  Return the pointer to the
+ * old one if we didn't use the new one.
  *
  * NOTE! It is very important that the FASYNC flag always
  * match the state "is the filp on a fasync list".
  */
-static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp)
+struct fasync_struct *fasync_insert_entry(int fd, struct file *filp, struct fasync_struct **fapp, struct fasync_struct *new)
 {
-	struct fasync_struct *new, *fa, **fp;
-	int result = 0;
-
-	new = kmem_cache_alloc(fasync_cache, GFP_KERNEL);
-	if (!new)
-		return -ENOMEM;
+        struct fasync_struct *fa, **fp;
 
 	spin_lock(&filp->f_lock);
 	spin_lock(&fasync_lock);
@@ -691,8 +701,6 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
 		spin_lock_irq(&fa->fa_lock);
 		fa->fa_fd = fd;
 		spin_unlock_irq(&fa->fa_lock);
-
-		kmem_cache_free(fasync_cache, new);
 		goto out;
 	}
 
@@ -702,13 +710,39 @@ static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fa
 	new->fa_fd = fd;
 	new->fa_next = *fapp;
 	rcu_assign_pointer(*fapp, new);
-	result = 1;
 	filp->f_flags |= FASYNC;
 
 out:
 	spin_unlock(&fasync_lock);
 	spin_unlock(&filp->f_lock);
-	return result;
+	return fa;
+}
+
+/*
+ * Add a fasync entry. Return negative on error, positive if
+ * added, and zero if did nothing but change an existing one.
+ */
+static int fasync_add_entry(int fd, struct file *filp, struct fasync_struct **fapp)
+{
+	struct fasync_struct *new;
+
+	new = fasync_alloc();
+	if (!new)
+		return -ENOMEM;
+
+	/*
+	 * fasync_insert_entry() returns the old (update) entry if
+	 * it existed.
+	 *
+	 * So free the (unused) new entry and return 0 to let the
+	 * caller know that we didn't add any new fasync entries.
+	 */
+	if (fasync_insert_entry(fd, filp, fapp, new)) {
+		fasync_free(new);
+		return 0;
+	}
+
+	return 1;
 }
 
 /*
diff --git a/fs/locks.c b/fs/locks.c
index 4de3a26..9ff3f66 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1515,6 +1515,7 @@ EXPORT_SYMBOL_GPL(vfs_setlease);
 int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 {
 	struct file_lock fl, *flp = &fl;
+	struct fasync_struct *new;
 	struct inode *inode = filp->f_path.dentry->d_inode;
 	int error;
 
@@ -1523,13 +1524,25 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 	if (error)
 		return error;
 
+	new = fasync_alloc();
+	if (!new)
+		return -ENOMEM;
+
 	lock_flocks();
 
 	error = __vfs_setlease(filp, arg, &flp);
 	if (error || arg == F_UNLCK)
 		goto out_unlock;
 
-	error = fasync_helper(fd, filp, 1, &flp->fl_fasync);
+	/*
+	 * fasync_insert_entry() returns the old entry if any.
+	 * If there was no old entry, then it used 'new' and
+	 * inserted it into the fasync list. Clear new so that
+	 * we don't release it here.
+	 */
+	if (!fasync_insert_entry(fd, filp, &flp->fl_fasync, new))
+		new = NULL;
+
 	if (error < 0) {
 		/* remove lease just inserted by setlease */
 		flp->fl_type = F_UNLCK | F_INPROGRESS;
@@ -1541,6 +1554,8 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 	error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
 out_unlock:
 	unlock_flocks();
+	if (new)
+		fasync_free(new);
 	return error;
 }
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 240eb1d..d487772 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1310,6 +1310,11 @@ struct fasync_struct {
 
 /* SMP safe fasync helpers: */
 extern int fasync_helper(int, struct file *, int, struct fasync_struct **);
+extern struct fasync_struct *fasync_insert_entry(int, struct file *, struct fasync_struct **, struct fasync_struct *);
+extern int fasync_remove_entry(struct file *, struct fasync_struct **);
+extern struct fasync_struct *fasync_alloc(void);
+extern void fasync_free(struct fasync_struct *);
+
 /* can be called from interrupts */
 extern void kill_fasync(struct fasync_struct **, int, int);
 

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 18:42                                         ` Linus Torvalds
@ 2010-10-27 18:43                                           ` Linus Torvalds
  2010-10-27 19:48                                             ` Arnd Bergmann
  0 siblings, 1 reply; 62+ messages in thread
From: Linus Torvalds @ 2010-10-27 18:43 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: J. Bruce Fields, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 11:42 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Feel free to edit the message/patch to your hearts content.

Oh, and if you want me to just commit this part, I can do so. It
doesn't make much sense without the other parts to actually make it
useful, though, so it probably makes more sense to come with them.

                    Linus

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 18:43                                           ` Linus Torvalds
@ 2010-10-27 19:48                                             ` Arnd Bergmann
  2010-10-27 20:01                                               ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-27 19:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: J. Bruce Fields, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wednesday 27 October 2010 20:43:59 Linus Torvalds wrote:
> On Wed, Oct 27, 2010 at 11:42 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Feel free to edit the message/patch to your hearts content.
> 
> Oh, and if you want me to just commit this part, I can do so. It
> doesn't make much sense without the other parts to actually make it
> useful, though, so it probably makes more sense to come with them.

Once Bruce is happy with the test results, you can pull it from

git+ssh://master.kernel.org/pub/scm/linux/kernel/git/arnd/bkl.git flock

I've rewritten all the changelogs to make more sense as a series,
and split out the bits that turn lock_flocks into a spinlock to
come last, so we get a bisectable series.

The contents are left unchanged.

	Arnd
---

commit b3426739cc8f7c7dd127ca8dad5e25195930cac1
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Wed Oct 27 21:39:58 2010 +0200

    locks: turn lock_flocks into a spinlock
    
    Nothing depends on lock_flocks using the BKL
    any more, so we can do the switch over to
    a private spinlock.
    
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>

 fs/Kconfig |    1 -
 fs/locks.c |    5 +++--
 2 files changed, 3 insertions(+), 3 deletions(-)

commit 55d3ff97c5c0b3dce39f705e2b1fe85818891822
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Wed Oct 27 12:38:12 2010 -0400

    locks: avoid fasync allocation under lock_flocks
    
    This splits fasync_helper into four functions: fasync_alloc,
    fasync_free, fasync_insert_entry and fasync_remove_entry,
    in order to allow the lease handling to call them directly
    instead of going through fasync_helper.
    
    The fasync_helper interface really doesn't make sense outside
    of ->fasync file operations, which use the same calling
    conventions of passing flags in and out as the helper.
    
    After the change, fcntl_setlease can simply allocate the
    new fasync_struct outside of lock_flocks, which is required
    to turn that into a spinlock.
    
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    [bfields@redhat.com: rebase on top of my changes to Arnd's patch]
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    [arnd: rewrite changelog text]
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>

 fs/fcntl.c         |   66 +++++++++++++++++++++++++++++++++++++++------------
 fs/locks.c         |   18 +++++++++++++-
 include/linux/fs.h |    5 ++++
 3 files changed, 72 insertions(+), 17 deletions(-)

commit c5b1f0d92c36851aca09ac6c7c0c4f9690ac14f3
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Wed Oct 27 15:46:08 2010 +0200

    locks/nfsd: allocate file lock outside of spinlock
    
    As suggested by Christoph Hellwig, this moves allocation
    of new file locks out of generic_setlease into the
    callers, nfs4_open_delegation and fcntl_setlease in order
    to allow GFP_KERNEL allocations when lock_flocks has
    become a spinlock.
    
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Acked-by: J. Bruce Fields <bfields@redhat.com>

 fs/locks.c          |   36 ++++++++++++------------------------
 fs/nfsd/nfs4state.c |   26 +++++++++++++++-----------
 include/linux/fs.h  |    1 +
 3 files changed, 28 insertions(+), 35 deletions(-)

commit a282a1fa6b23bd21ba0b86e53ed2a316b001836f
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Tue Oct 26 18:25:30 2010 -0400

    lockd: fix nlmsvc_notify_blocked locking
    
    nlmsvc_notify_blocked walks the nlm_blocked list,
    which requires nlm_blocked_lock.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>

 fs/lockd/svclock.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

commit 763641d81202834e9d64de2019d1edec12868f4f
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Tue Oct 26 22:55:40 2010 +0200

    lockd: push lock_flocks down
    
    lockd should use lock_flocks() instead of lock_kernel()
    to lock against posix locks accessing the i_flock list.
    
    This is a prerequisite to turning lock_flocks into a
    spinlock.
    
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Acked-by: J. Bruce Fields <bfields@redhat.com>

 fs/lockd/svc.c     |   11 -----------
 fs/lockd/svcsubs.c |    9 ++++++++-
 fs/nfs/Kconfig     |    1 -
 fs/nfsd/Kconfig    |    1 -
 4 files changed, 8 insertions(+), 14 deletions(-)

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 19:48                                             ` Arnd Bergmann
@ 2010-10-27 20:01                                               ` J. Bruce Fields
  2010-10-27 20:20                                                 ` Arnd Bergmann
  0 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 20:01 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linus Torvalds, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 09:48:47PM +0200, Arnd Bergmann wrote:
> On Wednesday 27 October 2010 20:43:59 Linus Torvalds wrote:
> > On Wed, Oct 27, 2010 at 11:42 AM, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> > >
> > > Feel free to edit the message/patch to your hearts content.
> > 
> > Oh, and if you want me to just commit this part, I can do so. It
> > doesn't make much sense without the other parts to actually make it
> > useful, though, so it probably makes more sense to come with them.
> 
> Once Bruce is happy with the test results, you can pull it from
> 
> git+ssh://master.kernel.org/pub/scm/linux/kernel/git/arnd/bkl.git flock

Not the useful url for us hoi polloi out here.  OK, looking at
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl.git flock....

> I've rewritten all the changelogs to make more sense as a series,
> and split out the bits that turn lock_flocks into a spinlock to
> come last, so we get a bisectable series.
> 
> The contents are left unchanged.

Looks fine to me!

I haven't retested that, but I can see that the result is identical to
what I tested.

(You're missing Linus's comment fix, though, unless I'm confused.)

--b.


> 
> 	Arnd
> ---
> 
> commit b3426739cc8f7c7dd127ca8dad5e25195930cac1
> Author: Arnd Bergmann <arnd@arndb.de>
> Date:   Wed Oct 27 21:39:58 2010 +0200
> 
>     locks: turn lock_flocks into a spinlock
>     
>     Nothing depends on lock_flocks using the BKL
>     any more, so we can do the switch over to
>     a private spinlock.
>     
>     Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> 
>  fs/Kconfig |    1 -
>  fs/locks.c |    5 +++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> commit 55d3ff97c5c0b3dce39f705e2b1fe85818891822
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   Wed Oct 27 12:38:12 2010 -0400
> 
>     locks: avoid fasync allocation under lock_flocks
>     
>     This splits fasync_helper into four functions: fasync_alloc,
>     fasync_free, fasync_insert_entry and fasync_remove_entry,
>     in order to allow the lease handling to call them directly
>     instead of going through fasync_helper.
>     
>     The fasync_helper interface really doesn't make sense outside
>     of ->fasync file operations, which use the same calling
>     conventions of passing flags in and out as the helper.
>     
>     After the change, fcntl_setlease can simply allocate the
>     new fasync_struct outside of lock_flocks, which is required
>     to turn that into a spinlock.
>     
>     Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
>     [bfields@redhat.com: rebase on top of my changes to Arnd's patch]
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>     [arnd: rewrite changelog text]
>     Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> 
>  fs/fcntl.c         |   66 +++++++++++++++++++++++++++++++++++++++------------
>  fs/locks.c         |   18 +++++++++++++-
>  include/linux/fs.h |    5 ++++
>  3 files changed, 72 insertions(+), 17 deletions(-)
> 
> commit c5b1f0d92c36851aca09ac6c7c0c4f9690ac14f3
> Author: Arnd Bergmann <arnd@arndb.de>
> Date:   Wed Oct 27 15:46:08 2010 +0200
> 
>     locks/nfsd: allocate file lock outside of spinlock
>     
>     As suggested by Christoph Hellwig, this moves allocation
>     of new file locks out of generic_setlease into the
>     callers, nfs4_open_delegation and fcntl_setlease in order
>     to allow GFP_KERNEL allocations when lock_flocks has
>     become a spinlock.
>     
>     Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>     Acked-by: J. Bruce Fields <bfields@redhat.com>
> 
>  fs/locks.c          |   36 ++++++++++++------------------------
>  fs/nfsd/nfs4state.c |   26 +++++++++++++++-----------
>  include/linux/fs.h  |    1 +
>  3 files changed, 28 insertions(+), 35 deletions(-)
> 
> commit a282a1fa6b23bd21ba0b86e53ed2a316b001836f
> Author: J. Bruce Fields <bfields@redhat.com>
> Date:   Tue Oct 26 18:25:30 2010 -0400
> 
>     lockd: fix nlmsvc_notify_blocked locking
>     
>     nlmsvc_notify_blocked walks the nlm_blocked list,
>     which requires nlm_blocked_lock.
>     
>     Signed-off-by: J. Bruce Fields <bfields@redhat.com>
>     Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> 
>  fs/lockd/svclock.c |    6 ++++--
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> commit 763641d81202834e9d64de2019d1edec12868f4f
> Author: Arnd Bergmann <arnd@arndb.de>
> Date:   Tue Oct 26 22:55:40 2010 +0200
> 
>     lockd: push lock_flocks down
>     
>     lockd should use lock_flocks() instead of lock_kernel()
>     to lock against posix locks accessing the i_flock list.
>     
>     This is a prerequisite to turning lock_flocks into a
>     spinlock.
>     
>     Signed-off-by: Arnd Bergmann <arnd@arndb.de>
>     Acked-by: J. Bruce Fields <bfields@redhat.com>
> 
>  fs/lockd/svc.c     |   11 -----------
>  fs/lockd/svcsubs.c |    9 ++++++++-
>  fs/nfs/Kconfig     |    1 -
>  fs/nfsd/Kconfig    |    1 -
>  4 files changed, 8 insertions(+), 14 deletions(-)

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 20:01                                               ` J. Bruce Fields
@ 2010-10-27 20:20                                                 ` Arnd Bergmann
  2010-10-27 20:24                                                   ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-27 20:20 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Linus Torvalds, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wednesday 27 October 2010 22:01:45 J. Bruce Fields wrote:
> (You're missing Linus's comment fix, though, unless I'm confused.)

Right, I didn't look into the attachment, so I hadn't noticed
the new changelog. I've updated the changelog to Linus' version,
as it's more verbose than mine.

I also checked that the version in your tree is actually the
right one, so I left the code untouched.

	Arnd

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 20:20                                                 ` Arnd Bergmann
@ 2010-10-27 20:24                                                   ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-27 20:24 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linus Torvalds, Christoph Hellwig, Bryan Schumaker, linux-nfs,
	linux-kernel

On Wed, Oct 27, 2010 at 10:20:20PM +0200, Arnd Bergmann wrote:
> On Wednesday 27 October 2010 22:01:45 J. Bruce Fields wrote:
> > (You're missing Linus's comment fix, though, unless I'm confused.)
> 
> Right, I didn't look into the attachment, so I hadn't noticed
> the new changelog. I've updated the changelog to Linus' version,
> as it's more verbose than mine.

He fixed a code comment too, though.  No big deal.

> I also checked that the version in your tree is actually the
> right one, so I left the code untouched.

Yup, looks good.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-27 14:59                           ` Christoph Hellwig
  2010-10-27 15:16                             ` J. Bruce Fields
  2010-10-27 15:23                             ` Arnd Bergmann
@ 2010-10-30 21:25                             ` J. Bruce Fields
  2010-10-30 21:31                               ` [PATCH 1/4] locks: prevent ENOMEM on lease unlock J. Bruce Fields
                                                 ` (6 more replies)
  2 siblings, 7 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-30 21:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

I noticed a couple problems with the lease patches.  Also:

On Wed, Oct 27, 2010 at 10:59:29AM -0400, Christoph Hellwig wrote:
> On Wed, Oct 27, 2010 at 10:55:39AM -0400, J. Bruce Fields wrote:
> > Hm, two problems:
> > 	- We introduce the possibility of fcntl(fd, F_SETLEASE, F_UNLCK)
> > 	  failing with ENOMEM.
> 
> splitt ->setlease into ->add_least and ->delete_lease.  No need to pass
> in a structure for the later.  No need to return one either.

We still need to do this part.  I just did the minimum required to fix
the problem--I figure splitting out a separate ->delete_lease method
(which I agree would be cleaner) could wait till the next merge window.

Patches follow, tested on top of the merge of Arnd's patches; if they
look OK they can go in, or I can commit them on top of a few pending
nfsd fixes and send another pull request.

It'd be nice if we get this fixed by -rc1, but not the end of the world
if we don't.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 1/4] locks: prevent ENOMEM on lease unlock
  2010-10-30 21:25                             ` J. Bruce Fields
@ 2010-10-30 21:31                               ` J. Bruce Fields
  2010-10-30 21:31                               ` [PATCH 2/4] locks: fix leaks on setlease errors J. Bruce Fields
                                                 ` (5 subsequent siblings)
  6 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-30 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs,
	linux-kernel, J. Bruce Fields

Removing a lock shouldn't require any allocations; a failure due to
ENOMEM leaves the caller with a choice between retrying or giving up and
leaking an unused lease.

Next we should split the other lease calls into add and delete cases.
I wanted to start with just the bugfix.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c |   43 ++++++++++++++++++++++++++++++-------------
 1 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 50ec159..06c7773 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1441,7 +1441,8 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 	return 0;
 
 out:
-	locks_free_lock(lease);
+	if (arg != F_UNLCK)
+		locks_free_lock(lease);
 	return error;
 }
 EXPORT_SYMBOL(generic_setlease);
@@ -1493,17 +1494,16 @@ int vfs_setlease(struct file *filp, long arg, struct file_lock **lease)
 }
 EXPORT_SYMBOL_GPL(vfs_setlease);
 
-/**
- *	fcntl_setlease	-	sets a lease on an open file
- *	@fd: open file descriptor
- *	@filp: file pointer
- *	@arg: type of lease to obtain
- *
- *	Call this fcntl to establish a lease on the file.
- *	Note that you also need to call %F_SETSIG to
- *	receive a signal when the lease is broken.
- */
-int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
+static int do_fcntl_delete_lease(struct file *filp)
+{
+	struct file_lock fl, *flp = &fl;
+
+	lease_init(filp, F_UNLCK, flp);
+
+	return vfs_setlease(filp, F_UNLCK, &flp);
+}
+
+static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg)
 {
 	struct file_lock *fl;
 	struct fasync_struct *new;
@@ -1521,7 +1521,7 @@ int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
 	}
 	lock_flocks();
 	error = __vfs_setlease(filp, arg, &fl);
-	if (error || arg == F_UNLCK)
+	if (error)
 		goto out_unlock;
 
 	/*
@@ -1550,6 +1550,23 @@ out_unlock:
 }
 
 /**
+ *	fcntl_setlease	-	sets a lease on an open file
+ *	@fd: open file descriptor
+ *	@filp: file pointer
+ *	@arg: type of lease to obtain
+ *
+ *	Call this fcntl to establish a lease on the file.
+ *	Note that you also need to call %F_SETSIG to
+ *	receive a signal when the lease is broken.
+ */
+int fcntl_setlease(unsigned int fd, struct file *filp, long arg)
+{
+	if (arg == F_UNLCK)
+		return do_fcntl_delete_lease(filp);
+	return do_fcntl_add_lease(fd, filp, arg);
+}
+
+/**
  * flock_lock_file_wait - Apply a FLOCK-style lock to a file
  * @filp: The file to apply the lock to
  * @fl: The lock to be applied
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 2/4] locks: fix leaks on setlease errors
  2010-10-30 21:25                             ` J. Bruce Fields
  2010-10-30 21:31                               ` [PATCH 1/4] locks: prevent ENOMEM on lease unlock J. Bruce Fields
@ 2010-10-30 21:31                               ` J. Bruce Fields
  2010-10-31 11:10                                 ` Christoph Hellwig
  2010-10-30 21:31                               ` [PATCH 3/4] locks: fix setlease methods to free passed-in lock J. Bruce Fields
                                                 ` (4 subsequent siblings)
  6 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-30 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs,
	linux-kernel, J. Bruce Fields

We're depending on setlease to free the passed-in lease on failure.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c |   12 +++++++-----
 1 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 06c7773..63fbc41 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1371,20 +1371,22 @@ int generic_setlease(struct file *filp, long arg, struct file_lock **flp)
 	struct inode *inode = dentry->d_inode;
 	int error, rdlease_count = 0, wrlease_count = 0;
 
+	lease = *flp;
+
+	error = -EACCES;
 	if ((current_fsuid() != inode->i_uid) && !capable(CAP_LEASE))
-		return -EACCES;
+		goto out;
+	error = -EINVAL;
 	if (!S_ISREG(inode->i_mode))
-		return -EINVAL;
+		goto out;
 	error = security_file_lock(filp, arg);
 	if (error)
-		return error;
+		goto out;
 
 	time_out_leases(inode);
 
 	BUG_ON(!(*flp)->fl_lmops->fl_break);
 
-	lease = *flp;
-
 	if (arg != F_UNLCK) {
 		error = -EAGAIN;
 		if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 3/4] locks: fix setlease methods to free passed-in lock
  2010-10-30 21:25                             ` J. Bruce Fields
  2010-10-30 21:31                               ` [PATCH 1/4] locks: prevent ENOMEM on lease unlock J. Bruce Fields
  2010-10-30 21:31                               ` [PATCH 2/4] locks: fix leaks on setlease errors J. Bruce Fields
@ 2010-10-30 21:31                               ` J. Bruce Fields
  2010-10-30 21:31                               ` [PATCH 4/4] nfsd4: initialize delegation pointer to lease J. Bruce Fields
                                                 ` (3 subsequent siblings)
  6 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-30 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs,
	linux-kernel, J. Bruce Fields, Steven Whitehouse, Steve French,
	Trond Myklebust

We modified setlease to require the caller to allocate the new lease in
the case of creating a new lease, but forgot to fix up the filesystem
methods.

Cc: Steven Whitehouse <swhiteho@redhat.com>
Cc: Steve French <sfrench@samba.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/cifs/cifsfs.c   |    5 ++++-
 fs/gfs2/file.c     |    2 ++
 fs/locks.c         |    3 ++-
 fs/nfs/file.c      |    3 ++-
 include/linux/fs.h |    1 +
 5 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 75c4eaa..54745b6 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -625,8 +625,11 @@ static int cifs_setlease(struct file *file, long arg, struct file_lock **lease)
 		   knows that the file won't be changed on the server
 		   by anyone else */
 		return generic_setlease(file, arg, lease);
-	else
+	else {
+		if (arg != F_UNLCK)
+			locks_free_lock(*lease);
 		return -EAGAIN;
+	}
 }
 
 struct file_system_type cifs_fs_type = {
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index aa99647..ac943c1 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -629,6 +629,8 @@ static ssize_t gfs2_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
 
 static int gfs2_setlease(struct file *file, long arg, struct file_lock **fl)
 {
+	if (arg != F_UNLCK)
+		locks_free_lock(*fl);
 	return -EINVAL;
 }
 
diff --git a/fs/locks.c b/fs/locks.c
index 63fbc41..5b526a9 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -186,7 +186,7 @@ void locks_release_private(struct file_lock *fl)
 EXPORT_SYMBOL_GPL(locks_release_private);
 
 /* Free a lock which is not in use. */
-static void locks_free_lock(struct file_lock *fl)
+void locks_free_lock(struct file_lock *fl)
 {
 	BUG_ON(waitqueue_active(&fl->fl_wait));
 	BUG_ON(!list_empty(&fl->fl_block));
@@ -195,6 +195,7 @@ static void locks_free_lock(struct file_lock *fl)
 	locks_release_private(fl);
 	kmem_cache_free(filelock_cache, fl);
 }
+EXPORT_SYMBOL(locks_free_lock);
 
 void locks_init_lock(struct file_lock *fl)
 {
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index e756075..1e524fb 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -884,6 +884,7 @@ static int nfs_setlease(struct file *file, long arg, struct file_lock **fl)
 	dprintk("NFS: setlease(%s/%s, arg=%ld)\n",
 			file->f_path.dentry->d_parent->d_name.name,
 			file->f_path.dentry->d_name.name, arg);
-
+	if (arg != F_UNLCK)
+		locks_free_lock(*fl);
 	return -EINVAL;
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7b7b507..1eb2939 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1129,6 +1129,7 @@ extern int fcntl_setlease(unsigned int fd, struct file *filp, long arg);
 extern int fcntl_getlease(struct file *filp);
 
 /* fs/locks.c */
+void locks_free_lock(struct file_lock *fl);
 extern void locks_init_lock(struct file_lock *);
 extern struct file_lock * locks_alloc_lock(void);
 extern void locks_copy_lock(struct file_lock *, struct file_lock *);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH 4/4] nfsd4: initialize delegation pointer to lease
  2010-10-30 21:25                             ` J. Bruce Fields
                                                 ` (2 preceding siblings ...)
  2010-10-30 21:31                               ` [PATCH 3/4] locks: fix setlease methods to free passed-in lock J. Bruce Fields
@ 2010-10-30 21:31                               ` J. Bruce Fields
  2010-10-31  2:04                                 ` Christoph Hellwig
  2010-10-30 21:40                               ` nfsd changes for 2.6.37 Arnd Bergmann
                                                 ` (2 subsequent siblings)
  6 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-30 21:31 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Arnd Bergmann, Bryan Schumaker, linux-nfs,
	linux-kernel, J. Bruce Fields

The NFSv4 server was initializing the dp->dl_flock pointer by the
somewhat ridiculous method of a locks_copy_lock callback.

Now that setlease uses the passed-in lock instead of doing a copy,
dl_flock no longer gets set, resulting in the lock leaking on delegation
release, and later possible hangs (among other problems).

So, initialize dl_flock and get rid of the callback.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/nfsd/nfs4state.c |   19 ++-----------------
 1 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 56347e0..b7f818b 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -2310,22 +2310,6 @@ void nfsd_release_deleg_cb(struct file_lock *fl)
 }
 
 /*
- * Set the delegation file_lock back pointer.
- *
- * Called from setlease() with lock_kernel() held.
- */
-static
-void nfsd_copy_lock_deleg_cb(struct file_lock *new, struct file_lock *fl)
-{
-	struct nfs4_delegation *dp = (struct nfs4_delegation *)new->fl_owner;
-
-	dprintk("NFSD: nfsd_copy_lock_deleg_cb: new fl %p dp %p\n", new, dp);
-	if (!dp)
-		return;
-	dp->dl_flock = new;
-}
-
-/*
  * Called from setlease() with lock_kernel() held
  */
 static
@@ -2355,7 +2339,6 @@ int nfsd_change_deleg_cb(struct file_lock **onlist, int arg)
 static const struct lock_manager_operations nfsd_lease_mng_ops = {
 	.fl_break = nfsd_break_deleg_cb,
 	.fl_release_private = nfsd_release_deleg_cb,
-	.fl_copy_lock = nfsd_copy_lock_deleg_cb,
 	.fl_mylease = nfsd_same_client_deleg_cb,
 	.fl_change = nfsd_change_deleg_cb,
 };
@@ -2661,12 +2644,14 @@ nfs4_open_delegation(struct svc_fh *fh, struct nfsd4_open *open, struct nfs4_sta
 	fl->fl_file = find_readable_file(stp->st_file);
 	BUG_ON(!fl->fl_file);
 	fl->fl_pid = current->tgid;
+	dp->dl_flock = fl;
 
 	/* vfs_setlease checks to see if delegation should be handed out.
 	 * the lock_manager callbacks fl_mylease and fl_change are used
 	 */
 	if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) {
 		dprintk("NFSD: setlease failed [%d], no delegation\n", status);
+		dp->dl_flock = NULL;
 		unhash_delegation(dp);
 		flag = NFS4_OPEN_DELEGATE_NONE;
 		goto out;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-30 21:25                             ` J. Bruce Fields
                                                 ` (3 preceding siblings ...)
  2010-10-30 21:31                               ` [PATCH 4/4] nfsd4: initialize delegation pointer to lease J. Bruce Fields
@ 2010-10-30 21:40                               ` Arnd Bergmann
  2010-10-31  2:07                               ` Christoph Hellwig
  2010-10-31 12:34                               ` Christoph Hellwig
  6 siblings, 0 replies; 62+ messages in thread
From: Arnd Bergmann @ 2010-10-30 21:40 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Linus Torvalds, Bryan Schumaker, linux-nfs,
	linux-kernel

On Saturday 30 October 2010, J. Bruce Fields wrote:
> We still need to do this part.  I just did the minimum required to fix
> the problem--I figure splitting out a separate ->delete_lease method
> (which I agree would be cleaner) could wait till the next merge window.
> 
> Patches follow, tested on top of the merge of Arnd's patches; if they
> look OK they can go in, or I can commit them on top of a few pending
> nfsd fixes and send another pull request.

All four patches look good AFAICT.

Thanks for the follow-up!

	Arnd

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 4/4] nfsd4: initialize delegation pointer to lease
  2010-10-30 21:31                               ` [PATCH 4/4] nfsd4: initialize delegation pointer to lease J. Bruce Fields
@ 2010-10-31  2:04                                 ` Christoph Hellwig
  2010-10-31  3:04                                   ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-31  2:04 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Linus Torvalds, Christoph Hellwig, Arnd Bergmann,
	Bryan Schumaker, linux-nfs, linux-kernel

On Sat, Oct 30, 2010 at 05:31:16PM -0400, J. Bruce Fields wrote:
> The NFSv4 server was initializing the dp->dl_flock pointer by the
> somewhat ridiculous method of a locks_copy_lock callback.
> 
> Now that setlease uses the passed-in lock instead of doing a copy,
> dl_flock no longer gets set, resulting in the lock leaking on delegation
> release, and later possible hangs (among other problems).
> 
> So, initialize dl_flock and get rid of the callback.

>From what I can see this was the only instance of 
lock_manager_operations.fl_copy_lock.  Please kill it while you're at
it.

Also lock_manager_operations.fl_release_private has only exact one
instance in nfs4d which is part of the same abuse scheme.  Please also
get rid of it.   I recently noticed this while updating
Documentation/filesystems/Locking for the grand new BKL-less world.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-30 21:25                             ` J. Bruce Fields
                                                 ` (4 preceding siblings ...)
  2010-10-30 21:40                               ` nfsd changes for 2.6.37 Arnd Bergmann
@ 2010-10-31  2:07                               ` Christoph Hellwig
  2010-10-31  3:05                                 ` J. Bruce Fields
  2010-10-31 12:34                               ` Christoph Hellwig
  6 siblings, 1 reply; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-31  2:07 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds,
	Bryan Schumaker, linux-nfs, linux-kernel

On Sat, Oct 30, 2010 at 05:25:00PM -0400, J. Bruce Fields wrote:
> We still need to do this part.  I just did the minimum required to fix
> the problem--I figure splitting out a separate ->delete_lease method
> (which I agree would be cleaner) could wait till the next merge window.

I'd say please do it now.  The pre-allocated file_lock changes make
the semantics inside ->setlease really nasty right now.  I think it's a
dumb idea to leave it that way even for a single merge window.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 4/4] nfsd4: initialize delegation pointer to lease
  2010-10-31  2:04                                 ` Christoph Hellwig
@ 2010-10-31  3:04                                   ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-31  3:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: J. Bruce Fields, Linus Torvalds, Arnd Bergmann, Bryan Schumaker,
	linux-nfs, linux-kernel

On Sat, Oct 30, 2010 at 10:04:31PM -0400, Christoph Hellwig wrote:
> On Sat, Oct 30, 2010 at 05:31:16PM -0400, J. Bruce Fields wrote:
> > The NFSv4 server was initializing the dp->dl_flock pointer by the
> > somewhat ridiculous method of a locks_copy_lock callback.
> > 
> > Now that setlease uses the passed-in lock instead of doing a copy,
> > dl_flock no longer gets set, resulting in the lock leaking on delegation
> > release, and later possible hangs (among other problems).
> > 
> > So, initialize dl_flock and get rid of the callback.
> 
> >From what I can see this was the only instance of 
> lock_manager_operations.fl_copy_lock.  Please kill it while you're at
> it.
> 
> Also lock_manager_operations.fl_release_private has only exact one
> instance in nfs4d which is part of the same abuse scheme.  Please also
> get rid of it.   I recently noticed this while updating
> Documentation/filesystems/Locking for the grand new BKL-less world.

Yeah, I wanted to maximize chances of getting the minimal fixes into
-rc1, but I've got patches for that queued up too.  I was planning on
waiting for the next merge window, but I'm happy enough to send them in
now.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-31  2:07                               ` Christoph Hellwig
@ 2010-10-31  3:05                                 ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-10-31  3:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

On Sat, Oct 30, 2010 at 10:07:08PM -0400, Christoph Hellwig wrote:
> On Sat, Oct 30, 2010 at 05:25:00PM -0400, J. Bruce Fields wrote:
> > We still need to do this part.  I just did the minimum required to fix
> > the problem--I figure splitting out a separate ->delete_lease method
> > (which I agree would be cleaner) could wait till the next merge window.
> 
> I'd say please do it now.  The pre-allocated file_lock changes make
> the semantics inside ->setlease really nasty right now.  I think it's a
> dumb idea to leave it that way even for a single merge window.

OK, will do.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] locks: fix leaks on setlease errors
  2010-10-30 21:31                               ` [PATCH 2/4] locks: fix leaks on setlease errors J. Bruce Fields
@ 2010-10-31 11:10                                 ` Christoph Hellwig
  2010-11-01 17:24                                   ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-31 11:10 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Linus Torvalds, Christoph Hellwig, Arnd Bergmann,
	Bryan Schumaker, linux-nfs, linux-kernel

On Sat, Oct 30, 2010 at 05:31:14PM -0400, J. Bruce Fields wrote:
> We're depending on setlease to free the passed-in lease on failure.

But we would be much better to just free it in the caller.  I'ts much
more natural - caller allocates, caller frees, and it's also simpler.

I'll send a patch to do so shortly, together with sorting out the
remaining nfs4d lock_manager_operations abuses.  I think we're set with
that for 2.6.37, the setlease split can wait once we've sorted out the
lock freeing issue.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-30 21:25                             ` J. Bruce Fields
                                                 ` (5 preceding siblings ...)
  2010-10-31  2:07                               ` Christoph Hellwig
@ 2010-10-31 12:34                               ` Christoph Hellwig
  2010-10-31 12:35                                 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig
                                                   ` (2 more replies)
  6 siblings, 3 replies; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-31 12:34 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

And two more fixups on top of the set sent by Bruce that already are
in Linus' tree.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure
  2010-10-31 12:34                               ` Christoph Hellwig
@ 2010-10-31 12:35                                 ` Christoph Hellwig
  2010-11-03 20:41                                   ` J. Bruce Fields
  2010-10-31 12:35                                 ` [PATCH 2/2] locks: remove fl_copy_lock lock_manager operation Christoph Hellwig
  2010-11-01 15:02                                 ` nfsd changes for 2.6.37 J. Bruce Fields
  2 siblings, 1 reply; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-31 12:35 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

The caller allocated it, the caller should free it.  The only issue so
far is that we could change the flp pointer even on an error return if
the fl_change callback failed.  But we can simply move the flp assignment
after the fl_change invocation, as the callers don't care about the
flp return value if the setlease call failed.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: linux-2.6/fs/cifs/cifsfs.c
===================================================================
--- linux-2.6.orig/fs/cifs/cifsfs.c	2010-10-31 07:10:07.636004223 -0400
+++ linux-2.6/fs/cifs/cifsfs.c	2010-10-31 07:10:10.922004154 -0400
@@ -625,11 +625,8 @@ static int cifs_setlease(struct file *fi
 		   knows that the file won't be changed on the server
 		   by anyone else */
 		return generic_setlease(file, arg, lease);
-	else {
-		if (arg != F_UNLCK)
-			locks_free_lock(*lease);
+	else
 		return -EAGAIN;
-	}
 }
 
 struct file_system_type cifs_fs_type = {
Index: linux-2.6/fs/gfs2/file.c
===================================================================
--- linux-2.6.orig/fs/gfs2/file.c	2010-10-31 07:10:07.643004363 -0400
+++ linux-2.6/fs/gfs2/file.c	2010-10-31 07:10:10.923003665 -0400
@@ -629,8 +629,6 @@ static ssize_t gfs2_file_aio_write(struc
 
 static int gfs2_setlease(struct file *file, long arg, struct file_lock **fl)
 {
-	if (arg != F_UNLCK)
-		locks_free_lock(*fl);
 	return -EINVAL;
 }
 
Index: linux-2.6/fs/locks.c
===================================================================
--- linux-2.6.orig/fs/locks.c	2010-10-31 07:10:07.649004084 -0400
+++ linux-2.6/fs/locks.c	2010-10-31 07:34:10.102255587 -0400
@@ -1428,8 +1425,9 @@ int generic_setlease(struct file *filp,
 		goto out;
 
 	if (my_before != NULL) {
-		*flp = *my_before;
 		error = lease->fl_lmops->fl_change(my_before, arg);
+		if (!error)
+			*flp = *my_before;
 		goto out;
 	}
 
@@ -1444,8 +1442,6 @@ int generic_setlease(struct file *filp,
 	return 0;
 
 out:
-	if (arg != F_UNLCK)
-		locks_free_lock(lease);
 	return error;
 }
 EXPORT_SYMBOL(generic_setlease);
@@ -1524,8 +1520,11 @@ static int do_fcntl_add_lease(unsigned i
 	}
 	lock_flocks();
 	error = __vfs_setlease(filp, arg, &fl);
-	if (error)
-		goto out_unlock;
+	if (error) {
+		unlock_flocks();
+		locks_free_lock(fl);
+		goto out_free_fasync;
+	}
 
 	/*
 	 * fasync_insert_entry() returns the old entry if any.
@@ -1541,12 +1540,12 @@ static int do_fcntl_add_lease(unsigned i
 		fl->fl_type = F_UNLCK | F_INPROGRESS;
 		fl->fl_break_time = jiffies - 10;
 		time_out_leases(inode);
-		goto out_unlock;
+	} else {
+		error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
 	}
-
-	error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
-out_unlock:
 	unlock_flocks();
+
+out_free_fasync:
 	if (new)
 		fasync_free(new);
 	return error;
Index: linux-2.6/fs/nfs/file.c
===================================================================
--- linux-2.6.orig/fs/nfs/file.c	2010-10-31 07:10:07.658003804 -0400
+++ linux-2.6/fs/nfs/file.c	2010-10-31 07:10:10.936003734 -0400
@@ -884,7 +884,5 @@ static int nfs_setlease(struct file *fil
 	dprintk("NFS: setlease(%s/%s, arg=%ld)\n",
 			file->f_path.dentry->d_parent->d_name.name,
 			file->f_path.dentry->d_name.name, arg);
-	if (arg != F_UNLCK)
-		locks_free_lock(*fl);
 	return -EINVAL;
 }
Index: linux-2.6/fs/nfsd/nfs4state.c
===================================================================
--- linux-2.6.orig/fs/nfsd/nfs4state.c	2010-10-31 07:10:07.666004084 -0400
+++ linux-2.6/fs/nfsd/nfs4state.c	2010-10-31 07:32:56.906254608 -0400
@@ -2652,6 +2652,7 @@ nfs4_open_delegation(struct svc_fh *fh,
 	if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) {
 		dprintk("NFSD: setlease failed [%d], no delegation\n", status);
 		dp->dl_flock = NULL;
+		locks_free_lock(fl);
 		unhash_delegation(dp);
 		flag = NFS4_OPEN_DELEGATE_NONE;
 		goto out;

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH 2/2] locks: remove fl_copy_lock lock_manager operation
  2010-10-31 12:34                               ` Christoph Hellwig
  2010-10-31 12:35                                 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig
@ 2010-10-31 12:35                                 ` Christoph Hellwig
  2010-11-01 15:02                                 ` nfsd changes for 2.6.37 J. Bruce Fields
  2 siblings, 0 replies; 62+ messages in thread
From: Christoph Hellwig @ 2010-10-31 12:35 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

This one was only used for a nasty hack in nfsd, which has recently
been removed.

Signed-off-by: Christoph Hellwig <hch@lst.de>

Index: linux-2.6/fs/locks.c
===================================================================
--- linux-2.6.orig/fs/locks.c	2010-10-31 07:10:07.649004084 -0400
+++ linux-2.6/fs/locks.c	2010-10-31 07:34:10.102255587 -0400
@@ -235,11 +235,8 @@ static void locks_copy_private(struct fi
 			fl->fl_ops->fl_copy_lock(new, fl);
 		new->fl_ops = fl->fl_ops;
 	}
-	if (fl->fl_lmops) {
-		if (fl->fl_lmops->fl_copy_lock)
-			fl->fl_lmops->fl_copy_lock(new, fl);
+	if (fl->fl_lmops)
 		new->fl_lmops = fl->fl_lmops;
-	}
 }
 
 /*
Index: linux-2.6/Documentation/filesystems/Locking
===================================================================
--- linux-2.6.orig/Documentation/filesystems/Locking	2010-10-31 07:34:13.269012536 -0400
+++ linux-2.6/Documentation/filesystems/Locking	2010-10-31 07:34:20.567292713 -0400
@@ -322,7 +322,6 @@ fl_release_private:	yes	yes
 prototypes:
 	int (*fl_compare_owner)(struct file_lock *, struct file_lock *);
 	void (*fl_notify)(struct file_lock *);  /* unblock callback */
-	void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
 	void (*fl_release_private)(struct file_lock *);
 	void (*fl_break)(struct file_lock *); /* break_lease callback */
 
@@ -330,7 +329,6 @@ locking rules:
 			BKL	may block
 fl_compare_owner:	yes	no
 fl_notify:		yes	no
-fl_copy_lock:		yes	no
 fl_release_private:	yes	yes
 fl_break:		yes	no
 
Index: linux-2.6/include/linux/fs.h
===================================================================
--- linux-2.6.orig/include/linux/fs.h	2010-10-31 07:33:13.131262640 -0400
+++ linux-2.6/include/linux/fs.h	2010-10-31 07:33:25.959005833 -0400
@@ -1056,7 +1056,6 @@ struct lock_manager_operations {
 	int (*fl_compare_owner)(struct file_lock *, struct file_lock *);
 	void (*fl_notify)(struct file_lock *);	/* unblock callback */
 	int (*fl_grant)(struct file_lock *, struct file_lock *, int);
-	void (*fl_copy_lock)(struct file_lock *, struct file_lock *);
 	void (*fl_release_private)(struct file_lock *);
 	void (*fl_break)(struct file_lock *);
 	int (*fl_mylease)(struct file_lock *, struct file_lock *);

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-10-31 12:34                               ` Christoph Hellwig
  2010-10-31 12:35                                 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig
  2010-10-31 12:35                                 ` [PATCH 2/2] locks: remove fl_copy_lock lock_manager operation Christoph Hellwig
@ 2010-11-01 15:02                                 ` J. Bruce Fields
  2010-11-06 19:04                                   ` Christoph Hellwig
  2 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-11-01 15:02 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

On Sun, Oct 31, 2010 at 08:34:20AM -0400, Christoph Hellwig wrote:
> And two more fixups on top of the set sent by Bruce that already are
> in Linus' tree.

Both look good to me, thanks!

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] locks: fix leaks on setlease errors
  2010-10-31 11:10                                 ` Christoph Hellwig
@ 2010-11-01 17:24                                   ` J. Bruce Fields
  2010-11-01 17:41                                     ` Christoph Hellwig
  0 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-11-01 17:24 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: J. Bruce Fields, Linus Torvalds, Arnd Bergmann, Bryan Schumaker,
	linux-nfs, linux-kernel

On Sun, Oct 31, 2010 at 07:10:57AM -0400, Christoph Hellwig wrote:
> On Sat, Oct 30, 2010 at 05:31:14PM -0400, J. Bruce Fields wrote:
> > We're depending on setlease to free the passed-in lease on failure.
> 
> But we would be much better to just free it in the caller.  I'ts much
> more natural - caller allocates, caller frees, and it's also simpler.
> 
> I'll send a patch to do so shortly, together with sorting out the
> remaining nfs4d lock_manager_operations abuses.  I think we're set with
> that for 2.6.37, the setlease split can wait once we've sorted out the
> lock freeing issue.

I also have patches that get rid of fl_release_private, fl_mylease, and
(almost done) fl_change.

Unless you've a better suggestion I'll probably send them out for review
and then queue them up with other nfsd changes for 2.6.38.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] locks: fix leaks on setlease errors
  2010-11-01 17:24                                   ` J. Bruce Fields
@ 2010-11-01 17:41                                     ` Christoph Hellwig
  2010-11-01 18:34                                       ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: Christoph Hellwig @ 2010-11-01 17:41 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, J. Bruce Fields, Linus Torvalds,
	Arnd Bergmann, Bryan Schumaker, linux-nfs, linux-kernel

On Mon, Nov 01, 2010 at 01:24:40PM -0400, J. Bruce Fields wrote:
> I also have patches that get rid of fl_release_private, fl_mylease, and
> (almost done) fl_change.
> 
> Unless you've a better suggestion I'll probably send them out for review
> and then queue them up with other nfsd changes for 2.6.38.

Sounds good.  I was also wondering if we can get rid of ->setlease
entirely.  The file_lock_operations should be enough abstraction to
reject the leases in theory, but I need to look into it a bit more.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 2/4] locks: fix leaks on setlease errors
  2010-11-01 17:41                                     ` Christoph Hellwig
@ 2010-11-01 18:34                                       ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-11-01 18:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: J. Bruce Fields, Linus Torvalds, Arnd Bergmann, Bryan Schumaker,
	linux-nfs, linux-kernel

On Mon, Nov 01, 2010 at 01:41:22PM -0400, Christoph Hellwig wrote:
> On Mon, Nov 01, 2010 at 01:24:40PM -0400, J. Bruce Fields wrote:
> > I also have patches that get rid of fl_release_private, fl_mylease, and
> > (almost done) fl_change.
> > 
> > Unless you've a better suggestion I'll probably send them out for review
> > and then queue them up with other nfsd changes for 2.6.38.
> 
> Sounds good.  I was also wondering if we can get rid of ->setlease
> entirely.  The file_lock_operations should be enough abstraction to
> reject the leases in theory, but I need to look into it a bit more.

Note that cifs has a real lease operation now.  No idea how it works or
if it's really correct.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure
  2010-10-31 12:35                                 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig
@ 2010-11-03 20:41                                   ` J. Bruce Fields
  2010-11-04  1:40                                     ` J. Bruce Fields
  0 siblings, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-11-03 20:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

On Sun, Oct 31, 2010 at 08:35:10AM -0400, Christoph Hellwig wrote:
> The caller allocated it, the caller should free it.  The only issue so
> far is that we could change the flp pointer even on an error return if
> the fl_change callback failed.  But we can simply move the flp assignment
> after the fl_change invocation, as the callers don't care about the
> flp return value if the setlease call failed.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Index: linux-2.6/fs/cifs/cifsfs.c
> ===================================================================
> --- linux-2.6.orig/fs/cifs/cifsfs.c	2010-10-31 07:10:07.636004223 -0400
> +++ linux-2.6/fs/cifs/cifsfs.c	2010-10-31 07:10:10.922004154 -0400
> @@ -625,11 +625,8 @@ static int cifs_setlease(struct file *fi
>  		   knows that the file won't be changed on the server
>  		   by anyone else */
>  		return generic_setlease(file, arg, lease);
> -	else {
> -		if (arg != F_UNLCK)
> -			locks_free_lock(*lease);
> +	else
>  		return -EAGAIN;
> -	}
>  }
>  
>  struct file_system_type cifs_fs_type = {
> Index: linux-2.6/fs/gfs2/file.c
> ===================================================================
> --- linux-2.6.orig/fs/gfs2/file.c	2010-10-31 07:10:07.643004363 -0400
> +++ linux-2.6/fs/gfs2/file.c	2010-10-31 07:10:10.923003665 -0400
> @@ -629,8 +629,6 @@ static ssize_t gfs2_file_aio_write(struc
>  
>  static int gfs2_setlease(struct file *file, long arg, struct file_lock **fl)
>  {
> -	if (arg != F_UNLCK)
> -		locks_free_lock(*fl);
>  	return -EINVAL;
>  }
>  
> Index: linux-2.6/fs/locks.c
> ===================================================================
> --- linux-2.6.orig/fs/locks.c	2010-10-31 07:10:07.649004084 -0400
> +++ linux-2.6/fs/locks.c	2010-10-31 07:34:10.102255587 -0400
> @@ -1428,8 +1425,9 @@ int generic_setlease(struct file *filp,
>  		goto out;
>  
>  	if (my_before != NULL) {
> -		*flp = *my_before;
>  		error = lease->fl_lmops->fl_change(my_before, arg);
> +		if (!error)
> +			*flp = *my_before;

Argh, missed this: we're leaking the passed-in lease in this case.

--b.

>  		goto out;
>  	}
>  
> @@ -1444,8 +1442,6 @@ int generic_setlease(struct file *filp,
>  	return 0;
>  
>  out:
> -	if (arg != F_UNLCK)
> -		locks_free_lock(lease);
>  	return error;
>  }
>  EXPORT_SYMBOL(generic_setlease);
> @@ -1524,8 +1520,11 @@ static int do_fcntl_add_lease(unsigned i
>  	}
>  	lock_flocks();
>  	error = __vfs_setlease(filp, arg, &fl);
> -	if (error)
> -		goto out_unlock;
> +	if (error) {
> +		unlock_flocks();
> +		locks_free_lock(fl);
> +		goto out_free_fasync;
> +	}
>  
>  	/*
>  	 * fasync_insert_entry() returns the old entry if any.
> @@ -1541,12 +1540,12 @@ static int do_fcntl_add_lease(unsigned i
>  		fl->fl_type = F_UNLCK | F_INPROGRESS;
>  		fl->fl_break_time = jiffies - 10;
>  		time_out_leases(inode);
> -		goto out_unlock;
> +	} else {
> +		error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
>  	}
> -
> -	error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
> -out_unlock:
>  	unlock_flocks();
> +
> +out_free_fasync:
>  	if (new)
>  		fasync_free(new);
>  	return error;
> Index: linux-2.6/fs/nfs/file.c
> ===================================================================
> --- linux-2.6.orig/fs/nfs/file.c	2010-10-31 07:10:07.658003804 -0400
> +++ linux-2.6/fs/nfs/file.c	2010-10-31 07:10:10.936003734 -0400
> @@ -884,7 +884,5 @@ static int nfs_setlease(struct file *fil
>  	dprintk("NFS: setlease(%s/%s, arg=%ld)\n",
>  			file->f_path.dentry->d_parent->d_name.name,
>  			file->f_path.dentry->d_name.name, arg);
> -	if (arg != F_UNLCK)
> -		locks_free_lock(*fl);
>  	return -EINVAL;
>  }
> Index: linux-2.6/fs/nfsd/nfs4state.c
> ===================================================================
> --- linux-2.6.orig/fs/nfsd/nfs4state.c	2010-10-31 07:10:07.666004084 -0400
> +++ linux-2.6/fs/nfsd/nfs4state.c	2010-10-31 07:32:56.906254608 -0400
> @@ -2652,6 +2652,7 @@ nfs4_open_delegation(struct svc_fh *fh,
>  	if ((status = vfs_setlease(fl->fl_file, fl->fl_type, &fl))) {
>  		dprintk("NFSD: setlease failed [%d], no delegation\n", status);
>  		dp->dl_flock = NULL;
> +		locks_free_lock(fl);
>  		unhash_delegation(dp);
>  		flag = NFS4_OPEN_DELEGATE_NONE;
>  		goto out;

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure
  2010-11-03 20:41                                   ` J. Bruce Fields
@ 2010-11-04  1:40                                     ` J. Bruce Fields
  2010-11-04  1:41                                       ` J. Bruce Fields
  2010-11-06 19:03                                       ` Christoph Hellwig
  0 siblings, 2 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-11-04  1:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

On Wed, Nov 03, 2010 at 04:41:48PM -0400, J. Bruce Fields wrote:
> On Sun, Oct 31, 2010 at 08:35:10AM -0400, Christoph Hellwig wrote:
> > Index: linux-2.6/fs/locks.c
> > ===================================================================
> > --- linux-2.6.orig/fs/locks.c	2010-10-31 07:10:07.649004084 -0400
> > +++ linux-2.6/fs/locks.c	2010-10-31 07:34:10.102255587 -0400
> > @@ -1428,8 +1425,9 @@ int generic_setlease(struct file *filp,
> >  		goto out;
> >  
> >  	if (my_before != NULL) {
> > -		*flp = *my_before;
> >  		error = lease->fl_lmops->fl_change(my_before, arg);
> > +		if (!error)
> > +			*flp = *my_before;
> 
> Argh, missed this: we're leaking the passed-in lease in this case.

We could do something like this.

The irritating thing is that the only lease user I understand is the
nfsd code, and it doesn't want this lease-merging behavior; the only
reason that fl_change is there is so it can just turn this case into an
error every time.

And I have no idea what the requirements are of any other users: do
leases behave like this on purpose, or was it just an arbitrary choice,
and does anyone depend on it now?

In the end maybe it would be better just to leave leases as they are and
define a new lock type for nfsd.

We'd probably have to do that eventually anyway, and it'd save me trying
to guess what the lease semantics are supposed to be....

--b.

>From f5c6d51ae638af213d8bda31504f4b2287b8a801 Mon Sep 17 00:00:00 2001
From: J. Bruce Fields <bfields@redhat.com>
Date: Wed, 3 Nov 2010 16:49:44 -0400
Subject: [PATCH 1/2] locks: fix leak on merging leases

We must also free the passed-in lease in the case it wasn't used because
an existing lease was upgrade/downgraded or already existed.

Note the nfsd caller doesn't care because it's fl_change callback
returns an error in those cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c |    7 +++++--
 1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 65765cb..61c22f7 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1504,7 +1504,7 @@ static int do_fcntl_delete_lease(struct file *filp)
 
 static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg)
 {
-	struct file_lock *fl;
+	struct file_lock *fl, *ret;
 	struct fasync_struct *new;
 	struct inode *inode = filp->f_path.dentry->d_inode;
 	int error;
@@ -1518,6 +1518,7 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg)
 		locks_free_lock(fl);
 		return -ENOMEM;
 	}
+	ret = fl;
 	lock_flocks();
 	error = __vfs_setlease(filp, arg, &fl);
 	if (error) {
@@ -1525,6 +1526,8 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg)
 		locks_free_lock(fl);
 		goto out_free_fasync;
 	}
+	if (ret != fl)
+		locks_free_lock(fl);
 
 	/*
 	 * fasync_insert_entry() returns the old entry if any.
@@ -1532,7 +1535,7 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg)
 	 * inserted it into the fasync list. Clear new so that
 	 * we don't release it here.
 	 */
-	if (!fasync_insert_entry(fd, filp, &fl->fl_fasync, new))
+	if (!fasync_insert_entry(fd, filp, &ret->fl_fasync, new))
 		new = NULL;
 
 	if (error < 0) {
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure
  2010-11-04  1:40                                     ` J. Bruce Fields
@ 2010-11-04  1:41                                       ` J. Bruce Fields
  2010-11-06 19:03                                         ` Christoph Hellwig
  2010-11-06 19:03                                       ` Christoph Hellwig
  1 sibling, 1 reply; 62+ messages in thread
From: J. Bruce Fields @ 2010-11-04  1:41 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs, linux-kernel

On Wed, Nov 03, 2010 at 09:40:24PM -0400, J. Bruce Fields wrote:
> On Wed, Nov 03, 2010 at 04:41:48PM -0400, J. Bruce Fields wrote:
> > On Sun, Oct 31, 2010 at 08:35:10AM -0400, Christoph Hellwig wrote:
> > > Index: linux-2.6/fs/locks.c
> > > ===================================================================
> > > --- linux-2.6.orig/fs/locks.c	2010-10-31 07:10:07.649004084 -0400
> > > +++ linux-2.6/fs/locks.c	2010-10-31 07:34:10.102255587 -0400
> > > @@ -1428,8 +1425,9 @@ int generic_setlease(struct file *filp,
> > >  		goto out;
> > >  
> > >  	if (my_before != NULL) {
> > > -		*flp = *my_before;
> > >  		error = lease->fl_lmops->fl_change(my_before, arg);
> > > +		if (!error)
> > > +			*flp = *my_before;
> > 
> > Argh, missed this: we're leaking the passed-in lease in this case.
> 
> We could do something like this.

And while I was here I noticed:

--b.

>From 072be1f975c6f839661cfc4f6c45ccb944417eea Mon Sep 17 00:00:00 2001
From: J. Bruce Fields <bfields@redhat.com>
Date: Wed, 3 Nov 2010 18:09:18 -0400
Subject: [PATCH 2/2] locks: remove dead lease error-handling code

A minor oversight from f7347ce4ee7c65415f84be915c018473e7076f31,
"fasync: re-organize fasync entry insertion to allow it under a
spinlock": this cleanup-on-error was only needed to handle -ENOMEM.  Now
that we're preallocating it's unneeded.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
---
 fs/locks.c |   12 ++----------
 1 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 61c22f7..0e62dd3 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1506,7 +1506,6 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg)
 {
 	struct file_lock *fl, *ret;
 	struct fasync_struct *new;
-	struct inode *inode = filp->f_path.dentry->d_inode;
 	int error;
 
 	fl = lease_alloc(filp, arg);
@@ -1520,7 +1519,7 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg)
 	}
 	ret = fl;
 	lock_flocks();
-	error = __vfs_setlease(filp, arg, &fl);
+	error = __vfs_setlease(filp, arg, &ret);
 	if (error) {
 		unlock_flocks();
 		locks_free_lock(fl);
@@ -1538,14 +1537,7 @@ static int do_fcntl_add_lease(unsigned int fd, struct file *filp, long arg)
 	if (!fasync_insert_entry(fd, filp, &ret->fl_fasync, new))
 		new = NULL;
 
-	if (error < 0) {
-		/* remove lease just inserted by setlease */
-		fl->fl_type = F_UNLCK | F_INPROGRESS;
-		fl->fl_break_time = jiffies - 10;
-		time_out_leases(inode);
-	} else {
-		error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
-	}
+	error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
 	unlock_flocks();
 
 out_free_fasync:
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure
  2010-11-04  1:40                                     ` J. Bruce Fields
  2010-11-04  1:41                                       ` J. Bruce Fields
@ 2010-11-06 19:03                                       ` Christoph Hellwig
  2010-11-08 16:10                                         ` J. Bruce Fields
  1 sibling, 1 reply; 62+ messages in thread
From: Christoph Hellwig @ 2010-11-06 19:03 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds,
	Bryan Schumaker, linux-nfs, linux-kernel, willy, sfr

On Wed, Nov 03, 2010 at 09:40:24PM -0400, J. Bruce Fields wrote:
> The irritating thing is that the only lease user I understand is the
> nfsd code, and it doesn't want this lease-merging behavior; the only
> reason that fl_change is there is so it can just turn this case into an
> error every time.

Yes.

> And I have no idea what the requirements are of any other users: do
> leases behave like this on purpose, or was it just an arbitrary choice,
> and does anyone depend on it now?

Adding Willy and Stephen to the Cc list as they wrote the code.

> In the end maybe it would be better just to leave leases as they are and
> define a new lock type for nfsd.
>
> We'd probably have to do that eventually anyway, and it'd save me trying
> to guess what the lease semantics are supposed to be....

I'd rather see both leases and the nfs4 delegations detangled from the
locks.c code.  It's far too much of a mess already anyway.

> Subject: [PATCH 1/2] locks: fix leak on merging leases
> 
> We must also free the passed-in lease in the case it wasn't used because
> an existing lease was upgrade/downgraded or already existed.
> 
> Note the nfsd caller doesn't care because it's fl_change callback
> returns an error in those cases.

The patch looks good to me.  Care to feed it to Linus?


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure
  2010-11-04  1:41                                       ` J. Bruce Fields
@ 2010-11-06 19:03                                         ` Christoph Hellwig
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Hellwig @ 2010-11-06 19:03 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds,
	Bryan Schumaker, linux-nfs, linux-kernel

> A minor oversight from f7347ce4ee7c65415f84be915c018473e7076f31,
> "fasync: re-organize fasync entry insertion to allow it under a
> spinlock": this cleanup-on-error was only needed to handle -ENOMEM.  Now
> that we're preallocating it's unneeded.

Looks good to me, too.


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: nfsd changes for 2.6.37
  2010-11-01 15:02                                 ` nfsd changes for 2.6.37 J. Bruce Fields
@ 2010-11-06 19:04                                   ` Christoph Hellwig
  0 siblings, 0 replies; 62+ messages in thread
From: Christoph Hellwig @ 2010-11-06 19:04 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Christoph Hellwig, Arnd Bergmann, Linus Torvalds,
	Bryan Schumaker, linux-nfs, linux-kernel

Btw, the nfsd delegation code still has a few comments referring to the
BKL.  This should probably be cleaned up:

fs/nfsd/nfs4state.c: * Called from break_lease() with lock_kernel() held.
fs/nfsd/nfs4state.c:    /* only place dl_time is set. protected by lock_kernel*/
fs/nfsd/nfs4state.c: * Called by locks_free_lock() with lock_kernel() held.
fs/nfsd/nfs4state.c: * Called from setlease() with lock_kernel() held


^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure
  2010-11-06 19:03                                       ` Christoph Hellwig
@ 2010-11-08 16:10                                         ` J. Bruce Fields
  0 siblings, 0 replies; 62+ messages in thread
From: J. Bruce Fields @ 2010-11-08 16:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Arnd Bergmann, Linus Torvalds, Bryan Schumaker, linux-nfs,
	linux-kernel, willy, sfr

On Sat, Nov 06, 2010 at 03:03:32PM -0400, Christoph Hellwig wrote:
> On Wed, Nov 03, 2010 at 09:40:24PM -0400, J. Bruce Fields wrote:
> > The irritating thing is that the only lease user I understand is the
> > nfsd code, and it doesn't want this lease-merging behavior; the only
> > reason that fl_change is there is so it can just turn this case into an
> > error every time.
> 
> Yes.
> 
> > And I have no idea what the requirements are of any other users: do
> > leases behave like this on purpose, or was it just an arbitrary choice,
> > and does anyone depend on it now?
> 
> Adding Willy and Stephen to the Cc list as they wrote the code.
> 
> > In the end maybe it would be better just to leave leases as they are and
> > define a new lock type for nfsd.
> >
> > We'd probably have to do that eventually anyway, and it'd save me trying
> > to guess what the lease semantics are supposed to be....
> 
> I'd rather see both leases and the nfs4 delegations detangled from the
> locks.c code.

What are you thinking of?

> It's far too much of a mess already anyway.
> 
> > Subject: [PATCH 1/2] locks: fix leak on merging leases
> > 
> > We must also free the passed-in lease in the case it wasn't used because
> > an existing lease was upgrade/downgraded or already existed.
> > 
> > Note the nfsd caller doesn't care because it's fl_change callback
> > returns an error in those cases.
> 
> The patch looks good to me.  Care to feed it to Linus?

Yep, will do.

--b.

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2010-11-08 16:11 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-26 16:45 nfsd changes for 2.6.37 J. Bruce Fields
2010-10-26 17:22 ` J. Bruce Fields
2010-10-26 17:39   ` Linus Torvalds
2010-10-26 17:46     ` J. Bruce Fields
2010-10-26 17:46       ` J. Bruce Fields
2010-10-26 20:18 ` Arnd Bergmann
2010-10-26 20:35   ` Bryan Schumaker
2010-10-26 20:55     ` Arnd Bergmann
2010-10-26 21:02       ` Linus Torvalds
2010-10-26 21:24         ` J. Bruce Fields
2010-10-26 21:37           ` Linus Torvalds
2010-10-26 21:44             ` J. Bruce Fields
2010-10-26 22:11               ` J. Bruce Fields
2010-10-26 22:41                 ` J. Bruce Fields
2010-10-27  7:21                 ` Arnd Bergmann
2010-10-27  8:39                   ` Christoph Hellwig
2010-10-27 13:39                     ` J. Bruce Fields
2010-10-27 13:46                       ` Arnd Bergmann
2010-10-27 14:55                         ` J. Bruce Fields
2010-10-27 14:59                           ` Christoph Hellwig
2010-10-27 15:16                             ` J. Bruce Fields
2010-10-27 15:19                               ` Christoph Hellwig
2010-10-27 15:23                             ` Arnd Bergmann
2010-10-27 15:28                               ` J. Bruce Fields
2010-10-27 15:31                               ` Christoph Hellwig
2010-10-27 16:12                               ` Linus Torvalds
2010-10-27 16:46                                 ` J. Bruce Fields
2010-10-27 16:46                                   ` J. Bruce Fields
2010-10-27 17:32                                   ` Linus Torvalds
2010-10-27 17:40                                     ` J. Bruce Fields
2010-10-27 18:20                                       ` Arnd Bergmann
2010-10-27 18:42                                         ` Linus Torvalds
2010-10-27 18:43                                           ` Linus Torvalds
2010-10-27 19:48                                             ` Arnd Bergmann
2010-10-27 20:01                                               ` J. Bruce Fields
2010-10-27 20:20                                                 ` Arnd Bergmann
2010-10-27 20:24                                                   ` J. Bruce Fields
2010-10-30 21:25                             ` J. Bruce Fields
2010-10-30 21:31                               ` [PATCH 1/4] locks: prevent ENOMEM on lease unlock J. Bruce Fields
2010-10-30 21:31                               ` [PATCH 2/4] locks: fix leaks on setlease errors J. Bruce Fields
2010-10-31 11:10                                 ` Christoph Hellwig
2010-11-01 17:24                                   ` J. Bruce Fields
2010-11-01 17:41                                     ` Christoph Hellwig
2010-11-01 18:34                                       ` J. Bruce Fields
2010-10-30 21:31                               ` [PATCH 3/4] locks: fix setlease methods to free passed-in lock J. Bruce Fields
2010-10-30 21:31                               ` [PATCH 4/4] nfsd4: initialize delegation pointer to lease J. Bruce Fields
2010-10-31  2:04                                 ` Christoph Hellwig
2010-10-31  3:04                                   ` J. Bruce Fields
2010-10-30 21:40                               ` nfsd changes for 2.6.37 Arnd Bergmann
2010-10-31  2:07                               ` Christoph Hellwig
2010-10-31  3:05                                 ` J. Bruce Fields
2010-10-31 12:34                               ` Christoph Hellwig
2010-10-31 12:35                                 ` [PATCH 1/2] locks: let the caller free file_lock on ->setlease failure Christoph Hellwig
2010-11-03 20:41                                   ` J. Bruce Fields
2010-11-04  1:40                                     ` J. Bruce Fields
2010-11-04  1:41                                       ` J. Bruce Fields
2010-11-06 19:03                                         ` Christoph Hellwig
2010-11-06 19:03                                       ` Christoph Hellwig
2010-11-08 16:10                                         ` J. Bruce Fields
2010-10-31 12:35                                 ` [PATCH 2/2] locks: remove fl_copy_lock lock_manager operation Christoph Hellwig
2010-11-01 15:02                                 ` nfsd changes for 2.6.37 J. Bruce Fields
2010-11-06 19:04                                   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.