All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/41] missing patches for lustre 2.7.50 to 2.7.55
@ 2016-10-03  2:27 ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

Another batch of cleanups and fixes missing up to Lustre
version 2.7.55.

Alex Zhuravlev (1):
  staging: lustre: echo: request pages in batches

Andreas Dilger (1):
  staging: lustre: ptlrpc: remove old protocol compatibility

Bobi Jam (3):
  staging: lustre: llite: remove duplicate fiemap defines
  staging: lustre: clio: get rid of lov_stripe_md reference
  staging: lustre: llite: restart short read/write for normal IO

Bruno Faccini (1):
  staging: lustre: obdclass: fix race during key quiescency

Chris Horn (1):
  staging: lustre: ptlrpc: Move NRS structures out of lustre_net.h

Gregoire Pichon (3):
  staging: lustre: ptlrpc: Add OBD_CONNECT_MULTIMODRPCS flag
  staging: lustre: ptlrpc: Add a tag field to ptlrpc messages
  staging: lustre: mdc: add max modify RPCs in flight variable

Henri Doreau (3):
  staging: lustre: llite: Report first encountered error
  staging: lustre: mdc: Removed unneeded NULL check
  staging: lustre: hsm: Use file lease to implement migration

James Simmons (1):
  staging: lustre: lov: copy_to_user uses wrong casting

Jinshan Xiong (2):
  staging: lustre: clio: Revise read ahead implementation
  staging: lustre: llite: remove lli_has_smd

John L. Hammond (12):
  staging: lustre: llite: remove client Size on MDS support
  staging: lustre: obd: remove client Size on MDS support
  staging: lustre: clio: use CIT_SETATTR for FSFILT_IOC_SETFLAGS
  staging: lustre: remove Size on MDS support
  staging: lustre: obd: remove unused LSM parameters
  staging: lustre: clio: add CIT_DATA_VERSION and remove IOC_LOV_GETINFO
  staging: lustre: lov: add cl_object_layout_get()
  staging: lustre: llite: add cl_object_maxbytes()
  staging: lustre: obd: remove destroy cookie handling
  staging: lustre: lov: use obd_get_info() to get def/max LOV EA sizes
  staging: lustre: osc: remove remaining bits for capa support
  staging: lustre: lov: move LSM to LOV layer

Liang Zhen (1):
  staging: lustre: libcfs: shortcut to create CPT from NUMA topology

Mikhail Pershin (1):
  staging: lustre: hsm: make HSM modification requests replayable

Niu Yawei (2):
  staging: lustre: quota: remove obsolete quota code
  staging: lustre: ldlm: cancel aged locks for LRUR

Patrick Farrell (1):
  staging: lustre: ldlm: Do not use cbpending for group locks

Patrick Valentin (1):
  staging: lustre: obdclass: Add synchro in lu_context_key_degister()

Sebastien Buisson (2):
  staging: lustre: ptlrpc: ret -ECONNREFUSED if not context found in req
  staging: lustre: ptlrpc: dont take unwrap in req_waittime calculation

Vitaly Fertman (1):
  staging: lustre: ldlm: interval tree search in ldlm_lock_match()

Wu Libin (1):
  staging: lustre: osc: fix bug when setting max_pages_per_rpc

frank zago (1):
  staging: lustre: ldlm: remove unnecessary EXPORT_SYMBOL

wang di (2):
  staging: lustre: llite: default dir stripe index only for mkdir
  staging: lustre: mgc: MGC should retry for invalid import

 drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
 .../staging/lustre/lnet/libcfs/linux/linux-cpu.c   |   48 +-
 drivers/staging/lustre/lustre/include/cl_object.h  |  113 ++-
 .../lustre/lustre/include/lustre/ll_fiemap.h       |   75 +--
 .../lustre/lustre/include/lustre/lustre_idl.h      |  107 +--
 .../lustre/lustre/include/lustre/lustre_ioctl.h    |    4 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |   24 +-
 .../staging/lustre/lustre/include/lustre_compat.h  |    2 +
 drivers/staging/lustre/lustre/include/lustre_dlm.h |    2 +-
 drivers/staging/lustre/lustre/include/lustre_ha.h  |    1 +
 drivers/staging/lustre/lustre/include/lustre_mdc.h |   20 +-
 drivers/staging/lustre/lustre/include/lustre_net.h |  714 +------------------
 drivers/staging/lustre/lustre/include/lustre_nrs.h |  717 ++++++++++++++++++
 .../lustre/lustre/include/lustre_nrs_fifo.h        |   70 ++
 .../lustre/lustre/include/lustre_req_layout.h      |    6 +-
 drivers/staging/lustre/lustre/include/obd.h        |  174 +----
 drivers/staging/lustre/lustre/include/obd_class.h  |  181 +----
 .../staging/lustre/lustre/include/obd_support.h    |   12 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   18 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_internal.h |    7 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |    5 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |  282 +++++---
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |   26 -
 drivers/staging/lustre/lustre/ldlm/ldlm_pool.c     |    6 -
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |   11 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    4 -
 drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
 drivers/staging/lustre/lustre/llite/dir.c          |  157 +----
 drivers/staging/lustre/lustre/llite/file.c         |  781 ++++++++------------
 drivers/staging/lustre/lustre/llite/glimpse.c      |  139 ++---
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |   28 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   37 +-
 drivers/staging/lustre/lustre/llite/llite_close.c  |  395 ----------
 .../staging/lustre/lustre/llite/llite_internal.h   |   87 +--
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  246 ++-----
 drivers/staging/lustre/lustre/llite/namei.c        |    8 -
 drivers/staging/lustre/lustre/llite/rw.c           |  218 ++++---
 drivers/staging/lustre/lustre/llite/rw26.c         |    4 -
 drivers/staging/lustre/lustre/llite/vvp_dev.c      |    3 +-
 drivers/staging/lustre/lustre/llite/vvp_internal.h |   33 +-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |  170 +++--
 drivers/staging/lustre/lustre/llite/vvp_object.c   |   25 +-
 drivers/staging/lustre/lustre/llite/vvp_page.c     |   40 +-
 drivers/staging/lustre/lustre/llite/vvp_req.c      |   13 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |  234 +++---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |  198 +-----
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |    1 -
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  195 +++---
 drivers/staging/lustre/lustre/lov/lov_internal.h   |   94 ++-
 drivers/staging/lustre/lustre/lov/lov_io.c         |  109 +++-
 drivers/staging/lustre/lustre/lov/lov_merge.c      |   50 --
 drivers/staging/lustre/lustre/lov/lov_obd.c        |  677 +----------------
 drivers/staging/lustre/lustre/lov/lov_object.c     |  570 ++++++++++++++-
 drivers/staging/lustre/lustre/lov/lov_pack.c       |  184 ++---
 drivers/staging/lustre/lustre/lov/lov_page.c       |   46 --
 drivers/staging/lustre/lustre/lov/lov_request.c    |  292 --------
 drivers/staging/lustre/lustre/mdc/lproc_mdc.c      |   38 +
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    5 +-
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   57 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |    2 -
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |   62 +--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |  204 +-----
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |   94 ++-
 drivers/staging/lustre/lustre/obdclass/Makefile    |    2 +-
 drivers/staging/lustre/lustre/obdclass/cl_io.c     |   58 +--
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   61 ++
 drivers/staging/lustre/lustre/obdclass/cl_page.c   |   47 --
 drivers/staging/lustre/lustre/obdclass/genops.c    |   57 ++
 .../lustre/lustre/obdclass/linux/linux-obdo.c      |   80 --
 .../lustre/lustre/obdclass/lprocfs_status.c        |    7 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   63 ++-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    2 +-
 drivers/staging/lustre/lustre/obdclass/obdo.c      |   65 --
 .../staging/lustre/lustre/obdecho/echo_client.c    |  102 +--
 drivers/staging/lustre/lustre/osc/lproc_osc.c      |    3 +-
 drivers/staging/lustre/lustre/osc/osc_cache.c      |    3 +-
 .../staging/lustre/lustre/osc/osc_cl_internal.h    |    1 -
 drivers/staging/lustre/lustre/osc/osc_internal.h   |   32 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |  184 +++++-
 drivers/staging/lustre/lustre/osc/osc_lock.c       |   14 +-
 drivers/staging/lustre/lustre/osc/osc_object.c     |   91 +++-
 drivers/staging/lustre/lustre/osc/osc_page.c       |   20 -
 drivers/staging/lustre/lustre/osc/osc_quota.c      |   44 --
 drivers/staging/lustre/lustre/osc/osc_request.c    |  363 +--------
 drivers/staging/lustre/lustre/ptlrpc/client.c      |    8 +-
 drivers/staging/lustre/lustre/ptlrpc/import.c      |   25 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   36 +-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   57 ++-
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |  248 +++----
 90 files changed, 3936 insertions(+), 5867 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/include/lustre_nrs.h
 create mode 100644 drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h
 delete mode 100644 drivers/staging/lustre/lustre/llite/llite_close.c
 delete mode 100644 drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 00/41] missing patches for lustre 2.7.50 to 2.7.55
@ 2016-10-03  2:27 ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

Another batch of cleanups and fixes missing up to Lustre
version 2.7.55.

Alex Zhuravlev (1):
  staging: lustre: echo: request pages in batches

Andreas Dilger (1):
  staging: lustre: ptlrpc: remove old protocol compatibility

Bobi Jam (3):
  staging: lustre: llite: remove duplicate fiemap defines
  staging: lustre: clio: get rid of lov_stripe_md reference
  staging: lustre: llite: restart short read/write for normal IO

Bruno Faccini (1):
  staging: lustre: obdclass: fix race during key quiescency

Chris Horn (1):
  staging: lustre: ptlrpc: Move NRS structures out of lustre_net.h

Gregoire Pichon (3):
  staging: lustre: ptlrpc: Add OBD_CONNECT_MULTIMODRPCS flag
  staging: lustre: ptlrpc: Add a tag field to ptlrpc messages
  staging: lustre: mdc: add max modify RPCs in flight variable

Henri Doreau (3):
  staging: lustre: llite: Report first encountered error
  staging: lustre: mdc: Removed unneeded NULL check
  staging: lustre: hsm: Use file lease to implement migration

James Simmons (1):
  staging: lustre: lov: copy_to_user uses wrong casting

Jinshan Xiong (2):
  staging: lustre: clio: Revise read ahead implementation
  staging: lustre: llite: remove lli_has_smd

John L. Hammond (12):
  staging: lustre: llite: remove client Size on MDS support
  staging: lustre: obd: remove client Size on MDS support
  staging: lustre: clio: use CIT_SETATTR for FSFILT_IOC_SETFLAGS
  staging: lustre: remove Size on MDS support
  staging: lustre: obd: remove unused LSM parameters
  staging: lustre: clio: add CIT_DATA_VERSION and remove IOC_LOV_GETINFO
  staging: lustre: lov: add cl_object_layout_get()
  staging: lustre: llite: add cl_object_maxbytes()
  staging: lustre: obd: remove destroy cookie handling
  staging: lustre: lov: use obd_get_info() to get def/max LOV EA sizes
  staging: lustre: osc: remove remaining bits for capa support
  staging: lustre: lov: move LSM to LOV layer

Liang Zhen (1):
  staging: lustre: libcfs: shortcut to create CPT from NUMA topology

Mikhail Pershin (1):
  staging: lustre: hsm: make HSM modification requests replayable

Niu Yawei (2):
  staging: lustre: quota: remove obsolete quota code
  staging: lustre: ldlm: cancel aged locks for LRUR

Patrick Farrell (1):
  staging: lustre: ldlm: Do not use cbpending for group locks

Patrick Valentin (1):
  staging: lustre: obdclass: Add synchro in lu_context_key_degister()

Sebastien Buisson (2):
  staging: lustre: ptlrpc: ret -ECONNREFUSED if not context found in req
  staging: lustre: ptlrpc: dont take unwrap in req_waittime calculation

Vitaly Fertman (1):
  staging: lustre: ldlm: interval tree search in ldlm_lock_match()

Wu Libin (1):
  staging: lustre: osc: fix bug when setting max_pages_per_rpc

frank zago (1):
  staging: lustre: ldlm: remove unnecessary EXPORT_SYMBOL

wang di (2):
  staging: lustre: llite: default dir stripe index only for mkdir
  staging: lustre: mgc: MGC should retry for invalid import

 drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
 .../staging/lustre/lnet/libcfs/linux/linux-cpu.c   |   48 +-
 drivers/staging/lustre/lustre/include/cl_object.h  |  113 ++-
 .../lustre/lustre/include/lustre/ll_fiemap.h       |   75 +--
 .../lustre/lustre/include/lustre/lustre_idl.h      |  107 +--
 .../lustre/lustre/include/lustre/lustre_ioctl.h    |    4 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |   24 +-
 .../staging/lustre/lustre/include/lustre_compat.h  |    2 +
 drivers/staging/lustre/lustre/include/lustre_dlm.h |    2 +-
 drivers/staging/lustre/lustre/include/lustre_ha.h  |    1 +
 drivers/staging/lustre/lustre/include/lustre_mdc.h |   20 +-
 drivers/staging/lustre/lustre/include/lustre_net.h |  714 +------------------
 drivers/staging/lustre/lustre/include/lustre_nrs.h |  717 ++++++++++++++++++
 .../lustre/lustre/include/lustre_nrs_fifo.h        |   70 ++
 .../lustre/lustre/include/lustre_req_layout.h      |    6 +-
 drivers/staging/lustre/lustre/include/obd.h        |  174 +----
 drivers/staging/lustre/lustre/include/obd_class.h  |  181 +----
 .../staging/lustre/lustre/include/obd_support.h    |   12 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   18 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_internal.h |    7 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |    5 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |  282 +++++---
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |   26 -
 drivers/staging/lustre/lustre/ldlm/ldlm_pool.c     |    6 -
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |   11 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    4 -
 drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
 drivers/staging/lustre/lustre/llite/dir.c          |  157 +----
 drivers/staging/lustre/lustre/llite/file.c         |  781 ++++++++------------
 drivers/staging/lustre/lustre/llite/glimpse.c      |  139 ++---
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |   28 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   37 +-
 drivers/staging/lustre/lustre/llite/llite_close.c  |  395 ----------
 .../staging/lustre/lustre/llite/llite_internal.h   |   87 +--
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  246 ++-----
 drivers/staging/lustre/lustre/llite/namei.c        |    8 -
 drivers/staging/lustre/lustre/llite/rw.c           |  218 ++++---
 drivers/staging/lustre/lustre/llite/rw26.c         |    4 -
 drivers/staging/lustre/lustre/llite/vvp_dev.c      |    3 +-
 drivers/staging/lustre/lustre/llite/vvp_internal.h |   33 +-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |  170 +++--
 drivers/staging/lustre/lustre/llite/vvp_object.c   |   25 +-
 drivers/staging/lustre/lustre/llite/vvp_page.c     |   40 +-
 drivers/staging/lustre/lustre/llite/vvp_req.c      |   13 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |  234 +++---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |  198 +-----
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |    1 -
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  195 +++---
 drivers/staging/lustre/lustre/lov/lov_internal.h   |   94 ++-
 drivers/staging/lustre/lustre/lov/lov_io.c         |  109 +++-
 drivers/staging/lustre/lustre/lov/lov_merge.c      |   50 --
 drivers/staging/lustre/lustre/lov/lov_obd.c        |  677 +----------------
 drivers/staging/lustre/lustre/lov/lov_object.c     |  570 ++++++++++++++-
 drivers/staging/lustre/lustre/lov/lov_pack.c       |  184 ++---
 drivers/staging/lustre/lustre/lov/lov_page.c       |   46 --
 drivers/staging/lustre/lustre/lov/lov_request.c    |  292 --------
 drivers/staging/lustre/lustre/mdc/lproc_mdc.c      |   38 +
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    5 +-
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   57 +-
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |    2 -
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |   62 +--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |  204 +-----
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |   94 ++-
 drivers/staging/lustre/lustre/obdclass/Makefile    |    2 +-
 drivers/staging/lustre/lustre/obdclass/cl_io.c     |   58 +--
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   61 ++
 drivers/staging/lustre/lustre/obdclass/cl_page.c   |   47 --
 drivers/staging/lustre/lustre/obdclass/genops.c    |   57 ++
 .../lustre/lustre/obdclass/linux/linux-obdo.c      |   80 --
 .../lustre/lustre/obdclass/lprocfs_status.c        |    7 +-
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   63 ++-
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    2 +-
 drivers/staging/lustre/lustre/obdclass/obdo.c      |   65 --
 .../staging/lustre/lustre/obdecho/echo_client.c    |  102 +--
 drivers/staging/lustre/lustre/osc/lproc_osc.c      |    3 +-
 drivers/staging/lustre/lustre/osc/osc_cache.c      |    3 +-
 .../staging/lustre/lustre/osc/osc_cl_internal.h    |    1 -
 drivers/staging/lustre/lustre/osc/osc_internal.h   |   32 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |  184 +++++-
 drivers/staging/lustre/lustre/osc/osc_lock.c       |   14 +-
 drivers/staging/lustre/lustre/osc/osc_object.c     |   91 +++-
 drivers/staging/lustre/lustre/osc/osc_page.c       |   20 -
 drivers/staging/lustre/lustre/osc/osc_quota.c      |   44 --
 drivers/staging/lustre/lustre/osc/osc_request.c    |  363 +--------
 drivers/staging/lustre/lustre/ptlrpc/client.c      |    8 +-
 drivers/staging/lustre/lustre/ptlrpc/import.c      |   25 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   36 +-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   57 ++-
 drivers/staging/lustre/lustre/ptlrpc/sec.c         |    2 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |  248 +++----
 90 files changed, 3936 insertions(+), 5867 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/include/lustre_nrs.h
 create mode 100644 drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h
 delete mode 100644 drivers/staging/lustre/lustre/llite/llite_close.c
 delete mode 100644 drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [PATCH 01/41] staging: lustre: obdclass: fix race during key quiescency
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:27   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Bruno Faccini, James Simmons

From: Bruno Faccini <bruno.faccini@intel.com>

Upon umount, presumably of last device using same OSD back-end,
to prepare for module unload, lu_context_key_quiesce() is run to
remove all module's key reference in any context linked on
lu_context_remembered list.
Threads must protect against such transversal processing when
exiting from its context.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5264
Reviewed-on: http://review.whamcloud.com/13103
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 054e567..f0e74c6 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -1663,6 +1663,9 @@ void lu_context_exit(struct lu_context *ctx)
 	ctx->lc_state = LCS_LEFT;
 	if (ctx->lc_tags & LCT_HAS_EXIT && ctx->lc_value) {
 		for (i = 0; i < ARRAY_SIZE(lu_keys); ++i) {
+			/* could race with key quiescency */
+			if (ctx->lc_tags & LCT_REMEMBER)
+				spin_lock(&lu_keys_guard);
 			if (ctx->lc_value[i]) {
 				struct lu_context_key *key;
 
@@ -1671,6 +1674,8 @@ void lu_context_exit(struct lu_context *ctx)
 					key->lct_exit(ctx,
 						      key, ctx->lc_value[i]);
 			}
+			if (ctx->lc_tags & LCT_REMEMBER)
+				spin_unlock(&lu_keys_guard);
 		}
 	}
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 01/41] staging: lustre: obdclass: fix race during key quiescency
@ 2016-10-03  2:27   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Bruno Faccini, James Simmons

From: Bruno Faccini <bruno.faccini@intel.com>

Upon umount, presumably of last device using same OSD back-end,
to prepare for module unload, lu_context_key_quiesce() is run to
remove all module's key reference in any context linked on
lu_context_remembered list.
Threads must protect against such transversal processing when
exiting from its context.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5264
Reviewed-on: http://review.whamcloud.com/13103
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index 054e567..f0e74c6 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -1663,6 +1663,9 @@ void lu_context_exit(struct lu_context *ctx)
 	ctx->lc_state = LCS_LEFT;
 	if (ctx->lc_tags & LCT_HAS_EXIT && ctx->lc_value) {
 		for (i = 0; i < ARRAY_SIZE(lu_keys); ++i) {
+			/* could race with key quiescency */
+			if (ctx->lc_tags & LCT_REMEMBER)
+				spin_lock(&lu_keys_guard);
 			if (ctx->lc_value[i]) {
 				struct lu_context_key *key;
 
@@ -1671,6 +1674,8 @@ void lu_context_exit(struct lu_context *ctx)
 					key->lct_exit(ctx,
 						      key, ctx->lc_value[i]);
 			}
+			if (ctx->lc_tags & LCT_REMEMBER)
+				spin_unlock(&lu_keys_guard);
 		}
 	}
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 02/41] staging: lustre: obdclass: Add synchro in lu_context_key_degister()
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:27   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Patrick Valentin, Gregoire Pichon, James Simmons

From: Patrick Valentin <patrick.valentin@bull.net>

When unloading a module, it may happen that lu_context_key_degister()
removes a key while a thread is either registering it in a new
context (lu_context_init(), lu_context_refill()), or using it when
exiting from a context (lu_context__exit(), lu_context__fini()).

In these cases, we reference a key which no longer exists, and
the system crashes either because we use a *POISON'ed* pointer
in key_fini() -> key->lct_fini(), or because one of the following
assertions fails:
 - lu_context_key_degister():
        ASSERTION(cfs_atomic_read(&key->lct_used) == 1)
                  failed: key has instances: 2

 - lu_context_exit():
        ASSERTION(key != NULL)

 - key_fini():
        ASSERTION(atomic_read(&key->lct_used) > 1)

This can also leads to SLAB objects which are not freed:
        slab error in kmem_cache_destroy(): cache `echo_thread_kmem':
                   Can't free all objects

Note: ptlrpc service threads need to call lu_context_init/fini in
each loop (for each RPC), and this could be a big performance issue
on fat SMP machines if we add serialization by a spinlock and need
to lock/unlock it for multiple times for each RPC.

So the aim of this patch, which only impacts some low frequently used
functions, is:
  1) to add a synchronization in lu_context_key_quiesce(), also called
     by lu_context_key_degister(), to wait until all key::lct_init()
     methods have completed, by serializing with keys_fill()
  2) to add a synchronization in lu_context_key_degister(), to wait
     until all transient contexts referencing this key have run
     key::lct_fini() method

Signed-off-by: Patrick Valentin <patrick.valentin@bull.net>
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6049
Reviewed-on: http://review.whamcloud.com/13164
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   58 ++++++++++++++++++--
 1 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index f0e74c6..e031fd2 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -1311,6 +1311,7 @@ enum {
 static struct lu_context_key *lu_keys[LU_CONTEXT_KEY_NR] = { NULL, };
 
 static DEFINE_SPINLOCK(lu_keys_guard);
+static atomic_t lu_key_initing_cnt = ATOMIC_INIT(0);
 
 /**
  * Global counter incremented whenever key is registered, unregistered,
@@ -1385,6 +1386,19 @@ void lu_context_key_degister(struct lu_context_key *key)
 	++key_set_version;
 	spin_lock(&lu_keys_guard);
 	key_fini(&lu_shrink_env.le_ctx, key->lct_index);
+
+	/**
+	 * Wait until all transient contexts referencing this key have
+	 * run lu_context_key::lct_fini() method.
+	 */
+	while (atomic_read(&key->lct_used) > 1) {
+		spin_unlock(&lu_keys_guard);
+		CDEBUG(D_INFO, "lu_context_key_degister: \"%s\" %p, %d\n",
+		       key->lct_owner ? key->lct_owner->name : "", key,
+		       atomic_read(&key->lct_used));
+		schedule();
+		spin_lock(&lu_keys_guard);
+	}
 	if (lu_keys[key->lct_index]) {
 		lu_keys[key->lct_index] = NULL;
 		lu_ref_fini(&key->lct_reference);
@@ -1510,11 +1524,26 @@ void lu_context_key_quiesce(struct lu_context_key *key)
 		 * XXX layering violation.
 		 */
 		cl_env_cache_purge(~0);
-		key->lct_tags |= LCT_QUIESCENT;
 		/*
 		 * XXX memory barrier has to go here.
 		 */
 		spin_lock(&lu_keys_guard);
+		key->lct_tags |= LCT_QUIESCENT;
+
+		/**
+		 * Wait until all lu_context_key::lct_init() methods
+		 * have completed.
+		 */
+		while (atomic_read(&lu_key_initing_cnt) > 0) {
+			spin_unlock(&lu_keys_guard);
+			CDEBUG(D_INFO, "lu_context_key_quiesce: \"%s\" %p, %d (%d)\n",
+			       key->lct_owner ? key->lct_owner->name : "",
+			       key, atomic_read(&key->lct_used),
+			atomic_read(&lu_key_initing_cnt));
+			schedule();
+			spin_lock(&lu_keys_guard);
+		}
+
 		list_for_each_entry(ctx, &lu_context_remembered, lc_remember)
 			key_fini(ctx, key->lct_index);
 		spin_unlock(&lu_keys_guard);
@@ -1546,6 +1575,19 @@ static int keys_fill(struct lu_context *ctx)
 {
 	unsigned int i;
 
+	/*
+	 * A serialisation with lu_context_key_quiesce() is needed, but some
+	 * "key->lct_init()" are calling kernel memory allocation routine and
+	 * can't be called while holding a spin_lock.
+	 * "lu_keys_guard" is held while incrementing "lu_key_initing_cnt"
+	 * to ensure the start of the serialisation.
+	 * An atomic_t variable is still used, in order not to reacquire the
+	 * lock when decrementing the counter.
+	 */
+	spin_lock(&lu_keys_guard);
+	atomic_inc(&lu_key_initing_cnt);
+	spin_unlock(&lu_keys_guard);
+
 	LINVRNT(ctx->lc_value);
 	for (i = 0; i < ARRAY_SIZE(lu_keys); ++i) {
 		struct lu_context_key *key;
@@ -1563,12 +1605,19 @@ static int keys_fill(struct lu_context *ctx)
 			LINVRNT(key->lct_init);
 			LINVRNT(key->lct_index == i);
 
+			LASSERT(key->lct_owner);
+			if (!(ctx->lc_tags & LCT_NOREF) &&
+			    !try_module_get(key->lct_owner)) {
+				/* module is unloading, skip this key */
+				continue;
+			}
+
 			value = key->lct_init(ctx, key);
-			if (IS_ERR(value))
+			if (unlikely(IS_ERR(value))) {
+				atomic_dec(&lu_key_initing_cnt);
 				return PTR_ERR(value);
+			}
 
-			if (!(ctx->lc_tags & LCT_NOREF))
-				try_module_get(key->lct_owner);
 			lu_ref_add_atomic(&key->lct_reference, "ctx", ctx);
 			atomic_inc(&key->lct_used);
 			/*
@@ -1582,6 +1631,7 @@ static int keys_fill(struct lu_context *ctx)
 		}
 		ctx->lc_version = key_set_version;
 	}
+	atomic_dec(&lu_key_initing_cnt);
 	return 0;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 02/41] staging: lustre: obdclass: Add synchro in lu_context_key_degister()
@ 2016-10-03  2:27   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Patrick Valentin, Gregoire Pichon, James Simmons

From: Patrick Valentin <patrick.valentin@bull.net>

When unloading a module, it may happen that lu_context_key_degister()
removes a key while a thread is either registering it in a new
context (lu_context_init(), lu_context_refill()), or using it when
exiting from a context (lu_context__exit(), lu_context__fini()).

In these cases, we reference a key which no longer exists, and
the system crashes either because we use a *POISON'ed* pointer
in key_fini() -> key->lct_fini(), or because one of the following
assertions fails:
 - lu_context_key_degister():
        ASSERTION(cfs_atomic_read(&key->lct_used) == 1)
                  failed: key has instances: 2

 - lu_context_exit():
        ASSERTION(key != NULL)

 - key_fini():
        ASSERTION(atomic_read(&key->lct_used) > 1)

This can also leads to SLAB objects which are not freed:
        slab error in kmem_cache_destroy(): cache `echo_thread_kmem':
                   Can't free all objects

Note: ptlrpc service threads need to call lu_context_init/fini in
each loop (for each RPC), and this could be a big performance issue
on fat SMP machines if we add serialization by a spinlock and need
to lock/unlock it for multiple times for each RPC.

So the aim of this patch, which only impacts some low frequently used
functions, is:
  1) to add a synchronization in lu_context_key_quiesce(), also called
     by lu_context_key_degister(), to wait until all key::lct_init()
     methods have completed, by serializing with keys_fill()
  2) to add a synchronization in lu_context_key_degister(), to wait
     until all transient contexts referencing this key have run
     key::lct_fini() method

Signed-off-by: Patrick Valentin <patrick.valentin@bull.net>
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6049
Reviewed-on: http://review.whamcloud.com/13164
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/obdclass/lu_object.c |   58 ++++++++++++++++++--
 1 files changed, 54 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/lu_object.c b/drivers/staging/lustre/lustre/obdclass/lu_object.c
index f0e74c6..e031fd2 100644
--- a/drivers/staging/lustre/lustre/obdclass/lu_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/lu_object.c
@@ -1311,6 +1311,7 @@ enum {
 static struct lu_context_key *lu_keys[LU_CONTEXT_KEY_NR] = { NULL, };
 
 static DEFINE_SPINLOCK(lu_keys_guard);
+static atomic_t lu_key_initing_cnt = ATOMIC_INIT(0);
 
 /**
  * Global counter incremented whenever key is registered, unregistered,
@@ -1385,6 +1386,19 @@ void lu_context_key_degister(struct lu_context_key *key)
 	++key_set_version;
 	spin_lock(&lu_keys_guard);
 	key_fini(&lu_shrink_env.le_ctx, key->lct_index);
+
+	/**
+	 * Wait until all transient contexts referencing this key have
+	 * run lu_context_key::lct_fini() method.
+	 */
+	while (atomic_read(&key->lct_used) > 1) {
+		spin_unlock(&lu_keys_guard);
+		CDEBUG(D_INFO, "lu_context_key_degister: \"%s\" %p, %d\n",
+		       key->lct_owner ? key->lct_owner->name : "", key,
+		       atomic_read(&key->lct_used));
+		schedule();
+		spin_lock(&lu_keys_guard);
+	}
 	if (lu_keys[key->lct_index]) {
 		lu_keys[key->lct_index] = NULL;
 		lu_ref_fini(&key->lct_reference);
@@ -1510,11 +1524,26 @@ void lu_context_key_quiesce(struct lu_context_key *key)
 		 * XXX layering violation.
 		 */
 		cl_env_cache_purge(~0);
-		key->lct_tags |= LCT_QUIESCENT;
 		/*
 		 * XXX memory barrier has to go here.
 		 */
 		spin_lock(&lu_keys_guard);
+		key->lct_tags |= LCT_QUIESCENT;
+
+		/**
+		 * Wait until all lu_context_key::lct_init() methods
+		 * have completed.
+		 */
+		while (atomic_read(&lu_key_initing_cnt) > 0) {
+			spin_unlock(&lu_keys_guard);
+			CDEBUG(D_INFO, "lu_context_key_quiesce: \"%s\" %p, %d (%d)\n",
+			       key->lct_owner ? key->lct_owner->name : "",
+			       key, atomic_read(&key->lct_used),
+			atomic_read(&lu_key_initing_cnt));
+			schedule();
+			spin_lock(&lu_keys_guard);
+		}
+
 		list_for_each_entry(ctx, &lu_context_remembered, lc_remember)
 			key_fini(ctx, key->lct_index);
 		spin_unlock(&lu_keys_guard);
@@ -1546,6 +1575,19 @@ static int keys_fill(struct lu_context *ctx)
 {
 	unsigned int i;
 
+	/*
+	 * A serialisation with lu_context_key_quiesce() is needed, but some
+	 * "key->lct_init()" are calling kernel memory allocation routine and
+	 * can't be called while holding a spin_lock.
+	 * "lu_keys_guard" is held while incrementing "lu_key_initing_cnt"
+	 * to ensure the start of the serialisation.
+	 * An atomic_t variable is still used, in order not to reacquire the
+	 * lock when decrementing the counter.
+	 */
+	spin_lock(&lu_keys_guard);
+	atomic_inc(&lu_key_initing_cnt);
+	spin_unlock(&lu_keys_guard);
+
 	LINVRNT(ctx->lc_value);
 	for (i = 0; i < ARRAY_SIZE(lu_keys); ++i) {
 		struct lu_context_key *key;
@@ -1563,12 +1605,19 @@ static int keys_fill(struct lu_context *ctx)
 			LINVRNT(key->lct_init);
 			LINVRNT(key->lct_index == i);
 
+			LASSERT(key->lct_owner);
+			if (!(ctx->lc_tags & LCT_NOREF) &&
+			    !try_module_get(key->lct_owner)) {
+				/* module is unloading, skip this key */
+				continue;
+			}
+
 			value = key->lct_init(ctx, key);
-			if (IS_ERR(value))
+			if (unlikely(IS_ERR(value))) {
+				atomic_dec(&lu_key_initing_cnt);
 				return PTR_ERR(value);
+			}
 
-			if (!(ctx->lc_tags & LCT_NOREF))
-				try_module_get(key->lct_owner);
 			lu_ref_add_atomic(&key->lct_reference, "ctx", ctx);
 			atomic_inc(&key->lct_used);
 			/*
@@ -1582,6 +1631,7 @@ static int keys_fill(struct lu_context *ctx)
 		}
 		ctx->lc_version = key_set_version;
 	}
+	atomic_dec(&lu_key_initing_cnt);
 	return 0;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 03/41] staging: lustre: llite: remove client Size on MDS support
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:27   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Size on MDS support have been in preview since at least 2.0.0. Remove
support for it from lustre/llite/.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6047
Reviewed-on: http://review.whamcloud.com/13126
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
 drivers/staging/lustre/lustre/llite/file.c         |   97 +-----
 drivers/staging/lustre/lustre/llite/glimpse.c      |  105 +++---
 drivers/staging/lustre/lustre/llite/llite_close.c  |  395 --------------------
 .../staging/lustre/lustre/llite/llite_internal.h   |   56 +---
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  127 +------
 drivers/staging/lustre/lustre/llite/namei.c        |    8 -
 drivers/staging/lustre/lustre/llite/vvp_dev.c      |    3 +-
 drivers/staging/lustre/lustre/llite/vvp_internal.h |   18 +-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   11 -
 drivers/staging/lustre/lustre/llite/vvp_object.c   |    4 +-
 drivers/staging/lustre/lustre/llite/vvp_page.c     |   24 +--
 drivers/staging/lustre/lustre/llite/vvp_req.c      |   13 +-
 13 files changed, 89 insertions(+), 774 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/llite/llite_close.c

diff --git a/drivers/staging/lustre/lustre/llite/Makefile b/drivers/staging/lustre/lustre/llite/Makefile
index 1ac0940..3690bee 100644
--- a/drivers/staging/lustre/lustre/llite/Makefile
+++ b/drivers/staging/lustre/lustre/llite/Makefile
@@ -1,5 +1,5 @@
 obj-$(CONFIG_LUSTRE_FS) += lustre.o
-lustre-y := dcache.o dir.o file.o llite_close.o llite_lib.o llite_nfs.o \
+lustre-y := dcache.o dir.o file.o llite_lib.o llite_nfs.o \
 	    rw.o namei.o symlink.o llite_mmap.o range_lock.o \
 	    xattr.o xattr_cache.o rw26.o super25.o statahead.o \
 	    glimpse.o lcommon_cl.o lcommon_misc.o \
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 6e3a188..b2058c6 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -86,7 +86,6 @@ void ll_pack_inode2opdata(struct inode *inode, struct md_op_data *op_data,
 	op_data->op_attr.ia_size = i_size_read(inode);
 	op_data->op_attr_blocks = inode->i_blocks;
 	op_data->op_attr_flags = ll_inode_to_ext_flags(inode->i_flags);
-	op_data->op_ioepoch = ll_i2info(inode)->lli_ioepoch;
 	if (fh)
 		op_data->op_handle = *fh;
 
@@ -95,8 +94,7 @@ void ll_pack_inode2opdata(struct inode *inode, struct md_op_data *op_data,
 }
 
 /**
- * Closes the IO epoch and packs all the attributes into @op_data for
- * the CLOSE rpc.
+ * Packs all the attributes into @op_data for the CLOSE rpc.
  */
 static void ll_prepare_close(struct inode *inode, struct md_op_data *op_data,
 			     struct obd_client_handle *och)
@@ -108,11 +106,7 @@ static void ll_prepare_close(struct inode *inode, struct md_op_data *op_data,
 	if (!(och->och_flags & FMODE_WRITE))
 		goto out;
 
-	if (!exp_connect_som(ll_i2mdexp(inode)) || !S_ISREG(inode->i_mode))
-		op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
-	else
-		ll_ioepoch_close(inode, op_data, &och, 0);
-
+	op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
 out:
 	ll_pack_inode2opdata(inode, op_data, &och->och_fh);
 	ll_prep_md_op_data(op_data, inode, NULL, NULL,
@@ -128,7 +122,6 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 	struct md_op_data *op_data;
 	struct ptlrpc_request *req = NULL;
 	struct obd_device *obd = class_exp2obd(exp);
-	int epoch_close = 1;
 	int rc;
 
 	if (!obd) {
@@ -157,22 +150,9 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 		op_data->op_lease_handle = och->och_lease_handle;
 		op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
 	}
-	epoch_close = op_data->op_flags & MF_EPOCH_CLOSE;
+
 	rc = md_close(md_exp, op_data, och->och_mod, &req);
-	if (rc == -EAGAIN) {
-		/* This close must have the epoch closed. */
-		LASSERT(epoch_close);
-		/* MDS has instructed us to obtain Size-on-MDS attribute from
-		 * OSTs and send setattr to back to MDS.
-		 */
-		rc = ll_som_update(inode, op_data);
-		if (rc) {
-			CERROR("%s: inode "DFID" mdc Size-on-MDS update failed: rc = %d\n",
-			       ll_i2mdexp(inode)->exp_obd->obd_name,
-			       PFID(ll_inode2fid(inode)), rc);
-			rc = 0;
-		}
-	} else if (rc) {
+	if (rc) {
 		CERROR("%s: inode "DFID" mdc close failed: rc = %d\n",
 		       ll_i2mdexp(inode)->exp_obd->obd_name,
 		       PFID(ll_inode2fid(inode)), rc);
@@ -200,15 +180,10 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 	ll_finish_md_op_data(op_data);
 
 out:
-	if (exp_connect_som(exp) && !epoch_close &&
-	    S_ISREG(inode->i_mode) && (och->och_flags & FMODE_WRITE)) {
-		ll_queue_done_writing(inode, LLIF_DONE_WRITING);
-	} else {
-		md_clear_open_replay_data(md_exp, och);
-		/* Free @och if it is not waiting for DONE_WRITING. */
-		och->och_fh.cookie = DEAD_HANDLE_MAGIC;
-		kfree(och);
-	}
+	md_clear_open_replay_data(md_exp, och);
+	och->och_fh.cookie = DEAD_HANDLE_MAGIC;
+	kfree(och);
+
 	if (req) /* This is close request */
 		ptlrpc_req_finished(req);
 	return rc;
@@ -437,20 +412,6 @@ out:
 	return rc;
 }
 
-/**
- * Assign an obtained @ioepoch to client's inode. No lock is needed, MDS does
- * not believe attributes if a few ioepoch holders exist. Attributes for
- * previous ioepoch if new one is opened are also skipped by MDS.
- */
-void ll_ioepoch_open(struct ll_inode_info *lli, __u64 ioepoch)
-{
-	if (ioepoch && lli->lli_ioepoch != ioepoch) {
-		lli->lli_ioepoch = ioepoch;
-		CDEBUG(D_INODE, "Epoch %llu opened on "DFID"\n",
-		       ioepoch, PFID(&lli->lli_fid));
-	}
-}
-
 static int ll_och_fill(struct obd_export *md_exp, struct lookup_intent *it,
 		       struct obd_client_handle *och)
 {
@@ -470,23 +431,17 @@ static int ll_local_open(struct file *file, struct lookup_intent *it,
 			 struct ll_file_data *fd, struct obd_client_handle *och)
 {
 	struct inode *inode = file_inode(file);
-	struct ll_inode_info *lli = ll_i2info(inode);
 
 	LASSERT(!LUSTRE_FPRIVATE(file));
 
 	LASSERT(fd);
 
 	if (och) {
-		struct mdt_body *body;
 		int rc;
 
 		rc = ll_och_fill(ll_i2sbi(inode)->ll_md_exp, it, och);
 		if (rc != 0)
 			return rc;
-
-		body = req_capsule_server_get(&it->it_request->rq_pill,
-					      &RMF_MDT_BODY);
-		ll_ioepoch_open(lli, body->mbo_ioepoch);
 	}
 
 	LUSTRE_FPRIVATE(file) = fd;
@@ -912,7 +867,7 @@ static int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 
 /* Fills the obdo with the attributes for the lsm */
 static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
-			  struct obdo *obdo, __u64 ioepoch, int dv_flags)
+			  struct obdo *obdo, int dv_flags)
 {
 	struct ptlrpc_request_set *set;
 	struct obd_info	    oinfo = { };
@@ -924,13 +879,11 @@ static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
 	oinfo.oi_oa = obdo;
 	oinfo.oi_oa->o_oi = lsm->lsm_oi;
 	oinfo.oi_oa->o_mode = S_IFREG;
-	oinfo.oi_oa->o_ioepoch = ioepoch;
 	oinfo.oi_oa->o_valid = OBD_MD_FLID | OBD_MD_FLTYPE |
 			       OBD_MD_FLSIZE | OBD_MD_FLBLOCKS |
 			       OBD_MD_FLBLKSZ | OBD_MD_FLATIME |
 			       OBD_MD_FLMTIME | OBD_MD_FLCTIME |
-			       OBD_MD_FLGROUP | OBD_MD_FLEPOCH |
-			       OBD_MD_FLDATAVERSION;
+			       OBD_MD_FLGROUP | OBD_MD_FLDATAVERSION;
 	if (dv_flags & (LL_DV_WR_FLUSH | LL_DV_RD_FLUSH)) {
 		oinfo.oi_oa->o_valid |= OBD_MD_FLFLAGS;
 		oinfo.oi_oa->o_flags |= OBD_FL_SRVLOCK;
@@ -961,32 +914,6 @@ static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
 	return rc;
 }
 
-/**
-  * Performs the getattr on the inode and updates its fields.
-  * If @sync != 0, perform the getattr under the server-side lock.
-  */
-int ll_inode_getattr(struct inode *inode, struct obdo *obdo,
-		     __u64 ioepoch, int sync)
-{
-	struct lov_stripe_md *lsm;
-	int rc;
-
-	lsm = ccc_inode_lsm_get(inode);
-	rc = ll_lsm_getattr(lsm, ll_i2dtexp(inode),
-			    obdo, ioepoch, sync ? LL_DV_RD_FLUSH : 0);
-	if (rc == 0) {
-		struct ost_id *oi = lsm ? &lsm->lsm_oi : &obdo->o_oi;
-
-		obdo_refresh_inode(inode, obdo, obdo->o_valid);
-		CDEBUG(D_INODE, "objid " DOSTID " size %llu, blocks %llu, blksize %lu\n",
-		       POSTID(oi), i_size_read(inode),
-		       (unsigned long long)inode->i_blocks,
-		       1UL << inode->i_blkbits);
-	}
-	ccc_inode_lsm_put(inode, lsm);
-	return rc;
-}
-
 int ll_merge_attr(const struct lu_env *env, struct inode *inode)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
@@ -1049,7 +976,7 @@ int ll_glimpse_ioctl(struct ll_sb_info *sbi, struct lov_stripe_md *lsm,
 	struct obdo obdo = { 0 };
 	int rc;
 
-	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, &obdo, 0, 0);
+	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, &obdo, 0);
 	if (rc == 0) {
 		st->st_size   = obdo.o_size;
 		st->st_blocks = obdo.o_blocks;
@@ -1823,7 +1750,7 @@ int ll_data_version(struct inode *inode, __u64 *data_version, int flags)
 		goto out;
 	}
 
-	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, obdo, 0, flags);
+	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, obdo, flags);
 	if (rc == 0) {
 		if (!(obdo->o_valid & OBD_MD_FLDATAVERSION))
 			rc = -EOPNOTSUPP;
diff --git a/drivers/staging/lustre/lustre/llite/glimpse.c b/drivers/staging/lustre/lustre/llite/glimpse.c
index 22507b9..0d1ffad 100644
--- a/drivers/staging/lustre/lustre/llite/glimpse.c
+++ b/drivers/staging/lustre/lustre/llite/glimpse.c
@@ -82,65 +82,62 @@ int cl_glimpse_lock(const struct lu_env *env, struct cl_io *io,
 {
 	struct ll_inode_info *lli   = ll_i2info(inode);
 	const struct lu_fid  *fid   = lu_object_fid(&clob->co_lu);
-	int result;
+	int result = 0;
 
-	result = 0;
-	if (!(lli->lli_flags & LLIF_MDS_SIZE_LOCK)) {
-		CDEBUG(D_DLMTRACE, "Glimpsing inode " DFID "\n", PFID(fid));
-		if (lli->lli_has_smd) {
-			struct cl_lock *lock = vvp_env_lock(env);
-			struct cl_lock_descr *descr = &lock->cll_descr;
-
-			/* NOTE: this looks like DLM lock request, but it may
-			 *       not be one. Due to CEF_ASYNC flag (translated
-			 *       to LDLM_FL_HAS_INTENT by osc), this is
-			 *       glimpse request, that won't revoke any
-			 *       conflicting DLM locks held. Instead,
-			 *       ll_glimpse_callback() will be called on each
-			 *       client holding a DLM lock against this file,
-			 *       and resulting size will be returned for each
-			 *       stripe. DLM lock on [0, EOF] is acquired only
-			 *       if there were no conflicting locks. If there
-			 *       were conflicting locks, enqueuing or waiting
-			 *       fails with -ENAVAIL, but valid inode
-			 *       attributes are returned anyway.
-			 */
-			*descr = whole_file;
-			descr->cld_obj   = clob;
-			descr->cld_mode  = CLM_READ;
-			descr->cld_enq_flags = CEF_ASYNC | CEF_MUST;
-			if (agl)
-				descr->cld_enq_flags |= CEF_AGL;
-			/*
-			 * CEF_ASYNC is used because glimpse sub-locks cannot
-			 * deadlock (because they never conflict with other
-			 * locks) and, hence, can be enqueued out-of-order.
-			 *
-			 * CEF_MUST protects glimpse lock from conversion into
-			 * a lockless mode.
-			 */
-			result = cl_lock_request(env, io, lock);
-			if (result < 0)
-				return result;
-
-			if (!agl) {
-				ll_merge_attr(env, inode);
-				if (i_size_read(inode) > 0 &&
-				    inode->i_blocks == 0) {
-					/*
-					 * LU-417: Add dirty pages block count
-					 * lest i_blocks reports 0, some "cp" or
-					 * "tar" may think it's a completely
-					 * sparse file and skip it.
-					 */
-					inode->i_blocks = dirty_cnt(inode);
-				}
-			}
-			cl_lock_release(env, lock);
-		} else {
-			CDEBUG(D_DLMTRACE, "No objects for inode\n");
+	CDEBUG(D_DLMTRACE, "Glimpsing inode " DFID "\n", PFID(fid));
+	if (lli->lli_has_smd) {
+		struct cl_lock *lock = vvp_env_lock(env);
+		struct cl_lock_descr *descr = &lock->cll_descr;
+
+		/* NOTE: this looks like DLM lock request, but it may
+		 *       not be one. Due to CEF_ASYNC flag (translated
+		 *       to LDLM_FL_HAS_INTENT by osc), this is
+		 *       glimpse request, that won't revoke any
+		 *       conflicting DLM locks held. Instead,
+		 *       ll_glimpse_callback() will be called on each
+		 *       client holding a DLM lock against this file,
+		 *       and resulting size will be returned for each
+		 *       stripe. DLM lock on [0, EOF] is acquired only
+		 *       if there were no conflicting locks. If there
+		 *       were conflicting locks, enqueuing or waiting
+		 *       fails with -ENAVAIL, but valid inode
+		 *       attributes are returned anyway.
+		 */
+		*descr = whole_file;
+		descr->cld_obj = clob;
+		descr->cld_mode = CLM_READ;
+		descr->cld_enq_flags = CEF_ASYNC | CEF_MUST;
+		if (agl)
+			descr->cld_enq_flags |= CEF_AGL;
+		/*
+		 * CEF_ASYNC is used because glimpse sub-locks cannot
+		 * deadlock (because they never conflict with other
+		 * locks) and, hence, can be enqueued out-of-order.
+		 *
+		 * CEF_MUST protects glimpse lock from conversion into
+		 * a lockless mode.
+		 */
+		result = cl_lock_request(env, io, lock);
+		if (result < 0)
+			return result;
+
+		if (!agl) {
 			ll_merge_attr(env, inode);
+			if (i_size_read(inode) > 0 && !inode->i_blocks) {
+				/*
+				 * LU-417: Add dirty pages block count
+				 * lest i_blocks reports 0, some "cp" or
+				 * "tar" may think it's a completely
+				 * sparse file and skip it.
+				 */
+				inode->i_blocks = dirty_cnt(inode);
+			}
 		}
+
+		cl_lock_release(env, lock);
+	} else {
+		CDEBUG(D_DLMTRACE, "No objects for inode\n");
+		ll_merge_attr(env, inode);
 	}
 
 	return result;
diff --git a/drivers/staging/lustre/lustre/llite/llite_close.c b/drivers/staging/lustre/lustre/llite/llite_close.c
deleted file mode 100644
index 8644631..0000000
--- a/drivers/staging/lustre/lustre/llite/llite_close.c
+++ /dev/null
@@ -1,395 +0,0 @@
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.gnu.org/licenses/gpl-2.0.html
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2011, 2012, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- *
- * lustre/llite/llite_close.c
- *
- * Lustre Lite routines to issue a secondary close after writeback
- */
-
-#include <linux/module.h>
-
-#define DEBUG_SUBSYSTEM S_LLITE
-
-#include "llite_internal.h"
-
-/** records that a write is in flight */
-void vvp_write_pending(struct vvp_object *club, struct vvp_page *page)
-{
-	struct ll_inode_info *lli = ll_i2info(club->vob_inode);
-
-	spin_lock(&lli->lli_lock);
-	lli->lli_flags |= LLIF_SOM_DIRTY;
-	if (page && list_empty(&page->vpg_pending_linkage))
-		list_add(&page->vpg_pending_linkage, &club->vob_pending_list);
-	spin_unlock(&lli->lli_lock);
-}
-
-/** records that a write has completed */
-void vvp_write_complete(struct vvp_object *club, struct vvp_page *page)
-{
-	struct ll_inode_info *lli = ll_i2info(club->vob_inode);
-	int rc = 0;
-
-	spin_lock(&lli->lli_lock);
-	if (page && !list_empty(&page->vpg_pending_linkage)) {
-		list_del_init(&page->vpg_pending_linkage);
-		rc = 1;
-	}
-	spin_unlock(&lli->lli_lock);
-	if (rc)
-		ll_queue_done_writing(club->vob_inode, 0);
-}
-
-/** Queues DONE_WRITING if
- * - done writing is allowed;
- * - inode has no no dirty pages;
- */
-void ll_queue_done_writing(struct inode *inode, unsigned long flags)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-	struct vvp_object *club = cl2vvp(ll_i2info(inode)->lli_clob);
-
-	spin_lock(&lli->lli_lock);
-	lli->lli_flags |= flags;
-
-	if ((lli->lli_flags & LLIF_DONE_WRITING) &&
-	    list_empty(&club->vob_pending_list)) {
-		struct ll_close_queue *lcq = ll_i2sbi(inode)->ll_lcq;
-
-		if (lli->lli_flags & LLIF_MDS_SIZE_LOCK)
-			CWARN("%s: file "DFID"(flags %u) Size-on-MDS valid, done writing allowed and no diry pages\n",
-			      ll_get_fsname(inode->i_sb, NULL, 0),
-			      PFID(ll_inode2fid(inode)), lli->lli_flags);
-		/* DONE_WRITING is allowed and inode has no dirty page. */
-		spin_lock(&lcq->lcq_lock);
-
-		LASSERT(list_empty(&lli->lli_close_list));
-		CDEBUG(D_INODE, "adding inode "DFID" to close list\n",
-		       PFID(ll_inode2fid(inode)));
-		list_add_tail(&lli->lli_close_list, &lcq->lcq_head);
-
-		/* Avoid a concurrent insertion into the close thread queue:
-		 * an inode is already in the close thread, open(), write(),
-		 * close() happen, epoch is closed as the inode is marked as
-		 * LLIF_EPOCH_PENDING. When pages are written inode should not
-		 * be inserted into the queue again, clear this flag to avoid
-		 * it.
-		 */
-		lli->lli_flags &= ~LLIF_DONE_WRITING;
-
-		wake_up(&lcq->lcq_waitq);
-		spin_unlock(&lcq->lcq_lock);
-	}
-	spin_unlock(&lli->lli_lock);
-}
-
-/** Pack SOM attributes info @opdata for CLOSE, DONE_WRITING rpc. */
-void ll_done_writing_attr(struct inode *inode, struct md_op_data *op_data)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-
-	op_data->op_flags |= MF_SOM_CHANGE;
-	/* Check if Size-on-MDS attributes are valid. */
-	if (lli->lli_flags & LLIF_MDS_SIZE_LOCK)
-		CERROR("%s: inode "DFID"(flags %u) MDS holds lock on Size-on-MDS attributes\n",
-		       ll_get_fsname(inode->i_sb, NULL, 0),
-		       PFID(ll_inode2fid(inode)), lli->lli_flags);
-
-	if (!cl_local_size(inode)) {
-		/* Send Size-on-MDS Attributes if valid. */
-		op_data->op_attr.ia_valid |= ATTR_MTIME_SET | ATTR_CTIME_SET |
-				ATTR_ATIME_SET | ATTR_SIZE | ATTR_BLOCKS;
-	}
-}
-
-/** Closes ioepoch and packs Size-on-MDS attribute if needed into @op_data. */
-void ll_ioepoch_close(struct inode *inode, struct md_op_data *op_data,
-		      struct obd_client_handle **och, unsigned long flags)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-	struct vvp_object *club = cl2vvp(ll_i2info(inode)->lli_clob);
-
-	spin_lock(&lli->lli_lock);
-	if (!(list_empty(&club->vob_pending_list))) {
-		if (!(lli->lli_flags & LLIF_EPOCH_PENDING)) {
-			LASSERT(*och);
-			LASSERT(!lli->lli_pending_och);
-			/* Inode is dirty and there is no pending write done
-			 * request yet, DONE_WRITE is to be sent later.
-			 */
-			lli->lli_flags |= LLIF_EPOCH_PENDING;
-			lli->lli_pending_och = *och;
-			spin_unlock(&lli->lli_lock);
-
-			inode = igrab(inode);
-			LASSERT(inode);
-			goto out;
-		}
-		if (flags & LLIF_DONE_WRITING) {
-			/* Some pages are still dirty, it is early to send
-			 * DONE_WRITE. Wait until all pages will be flushed
-			 * and try DONE_WRITE again later.
-			 */
-			LASSERT(!(lli->lli_flags & LLIF_DONE_WRITING));
-			lli->lli_flags |= LLIF_DONE_WRITING;
-			spin_unlock(&lli->lli_lock);
-
-			inode = igrab(inode);
-			LASSERT(inode);
-			goto out;
-		}
-	}
-	CDEBUG(D_INODE, "Epoch %llu closed on "DFID"\n",
-	       ll_i2info(inode)->lli_ioepoch, PFID(&lli->lli_fid));
-	op_data->op_flags |= MF_EPOCH_CLOSE;
-
-	if (flags & LLIF_DONE_WRITING) {
-		LASSERT(lli->lli_flags & LLIF_SOM_DIRTY);
-		LASSERT(!(lli->lli_flags & LLIF_DONE_WRITING));
-		*och = lli->lli_pending_och;
-		lli->lli_pending_och = NULL;
-		lli->lli_flags &= ~LLIF_EPOCH_PENDING;
-	} else {
-		/* Pack Size-on-MDS inode attributes only if they has changed */
-		if (!(lli->lli_flags & LLIF_SOM_DIRTY)) {
-			spin_unlock(&lli->lli_lock);
-			goto out;
-		}
-
-		/* There is a pending DONE_WRITE -- close epoch with no
-		 * attribute change.
-		 */
-		if (lli->lli_flags & LLIF_EPOCH_PENDING) {
-			spin_unlock(&lli->lli_lock);
-			goto out;
-		}
-	}
-
-	LASSERT(list_empty(&club->vob_pending_list));
-	lli->lli_flags &= ~LLIF_SOM_DIRTY;
-	spin_unlock(&lli->lli_lock);
-	ll_done_writing_attr(inode, op_data);
-
-out:
-	return;
-}
-
-/**
- * Cliens updates SOM attributes on MDS (including llog cookies):
- * obd_getattr with no lock and md_setattr.
- */
-int ll_som_update(struct inode *inode, struct md_op_data *op_data)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-	struct ptlrpc_request *request = NULL;
-	__u32 old_flags;
-	struct obdo *oa;
-	int rc;
-
-	LASSERT(op_data);
-	if (lli->lli_flags & LLIF_MDS_SIZE_LOCK)
-		CERROR("%s: inode "DFID"(flags %u) MDS holds lock on Size-on-MDS attributes\n",
-		       ll_get_fsname(inode->i_sb, NULL, 0),
-		       PFID(ll_inode2fid(inode)), lli->lli_flags);
-
-	oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-	if (!oa) {
-		CERROR("can't allocate memory for Size-on-MDS update.\n");
-		return -ENOMEM;
-	}
-
-	old_flags = op_data->op_flags;
-	op_data->op_flags = MF_SOM_CHANGE;
-
-	/* If inode is already in another epoch, skip getattr from OSTs. */
-	if (lli->lli_ioepoch == op_data->op_ioepoch) {
-		rc = ll_inode_getattr(inode, oa, op_data->op_ioepoch,
-				      old_flags & MF_GETATTR_LOCK);
-		if (rc) {
-			oa->o_valid = 0;
-			if (rc != -ENOENT)
-				CERROR("%s: inode_getattr failed - unable to send a Size-on-MDS attribute update for inode "DFID": rc = %d\n",
-				       ll_get_fsname(inode->i_sb, NULL, 0),
-				       PFID(ll_inode2fid(inode)), rc);
-		} else {
-			CDEBUG(D_INODE, "Size-on-MDS update on "DFID"\n",
-			       PFID(&lli->lli_fid));
-		}
-		/* Install attributes into op_data. */
-		md_from_obdo(op_data, oa, oa->o_valid);
-	}
-
-	rc = md_setattr(ll_i2sbi(inode)->ll_md_exp, op_data,
-			NULL, 0, NULL, 0, &request, NULL);
-	ptlrpc_req_finished(request);
-
-	kmem_cache_free(obdo_cachep, oa);
-	return rc;
-}
-
-/**
- * Closes the ioepoch and packs all the attributes into @op_data for
- * DONE_WRITING rpc.
- */
-static void ll_prepare_done_writing(struct inode *inode,
-				    struct md_op_data *op_data,
-				    struct obd_client_handle **och)
-{
-	ll_ioepoch_close(inode, op_data, och, LLIF_DONE_WRITING);
-	/* If there is no @och, we do not do D_W yet. */
-	if (!*och)
-		return;
-
-	ll_pack_inode2opdata(inode, op_data, &(*och)->och_fh);
-	ll_prep_md_op_data(op_data, inode, NULL, NULL,
-			   0, 0, LUSTRE_OPC_ANY, NULL);
-}
-
-/** Send a DONE_WRITING rpc. */
-static void ll_done_writing(struct inode *inode)
-{
-	struct obd_client_handle *och = NULL;
-	struct md_op_data *op_data;
-	int rc;
-
-	LASSERT(exp_connect_som(ll_i2mdexp(inode)));
-
-	op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
-	if (!op_data)
-		return;
-
-	ll_prepare_done_writing(inode, op_data, &och);
-	/* If there is no @och, we do not do D_W yet. */
-	if (!och)
-		goto out;
-
-	rc = md_done_writing(ll_i2sbi(inode)->ll_md_exp, op_data, NULL);
-	if (rc == -EAGAIN)
-		/* MDS has instructed us to obtain Size-on-MDS attribute from
-		 * OSTs and send setattr to back to MDS.
-		 */
-		rc = ll_som_update(inode, op_data);
-	else if (rc) {
-		CERROR("%s: inode "DFID" mdc done_writing failed: rc = %d\n",
-		       ll_get_fsname(inode->i_sb, NULL, 0),
-		       PFID(ll_inode2fid(inode)), rc);
-	}
-out:
-	ll_finish_md_op_data(op_data);
-	if (och) {
-		md_clear_open_replay_data(ll_i2sbi(inode)->ll_md_exp, och);
-		kfree(och);
-	}
-}
-
-static struct ll_inode_info *ll_close_next_lli(struct ll_close_queue *lcq)
-{
-	struct ll_inode_info *lli = NULL;
-
-	spin_lock(&lcq->lcq_lock);
-
-	if (!list_empty(&lcq->lcq_head)) {
-		lli = list_entry(lcq->lcq_head.next, struct ll_inode_info,
-				 lli_close_list);
-		list_del_init(&lli->lli_close_list);
-	} else if (atomic_read(&lcq->lcq_stop)) {
-		lli = ERR_PTR(-EALREADY);
-	}
-
-	spin_unlock(&lcq->lcq_lock);
-	return lli;
-}
-
-static int ll_close_thread(void *arg)
-{
-	struct ll_close_queue *lcq = arg;
-
-	complete(&lcq->lcq_comp);
-
-	while (1) {
-		struct l_wait_info lwi = { 0 };
-		struct ll_inode_info *lli;
-		struct inode *inode;
-
-		l_wait_event_exclusive(lcq->lcq_waitq,
-				       (lli = ll_close_next_lli(lcq)) != NULL,
-				       &lwi);
-		if (IS_ERR(lli))
-			break;
-
-		inode = ll_info2i(lli);
-		CDEBUG(D_INFO, "done_writing for inode "DFID"\n",
-		       PFID(ll_inode2fid(inode)));
-		ll_done_writing(inode);
-		iput(inode);
-	}
-
-	CDEBUG(D_INFO, "ll_close exiting\n");
-	complete(&lcq->lcq_comp);
-	return 0;
-}
-
-int ll_close_thread_start(struct ll_close_queue **lcq_ret)
-{
-	struct ll_close_queue *lcq;
-	struct task_struct *task;
-
-	if (OBD_FAIL_CHECK(OBD_FAIL_LDLM_CLOSE_THREAD))
-		return -EINTR;
-
-	lcq = kzalloc(sizeof(*lcq), GFP_NOFS);
-	if (!lcq)
-		return -ENOMEM;
-
-	spin_lock_init(&lcq->lcq_lock);
-	INIT_LIST_HEAD(&lcq->lcq_head);
-	init_waitqueue_head(&lcq->lcq_waitq);
-	init_completion(&lcq->lcq_comp);
-
-	task = kthread_run(ll_close_thread, lcq, "ll_close");
-	if (IS_ERR(task)) {
-		kfree(lcq);
-		return PTR_ERR(task);
-	}
-
-	wait_for_completion(&lcq->lcq_comp);
-	*lcq_ret = lcq;
-	return 0;
-}
-
-void ll_close_thread_shutdown(struct ll_close_queue *lcq)
-{
-	init_completion(&lcq->lcq_comp);
-	atomic_inc(&lcq->lcq_stop);
-	wake_up(&lcq->lcq_waitq);
-	wait_for_completion(&lcq->lcq_comp);
-	kfree(lcq);
-}
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 3e98bd6..cd89926 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -98,28 +98,17 @@ struct ll_grouplock {
 };
 
 enum lli_flags {
-	/* MDS has an authority for the Size-on-MDS attributes. */
-	LLIF_MDS_SIZE_LOCK      = (1 << 0),
-	/* Epoch close is postponed. */
-	LLIF_EPOCH_PENDING      = (1 << 1),
-	/* DONE WRITING is allowed. */
-	LLIF_DONE_WRITING       = (1 << 2),
-	/* Sizeon-on-MDS attributes are changed. An attribute update needs to
-	 * be sent to MDS.
-	 */
-	LLIF_SOM_DIRTY	  = (1 << 3),
 	/* File data is modified. */
-	LLIF_DATA_MODIFIED      = (1 << 4),
+	LLIF_DATA_MODIFIED	= BIT(0),
 	/* File is being restored */
-	LLIF_FILE_RESTORING	= (1 << 5),
+	LLIF_FILE_RESTORING	= BIT(1),
 	/* Xattr cache is attached to the file */
-	LLIF_XATTR_CACHE	= (1 << 6),
+	LLIF_XATTR_CACHE	= BIT(2),
 };
 
 struct ll_inode_info {
 	__u32				lli_inode_magic;
 	__u32				lli_flags;
-	__u64				lli_ioepoch;
 
 	spinlock_t			lli_lock;
 	struct posix_acl		*lli_posix_acl;
@@ -129,14 +118,6 @@ struct ll_inode_info {
 	/* master inode fid for stripe directory */
 	struct lu_fid		   lli_pfid;
 
-	struct list_head	      lli_close_list;
-
-	/* handle is to be sent to MDS later on done_writing and setattr.
-	 * Open handle data are needed for the recovery to reconstruct
-	 * the inode state on the MDS. XXX: recovery is not ready yet.
-	 */
-	struct obd_client_handle       *lli_pending_och;
-
 	/* We need all three because every inode may be opened in different
 	 * modes
 	 */
@@ -400,7 +381,7 @@ enum stats_track_type {
 #define LL_SBI_LOCALFLOCK       0x200 /* Local flocks support by kernel */
 #define LL_SBI_LRU_RESIZE       0x400 /* lru resize support */
 #define LL_SBI_LAZYSTATFS       0x800 /* lazystatfs mount option */
-#define LL_SBI_SOM_PREVIEW     0x1000 /* SOM preview mount option */
+/*	LL_SBI_SOM_PREVIEW     0x1000    SOM preview mount option, obsolete */
 #define LL_SBI_32BIT_API       0x2000 /* generate 32 bit inodes. */
 #define LL_SBI_64BIT_HASH      0x4000 /* support 64-bits dir hash/offset */
 #define LL_SBI_AGL_ENABLED     0x8000 /* enable agl */
@@ -466,10 +447,10 @@ struct ll_sb_info {
 
 	int		       ll_flags;
 	unsigned int		  ll_umounting:1,
-				  ll_xattr_cache_enabled:1;
-	struct lustre_client_ocd  ll_lco;
+				  ll_xattr_cache_enabled:1,
+				  ll_client_common_fill_super_succeeded:1;
 
-	struct ll_close_queue    *ll_lcq;
+	struct lustre_client_ocd  ll_lco;
 
 	struct lprocfs_stats     *ll_stats; /* lprocfs stats counter */
 
@@ -764,15 +745,8 @@ int ll_file_open(struct inode *inode, struct file *file);
 int ll_file_release(struct inode *inode, struct file *file);
 int ll_glimpse_ioctl(struct ll_sb_info *sbi,
 		     struct lov_stripe_md *lsm, lstat_t *st);
-void ll_ioepoch_open(struct ll_inode_info *lli, __u64 ioepoch);
 int ll_release_openhandle(struct inode *, struct lookup_intent *);
 int ll_md_real_close(struct inode *inode, fmode_t fmode);
-void ll_ioepoch_close(struct inode *inode, struct md_op_data *op_data,
-		      struct obd_client_handle **och, unsigned long flags);
-void ll_done_writing_attr(struct inode *inode, struct md_op_data *op_data);
-int ll_som_update(struct inode *inode, struct md_op_data *op_data);
-int ll_inode_getattr(struct inode *inode, struct obdo *obdo,
-		     __u64 ioepoch, int sync);
 void ll_pack_inode2opdata(struct inode *inode, struct md_op_data *op_data,
 			  struct lustre_handle *fh);
 int ll_getattr(struct vfsmount *mnt, struct dentry *de, struct kstat *stat);
@@ -891,18 +865,6 @@ int ll_dir_get_parent_fid(struct inode *dir, struct lu_fid *parent_fid);
 /* llite/symlink.c */
 extern const struct inode_operations ll_fast_symlink_inode_operations;
 
-/* llite/llite_close.c */
-struct ll_close_queue {
-	spinlock_t		lcq_lock;
-	struct list_head		lcq_head;
-	wait_queue_head_t		lcq_waitq;
-	struct completion	lcq_comp;
-	atomic_t		lcq_stop;
-};
-
-void vvp_write_pending(struct vvp_object *club, struct vvp_page *page);
-void vvp_write_complete(struct vvp_object *club, struct vvp_page *page);
-
 /**
  * IO arguments for various VFS I/O interfaces.
  */
@@ -956,10 +918,6 @@ static inline struct vvp_io_args *ll_env_args(const struct lu_env *env,
 	return via;
 }
 
-void ll_queue_done_writing(struct inode *inode, unsigned long flags);
-void ll_close_thread_shutdown(struct ll_close_queue *lcq);
-int ll_close_thread_start(struct ll_close_queue **lcq_ret);
-
 /* llite/llite_mmap.c */
 
 int ll_teardown_mmaps(struct address_space *mapping, __u64 first, __u64 last);
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 6bb41b0..4f83275 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -193,9 +193,6 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 				  OBD_CONNECT_OPEN_BY_FID |
 				  OBD_CONNECT_DIR_STRIPE;
 
-	if (sbi->ll_flags & LL_SBI_SOM_PREVIEW)
-		data->ocd_connect_flags |= OBD_CONNECT_SOM;
-
 	if (sbi->ll_flags & LL_SBI_LRU_RESIZE)
 		data->ocd_connect_flags |= OBD_CONNECT_LRU_RESIZE;
 #ifdef CONFIG_FS_POSIX_ACL
@@ -357,9 +354,6 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 				  OBD_CONNECT_JOBSTATS | OBD_CONNECT_LVB_TYPE |
 				  OBD_CONNECT_LAYOUTLOCK | OBD_CONNECT_PINGLESS;
 
-	if (sbi->ll_flags & LL_SBI_SOM_PREVIEW)
-		data->ocd_connect_flags |= OBD_CONNECT_SOM;
-
 	if (!OBD_FAIL_CHECK(OBD_FAIL_OSC_CONNECT_CKSUM)) {
 		/* OBD_CONNECT_CKSUM should always be set, even if checksums are
 		 * disabled by default, because it can still be enabled on the
@@ -488,12 +482,6 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 		goto out_root;
 	}
 
-	err = ll_close_thread_start(&sbi->ll_lcq);
-	if (err) {
-		CERROR("cannot start close thread: rc %d\n", err);
-		goto out_root;
-	}
-
 	checksum = sbi->ll_flags & LL_SBI_CHECKSUM;
 	err = obd_set_info_async(NULL, sbi->ll_dt_exp, sizeof(KEY_CHECKSUM),
 				 KEY_CHECKSUM, sizeof(checksum), &checksum,
@@ -633,8 +621,6 @@ static void client_common_put_super(struct super_block *sb)
 {
 	struct ll_sb_info *sbi = ll_s2sbi(sb);
 
-	ll_close_thread_shutdown(sbi->ll_lcq);
-
 	cl_sb_fini(sb);
 
 	obd_fid_fini(sbi->ll_dt_exp->exp_obd);
@@ -766,11 +752,6 @@ static int ll_options(char *options, int *flags)
 			*flags &= ~tmp;
 			goto next;
 		}
-		tmp = ll_set_opt("som_preview", s1, LL_SBI_SOM_PREVIEW);
-		if (tmp) {
-			*flags |= tmp;
-			goto next;
-		}
 		tmp = ll_set_opt("32bitapi", s1, LL_SBI_32BIT_API);
 		if (tmp) {
 			*flags |= tmp;
@@ -804,14 +785,11 @@ void ll_lli_init(struct ll_inode_info *lli)
 {
 	lli->lli_inode_magic = LLI_INODE_MAGIC;
 	lli->lli_flags = 0;
-	lli->lli_ioepoch = 0;
 	lli->lli_maxbytes = MAX_LFS_FILESIZE;
 	spin_lock_init(&lli->lli_lock);
 	lli->lli_posix_acl = NULL;
 	/* Do not set lli_fid, it has been initialized already. */
 	fid_zero(&lli->lli_pfid);
-	INIT_LIST_HEAD(&lli->lli_close_list);
-	lli->lli_pending_och = NULL;
 	lli->lli_mds_read_och = NULL;
 	lli->lli_mds_write_och = NULL;
 	lli->lli_mds_exec_och = NULL;
@@ -941,6 +919,8 @@ int ll_fill_super(struct super_block *sb, struct vfsmount *mnt)
 
 	/* connections, registrations, sb setup */
 	err = client_common_fill_super(sb, md, dt, mnt);
+	if (!err)
+		sbi->ll_client_common_fill_super_succeeded = 1;
 
 out_free:
 	kfree(md);
@@ -1002,7 +982,7 @@ void ll_put_super(struct super_block *sb)
 		}
 	}
 
-	if (sbi->ll_lcq) {
+	if (sbi->ll_client_common_fill_super_succeeded) {
 		/* Only if client_common_fill_super succeeded */
 		client_common_put_super(sb);
 	}
@@ -1272,9 +1252,6 @@ void ll_clear_inode(struct inode *inode)
 		LASSERT(lli->lli_opendir_pid == 0);
 	}
 
-	spin_lock(&lli->lli_lock);
-	ll_i2info(inode)->lli_flags &= ~LLIF_MDS_SIZE_LOCK;
-	spin_unlock(&lli->lli_lock);
 	md_null_inode(sbi->ll_md_exp, ll_inode2fid(inode));
 
 	LASSERT(!lli->lli_open_fd_write_count);
@@ -1369,48 +1346,12 @@ static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data,
 	rc = simple_setattr(dentry, &op_data->op_attr);
 	op_data->op_attr.ia_valid = ia_valid;
 
-	/* Extract epoch data if obtained. */
-	op_data->op_handle = md.body->mbo_handle;
-	op_data->op_ioepoch = md.body->mbo_ioepoch;
-
 	rc = ll_update_inode(inode, &md);
 	ptlrpc_req_finished(request);
 
 	return rc;
 }
 
-/* Close IO epoch and send Size-on-MDS attribute update. */
-static int ll_setattr_done_writing(struct inode *inode,
-				   struct md_op_data *op_data,
-				   struct md_open_data *mod)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-	int rc = 0;
-
-	if (!S_ISREG(inode->i_mode))
-		return 0;
-
-	CDEBUG(D_INODE, "Epoch %llu closed on "DFID" for truncate\n",
-	       op_data->op_ioepoch, PFID(&lli->lli_fid));
-
-	op_data->op_flags = MF_EPOCH_CLOSE;
-	ll_done_writing_attr(inode, op_data);
-	ll_pack_inode2opdata(inode, op_data, NULL);
-
-	rc = md_done_writing(ll_i2sbi(inode)->ll_md_exp, op_data, mod);
-	if (rc == -EAGAIN)
-		/* MDS has instructed us to obtain Size-on-MDS attribute
-		 * from OSTs and send setattr to back to MDS.
-		 */
-		rc = ll_som_update(inode, op_data);
-	else if (rc) {
-		CERROR("%s: inode "DFID" mdc truncate failed: rc = %d\n",
-		       ll_i2sbi(inode)->ll_md_exp->exp_obd->obd_name,
-		       PFID(ll_inode2fid(inode)), rc);
-	}
-	return rc;
-}
-
 /* If this inode has objects allocated to it (lsm != NULL), then the OST
  * object(s) determine the file size and mtime.  Otherwise, the MDS will
  * keep these values until such a time that objects are allocated for it.
@@ -1433,7 +1374,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 	struct md_op_data *op_data = NULL;
 	struct md_open_data *mod = NULL;
 	bool file_is_released = false;
-	int rc = 0, rc1 = 0;
+	int rc = 0;
 
 	CDEBUG(D_VFSTRACE, "%s: setattr inode "DFID"(%p) from %llu to %llu, valid %x, hsm_import %d\n",
 	       ll_get_fsname(inode->i_sb, NULL, 0), PFID(&lli->lli_fid), inode,
@@ -1536,11 +1477,6 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 
 	memcpy(&op_data->op_attr, attr, sizeof(*attr));
 
-	/* Open epoch for truncate. */
-	if (exp_connect_som(ll_i2mdexp(inode)) && !hsm_import &&
-	    (attr->ia_valid & (ATTR_SIZE | ATTR_MTIME | ATTR_MTIME_SET)))
-		op_data->op_flags = MF_EPOCH_OPEN;
-
 	rc = ll_md_setattr(dentry, op_data, &mod);
 	if (rc)
 		goto out;
@@ -1552,7 +1488,6 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 		spin_unlock(&lli->lli_lock);
 	}
 
-	ll_ioepoch_open(lli, op_data->op_ioepoch);
 	if (!S_ISREG(inode->i_mode) || file_is_released) {
 		rc = 0;
 		goto out;
@@ -1575,12 +1510,8 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 			up_write(&lli->lli_trunc_sem);
 	}
 out:
-	if (op_data->op_ioepoch) {
-		rc1 = ll_setattr_done_writing(inode, op_data, mod);
-		if (!rc)
-			rc = rc1;
-	}
-	ll_finish_md_op_data(op_data);
+	if (op_data)
+		ll_finish_md_op_data(op_data);
 
 	if (!S_ISDIR(inode->i_mode)) {
 		inode_lock(inode);
@@ -1828,48 +1759,11 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md)
 	LASSERT(fid_seq(&lli->lli_fid) != 0);
 
 	if (body->mbo_valid & OBD_MD_FLSIZE) {
-		if (exp_connect_som(ll_i2mdexp(inode)) &&
-		    S_ISREG(inode->i_mode)) {
-			struct lustre_handle lockh;
-			enum ldlm_mode mode;
-
-			/* As it is possible a blocking ast has been processed
-			 * by this time, we need to check there is an UPDATE
-			 * lock on the client and set LLIF_MDS_SIZE_LOCK holding
-			 * it.
-			 */
-			mode = ll_take_md_lock(inode, MDS_INODELOCK_UPDATE,
-					       &lockh, LDLM_FL_CBPENDING,
-					       LCK_CR | LCK_CW |
-					       LCK_PR | LCK_PW);
-			if (mode) {
-				if (lli->lli_flags & (LLIF_DONE_WRITING |
-						      LLIF_EPOCH_PENDING |
-						      LLIF_SOM_DIRTY)) {
-					CERROR("%s: inode "DFID" flags %u still has size authority! do not trust the size got from MDS\n",
-					       sbi->ll_md_exp->exp_obd->obd_name,
-					       PFID(ll_inode2fid(inode)),
-					       lli->lli_flags);
-				} else {
-					/* Use old size assignment to avoid
-					 * deadlock bz14138 & bz14326
-					 */
-					i_size_write(inode, body->mbo_size);
-					spin_lock(&lli->lli_lock);
-					lli->lli_flags |= LLIF_MDS_SIZE_LOCK;
-					spin_unlock(&lli->lli_lock);
-				}
-				ldlm_lock_decref(&lockh, mode);
-			}
-		} else {
-			/* Use old size assignment to avoid
-			 * deadlock bz14138 & bz14326
-			 */
-			i_size_write(inode, body->mbo_size);
+		i_size_write(inode, body->mbo_size);
 
-			CDEBUG(D_VFSTRACE, "inode=%lu, updating i_size %llu\n",
-			       inode->i_ino, (unsigned long long)body->mbo_size);
-		}
+		CDEBUG(D_VFSTRACE, "inode=" DFID ", updating i_size %llu\n",
+		       PFID(ll_inode2fid(inode)),
+		       (unsigned long long)body->mbo_size);
 
 		if (body->mbo_valid & OBD_MD_FLBLOCKS)
 			inode->i_blocks = body->mbo_blocks;
@@ -2164,7 +2058,6 @@ void ll_open_cleanup(struct super_block *sb, struct ptlrpc_request *open_req)
 		return;
 
 	op_data->op_fid1 = body->mbo_fid1;
-	op_data->op_ioepoch = body->mbo_ioepoch;
 	op_data->op_handle = body->mbo_handle;
 	op_data->op_mod_time = get_seconds();
 	md_close(exp, op_data, NULL, &close_req);
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index dfa36d3..9cc4bb4 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -254,14 +254,6 @@ int ll_md_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 				       PFID(ll_inode2fid(inode)), rc);
 		}
 
-		if (bits & MDS_INODELOCK_UPDATE) {
-			struct ll_inode_info *lli = ll_i2info(inode);
-
-			spin_lock(&lli->lli_lock);
-			lli->lli_flags &= ~LLIF_MDS_SIZE_LOCK;
-			spin_unlock(&lli->lli_lock);
-		}
-
 		if ((bits & MDS_INODELOCK_UPDATE) && S_ISDIR(inode->i_mode)) {
 			struct ll_inode_info *lli = ll_i2info(inode);
 
diff --git a/drivers/staging/lustre/lustre/llite/vvp_dev.c b/drivers/staging/lustre/lustre/llite/vvp_dev.c
index 8aa8ecc..cab95ac 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_dev.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_dev.c
@@ -521,11 +521,10 @@ static void vvp_pgcache_page_show(const struct lu_env *env,
 
 	vpg = cl2vvp_page(cl_page_at(page, &vvp_device_type));
 	vmpage = vpg->vpg_page;
-	seq_printf(seq, " %5i | %p %p %s %s %s %s | %p "DFID"(%p) %lu %u [",
+	seq_printf(seq, " %5i | %p %p %s %s %s | %p " DFID "(%p) %lu %u [",
 		   0 /* gen */,
 		   vpg, page,
 		   "none",
-		   vpg->vpg_write_queued ? "wq" : "- ",
 		   vpg->vpg_defer_uptodate ? "du" : "- ",
 		   PageWriteback(vmpage) ? "wb" : "-",
 		   vmpage, PFID(ll_inode2fid(vmpage->mapping->host)),
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index 5802da8..47d035e 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -209,14 +209,6 @@ struct vvp_object {
 	struct inode           *vob_inode;
 
 	/**
-	 * A list of dirty pages pending IO in the cache. Used by
-	 * SOM. Protected by ll_inode_info::lli_lock.
-	 *
-	 * \see vvp_page::vpg_pending_linkage
-	 */
-	struct list_head	vob_pending_list;
-
-	/**
 	 * Number of transient pages.  This is no longer protected by i_sem,
 	 * and needs to be atomic.  This is not actually used for anything,
 	 * and can probably be removed.
@@ -249,15 +241,7 @@ struct vvp_object {
 struct vvp_page {
 	struct cl_page_slice vpg_cl;
 	unsigned int	vpg_defer_uptodate:1,
-			vpg_ra_used:1,
-			vpg_write_queued:1;
-	/**
-	 * Non-empty iff this page is already counted in
-	 * vvp_object::vob_pending_list. This list is only used as a flag,
-	 * that is, never iterated through, only checked for list_empty(), but
-	 * having a list is useful for debugging.
-	 */
-	struct list_head	   vpg_pending_linkage;
+			vpg_ra_used:1;
 	/** VM page */
 	struct page	  *vpg_page;
 };
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 2ab4503..dbc4c26 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -807,16 +807,11 @@ static int vvp_io_commit_sync(const struct lu_env *env, struct cl_io *io,
 static void write_commit_callback(const struct lu_env *env, struct cl_io *io,
 				  struct cl_page *page)
 {
-	struct vvp_page *vpg;
 	struct page *vmpage = page->cp_vmpage;
-	struct cl_object *clob = cl_io_top(io)->ci_obj;
 
 	SetPageUptodate(vmpage);
 	set_page_dirty(vmpage);
 
-	vpg = cl2vvp_page(cl_object_page_slice(clob, page));
-	vvp_write_pending(cl2vvp(clob), vpg);
-
 	cl_page_disown(env, io, page);
 
 	/* held in ll_cl_init() */
@@ -1051,13 +1046,7 @@ static int vvp_io_kernel_fault(struct vvp_fault_io *cfio)
 static void mkwrite_commit_callback(const struct lu_env *env, struct cl_io *io,
 				    struct cl_page *page)
 {
-	struct vvp_page *vpg;
-	struct cl_object *clob = cl_io_top(io)->ci_obj;
-
 	set_page_dirty(page->cp_vmpage);
-
-	vpg = cl2vvp_page(cl_object_page_slice(clob, page));
-	vvp_write_pending(cl2vvp(clob), vpg);
 }
 
 static int vvp_io_fault_start(const struct lu_env *env,
diff --git a/drivers/staging/lustre/lustre/llite/vvp_object.c b/drivers/staging/lustre/lustre/llite/vvp_object.c
index b57195d..3214885 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_object.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_object.c
@@ -65,8 +65,7 @@ static int vvp_object_print(const struct lu_env *env, void *cookie,
 	struct inode	 *inode = obj->vob_inode;
 	struct ll_inode_info *lli;
 
-	(*p)(env, cookie, "(%s %d %d) inode: %p ",
-	     list_empty(&obj->vob_pending_list) ? "-" : "+",
+	(*p)(env, cookie, "(%d %d) inode: %p ",
 	     atomic_read(&obj->vob_transient_pages),
 	     atomic_read(&obj->vob_mmap_cnt), inode);
 	if (inode) {
@@ -240,7 +239,6 @@ static int vvp_object_init(const struct lu_env *env, struct lu_object *obj,
 		const struct cl_object_conf *cconf;
 
 		cconf = lu2cl_conf(conf);
-		INIT_LIST_HEAD(&vob->vob_pending_list);
 		lu_object_add(obj, below);
 		result = vvp_object_init0(env, vob, cconf);
 	} else {
diff --git a/drivers/staging/lustre/lustre/llite/vvp_page.c b/drivers/staging/lustre/lustre/llite/vvp_page.c
index 5d79efc..68f8990 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_page.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_page.c
@@ -162,8 +162,6 @@ static void vvp_page_delete(const struct lu_env *env,
 	LASSERT((struct cl_page *)vmpage->private == page);
 	LASSERT(inode == vvp_object_inode(obj));
 
-	vvp_write_complete(cl2vvp(obj), cl2vvp_page(slice));
-
 	/* Drop the reference count held in vvp_page_init */
 	refc = atomic_dec_return(&page->cp_ref);
 	LASSERTF(refc >= 1, "page = %p, refc = %d\n", page, refc);
@@ -221,8 +219,6 @@ static int vvp_page_prep_write(const struct lu_env *env,
 	if (!pg->cp_sync_io)
 		set_page_writeback(vmpage);
 
-	vvp_write_pending(cl2vvp(slice->cpl_obj), cl2vvp_page(slice));
-
 	return 0;
 }
 
@@ -290,19 +286,6 @@ static void vvp_page_completion_write(const struct lu_env *env,
 
 	CL_PAGE_HEADER(D_PAGE, env, pg, "completing WRITE with %d\n", ioret);
 
-	/*
-	 * TODO: Actually it makes sense to add the page into oap pending
-	 * list again and so that we don't need to take the page out from
-	 * SoM write pending list, if we just meet a recoverable error,
-	 * -ENOMEM, etc.
-	 * To implement this, we just need to return a non zero value in
-	 * ->cpo_completion method. The underlying transfer should be notified
-	 * and then re-add the page into pending transfer queue.  -jay
-	 */
-
-	vpg->vpg_write_queued = 0;
-	vvp_write_complete(cl2vvp(slice->cpl_obj), vpg);
-
 	if (pg->cp_sync_io) {
 		LASSERT(PageLocked(vmpage));
 		LASSERT(!PageWriteback(vmpage));
@@ -344,7 +327,6 @@ static int vvp_page_make_ready(const struct lu_env *env,
 		LASSERT(pg->cp_state == CPS_CACHED);
 		/* This actually clears the dirty bit in the radix tree. */
 		set_page_writeback(vmpage);
-		vvp_write_pending(cl2vvp(slice->cpl_obj), cl2vvp_page(slice));
 		CL_PAGE_HEADER(D_PAGE, env, pg, "readied\n");
 	} else if (pg->cp_state == CPS_PAGEOUT) {
 		/* is it possible for osc_flush_async_page() to already
@@ -381,9 +363,8 @@ static int vvp_page_print(const struct lu_env *env,
 	struct vvp_page *vpg = cl2vvp_page(slice);
 	struct page     *vmpage = vpg->vpg_page;
 
-	(*printer)(env, cookie, LUSTRE_VVP_NAME "-page@%p(%d:%d:%d) vm@%p ",
-		   vpg, vpg->vpg_defer_uptodate, vpg->vpg_ra_used,
-		   vpg->vpg_write_queued, vmpage);
+	(*printer)(env, cookie, LUSTRE_VVP_NAME "-page@%p(%d:%d) vm@%p ",
+		   vpg, vpg->vpg_defer_uptodate, vpg->vpg_ra_used, vmpage);
 	if (vmpage) {
 		(*printer)(env, cookie, "%lx %d:%d %lx %lu %slru",
 			   (long)vmpage->flags, page_count(vmpage),
@@ -542,7 +523,6 @@ int vvp_page_init(const struct lu_env *env, struct cl_object *obj,
 	vpg->vpg_page = vmpage;
 	get_page(vmpage);
 
-	INIT_LIST_HEAD(&vpg->vpg_pending_linkage);
 	if (page->cp_type == CPT_CACHEABLE) {
 		/* in cache, decref in vvp_page_delete */
 		atomic_inc(&page->cp_ref);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_req.c b/drivers/staging/lustre/lustre/llite/vvp_req.c
index e3f4c79..a8892e4 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_req.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_req.c
@@ -56,8 +56,6 @@ static inline struct vvp_req *cl2vvp_req(const struct cl_req_slice *slice)
  *
  *    - o_parent_ver
  *
- *    - o_ioepoch,
- *
  */
 static void vvp_req_attr_set(const struct lu_env *env,
 			     const struct cl_req_slice *slice,
@@ -72,14 +70,9 @@ static void vvp_req_attr_set(const struct lu_env *env,
 	inode = vvp_object_inode(obj);
 	valid_flags = OBD_MD_FLTYPE;
 
-	if (slice->crs_req->crq_type == CRT_WRITE) {
-		if (flags & OBD_MD_FLEPOCH) {
-			oa->o_valid |= OBD_MD_FLEPOCH;
-			oa->o_ioepoch = ll_i2info(inode)->lli_ioepoch;
-			valid_flags |= OBD_MD_FLMTIME | OBD_MD_FLCTIME |
-				       OBD_MD_FLUID | OBD_MD_FLGID;
-		}
-	}
+	if (slice->crs_req->crq_type == CRT_WRITE)
+		valid_flags |= OBD_MD_FLMTIME | OBD_MD_FLCTIME |
+			       OBD_MD_FLUID | OBD_MD_FLGID;
 	obdo_from_inode(oa, inode, valid_flags & flags);
 	obdo_set_parent_fid(oa, &ll_i2info(inode)->lli_fid);
 	if (OBD_FAIL_CHECK(OBD_FAIL_LFSCK_INVALID_PFID))
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 03/41] staging: lustre: llite: remove client Size on MDS support
@ 2016-10-03  2:27   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:27 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Size on MDS support have been in preview since at least 2.0.0. Remove
support for it from lustre/llite/.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6047
Reviewed-on: http://review.whamcloud.com/13126
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/Makefile       |    2 +-
 drivers/staging/lustre/lustre/llite/file.c         |   97 +-----
 drivers/staging/lustre/lustre/llite/glimpse.c      |  105 +++---
 drivers/staging/lustre/lustre/llite/llite_close.c  |  395 --------------------
 .../staging/lustre/lustre/llite/llite_internal.h   |   56 +---
 drivers/staging/lustre/lustre/llite/llite_lib.c    |  127 +------
 drivers/staging/lustre/lustre/llite/namei.c        |    8 -
 drivers/staging/lustre/lustre/llite/vvp_dev.c      |    3 +-
 drivers/staging/lustre/lustre/llite/vvp_internal.h |   18 +-
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   11 -
 drivers/staging/lustre/lustre/llite/vvp_object.c   |    4 +-
 drivers/staging/lustre/lustre/llite/vvp_page.c     |   24 +--
 drivers/staging/lustre/lustre/llite/vvp_req.c      |   13 +-
 13 files changed, 89 insertions(+), 774 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/llite/llite_close.c

diff --git a/drivers/staging/lustre/lustre/llite/Makefile b/drivers/staging/lustre/lustre/llite/Makefile
index 1ac0940..3690bee 100644
--- a/drivers/staging/lustre/lustre/llite/Makefile
+++ b/drivers/staging/lustre/lustre/llite/Makefile
@@ -1,5 +1,5 @@
 obj-$(CONFIG_LUSTRE_FS) += lustre.o
-lustre-y := dcache.o dir.o file.o llite_close.o llite_lib.o llite_nfs.o \
+lustre-y := dcache.o dir.o file.o llite_lib.o llite_nfs.o \
 	    rw.o namei.o symlink.o llite_mmap.o range_lock.o \
 	    xattr.o xattr_cache.o rw26.o super25.o statahead.o \
 	    glimpse.o lcommon_cl.o lcommon_misc.o \
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 6e3a188..b2058c6 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -86,7 +86,6 @@ void ll_pack_inode2opdata(struct inode *inode, struct md_op_data *op_data,
 	op_data->op_attr.ia_size = i_size_read(inode);
 	op_data->op_attr_blocks = inode->i_blocks;
 	op_data->op_attr_flags = ll_inode_to_ext_flags(inode->i_flags);
-	op_data->op_ioepoch = ll_i2info(inode)->lli_ioepoch;
 	if (fh)
 		op_data->op_handle = *fh;
 
@@ -95,8 +94,7 @@ void ll_pack_inode2opdata(struct inode *inode, struct md_op_data *op_data,
 }
 
 /**
- * Closes the IO epoch and packs all the attributes into @op_data for
- * the CLOSE rpc.
+ * Packs all the attributes into @op_data for the CLOSE rpc.
  */
 static void ll_prepare_close(struct inode *inode, struct md_op_data *op_data,
 			     struct obd_client_handle *och)
@@ -108,11 +106,7 @@ static void ll_prepare_close(struct inode *inode, struct md_op_data *op_data,
 	if (!(och->och_flags & FMODE_WRITE))
 		goto out;
 
-	if (!exp_connect_som(ll_i2mdexp(inode)) || !S_ISREG(inode->i_mode))
-		op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
-	else
-		ll_ioepoch_close(inode, op_data, &och, 0);
-
+	op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
 out:
 	ll_pack_inode2opdata(inode, op_data, &och->och_fh);
 	ll_prep_md_op_data(op_data, inode, NULL, NULL,
@@ -128,7 +122,6 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 	struct md_op_data *op_data;
 	struct ptlrpc_request *req = NULL;
 	struct obd_device *obd = class_exp2obd(exp);
-	int epoch_close = 1;
 	int rc;
 
 	if (!obd) {
@@ -157,22 +150,9 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 		op_data->op_lease_handle = och->och_lease_handle;
 		op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
 	}
-	epoch_close = op_data->op_flags & MF_EPOCH_CLOSE;
+
 	rc = md_close(md_exp, op_data, och->och_mod, &req);
-	if (rc == -EAGAIN) {
-		/* This close must have the epoch closed. */
-		LASSERT(epoch_close);
-		/* MDS has instructed us to obtain Size-on-MDS attribute from
-		 * OSTs and send setattr to back to MDS.
-		 */
-		rc = ll_som_update(inode, op_data);
-		if (rc) {
-			CERROR("%s: inode "DFID" mdc Size-on-MDS update failed: rc = %d\n",
-			       ll_i2mdexp(inode)->exp_obd->obd_name,
-			       PFID(ll_inode2fid(inode)), rc);
-			rc = 0;
-		}
-	} else if (rc) {
+	if (rc) {
 		CERROR("%s: inode "DFID" mdc close failed: rc = %d\n",
 		       ll_i2mdexp(inode)->exp_obd->obd_name,
 		       PFID(ll_inode2fid(inode)), rc);
@@ -200,15 +180,10 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 	ll_finish_md_op_data(op_data);
 
 out:
-	if (exp_connect_som(exp) && !epoch_close &&
-	    S_ISREG(inode->i_mode) && (och->och_flags & FMODE_WRITE)) {
-		ll_queue_done_writing(inode, LLIF_DONE_WRITING);
-	} else {
-		md_clear_open_replay_data(md_exp, och);
-		/* Free @och if it is not waiting for DONE_WRITING. */
-		och->och_fh.cookie = DEAD_HANDLE_MAGIC;
-		kfree(och);
-	}
+	md_clear_open_replay_data(md_exp, och);
+	och->och_fh.cookie = DEAD_HANDLE_MAGIC;
+	kfree(och);
+
 	if (req) /* This is close request */
 		ptlrpc_req_finished(req);
 	return rc;
@@ -437,20 +412,6 @@ out:
 	return rc;
 }
 
-/**
- * Assign an obtained @ioepoch to client's inode. No lock is needed, MDS does
- * not believe attributes if a few ioepoch holders exist. Attributes for
- * previous ioepoch if new one is opened are also skipped by MDS.
- */
-void ll_ioepoch_open(struct ll_inode_info *lli, __u64 ioepoch)
-{
-	if (ioepoch && lli->lli_ioepoch != ioepoch) {
-		lli->lli_ioepoch = ioepoch;
-		CDEBUG(D_INODE, "Epoch %llu opened on "DFID"\n",
-		       ioepoch, PFID(&lli->lli_fid));
-	}
-}
-
 static int ll_och_fill(struct obd_export *md_exp, struct lookup_intent *it,
 		       struct obd_client_handle *och)
 {
@@ -470,23 +431,17 @@ static int ll_local_open(struct file *file, struct lookup_intent *it,
 			 struct ll_file_data *fd, struct obd_client_handle *och)
 {
 	struct inode *inode = file_inode(file);
-	struct ll_inode_info *lli = ll_i2info(inode);
 
 	LASSERT(!LUSTRE_FPRIVATE(file));
 
 	LASSERT(fd);
 
 	if (och) {
-		struct mdt_body *body;
 		int rc;
 
 		rc = ll_och_fill(ll_i2sbi(inode)->ll_md_exp, it, och);
 		if (rc != 0)
 			return rc;
-
-		body = req_capsule_server_get(&it->it_request->rq_pill,
-					      &RMF_MDT_BODY);
-		ll_ioepoch_open(lli, body->mbo_ioepoch);
 	}
 
 	LUSTRE_FPRIVATE(file) = fd;
@@ -912,7 +867,7 @@ static int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 
 /* Fills the obdo with the attributes for the lsm */
 static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
-			  struct obdo *obdo, __u64 ioepoch, int dv_flags)
+			  struct obdo *obdo, int dv_flags)
 {
 	struct ptlrpc_request_set *set;
 	struct obd_info	    oinfo = { };
@@ -924,13 +879,11 @@ static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
 	oinfo.oi_oa = obdo;
 	oinfo.oi_oa->o_oi = lsm->lsm_oi;
 	oinfo.oi_oa->o_mode = S_IFREG;
-	oinfo.oi_oa->o_ioepoch = ioepoch;
 	oinfo.oi_oa->o_valid = OBD_MD_FLID | OBD_MD_FLTYPE |
 			       OBD_MD_FLSIZE | OBD_MD_FLBLOCKS |
 			       OBD_MD_FLBLKSZ | OBD_MD_FLATIME |
 			       OBD_MD_FLMTIME | OBD_MD_FLCTIME |
-			       OBD_MD_FLGROUP | OBD_MD_FLEPOCH |
-			       OBD_MD_FLDATAVERSION;
+			       OBD_MD_FLGROUP | OBD_MD_FLDATAVERSION;
 	if (dv_flags & (LL_DV_WR_FLUSH | LL_DV_RD_FLUSH)) {
 		oinfo.oi_oa->o_valid |= OBD_MD_FLFLAGS;
 		oinfo.oi_oa->o_flags |= OBD_FL_SRVLOCK;
@@ -961,32 +914,6 @@ static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
 	return rc;
 }
 
-/**
-  * Performs the getattr on the inode and updates its fields.
-  * If @sync != 0, perform the getattr under the server-side lock.
-  */
-int ll_inode_getattr(struct inode *inode, struct obdo *obdo,
-		     __u64 ioepoch, int sync)
-{
-	struct lov_stripe_md *lsm;
-	int rc;
-
-	lsm = ccc_inode_lsm_get(inode);
-	rc = ll_lsm_getattr(lsm, ll_i2dtexp(inode),
-			    obdo, ioepoch, sync ? LL_DV_RD_FLUSH : 0);
-	if (rc == 0) {
-		struct ost_id *oi = lsm ? &lsm->lsm_oi : &obdo->o_oi;
-
-		obdo_refresh_inode(inode, obdo, obdo->o_valid);
-		CDEBUG(D_INODE, "objid " DOSTID " size %llu, blocks %llu, blksize %lu\n",
-		       POSTID(oi), i_size_read(inode),
-		       (unsigned long long)inode->i_blocks,
-		       1UL << inode->i_blkbits);
-	}
-	ccc_inode_lsm_put(inode, lsm);
-	return rc;
-}
-
 int ll_merge_attr(const struct lu_env *env, struct inode *inode)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
@@ -1049,7 +976,7 @@ int ll_glimpse_ioctl(struct ll_sb_info *sbi, struct lov_stripe_md *lsm,
 	struct obdo obdo = { 0 };
 	int rc;
 
-	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, &obdo, 0, 0);
+	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, &obdo, 0);
 	if (rc == 0) {
 		st->st_size   = obdo.o_size;
 		st->st_blocks = obdo.o_blocks;
@@ -1823,7 +1750,7 @@ int ll_data_version(struct inode *inode, __u64 *data_version, int flags)
 		goto out;
 	}
 
-	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, obdo, 0, flags);
+	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, obdo, flags);
 	if (rc == 0) {
 		if (!(obdo->o_valid & OBD_MD_FLDATAVERSION))
 			rc = -EOPNOTSUPP;
diff --git a/drivers/staging/lustre/lustre/llite/glimpse.c b/drivers/staging/lustre/lustre/llite/glimpse.c
index 22507b9..0d1ffad 100644
--- a/drivers/staging/lustre/lustre/llite/glimpse.c
+++ b/drivers/staging/lustre/lustre/llite/glimpse.c
@@ -82,65 +82,62 @@ int cl_glimpse_lock(const struct lu_env *env, struct cl_io *io,
 {
 	struct ll_inode_info *lli   = ll_i2info(inode);
 	const struct lu_fid  *fid   = lu_object_fid(&clob->co_lu);
-	int result;
+	int result = 0;
 
-	result = 0;
-	if (!(lli->lli_flags & LLIF_MDS_SIZE_LOCK)) {
-		CDEBUG(D_DLMTRACE, "Glimpsing inode " DFID "\n", PFID(fid));
-		if (lli->lli_has_smd) {
-			struct cl_lock *lock = vvp_env_lock(env);
-			struct cl_lock_descr *descr = &lock->cll_descr;
-
-			/* NOTE: this looks like DLM lock request, but it may
-			 *       not be one. Due to CEF_ASYNC flag (translated
-			 *       to LDLM_FL_HAS_INTENT by osc), this is
-			 *       glimpse request, that won't revoke any
-			 *       conflicting DLM locks held. Instead,
-			 *       ll_glimpse_callback() will be called on each
-			 *       client holding a DLM lock against this file,
-			 *       and resulting size will be returned for each
-			 *       stripe. DLM lock on [0, EOF] is acquired only
-			 *       if there were no conflicting locks. If there
-			 *       were conflicting locks, enqueuing or waiting
-			 *       fails with -ENAVAIL, but valid inode
-			 *       attributes are returned anyway.
-			 */
-			*descr = whole_file;
-			descr->cld_obj   = clob;
-			descr->cld_mode  = CLM_READ;
-			descr->cld_enq_flags = CEF_ASYNC | CEF_MUST;
-			if (agl)
-				descr->cld_enq_flags |= CEF_AGL;
-			/*
-			 * CEF_ASYNC is used because glimpse sub-locks cannot
-			 * deadlock (because they never conflict with other
-			 * locks) and, hence, can be enqueued out-of-order.
-			 *
-			 * CEF_MUST protects glimpse lock from conversion into
-			 * a lockless mode.
-			 */
-			result = cl_lock_request(env, io, lock);
-			if (result < 0)
-				return result;
-
-			if (!agl) {
-				ll_merge_attr(env, inode);
-				if (i_size_read(inode) > 0 &&
-				    inode->i_blocks == 0) {
-					/*
-					 * LU-417: Add dirty pages block count
-					 * lest i_blocks reports 0, some "cp" or
-					 * "tar" may think it's a completely
-					 * sparse file and skip it.
-					 */
-					inode->i_blocks = dirty_cnt(inode);
-				}
-			}
-			cl_lock_release(env, lock);
-		} else {
-			CDEBUG(D_DLMTRACE, "No objects for inode\n");
+	CDEBUG(D_DLMTRACE, "Glimpsing inode " DFID "\n", PFID(fid));
+	if (lli->lli_has_smd) {
+		struct cl_lock *lock = vvp_env_lock(env);
+		struct cl_lock_descr *descr = &lock->cll_descr;
+
+		/* NOTE: this looks like DLM lock request, but it may
+		 *       not be one. Due to CEF_ASYNC flag (translated
+		 *       to LDLM_FL_HAS_INTENT by osc), this is
+		 *       glimpse request, that won't revoke any
+		 *       conflicting DLM locks held. Instead,
+		 *       ll_glimpse_callback() will be called on each
+		 *       client holding a DLM lock against this file,
+		 *       and resulting size will be returned for each
+		 *       stripe. DLM lock on [0, EOF] is acquired only
+		 *       if there were no conflicting locks. If there
+		 *       were conflicting locks, enqueuing or waiting
+		 *       fails with -ENAVAIL, but valid inode
+		 *       attributes are returned anyway.
+		 */
+		*descr = whole_file;
+		descr->cld_obj = clob;
+		descr->cld_mode = CLM_READ;
+		descr->cld_enq_flags = CEF_ASYNC | CEF_MUST;
+		if (agl)
+			descr->cld_enq_flags |= CEF_AGL;
+		/*
+		 * CEF_ASYNC is used because glimpse sub-locks cannot
+		 * deadlock (because they never conflict with other
+		 * locks) and, hence, can be enqueued out-of-order.
+		 *
+		 * CEF_MUST protects glimpse lock from conversion into
+		 * a lockless mode.
+		 */
+		result = cl_lock_request(env, io, lock);
+		if (result < 0)
+			return result;
+
+		if (!agl) {
 			ll_merge_attr(env, inode);
+			if (i_size_read(inode) > 0 && !inode->i_blocks) {
+				/*
+				 * LU-417: Add dirty pages block count
+				 * lest i_blocks reports 0, some "cp" or
+				 * "tar" may think it's a completely
+				 * sparse file and skip it.
+				 */
+				inode->i_blocks = dirty_cnt(inode);
+			}
 		}
+
+		cl_lock_release(env, lock);
+	} else {
+		CDEBUG(D_DLMTRACE, "No objects for inode\n");
+		ll_merge_attr(env, inode);
 	}
 
 	return result;
diff --git a/drivers/staging/lustre/lustre/llite/llite_close.c b/drivers/staging/lustre/lustre/llite/llite_close.c
deleted file mode 100644
index 8644631..0000000
--- a/drivers/staging/lustre/lustre/llite/llite_close.c
+++ /dev/null
@@ -1,395 +0,0 @@
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.gnu.org/licenses/gpl-2.0.html
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2003, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2011, 2012, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- *
- * lustre/llite/llite_close.c
- *
- * Lustre Lite routines to issue a secondary close after writeback
- */
-
-#include <linux/module.h>
-
-#define DEBUG_SUBSYSTEM S_LLITE
-
-#include "llite_internal.h"
-
-/** records that a write is in flight */
-void vvp_write_pending(struct vvp_object *club, struct vvp_page *page)
-{
-	struct ll_inode_info *lli = ll_i2info(club->vob_inode);
-
-	spin_lock(&lli->lli_lock);
-	lli->lli_flags |= LLIF_SOM_DIRTY;
-	if (page && list_empty(&page->vpg_pending_linkage))
-		list_add(&page->vpg_pending_linkage, &club->vob_pending_list);
-	spin_unlock(&lli->lli_lock);
-}
-
-/** records that a write has completed */
-void vvp_write_complete(struct vvp_object *club, struct vvp_page *page)
-{
-	struct ll_inode_info *lli = ll_i2info(club->vob_inode);
-	int rc = 0;
-
-	spin_lock(&lli->lli_lock);
-	if (page && !list_empty(&page->vpg_pending_linkage)) {
-		list_del_init(&page->vpg_pending_linkage);
-		rc = 1;
-	}
-	spin_unlock(&lli->lli_lock);
-	if (rc)
-		ll_queue_done_writing(club->vob_inode, 0);
-}
-
-/** Queues DONE_WRITING if
- * - done writing is allowed;
- * - inode has no no dirty pages;
- */
-void ll_queue_done_writing(struct inode *inode, unsigned long flags)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-	struct vvp_object *club = cl2vvp(ll_i2info(inode)->lli_clob);
-
-	spin_lock(&lli->lli_lock);
-	lli->lli_flags |= flags;
-
-	if ((lli->lli_flags & LLIF_DONE_WRITING) &&
-	    list_empty(&club->vob_pending_list)) {
-		struct ll_close_queue *lcq = ll_i2sbi(inode)->ll_lcq;
-
-		if (lli->lli_flags & LLIF_MDS_SIZE_LOCK)
-			CWARN("%s: file "DFID"(flags %u) Size-on-MDS valid, done writing allowed and no diry pages\n",
-			      ll_get_fsname(inode->i_sb, NULL, 0),
-			      PFID(ll_inode2fid(inode)), lli->lli_flags);
-		/* DONE_WRITING is allowed and inode has no dirty page. */
-		spin_lock(&lcq->lcq_lock);
-
-		LASSERT(list_empty(&lli->lli_close_list));
-		CDEBUG(D_INODE, "adding inode "DFID" to close list\n",
-		       PFID(ll_inode2fid(inode)));
-		list_add_tail(&lli->lli_close_list, &lcq->lcq_head);
-
-		/* Avoid a concurrent insertion into the close thread queue:
-		 * an inode is already in the close thread, open(), write(),
-		 * close() happen, epoch is closed as the inode is marked as
-		 * LLIF_EPOCH_PENDING. When pages are written inode should not
-		 * be inserted into the queue again, clear this flag to avoid
-		 * it.
-		 */
-		lli->lli_flags &= ~LLIF_DONE_WRITING;
-
-		wake_up(&lcq->lcq_waitq);
-		spin_unlock(&lcq->lcq_lock);
-	}
-	spin_unlock(&lli->lli_lock);
-}
-
-/** Pack SOM attributes info @opdata for CLOSE, DONE_WRITING rpc. */
-void ll_done_writing_attr(struct inode *inode, struct md_op_data *op_data)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-
-	op_data->op_flags |= MF_SOM_CHANGE;
-	/* Check if Size-on-MDS attributes are valid. */
-	if (lli->lli_flags & LLIF_MDS_SIZE_LOCK)
-		CERROR("%s: inode "DFID"(flags %u) MDS holds lock on Size-on-MDS attributes\n",
-		       ll_get_fsname(inode->i_sb, NULL, 0),
-		       PFID(ll_inode2fid(inode)), lli->lli_flags);
-
-	if (!cl_local_size(inode)) {
-		/* Send Size-on-MDS Attributes if valid. */
-		op_data->op_attr.ia_valid |= ATTR_MTIME_SET | ATTR_CTIME_SET |
-				ATTR_ATIME_SET | ATTR_SIZE | ATTR_BLOCKS;
-	}
-}
-
-/** Closes ioepoch and packs Size-on-MDS attribute if needed into @op_data. */
-void ll_ioepoch_close(struct inode *inode, struct md_op_data *op_data,
-		      struct obd_client_handle **och, unsigned long flags)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-	struct vvp_object *club = cl2vvp(ll_i2info(inode)->lli_clob);
-
-	spin_lock(&lli->lli_lock);
-	if (!(list_empty(&club->vob_pending_list))) {
-		if (!(lli->lli_flags & LLIF_EPOCH_PENDING)) {
-			LASSERT(*och);
-			LASSERT(!lli->lli_pending_och);
-			/* Inode is dirty and there is no pending write done
-			 * request yet, DONE_WRITE is to be sent later.
-			 */
-			lli->lli_flags |= LLIF_EPOCH_PENDING;
-			lli->lli_pending_och = *och;
-			spin_unlock(&lli->lli_lock);
-
-			inode = igrab(inode);
-			LASSERT(inode);
-			goto out;
-		}
-		if (flags & LLIF_DONE_WRITING) {
-			/* Some pages are still dirty, it is early to send
-			 * DONE_WRITE. Wait until all pages will be flushed
-			 * and try DONE_WRITE again later.
-			 */
-			LASSERT(!(lli->lli_flags & LLIF_DONE_WRITING));
-			lli->lli_flags |= LLIF_DONE_WRITING;
-			spin_unlock(&lli->lli_lock);
-
-			inode = igrab(inode);
-			LASSERT(inode);
-			goto out;
-		}
-	}
-	CDEBUG(D_INODE, "Epoch %llu closed on "DFID"\n",
-	       ll_i2info(inode)->lli_ioepoch, PFID(&lli->lli_fid));
-	op_data->op_flags |= MF_EPOCH_CLOSE;
-
-	if (flags & LLIF_DONE_WRITING) {
-		LASSERT(lli->lli_flags & LLIF_SOM_DIRTY);
-		LASSERT(!(lli->lli_flags & LLIF_DONE_WRITING));
-		*och = lli->lli_pending_och;
-		lli->lli_pending_och = NULL;
-		lli->lli_flags &= ~LLIF_EPOCH_PENDING;
-	} else {
-		/* Pack Size-on-MDS inode attributes only if they has changed */
-		if (!(lli->lli_flags & LLIF_SOM_DIRTY)) {
-			spin_unlock(&lli->lli_lock);
-			goto out;
-		}
-
-		/* There is a pending DONE_WRITE -- close epoch with no
-		 * attribute change.
-		 */
-		if (lli->lli_flags & LLIF_EPOCH_PENDING) {
-			spin_unlock(&lli->lli_lock);
-			goto out;
-		}
-	}
-
-	LASSERT(list_empty(&club->vob_pending_list));
-	lli->lli_flags &= ~LLIF_SOM_DIRTY;
-	spin_unlock(&lli->lli_lock);
-	ll_done_writing_attr(inode, op_data);
-
-out:
-	return;
-}
-
-/**
- * Cliens updates SOM attributes on MDS (including llog cookies):
- * obd_getattr with no lock and md_setattr.
- */
-int ll_som_update(struct inode *inode, struct md_op_data *op_data)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-	struct ptlrpc_request *request = NULL;
-	__u32 old_flags;
-	struct obdo *oa;
-	int rc;
-
-	LASSERT(op_data);
-	if (lli->lli_flags & LLIF_MDS_SIZE_LOCK)
-		CERROR("%s: inode "DFID"(flags %u) MDS holds lock on Size-on-MDS attributes\n",
-		       ll_get_fsname(inode->i_sb, NULL, 0),
-		       PFID(ll_inode2fid(inode)), lli->lli_flags);
-
-	oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-	if (!oa) {
-		CERROR("can't allocate memory for Size-on-MDS update.\n");
-		return -ENOMEM;
-	}
-
-	old_flags = op_data->op_flags;
-	op_data->op_flags = MF_SOM_CHANGE;
-
-	/* If inode is already in another epoch, skip getattr from OSTs. */
-	if (lli->lli_ioepoch == op_data->op_ioepoch) {
-		rc = ll_inode_getattr(inode, oa, op_data->op_ioepoch,
-				      old_flags & MF_GETATTR_LOCK);
-		if (rc) {
-			oa->o_valid = 0;
-			if (rc != -ENOENT)
-				CERROR("%s: inode_getattr failed - unable to send a Size-on-MDS attribute update for inode "DFID": rc = %d\n",
-				       ll_get_fsname(inode->i_sb, NULL, 0),
-				       PFID(ll_inode2fid(inode)), rc);
-		} else {
-			CDEBUG(D_INODE, "Size-on-MDS update on "DFID"\n",
-			       PFID(&lli->lli_fid));
-		}
-		/* Install attributes into op_data. */
-		md_from_obdo(op_data, oa, oa->o_valid);
-	}
-
-	rc = md_setattr(ll_i2sbi(inode)->ll_md_exp, op_data,
-			NULL, 0, NULL, 0, &request, NULL);
-	ptlrpc_req_finished(request);
-
-	kmem_cache_free(obdo_cachep, oa);
-	return rc;
-}
-
-/**
- * Closes the ioepoch and packs all the attributes into @op_data for
- * DONE_WRITING rpc.
- */
-static void ll_prepare_done_writing(struct inode *inode,
-				    struct md_op_data *op_data,
-				    struct obd_client_handle **och)
-{
-	ll_ioepoch_close(inode, op_data, och, LLIF_DONE_WRITING);
-	/* If there is no @och, we do not do D_W yet. */
-	if (!*och)
-		return;
-
-	ll_pack_inode2opdata(inode, op_data, &(*och)->och_fh);
-	ll_prep_md_op_data(op_data, inode, NULL, NULL,
-			   0, 0, LUSTRE_OPC_ANY, NULL);
-}
-
-/** Send a DONE_WRITING rpc. */
-static void ll_done_writing(struct inode *inode)
-{
-	struct obd_client_handle *och = NULL;
-	struct md_op_data *op_data;
-	int rc;
-
-	LASSERT(exp_connect_som(ll_i2mdexp(inode)));
-
-	op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
-	if (!op_data)
-		return;
-
-	ll_prepare_done_writing(inode, op_data, &och);
-	/* If there is no @och, we do not do D_W yet. */
-	if (!och)
-		goto out;
-
-	rc = md_done_writing(ll_i2sbi(inode)->ll_md_exp, op_data, NULL);
-	if (rc == -EAGAIN)
-		/* MDS has instructed us to obtain Size-on-MDS attribute from
-		 * OSTs and send setattr to back to MDS.
-		 */
-		rc = ll_som_update(inode, op_data);
-	else if (rc) {
-		CERROR("%s: inode "DFID" mdc done_writing failed: rc = %d\n",
-		       ll_get_fsname(inode->i_sb, NULL, 0),
-		       PFID(ll_inode2fid(inode)), rc);
-	}
-out:
-	ll_finish_md_op_data(op_data);
-	if (och) {
-		md_clear_open_replay_data(ll_i2sbi(inode)->ll_md_exp, och);
-		kfree(och);
-	}
-}
-
-static struct ll_inode_info *ll_close_next_lli(struct ll_close_queue *lcq)
-{
-	struct ll_inode_info *lli = NULL;
-
-	spin_lock(&lcq->lcq_lock);
-
-	if (!list_empty(&lcq->lcq_head)) {
-		lli = list_entry(lcq->lcq_head.next, struct ll_inode_info,
-				 lli_close_list);
-		list_del_init(&lli->lli_close_list);
-	} else if (atomic_read(&lcq->lcq_stop)) {
-		lli = ERR_PTR(-EALREADY);
-	}
-
-	spin_unlock(&lcq->lcq_lock);
-	return lli;
-}
-
-static int ll_close_thread(void *arg)
-{
-	struct ll_close_queue *lcq = arg;
-
-	complete(&lcq->lcq_comp);
-
-	while (1) {
-		struct l_wait_info lwi = { 0 };
-		struct ll_inode_info *lli;
-		struct inode *inode;
-
-		l_wait_event_exclusive(lcq->lcq_waitq,
-				       (lli = ll_close_next_lli(lcq)) != NULL,
-				       &lwi);
-		if (IS_ERR(lli))
-			break;
-
-		inode = ll_info2i(lli);
-		CDEBUG(D_INFO, "done_writing for inode "DFID"\n",
-		       PFID(ll_inode2fid(inode)));
-		ll_done_writing(inode);
-		iput(inode);
-	}
-
-	CDEBUG(D_INFO, "ll_close exiting\n");
-	complete(&lcq->lcq_comp);
-	return 0;
-}
-
-int ll_close_thread_start(struct ll_close_queue **lcq_ret)
-{
-	struct ll_close_queue *lcq;
-	struct task_struct *task;
-
-	if (OBD_FAIL_CHECK(OBD_FAIL_LDLM_CLOSE_THREAD))
-		return -EINTR;
-
-	lcq = kzalloc(sizeof(*lcq), GFP_NOFS);
-	if (!lcq)
-		return -ENOMEM;
-
-	spin_lock_init(&lcq->lcq_lock);
-	INIT_LIST_HEAD(&lcq->lcq_head);
-	init_waitqueue_head(&lcq->lcq_waitq);
-	init_completion(&lcq->lcq_comp);
-
-	task = kthread_run(ll_close_thread, lcq, "ll_close");
-	if (IS_ERR(task)) {
-		kfree(lcq);
-		return PTR_ERR(task);
-	}
-
-	wait_for_completion(&lcq->lcq_comp);
-	*lcq_ret = lcq;
-	return 0;
-}
-
-void ll_close_thread_shutdown(struct ll_close_queue *lcq)
-{
-	init_completion(&lcq->lcq_comp);
-	atomic_inc(&lcq->lcq_stop);
-	wake_up(&lcq->lcq_waitq);
-	wait_for_completion(&lcq->lcq_comp);
-	kfree(lcq);
-}
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 3e98bd6..cd89926 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -98,28 +98,17 @@ struct ll_grouplock {
 };
 
 enum lli_flags {
-	/* MDS has an authority for the Size-on-MDS attributes. */
-	LLIF_MDS_SIZE_LOCK      = (1 << 0),
-	/* Epoch close is postponed. */
-	LLIF_EPOCH_PENDING      = (1 << 1),
-	/* DONE WRITING is allowed. */
-	LLIF_DONE_WRITING       = (1 << 2),
-	/* Sizeon-on-MDS attributes are changed. An attribute update needs to
-	 * be sent to MDS.
-	 */
-	LLIF_SOM_DIRTY	  = (1 << 3),
 	/* File data is modified. */
-	LLIF_DATA_MODIFIED      = (1 << 4),
+	LLIF_DATA_MODIFIED	= BIT(0),
 	/* File is being restored */
-	LLIF_FILE_RESTORING	= (1 << 5),
+	LLIF_FILE_RESTORING	= BIT(1),
 	/* Xattr cache is attached to the file */
-	LLIF_XATTR_CACHE	= (1 << 6),
+	LLIF_XATTR_CACHE	= BIT(2),
 };
 
 struct ll_inode_info {
 	__u32				lli_inode_magic;
 	__u32				lli_flags;
-	__u64				lli_ioepoch;
 
 	spinlock_t			lli_lock;
 	struct posix_acl		*lli_posix_acl;
@@ -129,14 +118,6 @@ struct ll_inode_info {
 	/* master inode fid for stripe directory */
 	struct lu_fid		   lli_pfid;
 
-	struct list_head	      lli_close_list;
-
-	/* handle is to be sent to MDS later on done_writing and setattr.
-	 * Open handle data are needed for the recovery to reconstruct
-	 * the inode state on the MDS. XXX: recovery is not ready yet.
-	 */
-	struct obd_client_handle       *lli_pending_och;
-
 	/* We need all three because every inode may be opened in different
 	 * modes
 	 */
@@ -400,7 +381,7 @@ enum stats_track_type {
 #define LL_SBI_LOCALFLOCK       0x200 /* Local flocks support by kernel */
 #define LL_SBI_LRU_RESIZE       0x400 /* lru resize support */
 #define LL_SBI_LAZYSTATFS       0x800 /* lazystatfs mount option */
-#define LL_SBI_SOM_PREVIEW     0x1000 /* SOM preview mount option */
+/*	LL_SBI_SOM_PREVIEW     0x1000    SOM preview mount option, obsolete */
 #define LL_SBI_32BIT_API       0x2000 /* generate 32 bit inodes. */
 #define LL_SBI_64BIT_HASH      0x4000 /* support 64-bits dir hash/offset */
 #define LL_SBI_AGL_ENABLED     0x8000 /* enable agl */
@@ -466,10 +447,10 @@ struct ll_sb_info {
 
 	int		       ll_flags;
 	unsigned int		  ll_umounting:1,
-				  ll_xattr_cache_enabled:1;
-	struct lustre_client_ocd  ll_lco;
+				  ll_xattr_cache_enabled:1,
+				  ll_client_common_fill_super_succeeded:1;
 
-	struct ll_close_queue    *ll_lcq;
+	struct lustre_client_ocd  ll_lco;
 
 	struct lprocfs_stats     *ll_stats; /* lprocfs stats counter */
 
@@ -764,15 +745,8 @@ int ll_file_open(struct inode *inode, struct file *file);
 int ll_file_release(struct inode *inode, struct file *file);
 int ll_glimpse_ioctl(struct ll_sb_info *sbi,
 		     struct lov_stripe_md *lsm, lstat_t *st);
-void ll_ioepoch_open(struct ll_inode_info *lli, __u64 ioepoch);
 int ll_release_openhandle(struct inode *, struct lookup_intent *);
 int ll_md_real_close(struct inode *inode, fmode_t fmode);
-void ll_ioepoch_close(struct inode *inode, struct md_op_data *op_data,
-		      struct obd_client_handle **och, unsigned long flags);
-void ll_done_writing_attr(struct inode *inode, struct md_op_data *op_data);
-int ll_som_update(struct inode *inode, struct md_op_data *op_data);
-int ll_inode_getattr(struct inode *inode, struct obdo *obdo,
-		     __u64 ioepoch, int sync);
 void ll_pack_inode2opdata(struct inode *inode, struct md_op_data *op_data,
 			  struct lustre_handle *fh);
 int ll_getattr(struct vfsmount *mnt, struct dentry *de, struct kstat *stat);
@@ -891,18 +865,6 @@ int ll_dir_get_parent_fid(struct inode *dir, struct lu_fid *parent_fid);
 /* llite/symlink.c */
 extern const struct inode_operations ll_fast_symlink_inode_operations;
 
-/* llite/llite_close.c */
-struct ll_close_queue {
-	spinlock_t		lcq_lock;
-	struct list_head		lcq_head;
-	wait_queue_head_t		lcq_waitq;
-	struct completion	lcq_comp;
-	atomic_t		lcq_stop;
-};
-
-void vvp_write_pending(struct vvp_object *club, struct vvp_page *page);
-void vvp_write_complete(struct vvp_object *club, struct vvp_page *page);
-
 /**
  * IO arguments for various VFS I/O interfaces.
  */
@@ -956,10 +918,6 @@ static inline struct vvp_io_args *ll_env_args(const struct lu_env *env,
 	return via;
 }
 
-void ll_queue_done_writing(struct inode *inode, unsigned long flags);
-void ll_close_thread_shutdown(struct ll_close_queue *lcq);
-int ll_close_thread_start(struct ll_close_queue **lcq_ret);
-
 /* llite/llite_mmap.c */
 
 int ll_teardown_mmaps(struct address_space *mapping, __u64 first, __u64 last);
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 6bb41b0..4f83275 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -193,9 +193,6 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 				  OBD_CONNECT_OPEN_BY_FID |
 				  OBD_CONNECT_DIR_STRIPE;
 
-	if (sbi->ll_flags & LL_SBI_SOM_PREVIEW)
-		data->ocd_connect_flags |= OBD_CONNECT_SOM;
-
 	if (sbi->ll_flags & LL_SBI_LRU_RESIZE)
 		data->ocd_connect_flags |= OBD_CONNECT_LRU_RESIZE;
 #ifdef CONFIG_FS_POSIX_ACL
@@ -357,9 +354,6 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 				  OBD_CONNECT_JOBSTATS | OBD_CONNECT_LVB_TYPE |
 				  OBD_CONNECT_LAYOUTLOCK | OBD_CONNECT_PINGLESS;
 
-	if (sbi->ll_flags & LL_SBI_SOM_PREVIEW)
-		data->ocd_connect_flags |= OBD_CONNECT_SOM;
-
 	if (!OBD_FAIL_CHECK(OBD_FAIL_OSC_CONNECT_CKSUM)) {
 		/* OBD_CONNECT_CKSUM should always be set, even if checksums are
 		 * disabled by default, because it can still be enabled on the
@@ -488,12 +482,6 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 		goto out_root;
 	}
 
-	err = ll_close_thread_start(&sbi->ll_lcq);
-	if (err) {
-		CERROR("cannot start close thread: rc %d\n", err);
-		goto out_root;
-	}
-
 	checksum = sbi->ll_flags & LL_SBI_CHECKSUM;
 	err = obd_set_info_async(NULL, sbi->ll_dt_exp, sizeof(KEY_CHECKSUM),
 				 KEY_CHECKSUM, sizeof(checksum), &checksum,
@@ -633,8 +621,6 @@ static void client_common_put_super(struct super_block *sb)
 {
 	struct ll_sb_info *sbi = ll_s2sbi(sb);
 
-	ll_close_thread_shutdown(sbi->ll_lcq);
-
 	cl_sb_fini(sb);
 
 	obd_fid_fini(sbi->ll_dt_exp->exp_obd);
@@ -766,11 +752,6 @@ static int ll_options(char *options, int *flags)
 			*flags &= ~tmp;
 			goto next;
 		}
-		tmp = ll_set_opt("som_preview", s1, LL_SBI_SOM_PREVIEW);
-		if (tmp) {
-			*flags |= tmp;
-			goto next;
-		}
 		tmp = ll_set_opt("32bitapi", s1, LL_SBI_32BIT_API);
 		if (tmp) {
 			*flags |= tmp;
@@ -804,14 +785,11 @@ void ll_lli_init(struct ll_inode_info *lli)
 {
 	lli->lli_inode_magic = LLI_INODE_MAGIC;
 	lli->lli_flags = 0;
-	lli->lli_ioepoch = 0;
 	lli->lli_maxbytes = MAX_LFS_FILESIZE;
 	spin_lock_init(&lli->lli_lock);
 	lli->lli_posix_acl = NULL;
 	/* Do not set lli_fid, it has been initialized already. */
 	fid_zero(&lli->lli_pfid);
-	INIT_LIST_HEAD(&lli->lli_close_list);
-	lli->lli_pending_och = NULL;
 	lli->lli_mds_read_och = NULL;
 	lli->lli_mds_write_och = NULL;
 	lli->lli_mds_exec_och = NULL;
@@ -941,6 +919,8 @@ int ll_fill_super(struct super_block *sb, struct vfsmount *mnt)
 
 	/* connections, registrations, sb setup */
 	err = client_common_fill_super(sb, md, dt, mnt);
+	if (!err)
+		sbi->ll_client_common_fill_super_succeeded = 1;
 
 out_free:
 	kfree(md);
@@ -1002,7 +982,7 @@ void ll_put_super(struct super_block *sb)
 		}
 	}
 
-	if (sbi->ll_lcq) {
+	if (sbi->ll_client_common_fill_super_succeeded) {
 		/* Only if client_common_fill_super succeeded */
 		client_common_put_super(sb);
 	}
@@ -1272,9 +1252,6 @@ void ll_clear_inode(struct inode *inode)
 		LASSERT(lli->lli_opendir_pid == 0);
 	}
 
-	spin_lock(&lli->lli_lock);
-	ll_i2info(inode)->lli_flags &= ~LLIF_MDS_SIZE_LOCK;
-	spin_unlock(&lli->lli_lock);
 	md_null_inode(sbi->ll_md_exp, ll_inode2fid(inode));
 
 	LASSERT(!lli->lli_open_fd_write_count);
@@ -1369,48 +1346,12 @@ static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data,
 	rc = simple_setattr(dentry, &op_data->op_attr);
 	op_data->op_attr.ia_valid = ia_valid;
 
-	/* Extract epoch data if obtained. */
-	op_data->op_handle = md.body->mbo_handle;
-	op_data->op_ioepoch = md.body->mbo_ioepoch;
-
 	rc = ll_update_inode(inode, &md);
 	ptlrpc_req_finished(request);
 
 	return rc;
 }
 
-/* Close IO epoch and send Size-on-MDS attribute update. */
-static int ll_setattr_done_writing(struct inode *inode,
-				   struct md_op_data *op_data,
-				   struct md_open_data *mod)
-{
-	struct ll_inode_info *lli = ll_i2info(inode);
-	int rc = 0;
-
-	if (!S_ISREG(inode->i_mode))
-		return 0;
-
-	CDEBUG(D_INODE, "Epoch %llu closed on "DFID" for truncate\n",
-	       op_data->op_ioepoch, PFID(&lli->lli_fid));
-
-	op_data->op_flags = MF_EPOCH_CLOSE;
-	ll_done_writing_attr(inode, op_data);
-	ll_pack_inode2opdata(inode, op_data, NULL);
-
-	rc = md_done_writing(ll_i2sbi(inode)->ll_md_exp, op_data, mod);
-	if (rc == -EAGAIN)
-		/* MDS has instructed us to obtain Size-on-MDS attribute
-		 * from OSTs and send setattr to back to MDS.
-		 */
-		rc = ll_som_update(inode, op_data);
-	else if (rc) {
-		CERROR("%s: inode "DFID" mdc truncate failed: rc = %d\n",
-		       ll_i2sbi(inode)->ll_md_exp->exp_obd->obd_name,
-		       PFID(ll_inode2fid(inode)), rc);
-	}
-	return rc;
-}
-
 /* If this inode has objects allocated to it (lsm != NULL), then the OST
  * object(s) determine the file size and mtime.  Otherwise, the MDS will
  * keep these values until such a time that objects are allocated for it.
@@ -1433,7 +1374,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 	struct md_op_data *op_data = NULL;
 	struct md_open_data *mod = NULL;
 	bool file_is_released = false;
-	int rc = 0, rc1 = 0;
+	int rc = 0;
 
 	CDEBUG(D_VFSTRACE, "%s: setattr inode "DFID"(%p) from %llu to %llu, valid %x, hsm_import %d\n",
 	       ll_get_fsname(inode->i_sb, NULL, 0), PFID(&lli->lli_fid), inode,
@@ -1536,11 +1477,6 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 
 	memcpy(&op_data->op_attr, attr, sizeof(*attr));
 
-	/* Open epoch for truncate. */
-	if (exp_connect_som(ll_i2mdexp(inode)) && !hsm_import &&
-	    (attr->ia_valid & (ATTR_SIZE | ATTR_MTIME | ATTR_MTIME_SET)))
-		op_data->op_flags = MF_EPOCH_OPEN;
-
 	rc = ll_md_setattr(dentry, op_data, &mod);
 	if (rc)
 		goto out;
@@ -1552,7 +1488,6 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 		spin_unlock(&lli->lli_lock);
 	}
 
-	ll_ioepoch_open(lli, op_data->op_ioepoch);
 	if (!S_ISREG(inode->i_mode) || file_is_released) {
 		rc = 0;
 		goto out;
@@ -1575,12 +1510,8 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 			up_write(&lli->lli_trunc_sem);
 	}
 out:
-	if (op_data->op_ioepoch) {
-		rc1 = ll_setattr_done_writing(inode, op_data, mod);
-		if (!rc)
-			rc = rc1;
-	}
-	ll_finish_md_op_data(op_data);
+	if (op_data)
+		ll_finish_md_op_data(op_data);
 
 	if (!S_ISDIR(inode->i_mode)) {
 		inode_lock(inode);
@@ -1828,48 +1759,11 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md)
 	LASSERT(fid_seq(&lli->lli_fid) != 0);
 
 	if (body->mbo_valid & OBD_MD_FLSIZE) {
-		if (exp_connect_som(ll_i2mdexp(inode)) &&
-		    S_ISREG(inode->i_mode)) {
-			struct lustre_handle lockh;
-			enum ldlm_mode mode;
-
-			/* As it is possible a blocking ast has been processed
-			 * by this time, we need to check there is an UPDATE
-			 * lock on the client and set LLIF_MDS_SIZE_LOCK holding
-			 * it.
-			 */
-			mode = ll_take_md_lock(inode, MDS_INODELOCK_UPDATE,
-					       &lockh, LDLM_FL_CBPENDING,
-					       LCK_CR | LCK_CW |
-					       LCK_PR | LCK_PW);
-			if (mode) {
-				if (lli->lli_flags & (LLIF_DONE_WRITING |
-						      LLIF_EPOCH_PENDING |
-						      LLIF_SOM_DIRTY)) {
-					CERROR("%s: inode "DFID" flags %u still has size authority! do not trust the size got from MDS\n",
-					       sbi->ll_md_exp->exp_obd->obd_name,
-					       PFID(ll_inode2fid(inode)),
-					       lli->lli_flags);
-				} else {
-					/* Use old size assignment to avoid
-					 * deadlock bz14138 & bz14326
-					 */
-					i_size_write(inode, body->mbo_size);
-					spin_lock(&lli->lli_lock);
-					lli->lli_flags |= LLIF_MDS_SIZE_LOCK;
-					spin_unlock(&lli->lli_lock);
-				}
-				ldlm_lock_decref(&lockh, mode);
-			}
-		} else {
-			/* Use old size assignment to avoid
-			 * deadlock bz14138 & bz14326
-			 */
-			i_size_write(inode, body->mbo_size);
+		i_size_write(inode, body->mbo_size);
 
-			CDEBUG(D_VFSTRACE, "inode=%lu, updating i_size %llu\n",
-			       inode->i_ino, (unsigned long long)body->mbo_size);
-		}
+		CDEBUG(D_VFSTRACE, "inode=" DFID ", updating i_size %llu\n",
+		       PFID(ll_inode2fid(inode)),
+		       (unsigned long long)body->mbo_size);
 
 		if (body->mbo_valid & OBD_MD_FLBLOCKS)
 			inode->i_blocks = body->mbo_blocks;
@@ -2164,7 +2058,6 @@ void ll_open_cleanup(struct super_block *sb, struct ptlrpc_request *open_req)
 		return;
 
 	op_data->op_fid1 = body->mbo_fid1;
-	op_data->op_ioepoch = body->mbo_ioepoch;
 	op_data->op_handle = body->mbo_handle;
 	op_data->op_mod_time = get_seconds();
 	md_close(exp, op_data, NULL, &close_req);
diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index dfa36d3..9cc4bb4 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -254,14 +254,6 @@ int ll_md_blocking_ast(struct ldlm_lock *lock, struct ldlm_lock_desc *desc,
 				       PFID(ll_inode2fid(inode)), rc);
 		}
 
-		if (bits & MDS_INODELOCK_UPDATE) {
-			struct ll_inode_info *lli = ll_i2info(inode);
-
-			spin_lock(&lli->lli_lock);
-			lli->lli_flags &= ~LLIF_MDS_SIZE_LOCK;
-			spin_unlock(&lli->lli_lock);
-		}
-
 		if ((bits & MDS_INODELOCK_UPDATE) && S_ISDIR(inode->i_mode)) {
 			struct ll_inode_info *lli = ll_i2info(inode);
 
diff --git a/drivers/staging/lustre/lustre/llite/vvp_dev.c b/drivers/staging/lustre/lustre/llite/vvp_dev.c
index 8aa8ecc..cab95ac 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_dev.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_dev.c
@@ -521,11 +521,10 @@ static void vvp_pgcache_page_show(const struct lu_env *env,
 
 	vpg = cl2vvp_page(cl_page_at(page, &vvp_device_type));
 	vmpage = vpg->vpg_page;
-	seq_printf(seq, " %5i | %p %p %s %s %s %s | %p "DFID"(%p) %lu %u [",
+	seq_printf(seq, " %5i | %p %p %s %s %s | %p " DFID "(%p) %lu %u [",
 		   0 /* gen */,
 		   vpg, page,
 		   "none",
-		   vpg->vpg_write_queued ? "wq" : "- ",
 		   vpg->vpg_defer_uptodate ? "du" : "- ",
 		   PageWriteback(vmpage) ? "wb" : "-",
 		   vmpage, PFID(ll_inode2fid(vmpage->mapping->host)),
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index 5802da8..47d035e 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -209,14 +209,6 @@ struct vvp_object {
 	struct inode           *vob_inode;
 
 	/**
-	 * A list of dirty pages pending IO in the cache. Used by
-	 * SOM. Protected by ll_inode_info::lli_lock.
-	 *
-	 * \see vvp_page::vpg_pending_linkage
-	 */
-	struct list_head	vob_pending_list;
-
-	/**
 	 * Number of transient pages.  This is no longer protected by i_sem,
 	 * and needs to be atomic.  This is not actually used for anything,
 	 * and can probably be removed.
@@ -249,15 +241,7 @@ struct vvp_object {
 struct vvp_page {
 	struct cl_page_slice vpg_cl;
 	unsigned int	vpg_defer_uptodate:1,
-			vpg_ra_used:1,
-			vpg_write_queued:1;
-	/**
-	 * Non-empty iff this page is already counted in
-	 * vvp_object::vob_pending_list. This list is only used as a flag,
-	 * that is, never iterated through, only checked for list_empty(), but
-	 * having a list is useful for debugging.
-	 */
-	struct list_head	   vpg_pending_linkage;
+			vpg_ra_used:1;
 	/** VM page */
 	struct page	  *vpg_page;
 };
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 2ab4503..dbc4c26 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -807,16 +807,11 @@ static int vvp_io_commit_sync(const struct lu_env *env, struct cl_io *io,
 static void write_commit_callback(const struct lu_env *env, struct cl_io *io,
 				  struct cl_page *page)
 {
-	struct vvp_page *vpg;
 	struct page *vmpage = page->cp_vmpage;
-	struct cl_object *clob = cl_io_top(io)->ci_obj;
 
 	SetPageUptodate(vmpage);
 	set_page_dirty(vmpage);
 
-	vpg = cl2vvp_page(cl_object_page_slice(clob, page));
-	vvp_write_pending(cl2vvp(clob), vpg);
-
 	cl_page_disown(env, io, page);
 
 	/* held in ll_cl_init() */
@@ -1051,13 +1046,7 @@ static int vvp_io_kernel_fault(struct vvp_fault_io *cfio)
 static void mkwrite_commit_callback(const struct lu_env *env, struct cl_io *io,
 				    struct cl_page *page)
 {
-	struct vvp_page *vpg;
-	struct cl_object *clob = cl_io_top(io)->ci_obj;
-
 	set_page_dirty(page->cp_vmpage);
-
-	vpg = cl2vvp_page(cl_object_page_slice(clob, page));
-	vvp_write_pending(cl2vvp(clob), vpg);
 }
 
 static int vvp_io_fault_start(const struct lu_env *env,
diff --git a/drivers/staging/lustre/lustre/llite/vvp_object.c b/drivers/staging/lustre/lustre/llite/vvp_object.c
index b57195d..3214885 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_object.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_object.c
@@ -65,8 +65,7 @@ static int vvp_object_print(const struct lu_env *env, void *cookie,
 	struct inode	 *inode = obj->vob_inode;
 	struct ll_inode_info *lli;
 
-	(*p)(env, cookie, "(%s %d %d) inode: %p ",
-	     list_empty(&obj->vob_pending_list) ? "-" : "+",
+	(*p)(env, cookie, "(%d %d) inode: %p ",
 	     atomic_read(&obj->vob_transient_pages),
 	     atomic_read(&obj->vob_mmap_cnt), inode);
 	if (inode) {
@@ -240,7 +239,6 @@ static int vvp_object_init(const struct lu_env *env, struct lu_object *obj,
 		const struct cl_object_conf *cconf;
 
 		cconf = lu2cl_conf(conf);
-		INIT_LIST_HEAD(&vob->vob_pending_list);
 		lu_object_add(obj, below);
 		result = vvp_object_init0(env, vob, cconf);
 	} else {
diff --git a/drivers/staging/lustre/lustre/llite/vvp_page.c b/drivers/staging/lustre/lustre/llite/vvp_page.c
index 5d79efc..68f8990 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_page.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_page.c
@@ -162,8 +162,6 @@ static void vvp_page_delete(const struct lu_env *env,
 	LASSERT((struct cl_page *)vmpage->private == page);
 	LASSERT(inode == vvp_object_inode(obj));
 
-	vvp_write_complete(cl2vvp(obj), cl2vvp_page(slice));
-
 	/* Drop the reference count held in vvp_page_init */
 	refc = atomic_dec_return(&page->cp_ref);
 	LASSERTF(refc >= 1, "page = %p, refc = %d\n", page, refc);
@@ -221,8 +219,6 @@ static int vvp_page_prep_write(const struct lu_env *env,
 	if (!pg->cp_sync_io)
 		set_page_writeback(vmpage);
 
-	vvp_write_pending(cl2vvp(slice->cpl_obj), cl2vvp_page(slice));
-
 	return 0;
 }
 
@@ -290,19 +286,6 @@ static void vvp_page_completion_write(const struct lu_env *env,
 
 	CL_PAGE_HEADER(D_PAGE, env, pg, "completing WRITE with %d\n", ioret);
 
-	/*
-	 * TODO: Actually it makes sense to add the page into oap pending
-	 * list again and so that we don't need to take the page out from
-	 * SoM write pending list, if we just meet a recoverable error,
-	 * -ENOMEM, etc.
-	 * To implement this, we just need to return a non zero value in
-	 * ->cpo_completion method. The underlying transfer should be notified
-	 * and then re-add the page into pending transfer queue.  -jay
-	 */
-
-	vpg->vpg_write_queued = 0;
-	vvp_write_complete(cl2vvp(slice->cpl_obj), vpg);
-
 	if (pg->cp_sync_io) {
 		LASSERT(PageLocked(vmpage));
 		LASSERT(!PageWriteback(vmpage));
@@ -344,7 +327,6 @@ static int vvp_page_make_ready(const struct lu_env *env,
 		LASSERT(pg->cp_state == CPS_CACHED);
 		/* This actually clears the dirty bit in the radix tree. */
 		set_page_writeback(vmpage);
-		vvp_write_pending(cl2vvp(slice->cpl_obj), cl2vvp_page(slice));
 		CL_PAGE_HEADER(D_PAGE, env, pg, "readied\n");
 	} else if (pg->cp_state == CPS_PAGEOUT) {
 		/* is it possible for osc_flush_async_page() to already
@@ -381,9 +363,8 @@ static int vvp_page_print(const struct lu_env *env,
 	struct vvp_page *vpg = cl2vvp_page(slice);
 	struct page     *vmpage = vpg->vpg_page;
 
-	(*printer)(env, cookie, LUSTRE_VVP_NAME "-page@%p(%d:%d:%d) vm@%p ",
-		   vpg, vpg->vpg_defer_uptodate, vpg->vpg_ra_used,
-		   vpg->vpg_write_queued, vmpage);
+	(*printer)(env, cookie, LUSTRE_VVP_NAME "-page@%p(%d:%d) vm@%p ",
+		   vpg, vpg->vpg_defer_uptodate, vpg->vpg_ra_used, vmpage);
 	if (vmpage) {
 		(*printer)(env, cookie, "%lx %d:%d %lx %lu %slru",
 			   (long)vmpage->flags, page_count(vmpage),
@@ -542,7 +523,6 @@ int vvp_page_init(const struct lu_env *env, struct cl_object *obj,
 	vpg->vpg_page = vmpage;
 	get_page(vmpage);
 
-	INIT_LIST_HEAD(&vpg->vpg_pending_linkage);
 	if (page->cp_type == CPT_CACHEABLE) {
 		/* in cache, decref in vvp_page_delete */
 		atomic_inc(&page->cp_ref);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_req.c b/drivers/staging/lustre/lustre/llite/vvp_req.c
index e3f4c79..a8892e4 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_req.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_req.c
@@ -56,8 +56,6 @@ static inline struct vvp_req *cl2vvp_req(const struct cl_req_slice *slice)
  *
  *    - o_parent_ver
  *
- *    - o_ioepoch,
- *
  */
 static void vvp_req_attr_set(const struct lu_env *env,
 			     const struct cl_req_slice *slice,
@@ -72,14 +70,9 @@ static void vvp_req_attr_set(const struct lu_env *env,
 	inode = vvp_object_inode(obj);
 	valid_flags = OBD_MD_FLTYPE;
 
-	if (slice->crs_req->crq_type == CRT_WRITE) {
-		if (flags & OBD_MD_FLEPOCH) {
-			oa->o_valid |= OBD_MD_FLEPOCH;
-			oa->o_ioepoch = ll_i2info(inode)->lli_ioepoch;
-			valid_flags |= OBD_MD_FLMTIME | OBD_MD_FLCTIME |
-				       OBD_MD_FLUID | OBD_MD_FLGID;
-		}
-	}
+	if (slice->crs_req->crq_type == CRT_WRITE)
+		valid_flags |= OBD_MD_FLMTIME | OBD_MD_FLCTIME |
+			       OBD_MD_FLUID | OBD_MD_FLGID;
 	obdo_from_inode(oa, inode, valid_flags & flags);
 	obdo_set_parent_fid(oa, &ll_i2info(inode)->lli_fid);
 	if (OBD_FAIL_CHECK(OBD_FAIL_LFSCK_INVALID_PFID))
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 04/41] staging: lustre: obd: remove client Size on MDS support
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Remove the unused OBD MD API method md_done_writing(). Remove the
unused logcookie and struct md_open_data ** parameters from
md_setattr(). Remove the unused functions iattr_from_obdo(),
md_from_obdo(), and obdo_refresh_inode().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6047
Reviewed-on: http://review.whamcloud.com/13169
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    2 -
 drivers/staging/lustre/lustre/include/obd.h        |    7 +--
 drivers/staging/lustre/lustre/include/obd_class.h  |   22 +----
 drivers/staging/lustre/lustre/llite/dir.c          |    3 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   12 +--
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |   31 +------
 drivers/staging/lustre/lustre/lov/lov_request.c    |   14 ---
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    5 +-
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   20 +----
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |   58 ++-----------
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   93 +-------------------
 drivers/staging/lustre/lustre/obdclass/Makefile    |    2 +-
 .../lustre/lustre/obdclass/linux/linux-obdo.c      |   80 -----------------
 drivers/staging/lustre/lustre/obdclass/obdo.c      |   65 --------------
 14 files changed, 26 insertions(+), 388 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 72eaee9..d164545 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -2049,8 +2049,6 @@ enum md_op_flags {
 	MF_GET_MDT_IDX	  = (1 << 9),
 };
 
-#define MF_SOM_LOCAL_FLAGS (MF_SOM_CHANGE | MF_EPOCH_OPEN | MF_EPOCH_CLOSE)
-
 #define LUSTRE_BFLAG_UNCOMMITTED_WRITES   0x1
 
 /* these should be identical to their EXT4_*_FL counterparts, they are
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index f6fc4dd..51d5487 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -789,8 +789,6 @@ struct md_op_data {
 	__u64		   op_valid;
 	loff_t		  op_attr_blocks;
 
-	/* Size-on-MDS epoch and flags. */
-	__u64		   op_ioepoch;
 	__u32		   op_flags;
 
 	/* Various operation flags. */
@@ -992,8 +990,6 @@ struct md_ops {
 	int (*create)(struct obd_export *, struct md_op_data *,
 		      const void *, size_t, umode_t, uid_t, gid_t,
 		      cfs_cap_t, __u64, struct ptlrpc_request **);
-	int (*done_writing)(struct obd_export *, struct md_op_data  *,
-			    struct md_open_data *);
 	int (*enqueue)(struct obd_export *, struct ldlm_enqueue_info *,
 		       const ldlm_policy_data_t *,
 		       struct lookup_intent *, struct md_op_data *,
@@ -1012,8 +1008,7 @@ struct md_ops {
 		      const char *, size_t, const char *, size_t,
 		      struct ptlrpc_request **);
 	int (*setattr)(struct obd_export *, struct md_op_data *, void *,
-		       size_t, void *, size_t, struct ptlrpc_request **,
-			 struct md_open_data **mod);
+		       size_t, struct ptlrpc_request **);
 	int (*sync)(struct obd_export *, const struct lu_fid *,
 		    struct ptlrpc_request **);
 	int (*read_page)(struct obd_export *, struct md_op_data *,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 16094db..2ea102d 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -269,10 +269,8 @@ static inline int lprocfs_climp_check(struct obd_device *obd)
 struct inode;
 struct lu_attr;
 struct obdo;
-void obdo_refresh_inode(struct inode *dst, const struct obdo *src, u32 valid);
 
 void obdo_to_ioobj(const struct obdo *oa, struct obd_ioobj *ioobj);
-void md_from_obdo(struct md_op_data *op_data, const struct obdo *oa, u32 valid);
 
 #define OBT(dev)	(dev)->obd_type
 #define OBP(dev, op)    (dev)->obd_type->typ_dt_ops->op
@@ -1346,18 +1344,6 @@ static inline int md_create(struct obd_export *exp, struct md_op_data *op_data,
 	return rc;
 }
 
-static inline int md_done_writing(struct obd_export *exp,
-				  struct md_op_data *op_data,
-				  struct md_open_data *mod)
-{
-	int rc;
-
-	EXP_CHECK_MD_OP(exp, done_writing);
-	EXP_MD_COUNTER_INCREMENT(exp, done_writing);
-	rc = MDP(exp->exp_obd, done_writing)(exp, op_data, mod);
-	return rc;
-}
-
 static inline int md_enqueue(struct obd_export *exp,
 			     struct ldlm_enqueue_info *einfo,
 			     const ldlm_policy_data_t *policy,
@@ -1428,16 +1414,14 @@ static inline int md_rename(struct obd_export *exp, struct md_op_data *op_data,
 }
 
 static inline int md_setattr(struct obd_export *exp, struct md_op_data *op_data,
-			     void *ea, size_t ealen, void *ea2, size_t ea2len,
-			     struct ptlrpc_request **request,
-			     struct md_open_data **mod)
+			     void *ea, size_t ealen,
+			     struct ptlrpc_request **request)
 {
 	int rc;
 
 	EXP_CHECK_MD_OP(exp, setattr);
 	EXP_MD_COUNTER_INCREMENT(exp, setattr);
-	rc = MDP(exp->exp_obd, setattr)(exp, op_data, ea, ealen,
-					ea2, ea2len, request, mod);
+	rc = MDP(exp->exp_obd, setattr)(exp, op_data, ea, ealen, request);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 7f32a53..3641327 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -501,8 +501,7 @@ int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump,
 		return PTR_ERR(op_data);
 
 	/* swabbing is done in lov_setstripe() on server side */
-	rc = md_setattr(sbi->ll_md_exp, op_data, lump, lum_size,
-			NULL, 0, &req, NULL);
+	rc = md_setattr(sbi->ll_md_exp, op_data, lump, lum_size, &req);
 	ll_finish_md_op_data(op_data);
 	ptlrpc_req_finished(req);
 	if (rc) {
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 4f83275..e75ab2f 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1295,8 +1295,7 @@ void ll_clear_inode(struct inode *inode)
 
 #define TIMES_SET_FLAGS (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)
 
-static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data,
-			 struct md_open_data **mod)
+static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data)
 {
 	struct lustre_md md;
 	struct inode *inode = d_inode(dentry);
@@ -1309,8 +1308,7 @@ static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data,
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
 
-	rc = md_setattr(sbi->ll_md_exp, op_data, NULL, 0, NULL, 0,
-			&request, mod);
+	rc = md_setattr(sbi->ll_md_exp, op_data, NULL, 0, &request);
 	if (rc) {
 		ptlrpc_req_finished(request);
 		if (rc == -ENOENT) {
@@ -1372,7 +1370,6 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 	struct inode *inode = d_inode(dentry);
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct md_op_data *op_data = NULL;
-	struct md_open_data *mod = NULL;
 	bool file_is_released = false;
 	int rc = 0;
 
@@ -1477,7 +1474,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 
 	memcpy(&op_data->op_attr, attr, sizeof(*attr));
 
-	rc = ll_md_setattr(dentry, op_data, &mod);
+	rc = ll_md_setattr(dentry, op_data);
 	if (rc)
 		goto out;
 
@@ -1896,8 +1893,7 @@ int ll_iocontrol(struct inode *inode, struct file *file,
 
 		op_data->op_attr_flags = flags;
 		op_data->op_attr.ia_valid |= ATTR_ATTR_FLAG;
-		rc = md_setattr(sbi->ll_md_exp, op_data,
-				NULL, 0, NULL, 0, &req, NULL);
+		rc = md_setattr(sbi->ll_md_exp, op_data, NULL, 0, &req);
 		ll_finish_md_op_data(op_data);
 		ptlrpc_req_finished(req);
 		if (rc)
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 7dbb2b9..b401ffb 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1728,27 +1728,6 @@ static int lmv_create(struct obd_export *exp, struct md_op_data *op_data,
 	return rc;
 }
 
-static int lmv_done_writing(struct obd_export *exp,
-			    struct md_op_data *op_data,
-			    struct md_open_data *mod)
-{
-	struct obd_device     *obd = exp->exp_obd;
-	struct lmv_obd	*lmv = &obd->u.lmv;
-	struct lmv_tgt_desc   *tgt;
-	int		    rc;
-
-	rc = lmv_check_connect(obd);
-	if (rc)
-		return rc;
-
-	tgt = lmv_find_target(lmv, &op_data->op_fid1);
-	if (IS_ERR(tgt))
-		return PTR_ERR(tgt);
-
-	rc = md_done_writing(tgt->ltd_exp, op_data, mod);
-	return rc;
-}
-
 static int
 lmv_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 	    const ldlm_policy_data_t *policy,
@@ -2065,9 +2044,7 @@ static int lmv_rename(struct obd_export *exp, struct md_op_data *op_data,
 }
 
 static int lmv_setattr(struct obd_export *exp, struct md_op_data *op_data,
-		       void *ea, size_t ealen, void *ea2, size_t ea2len,
-		       struct ptlrpc_request **request,
-		       struct md_open_data **mod)
+		       void *ea, size_t ealen, struct ptlrpc_request **request)
 {
 	struct obd_device       *obd = exp->exp_obd;
 	struct lmv_obd	  *lmv = &obd->u.lmv;
@@ -2086,10 +2063,7 @@ static int lmv_setattr(struct obd_export *exp, struct md_op_data *op_data,
 	if (IS_ERR(tgt))
 		return PTR_ERR(tgt);
 
-	rc = md_setattr(tgt->ltd_exp, op_data, ea, ealen, ea2,
-			ea2len, request, mod);
-
-	return rc;
+	return md_setattr(tgt->ltd_exp, op_data, ea, ealen, request);
 }
 
 static int lmv_sync(struct obd_export *exp, const struct lu_fid *fid,
@@ -3363,7 +3337,6 @@ static struct md_ops lmv_md_ops = {
 	.null_inode		= lmv_null_inode,
 	.close			= lmv_close,
 	.create			= lmv_create,
-	.done_writing		= lmv_done_writing,
 	.enqueue		= lmv_enqueue,
 	.getattr		= lmv_getattr,
 	.getxattr		= lmv_getxattr,
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 09dcaf4..8e40702 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -214,15 +214,6 @@ static int common_attr_done(struct lov_request_set *set)
 		CERROR("No stripes had valid attrs\n");
 		rc = -EIO;
 	}
-	if ((set->set_oi->oi_oa->o_valid & OBD_MD_FLEPOCH) &&
-	    (set->set_oi->oi_md->lsm_stripe_count != attrset)) {
-		/* When we take attributes of some epoch, we require all the
-		 * ost to be active.
-		 */
-		CERROR("Not all the stripes had valid attrs\n");
-		rc = -EIO;
-		goto out;
-	}
 
 	tmp_oa->o_oi = set->set_oi->oi_oa->o_oi;
 	memcpy(set->set_oi->oi_oa, tmp_oa, sizeof(*set->set_oi->oi_oa));
@@ -284,11 +275,6 @@ int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
 
 		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
 			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
-			if (oinfo->oi_oa->o_valid & OBD_MD_FLEPOCH) {
-				/* SOM requires all the OSTs to be active. */
-				rc = -EIO;
-				goto out_set;
-			}
 			continue;
 		}
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_internal.h b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
index f446c1c..d2af8e7 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_internal.h
+++ b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
@@ -46,7 +46,7 @@ void mdc_readdir_pack(struct ptlrpc_request *req, __u64 pgoff, size_t size,
 void mdc_getattr_pack(struct ptlrpc_request *req, __u64 valid, u32 flags,
 		      struct md_op_data *data, size_t ea_size);
 void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
-		      void *ea, size_t ealen, void *ea2, size_t ea2len);
+		      void *ea, size_t ealen);
 void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 		     const void *data, size_t datalen, umode_t mode, uid_t uid,
 		     gid_t gid, cfs_cap_t capability, __u64 rdev);
@@ -105,8 +105,7 @@ int mdc_rename(struct obd_export *exp, struct md_op_data *op_data,
 	       const char *new, size_t newlen,
 	       struct ptlrpc_request **request);
 int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
-		void *ea, size_t ealen, void *ea2, size_t ea2len,
-		struct ptlrpc_request **request, struct md_open_data **mod);
+		void *ea, size_t ealen, struct ptlrpc_request **request);
 int mdc_unlink(struct obd_export *exp, struct md_op_data *op_data,
 	       struct ptlrpc_request **request);
 int mdc_cancel_unused(struct obd_export *exp, const struct lu_fid *fid,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index aac7e04..709440b 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -139,7 +139,7 @@ void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec->cr_time     = op_data->op_mod_time;
 	rec->cr_suppgid1 = op_data->op_suppgids[0];
 	rec->cr_suppgid2 = op_data->op_suppgids[1];
-	flags = op_data->op_flags & MF_SOM_LOCAL_FLAGS;
+	flags = 0;
 	if (op_data->op_bias & MDS_CREATE_VOLATILE)
 		flags |= MDS_OPEN_VOLATILE;
 	set_mrc_cr_flags(rec, flags);
@@ -302,15 +302,14 @@ static void mdc_ioepoch_pack(struct mdt_ioepoch *epoch,
 			     struct md_op_data *op_data)
 {
 	memcpy(&epoch->handle, &op_data->op_handle, sizeof(epoch->handle));
-	epoch->ioepoch = op_data->op_ioepoch;
-	epoch->flags = op_data->op_flags & MF_SOM_LOCAL_FLAGS;
+	epoch->ioepoch = 0;
+	epoch->flags = 0;
 }
 
 void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
-		      void *ea, size_t ealen, void *ea2, size_t ea2len)
+		      void *ea, size_t ealen)
 {
 	struct mdt_rec_setattr *rec;
-	struct mdt_ioepoch *epoch;
 	struct lov_user_md *lum = NULL;
 
 	CLASSERT(sizeof(struct mdt_rec_reint) ==
@@ -318,11 +317,6 @@ void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
 	mdc_setattr_pack_rec(rec, op_data);
 
-	if (op_data->op_flags & (MF_SOM_CHANGE | MF_EPOCH_OPEN)) {
-		epoch = req_capsule_client_get(&req->rq_pill, &RMF_MDT_EPOCH);
-		mdc_ioepoch_pack(epoch, op_data);
-	}
-
 	if (ealen == 0)
 		return;
 
@@ -335,12 +329,6 @@ void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	} else {
 		memcpy(lum, ea, ealen);
 	}
-
-	if (ea2len == 0)
-		return;
-
-	memcpy(req_capsule_client_get(&req->rq_pill, &RMF_LOGCOOKIES), ea2,
-	       ea2len);
 }
 
 void mdc_unlink_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_reint.c b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
index c921e47..6f62a95 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_reint.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
@@ -99,8 +99,7 @@ int mdc_resource_get_unused(struct obd_export *exp, const struct lu_fid *fid,
 }
 
 int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
-		void *ea, size_t ealen, void *ea2, size_t ea2len,
-		struct ptlrpc_request **request, struct md_open_data **mod)
+		void *ea, size_t ealen, struct ptlrpc_request **request)
 {
 	LIST_HEAD(cancels);
 	struct ptlrpc_request *req;
@@ -122,12 +121,9 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
 		ldlm_lock_list_put(&cancels, l_bl_ast, count);
 		return -ENOMEM;
 	}
-	if ((op_data->op_flags & (MF_SOM_CHANGE | MF_EPOCH_OPEN)) == 0)
-		req_capsule_set_size(&req->rq_pill, &RMF_MDT_EPOCH, RCL_CLIENT,
-				     0);
+	req_capsule_set_size(&req->rq_pill, &RMF_MDT_EPOCH, RCL_CLIENT, 0);
 	req_capsule_set_size(&req->rq_pill, &RMF_EADATA, RCL_CLIENT, ealen);
-	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_CLIENT,
-			     ea2len);
+	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_CLIENT, 0);
 
 	rc = mdc_prep_elc_req(exp, req, MDS_REINT, &cancels, count);
 	if (rc) {
@@ -141,57 +137,17 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
 		CDEBUG(D_INODE, "setting mtime %ld, ctime %ld\n",
 		       LTIME_S(op_data->op_attr.ia_mtime),
 		       LTIME_S(op_data->op_attr.ia_ctime));
-	mdc_setattr_pack(req, op_data, ea, ealen, ea2, ea2len);
+	mdc_setattr_pack(req, op_data, ea, ealen);
 
 	ptlrpc_request_set_replen(req);
-	if (mod && (op_data->op_flags & MF_EPOCH_OPEN) &&
-	    req->rq_import->imp_replayable) {
-		LASSERT(!*mod);
-
-		*mod = obd_mod_alloc();
-		if (!*mod) {
-			DEBUG_REQ(D_ERROR, req, "Can't allocate md_open_data");
-		} else {
-			req->rq_replay = 1;
-			req->rq_cb_data = *mod;
-			(*mod)->mod_open_req = req;
-			req->rq_commit_cb = mdc_commit_open;
-			(*mod)->mod_is_create = true;
-			/**
-			 * Take an extra reference on \var mod, it protects \var
-			 * mod from being freed on eviction (commit callback is
-			 * called despite rq_replay flag).
-			 * Will be put on mdc_done_writing().
-			 */
-			obd_mod_get(*mod);
-		}
-	}
 
 	rc = mdc_reint(req, rpc_lock, LUSTRE_IMP_FULL);
 
-	/* Save the obtained info in the original RPC for the replay case. */
-	if (rc == 0 && (op_data->op_flags & MF_EPOCH_OPEN)) {
-		struct mdt_ioepoch *epoch;
-		struct mdt_body  *body;
-
-		epoch = req_capsule_client_get(&req->rq_pill, &RMF_MDT_EPOCH);
-		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-		epoch->handle = body->mbo_handle;
-		epoch->ioepoch = body->mbo_ioepoch;
-		req->rq_replay_cb = mdc_replay_open;
-	/** bug 3633, open may be committed and estale answer is not error */
-	} else if (rc == -ESTALE && (op_data->op_flags & MF_SOM_CHANGE)) {
-		rc = 0;
-	} else if (rc == -ERESTARTSYS) {
+	if (rc == -ERESTARTSYS)
 		rc = 0;
-	}
+
 	*request = req;
-	if (rc && req->rq_commit_cb) {
-		/* Put an extra reference on \var mod on error case. */
-		if (mod && *mod)
-			obd_mod_put(*mod);
-		req->rq_commit_cb(req);
-	}
+
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index f56ea64..3ef1bae 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -528,10 +528,6 @@ static int mdc_free_lustre_md(struct obd_export *exp, struct lustre_md *md)
 	return 0;
 }
 
-/**
- * Handles both OPEN and SETATTR RPCs for OPEN-CLOSE and SETATTR-DONE_WRITING
- * RPC chains.
- */
 void mdc_replay_open(struct ptlrpc_request *req)
 {
 	struct md_open_data *mod = req->rq_cb_data;
@@ -565,7 +561,7 @@ void mdc_replay_open(struct ptlrpc_request *req)
 		__u32 opc = lustre_msg_get_opc(close_req->rq_reqmsg);
 		struct mdt_ioepoch *epoch;
 
-		LASSERT(opc == MDS_CLOSE || opc == MDS_DONE_WRITING);
+		LASSERT(opc == MDS_CLOSE);
 		epoch = req_capsule_client_get(&close_req->rq_pill,
 					       &RMF_MDT_EPOCH);
 		LASSERT(epoch);
@@ -715,22 +711,6 @@ static int mdc_clear_open_replay_data(struct obd_export *exp,
 	return 0;
 }
 
-/* Prepares the request for the replay by the given reply */
-static void mdc_close_handle_reply(struct ptlrpc_request *req,
-				   struct md_op_data *op_data, int rc) {
-	struct mdt_body  *repbody;
-	struct mdt_ioepoch *epoch;
-
-	if (req && rc == -EAGAIN) {
-		repbody = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-		epoch = req_capsule_client_get(&req->rq_pill, &RMF_MDT_EPOCH);
-
-		epoch->flags |= MF_SOM_AU;
-		if (repbody->mbo_valid & OBD_MD_FLGETATTRLOCK)
-			op_data->op_flags |= MF_GETATTR_LOCK;
-	}
-}
-
 static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 		     struct md_open_data *mod, struct ptlrpc_request **request)
 {
@@ -857,79 +837,9 @@ out:
 		obd_mod_put(mod);
 	}
 	*request = req;
-	mdc_close_handle_reply(req, op_data, rc);
 	return rc < 0 ? rc : saved_rc;
 }
 
-static int mdc_done_writing(struct obd_export *exp, struct md_op_data *op_data,
-			    struct md_open_data *mod)
-{
-	struct obd_device     *obd = class_exp2obd(exp);
-	struct ptlrpc_request *req;
-	int		    rc;
-
-	req = ptlrpc_request_alloc(class_exp2cliimp(exp),
-				   &RQF_MDS_DONE_WRITING);
-	if (!req)
-		return -ENOMEM;
-
-	rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_DONE_WRITING);
-	if (rc) {
-		ptlrpc_request_free(req);
-		return rc;
-	}
-
-	if (mod) {
-		LASSERTF(mod->mod_open_req &&
-			 mod->mod_open_req->rq_type != LI_POISON,
-			 "POISONED setattr %p!\n", mod->mod_open_req);
-
-		mod->mod_close_req = req;
-		DEBUG_REQ(D_HA, mod->mod_open_req, "matched setattr");
-		/* We no longer want to preserve this setattr for replay even
-		 * though the open was committed. b=3632, b=3633
-		 */
-		spin_lock(&mod->mod_open_req->rq_lock);
-		mod->mod_open_req->rq_replay = 0;
-		spin_unlock(&mod->mod_open_req->rq_lock);
-	}
-
-	mdc_close_pack(req, op_data);
-	ptlrpc_request_set_replen(req);
-
-	mdc_get_rpc_lock(obd->u.cli.cl_close_lock, NULL);
-	rc = ptlrpc_queue_wait(req);
-	mdc_put_rpc_lock(obd->u.cli.cl_close_lock, NULL);
-
-	if (rc == -ESTALE) {
-		/**
-		 * it can be allowed error after 3633 if open or setattr were
-		 * committed and server failed before close was sent.
-		 * Let's check if mod exists and return no error in that case
-		 */
-		if (mod) {
-			if (mod->mod_open_req->rq_committed)
-				rc = 0;
-		}
-	}
-
-	if (mod) {
-		if (rc != 0)
-			mod->mod_close_req = NULL;
-		LASSERT(mod->mod_open_req);
-		mdc_free_open(mod);
-
-		/* Since now, mod is accessed through setattr req only,
-		 * thus DW req does not keep a reference on mod anymore.
-		 */
-		obd_mod_put(mod);
-	}
-
-	mdc_close_handle_reply(req, op_data, rc);
-	ptlrpc_req_finished(req);
-	return rc;
-}
-
 static int mdc_getpage(struct obd_export *exp, const struct lu_fid *fid,
 		       u64 offset, struct page **pages, int npages,
 		       struct ptlrpc_request **request)
@@ -2889,7 +2799,6 @@ static struct md_ops mdc_md_ops = {
 	.null_inode		= mdc_null_inode,
 	.close			= mdc_close,
 	.create			= mdc_create,
-	.done_writing		= mdc_done_writing,
 	.enqueue		= mdc_enqueue,
 	.getattr		= mdc_getattr,
 	.getattr_name		= mdc_getattr_name,
diff --git a/drivers/staging/lustre/lustre/obdclass/Makefile b/drivers/staging/lustre/lustre/obdclass/Makefile
index b42e109..af570c0 100644
--- a/drivers/staging/lustre/lustre/obdclass/Makefile
+++ b/drivers/staging/lustre/lustre/obdclass/Makefile
@@ -1,6 +1,6 @@
 obj-$(CONFIG_LUSTRE_FS) += obdclass.o
 
-obdclass-y := linux/linux-module.o linux/linux-obdo.o linux/linux-sysctl.o \
+obdclass-y := linux/linux-module.o linux/linux-sysctl.o \
 	      llog.o llog_cat.o llog_obd.o llog_swab.o class_obd.o debug.o \
 	      genops.o uuid.o lprocfs_status.o lprocfs_counters.o \
 	      lustre_handles.o lustre_peer.o statfs_pack.o linkea.o \
diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c
deleted file mode 100644
index 41b77a3..0000000
--- a/drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c
+++ /dev/null
@@ -1,80 +0,0 @@
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.gnu.org/licenses/gpl-2.0.html
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2011, 2012, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- *
- * lustre/obdclass/linux/linux-obdo.c
- *
- * Object Devices Class Driver
- * These are the only exported functions, they provide some generic
- * infrastructure for managing object devices
- */
-
-#define DEBUG_SUBSYSTEM S_CLASS
-
-#include <linux/module.h>
-#include "../../include/obd_class.h"
-#include "../../include/lustre/lustre_idl.h"
-
-#include <linux/fs.h>
-
-void obdo_refresh_inode(struct inode *dst, const struct obdo *src, u32 valid)
-{
-	valid &= src->o_valid;
-
-	if (valid & (OBD_MD_FLCTIME | OBD_MD_FLMTIME))
-		CDEBUG(D_INODE,
-		       "valid %#llx, cur time %lu/%lu, new %llu/%llu\n",
-		       src->o_valid, LTIME_S(dst->i_mtime),
-		       LTIME_S(dst->i_ctime), src->o_mtime, src->o_ctime);
-
-	if (valid & OBD_MD_FLATIME && src->o_atime > LTIME_S(dst->i_atime))
-		LTIME_S(dst->i_atime) = src->o_atime;
-	if (valid & OBD_MD_FLMTIME && src->o_mtime > LTIME_S(dst->i_mtime))
-		LTIME_S(dst->i_mtime) = src->o_mtime;
-	if (valid & OBD_MD_FLCTIME && src->o_ctime > LTIME_S(dst->i_ctime))
-		LTIME_S(dst->i_ctime) = src->o_ctime;
-	if (valid & OBD_MD_FLSIZE)
-		i_size_write(dst, src->o_size);
-	/* optimum IO size */
-	if (valid & OBD_MD_FLBLKSZ && src->o_blksize > (1 << dst->i_blkbits))
-		dst->i_blkbits = ffs(src->o_blksize) - 1;
-
-	if (dst->i_blkbits < PAGE_SHIFT)
-		dst->i_blkbits = PAGE_SHIFT;
-
-	/* allocation of space */
-	if (valid & OBD_MD_FLBLOCKS && src->o_blocks > dst->i_blocks)
-		/*
-		 * XXX shouldn't overflow be checked here like in
-		 * obdo_to_inode().
-		 */
-		dst->i_blocks = src->o_blocks;
-}
-EXPORT_SYMBOL(obdo_refresh_inode);
diff --git a/drivers/staging/lustre/lustre/obdclass/obdo.c b/drivers/staging/lustre/lustre/obdclass/obdo.c
index 79104a6..c52b9e0 100644
--- a/drivers/staging/lustre/lustre/obdclass/obdo.c
+++ b/drivers/staging/lustre/lustre/obdclass/obdo.c
@@ -124,68 +124,3 @@ void obdo_to_ioobj(const struct obdo *oa, struct obd_ioobj *ioobj)
 	ioobj->ioo_max_brw = 0;
 }
 EXPORT_SYMBOL(obdo_to_ioobj);
-
-static void iattr_from_obdo(struct iattr *attr, const struct obdo *oa,
-			    u32 valid)
-{
-	valid &= oa->o_valid;
-
-	if (valid & (OBD_MD_FLCTIME | OBD_MD_FLMTIME))
-		CDEBUG(D_INODE, "valid %#llx, new time %llu/%llu\n",
-		       oa->o_valid, oa->o_mtime, oa->o_ctime);
-
-	attr->ia_valid = 0;
-	if (valid & OBD_MD_FLATIME) {
-		LTIME_S(attr->ia_atime) = oa->o_atime;
-		attr->ia_valid |= ATTR_ATIME;
-	}
-	if (valid & OBD_MD_FLMTIME) {
-		LTIME_S(attr->ia_mtime) = oa->o_mtime;
-		attr->ia_valid |= ATTR_MTIME;
-	}
-	if (valid & OBD_MD_FLCTIME) {
-		LTIME_S(attr->ia_ctime) = oa->o_ctime;
-		attr->ia_valid |= ATTR_CTIME;
-	}
-	if (valid & OBD_MD_FLSIZE) {
-		attr->ia_size = oa->o_size;
-		attr->ia_valid |= ATTR_SIZE;
-	}
-#if 0   /* you shouldn't be able to change a file's type with setattr */
-	if (valid & OBD_MD_FLTYPE) {
-		attr->ia_mode = (attr->ia_mode & ~S_IFMT) |
-				(oa->o_mode & S_IFMT);
-		attr->ia_valid |= ATTR_MODE;
-	}
-#endif
-	if (valid & OBD_MD_FLMODE) {
-		attr->ia_mode = (attr->ia_mode & S_IFMT) |
-				(oa->o_mode & ~S_IFMT);
-		attr->ia_valid |= ATTR_MODE;
-		if (!in_group_p(make_kgid(&init_user_ns, oa->o_gid)) &&
-		    !capable(CFS_CAP_FSETID))
-			attr->ia_mode &= ~S_ISGID;
-	}
-	if (valid & OBD_MD_FLUID) {
-		attr->ia_uid = make_kuid(&init_user_ns, oa->o_uid);
-		attr->ia_valid |= ATTR_UID;
-	}
-	if (valid & OBD_MD_FLGID) {
-		attr->ia_gid = make_kgid(&init_user_ns, oa->o_gid);
-		attr->ia_valid |= ATTR_GID;
-	}
-}
-
-void md_from_obdo(struct md_op_data *op_data, const struct obdo *oa, u32 valid)
-{
-	iattr_from_obdo(&op_data->op_attr, oa, valid);
-	if (valid & OBD_MD_FLBLOCKS) {
-		op_data->op_attr_blocks = oa->o_blocks;
-		op_data->op_attr.ia_valid |= ATTR_BLOCKS;
-	}
-	if (valid & OBD_MD_FLFLAGS) {
-		op_data->op_attr_flags = oa->o_flags;
-		op_data->op_attr.ia_valid |= ATTR_ATTR_FLAG;
-	}
-}
-EXPORT_SYMBOL(md_from_obdo);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 04/41] staging: lustre: obd: remove client Size on MDS support
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Remove the unused OBD MD API method md_done_writing(). Remove the
unused logcookie and struct md_open_data ** parameters from
md_setattr(). Remove the unused functions iattr_from_obdo(),
md_from_obdo(), and obdo_refresh_inode().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6047
Reviewed-on: http://review.whamcloud.com/13169
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    2 -
 drivers/staging/lustre/lustre/include/obd.h        |    7 +--
 drivers/staging/lustre/lustre/include/obd_class.h  |   22 +----
 drivers/staging/lustre/lustre/llite/dir.c          |    3 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   12 +--
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |   31 +------
 drivers/staging/lustre/lustre/lov/lov_request.c    |   14 ---
 drivers/staging/lustre/lustre/mdc/mdc_internal.h   |    5 +-
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   20 +----
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |   58 ++-----------
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   93 +-------------------
 drivers/staging/lustre/lustre/obdclass/Makefile    |    2 +-
 .../lustre/lustre/obdclass/linux/linux-obdo.c      |   80 -----------------
 drivers/staging/lustre/lustre/obdclass/obdo.c      |   65 --------------
 14 files changed, 26 insertions(+), 388 deletions(-)
 delete mode 100644 drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 72eaee9..d164545 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -2049,8 +2049,6 @@ enum md_op_flags {
 	MF_GET_MDT_IDX	  = (1 << 9),
 };
 
-#define MF_SOM_LOCAL_FLAGS (MF_SOM_CHANGE | MF_EPOCH_OPEN | MF_EPOCH_CLOSE)
-
 #define LUSTRE_BFLAG_UNCOMMITTED_WRITES   0x1
 
 /* these should be identical to their EXT4_*_FL counterparts, they are
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index f6fc4dd..51d5487 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -789,8 +789,6 @@ struct md_op_data {
 	__u64		   op_valid;
 	loff_t		  op_attr_blocks;
 
-	/* Size-on-MDS epoch and flags. */
-	__u64		   op_ioepoch;
 	__u32		   op_flags;
 
 	/* Various operation flags. */
@@ -992,8 +990,6 @@ struct md_ops {
 	int (*create)(struct obd_export *, struct md_op_data *,
 		      const void *, size_t, umode_t, uid_t, gid_t,
 		      cfs_cap_t, __u64, struct ptlrpc_request **);
-	int (*done_writing)(struct obd_export *, struct md_op_data  *,
-			    struct md_open_data *);
 	int (*enqueue)(struct obd_export *, struct ldlm_enqueue_info *,
 		       const ldlm_policy_data_t *,
 		       struct lookup_intent *, struct md_op_data *,
@@ -1012,8 +1008,7 @@ struct md_ops {
 		      const char *, size_t, const char *, size_t,
 		      struct ptlrpc_request **);
 	int (*setattr)(struct obd_export *, struct md_op_data *, void *,
-		       size_t, void *, size_t, struct ptlrpc_request **,
-			 struct md_open_data **mod);
+		       size_t, struct ptlrpc_request **);
 	int (*sync)(struct obd_export *, const struct lu_fid *,
 		    struct ptlrpc_request **);
 	int (*read_page)(struct obd_export *, struct md_op_data *,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 16094db..2ea102d 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -269,10 +269,8 @@ static inline int lprocfs_climp_check(struct obd_device *obd)
 struct inode;
 struct lu_attr;
 struct obdo;
-void obdo_refresh_inode(struct inode *dst, const struct obdo *src, u32 valid);
 
 void obdo_to_ioobj(const struct obdo *oa, struct obd_ioobj *ioobj);
-void md_from_obdo(struct md_op_data *op_data, const struct obdo *oa, u32 valid);
 
 #define OBT(dev)	(dev)->obd_type
 #define OBP(dev, op)    (dev)->obd_type->typ_dt_ops->op
@@ -1346,18 +1344,6 @@ static inline int md_create(struct obd_export *exp, struct md_op_data *op_data,
 	return rc;
 }
 
-static inline int md_done_writing(struct obd_export *exp,
-				  struct md_op_data *op_data,
-				  struct md_open_data *mod)
-{
-	int rc;
-
-	EXP_CHECK_MD_OP(exp, done_writing);
-	EXP_MD_COUNTER_INCREMENT(exp, done_writing);
-	rc = MDP(exp->exp_obd, done_writing)(exp, op_data, mod);
-	return rc;
-}
-
 static inline int md_enqueue(struct obd_export *exp,
 			     struct ldlm_enqueue_info *einfo,
 			     const ldlm_policy_data_t *policy,
@@ -1428,16 +1414,14 @@ static inline int md_rename(struct obd_export *exp, struct md_op_data *op_data,
 }
 
 static inline int md_setattr(struct obd_export *exp, struct md_op_data *op_data,
-			     void *ea, size_t ealen, void *ea2, size_t ea2len,
-			     struct ptlrpc_request **request,
-			     struct md_open_data **mod)
+			     void *ea, size_t ealen,
+			     struct ptlrpc_request **request)
 {
 	int rc;
 
 	EXP_CHECK_MD_OP(exp, setattr);
 	EXP_MD_COUNTER_INCREMENT(exp, setattr);
-	rc = MDP(exp->exp_obd, setattr)(exp, op_data, ea, ealen,
-					ea2, ea2len, request, mod);
+	rc = MDP(exp->exp_obd, setattr)(exp, op_data, ea, ealen, request);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 7f32a53..3641327 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -501,8 +501,7 @@ int ll_dir_setstripe(struct inode *inode, struct lov_user_md *lump,
 		return PTR_ERR(op_data);
 
 	/* swabbing is done in lov_setstripe() on server side */
-	rc = md_setattr(sbi->ll_md_exp, op_data, lump, lum_size,
-			NULL, 0, &req, NULL);
+	rc = md_setattr(sbi->ll_md_exp, op_data, lump, lum_size, &req);
 	ll_finish_md_op_data(op_data);
 	ptlrpc_req_finished(req);
 	if (rc) {
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 4f83275..e75ab2f 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1295,8 +1295,7 @@ void ll_clear_inode(struct inode *inode)
 
 #define TIMES_SET_FLAGS (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)
 
-static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data,
-			 struct md_open_data **mod)
+static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data)
 {
 	struct lustre_md md;
 	struct inode *inode = d_inode(dentry);
@@ -1309,8 +1308,7 @@ static int ll_md_setattr(struct dentry *dentry, struct md_op_data *op_data,
 	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
 
-	rc = md_setattr(sbi->ll_md_exp, op_data, NULL, 0, NULL, 0,
-			&request, mod);
+	rc = md_setattr(sbi->ll_md_exp, op_data, NULL, 0, &request);
 	if (rc) {
 		ptlrpc_req_finished(request);
 		if (rc == -ENOENT) {
@@ -1372,7 +1370,6 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 	struct inode *inode = d_inode(dentry);
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct md_op_data *op_data = NULL;
-	struct md_open_data *mod = NULL;
 	bool file_is_released = false;
 	int rc = 0;
 
@@ -1477,7 +1474,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 
 	memcpy(&op_data->op_attr, attr, sizeof(*attr));
 
-	rc = ll_md_setattr(dentry, op_data, &mod);
+	rc = ll_md_setattr(dentry, op_data);
 	if (rc)
 		goto out;
 
@@ -1896,8 +1893,7 @@ int ll_iocontrol(struct inode *inode, struct file *file,
 
 		op_data->op_attr_flags = flags;
 		op_data->op_attr.ia_valid |= ATTR_ATTR_FLAG;
-		rc = md_setattr(sbi->ll_md_exp, op_data,
-				NULL, 0, NULL, 0, &req, NULL);
+		rc = md_setattr(sbi->ll_md_exp, op_data, NULL, 0, &req);
 		ll_finish_md_op_data(op_data);
 		ptlrpc_req_finished(req);
 		if (rc)
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 7dbb2b9..b401ffb 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1728,27 +1728,6 @@ static int lmv_create(struct obd_export *exp, struct md_op_data *op_data,
 	return rc;
 }
 
-static int lmv_done_writing(struct obd_export *exp,
-			    struct md_op_data *op_data,
-			    struct md_open_data *mod)
-{
-	struct obd_device     *obd = exp->exp_obd;
-	struct lmv_obd	*lmv = &obd->u.lmv;
-	struct lmv_tgt_desc   *tgt;
-	int		    rc;
-
-	rc = lmv_check_connect(obd);
-	if (rc)
-		return rc;
-
-	tgt = lmv_find_target(lmv, &op_data->op_fid1);
-	if (IS_ERR(tgt))
-		return PTR_ERR(tgt);
-
-	rc = md_done_writing(tgt->ltd_exp, op_data, mod);
-	return rc;
-}
-
 static int
 lmv_enqueue(struct obd_export *exp, struct ldlm_enqueue_info *einfo,
 	    const ldlm_policy_data_t *policy,
@@ -2065,9 +2044,7 @@ static int lmv_rename(struct obd_export *exp, struct md_op_data *op_data,
 }
 
 static int lmv_setattr(struct obd_export *exp, struct md_op_data *op_data,
-		       void *ea, size_t ealen, void *ea2, size_t ea2len,
-		       struct ptlrpc_request **request,
-		       struct md_open_data **mod)
+		       void *ea, size_t ealen, struct ptlrpc_request **request)
 {
 	struct obd_device       *obd = exp->exp_obd;
 	struct lmv_obd	  *lmv = &obd->u.lmv;
@@ -2086,10 +2063,7 @@ static int lmv_setattr(struct obd_export *exp, struct md_op_data *op_data,
 	if (IS_ERR(tgt))
 		return PTR_ERR(tgt);
 
-	rc = md_setattr(tgt->ltd_exp, op_data, ea, ealen, ea2,
-			ea2len, request, mod);
-
-	return rc;
+	return md_setattr(tgt->ltd_exp, op_data, ea, ealen, request);
 }
 
 static int lmv_sync(struct obd_export *exp, const struct lu_fid *fid,
@@ -3363,7 +3337,6 @@ static struct md_ops lmv_md_ops = {
 	.null_inode		= lmv_null_inode,
 	.close			= lmv_close,
 	.create			= lmv_create,
-	.done_writing		= lmv_done_writing,
 	.enqueue		= lmv_enqueue,
 	.getattr		= lmv_getattr,
 	.getxattr		= lmv_getxattr,
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 09dcaf4..8e40702 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -214,15 +214,6 @@ static int common_attr_done(struct lov_request_set *set)
 		CERROR("No stripes had valid attrs\n");
 		rc = -EIO;
 	}
-	if ((set->set_oi->oi_oa->o_valid & OBD_MD_FLEPOCH) &&
-	    (set->set_oi->oi_md->lsm_stripe_count != attrset)) {
-		/* When we take attributes of some epoch, we require all the
-		 * ost to be active.
-		 */
-		CERROR("Not all the stripes had valid attrs\n");
-		rc = -EIO;
-		goto out;
-	}
 
 	tmp_oa->o_oi = set->set_oi->oi_oa->o_oi;
 	memcpy(set->set_oi->oi_oa, tmp_oa, sizeof(*set->set_oi->oi_oa));
@@ -284,11 +275,6 @@ int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
 
 		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
 			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
-			if (oinfo->oi_oa->o_valid & OBD_MD_FLEPOCH) {
-				/* SOM requires all the OSTs to be active. */
-				rc = -EIO;
-				goto out_set;
-			}
 			continue;
 		}
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_internal.h b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
index f446c1c..d2af8e7 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_internal.h
+++ b/drivers/staging/lustre/lustre/mdc/mdc_internal.h
@@ -46,7 +46,7 @@ void mdc_readdir_pack(struct ptlrpc_request *req, __u64 pgoff, size_t size,
 void mdc_getattr_pack(struct ptlrpc_request *req, __u64 valid, u32 flags,
 		      struct md_op_data *data, size_t ea_size);
 void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
-		      void *ea, size_t ealen, void *ea2, size_t ea2len);
+		      void *ea, size_t ealen);
 void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 		     const void *data, size_t datalen, umode_t mode, uid_t uid,
 		     gid_t gid, cfs_cap_t capability, __u64 rdev);
@@ -105,8 +105,7 @@ int mdc_rename(struct obd_export *exp, struct md_op_data *op_data,
 	       const char *new, size_t newlen,
 	       struct ptlrpc_request **request);
 int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
-		void *ea, size_t ealen, void *ea2, size_t ea2len,
-		struct ptlrpc_request **request, struct md_open_data **mod);
+		void *ea, size_t ealen, struct ptlrpc_request **request);
 int mdc_unlink(struct obd_export *exp, struct md_op_data *op_data,
 	       struct ptlrpc_request **request);
 int mdc_cancel_unused(struct obd_export *exp, const struct lu_fid *fid,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index aac7e04..709440b 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -139,7 +139,7 @@ void mdc_create_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec->cr_time     = op_data->op_mod_time;
 	rec->cr_suppgid1 = op_data->op_suppgids[0];
 	rec->cr_suppgid2 = op_data->op_suppgids[1];
-	flags = op_data->op_flags & MF_SOM_LOCAL_FLAGS;
+	flags = 0;
 	if (op_data->op_bias & MDS_CREATE_VOLATILE)
 		flags |= MDS_OPEN_VOLATILE;
 	set_mrc_cr_flags(rec, flags);
@@ -302,15 +302,14 @@ static void mdc_ioepoch_pack(struct mdt_ioepoch *epoch,
 			     struct md_op_data *op_data)
 {
 	memcpy(&epoch->handle, &op_data->op_handle, sizeof(epoch->handle));
-	epoch->ioepoch = op_data->op_ioepoch;
-	epoch->flags = op_data->op_flags & MF_SOM_LOCAL_FLAGS;
+	epoch->ioepoch = 0;
+	epoch->flags = 0;
 }
 
 void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
-		      void *ea, size_t ealen, void *ea2, size_t ea2len)
+		      void *ea, size_t ealen)
 {
 	struct mdt_rec_setattr *rec;
-	struct mdt_ioepoch *epoch;
 	struct lov_user_md *lum = NULL;
 
 	CLASSERT(sizeof(struct mdt_rec_reint) ==
@@ -318,11 +317,6 @@ void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	rec = req_capsule_client_get(&req->rq_pill, &RMF_REC_REINT);
 	mdc_setattr_pack_rec(rec, op_data);
 
-	if (op_data->op_flags & (MF_SOM_CHANGE | MF_EPOCH_OPEN)) {
-		epoch = req_capsule_client_get(&req->rq_pill, &RMF_MDT_EPOCH);
-		mdc_ioepoch_pack(epoch, op_data);
-	}
-
 	if (ealen == 0)
 		return;
 
@@ -335,12 +329,6 @@ void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
 	} else {
 		memcpy(lum, ea, ealen);
 	}
-
-	if (ea2len == 0)
-		return;
-
-	memcpy(req_capsule_client_get(&req->rq_pill, &RMF_LOGCOOKIES), ea2,
-	       ea2len);
 }
 
 void mdc_unlink_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_reint.c b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
index c921e47..6f62a95 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_reint.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
@@ -99,8 +99,7 @@ int mdc_resource_get_unused(struct obd_export *exp, const struct lu_fid *fid,
 }
 
 int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
-		void *ea, size_t ealen, void *ea2, size_t ea2len,
-		struct ptlrpc_request **request, struct md_open_data **mod)
+		void *ea, size_t ealen, struct ptlrpc_request **request)
 {
 	LIST_HEAD(cancels);
 	struct ptlrpc_request *req;
@@ -122,12 +121,9 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
 		ldlm_lock_list_put(&cancels, l_bl_ast, count);
 		return -ENOMEM;
 	}
-	if ((op_data->op_flags & (MF_SOM_CHANGE | MF_EPOCH_OPEN)) == 0)
-		req_capsule_set_size(&req->rq_pill, &RMF_MDT_EPOCH, RCL_CLIENT,
-				     0);
+	req_capsule_set_size(&req->rq_pill, &RMF_MDT_EPOCH, RCL_CLIENT, 0);
 	req_capsule_set_size(&req->rq_pill, &RMF_EADATA, RCL_CLIENT, ealen);
-	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_CLIENT,
-			     ea2len);
+	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_CLIENT, 0);
 
 	rc = mdc_prep_elc_req(exp, req, MDS_REINT, &cancels, count);
 	if (rc) {
@@ -141,57 +137,17 @@ int mdc_setattr(struct obd_export *exp, struct md_op_data *op_data,
 		CDEBUG(D_INODE, "setting mtime %ld, ctime %ld\n",
 		       LTIME_S(op_data->op_attr.ia_mtime),
 		       LTIME_S(op_data->op_attr.ia_ctime));
-	mdc_setattr_pack(req, op_data, ea, ealen, ea2, ea2len);
+	mdc_setattr_pack(req, op_data, ea, ealen);
 
 	ptlrpc_request_set_replen(req);
-	if (mod && (op_data->op_flags & MF_EPOCH_OPEN) &&
-	    req->rq_import->imp_replayable) {
-		LASSERT(!*mod);
-
-		*mod = obd_mod_alloc();
-		if (!*mod) {
-			DEBUG_REQ(D_ERROR, req, "Can't allocate md_open_data");
-		} else {
-			req->rq_replay = 1;
-			req->rq_cb_data = *mod;
-			(*mod)->mod_open_req = req;
-			req->rq_commit_cb = mdc_commit_open;
-			(*mod)->mod_is_create = true;
-			/**
-			 * Take an extra reference on \var mod, it protects \var
-			 * mod from being freed on eviction (commit callback is
-			 * called despite rq_replay flag).
-			 * Will be put on mdc_done_writing().
-			 */
-			obd_mod_get(*mod);
-		}
-	}
 
 	rc = mdc_reint(req, rpc_lock, LUSTRE_IMP_FULL);
 
-	/* Save the obtained info in the original RPC for the replay case. */
-	if (rc == 0 && (op_data->op_flags & MF_EPOCH_OPEN)) {
-		struct mdt_ioepoch *epoch;
-		struct mdt_body  *body;
-
-		epoch = req_capsule_client_get(&req->rq_pill, &RMF_MDT_EPOCH);
-		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-		epoch->handle = body->mbo_handle;
-		epoch->ioepoch = body->mbo_ioepoch;
-		req->rq_replay_cb = mdc_replay_open;
-	/** bug 3633, open may be committed and estale answer is not error */
-	} else if (rc == -ESTALE && (op_data->op_flags & MF_SOM_CHANGE)) {
-		rc = 0;
-	} else if (rc == -ERESTARTSYS) {
+	if (rc == -ERESTARTSYS)
 		rc = 0;
-	}
+
 	*request = req;
-	if (rc && req->rq_commit_cb) {
-		/* Put an extra reference on \var mod on error case. */
-		if (mod && *mod)
-			obd_mod_put(*mod);
-		req->rq_commit_cb(req);
-	}
+
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index f56ea64..3ef1bae 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -528,10 +528,6 @@ static int mdc_free_lustre_md(struct obd_export *exp, struct lustre_md *md)
 	return 0;
 }
 
-/**
- * Handles both OPEN and SETATTR RPCs for OPEN-CLOSE and SETATTR-DONE_WRITING
- * RPC chains.
- */
 void mdc_replay_open(struct ptlrpc_request *req)
 {
 	struct md_open_data *mod = req->rq_cb_data;
@@ -565,7 +561,7 @@ void mdc_replay_open(struct ptlrpc_request *req)
 		__u32 opc = lustre_msg_get_opc(close_req->rq_reqmsg);
 		struct mdt_ioepoch *epoch;
 
-		LASSERT(opc == MDS_CLOSE || opc == MDS_DONE_WRITING);
+		LASSERT(opc == MDS_CLOSE);
 		epoch = req_capsule_client_get(&close_req->rq_pill,
 					       &RMF_MDT_EPOCH);
 		LASSERT(epoch);
@@ -715,22 +711,6 @@ static int mdc_clear_open_replay_data(struct obd_export *exp,
 	return 0;
 }
 
-/* Prepares the request for the replay by the given reply */
-static void mdc_close_handle_reply(struct ptlrpc_request *req,
-				   struct md_op_data *op_data, int rc) {
-	struct mdt_body  *repbody;
-	struct mdt_ioepoch *epoch;
-
-	if (req && rc == -EAGAIN) {
-		repbody = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-		epoch = req_capsule_client_get(&req->rq_pill, &RMF_MDT_EPOCH);
-
-		epoch->flags |= MF_SOM_AU;
-		if (repbody->mbo_valid & OBD_MD_FLGETATTRLOCK)
-			op_data->op_flags |= MF_GETATTR_LOCK;
-	}
-}
-
 static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 		     struct md_open_data *mod, struct ptlrpc_request **request)
 {
@@ -857,79 +837,9 @@ out:
 		obd_mod_put(mod);
 	}
 	*request = req;
-	mdc_close_handle_reply(req, op_data, rc);
 	return rc < 0 ? rc : saved_rc;
 }
 
-static int mdc_done_writing(struct obd_export *exp, struct md_op_data *op_data,
-			    struct md_open_data *mod)
-{
-	struct obd_device     *obd = class_exp2obd(exp);
-	struct ptlrpc_request *req;
-	int		    rc;
-
-	req = ptlrpc_request_alloc(class_exp2cliimp(exp),
-				   &RQF_MDS_DONE_WRITING);
-	if (!req)
-		return -ENOMEM;
-
-	rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_DONE_WRITING);
-	if (rc) {
-		ptlrpc_request_free(req);
-		return rc;
-	}
-
-	if (mod) {
-		LASSERTF(mod->mod_open_req &&
-			 mod->mod_open_req->rq_type != LI_POISON,
-			 "POISONED setattr %p!\n", mod->mod_open_req);
-
-		mod->mod_close_req = req;
-		DEBUG_REQ(D_HA, mod->mod_open_req, "matched setattr");
-		/* We no longer want to preserve this setattr for replay even
-		 * though the open was committed. b=3632, b=3633
-		 */
-		spin_lock(&mod->mod_open_req->rq_lock);
-		mod->mod_open_req->rq_replay = 0;
-		spin_unlock(&mod->mod_open_req->rq_lock);
-	}
-
-	mdc_close_pack(req, op_data);
-	ptlrpc_request_set_replen(req);
-
-	mdc_get_rpc_lock(obd->u.cli.cl_close_lock, NULL);
-	rc = ptlrpc_queue_wait(req);
-	mdc_put_rpc_lock(obd->u.cli.cl_close_lock, NULL);
-
-	if (rc == -ESTALE) {
-		/**
-		 * it can be allowed error after 3633 if open or setattr were
-		 * committed and server failed before close was sent.
-		 * Let's check if mod exists and return no error in that case
-		 */
-		if (mod) {
-			if (mod->mod_open_req->rq_committed)
-				rc = 0;
-		}
-	}
-
-	if (mod) {
-		if (rc != 0)
-			mod->mod_close_req = NULL;
-		LASSERT(mod->mod_open_req);
-		mdc_free_open(mod);
-
-		/* Since now, mod is accessed through setattr req only,
-		 * thus DW req does not keep a reference on mod anymore.
-		 */
-		obd_mod_put(mod);
-	}
-
-	mdc_close_handle_reply(req, op_data, rc);
-	ptlrpc_req_finished(req);
-	return rc;
-}
-
 static int mdc_getpage(struct obd_export *exp, const struct lu_fid *fid,
 		       u64 offset, struct page **pages, int npages,
 		       struct ptlrpc_request **request)
@@ -2889,7 +2799,6 @@ static struct md_ops mdc_md_ops = {
 	.null_inode		= mdc_null_inode,
 	.close			= mdc_close,
 	.create			= mdc_create,
-	.done_writing		= mdc_done_writing,
 	.enqueue		= mdc_enqueue,
 	.getattr		= mdc_getattr,
 	.getattr_name		= mdc_getattr_name,
diff --git a/drivers/staging/lustre/lustre/obdclass/Makefile b/drivers/staging/lustre/lustre/obdclass/Makefile
index b42e109..af570c0 100644
--- a/drivers/staging/lustre/lustre/obdclass/Makefile
+++ b/drivers/staging/lustre/lustre/obdclass/Makefile
@@ -1,6 +1,6 @@
 obj-$(CONFIG_LUSTRE_FS) += obdclass.o
 
-obdclass-y := linux/linux-module.o linux/linux-obdo.o linux/linux-sysctl.o \
+obdclass-y := linux/linux-module.o linux/linux-sysctl.o \
 	      llog.o llog_cat.o llog_obd.o llog_swab.o class_obd.o debug.o \
 	      genops.o uuid.o lprocfs_status.o lprocfs_counters.o \
 	      lustre_handles.o lustre_peer.o statfs_pack.o linkea.o \
diff --git a/drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c b/drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c
deleted file mode 100644
index 41b77a3..0000000
--- a/drivers/staging/lustre/lustre/obdclass/linux/linux-obdo.c
+++ /dev/null
@@ -1,80 +0,0 @@
-/*
- * GPL HEADER START
- *
- * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 only,
- * as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License version 2 for more details (a copy is included
- * in the LICENSE file that accompanied this code).
- *
- * You should have received a copy of the GNU General Public License
- * version 2 along with this program; If not, see
- * http://www.gnu.org/licenses/gpl-2.0.html
- *
- * GPL HEADER END
- */
-/*
- * Copyright (c) 2007, 2010, Oracle and/or its affiliates. All rights reserved.
- * Use is subject to license terms.
- *
- * Copyright (c) 2011, 2012, Intel Corporation.
- */
-/*
- * This file is part of Lustre, http://www.lustre.org/
- * Lustre is a trademark of Sun Microsystems, Inc.
- *
- * lustre/obdclass/linux/linux-obdo.c
- *
- * Object Devices Class Driver
- * These are the only exported functions, they provide some generic
- * infrastructure for managing object devices
- */
-
-#define DEBUG_SUBSYSTEM S_CLASS
-
-#include <linux/module.h>
-#include "../../include/obd_class.h"
-#include "../../include/lustre/lustre_idl.h"
-
-#include <linux/fs.h>
-
-void obdo_refresh_inode(struct inode *dst, const struct obdo *src, u32 valid)
-{
-	valid &= src->o_valid;
-
-	if (valid & (OBD_MD_FLCTIME | OBD_MD_FLMTIME))
-		CDEBUG(D_INODE,
-		       "valid %#llx, cur time %lu/%lu, new %llu/%llu\n",
-		       src->o_valid, LTIME_S(dst->i_mtime),
-		       LTIME_S(dst->i_ctime), src->o_mtime, src->o_ctime);
-
-	if (valid & OBD_MD_FLATIME && src->o_atime > LTIME_S(dst->i_atime))
-		LTIME_S(dst->i_atime) = src->o_atime;
-	if (valid & OBD_MD_FLMTIME && src->o_mtime > LTIME_S(dst->i_mtime))
-		LTIME_S(dst->i_mtime) = src->o_mtime;
-	if (valid & OBD_MD_FLCTIME && src->o_ctime > LTIME_S(dst->i_ctime))
-		LTIME_S(dst->i_ctime) = src->o_ctime;
-	if (valid & OBD_MD_FLSIZE)
-		i_size_write(dst, src->o_size);
-	/* optimum IO size */
-	if (valid & OBD_MD_FLBLKSZ && src->o_blksize > (1 << dst->i_blkbits))
-		dst->i_blkbits = ffs(src->o_blksize) - 1;
-
-	if (dst->i_blkbits < PAGE_SHIFT)
-		dst->i_blkbits = PAGE_SHIFT;
-
-	/* allocation of space */
-	if (valid & OBD_MD_FLBLOCKS && src->o_blocks > dst->i_blocks)
-		/*
-		 * XXX shouldn't overflow be checked here like in
-		 * obdo_to_inode().
-		 */
-		dst->i_blocks = src->o_blocks;
-}
-EXPORT_SYMBOL(obdo_refresh_inode);
diff --git a/drivers/staging/lustre/lustre/obdclass/obdo.c b/drivers/staging/lustre/lustre/obdclass/obdo.c
index 79104a6..c52b9e0 100644
--- a/drivers/staging/lustre/lustre/obdclass/obdo.c
+++ b/drivers/staging/lustre/lustre/obdclass/obdo.c
@@ -124,68 +124,3 @@ void obdo_to_ioobj(const struct obdo *oa, struct obd_ioobj *ioobj)
 	ioobj->ioo_max_brw = 0;
 }
 EXPORT_SYMBOL(obdo_to_ioobj);
-
-static void iattr_from_obdo(struct iattr *attr, const struct obdo *oa,
-			    u32 valid)
-{
-	valid &= oa->o_valid;
-
-	if (valid & (OBD_MD_FLCTIME | OBD_MD_FLMTIME))
-		CDEBUG(D_INODE, "valid %#llx, new time %llu/%llu\n",
-		       oa->o_valid, oa->o_mtime, oa->o_ctime);
-
-	attr->ia_valid = 0;
-	if (valid & OBD_MD_FLATIME) {
-		LTIME_S(attr->ia_atime) = oa->o_atime;
-		attr->ia_valid |= ATTR_ATIME;
-	}
-	if (valid & OBD_MD_FLMTIME) {
-		LTIME_S(attr->ia_mtime) = oa->o_mtime;
-		attr->ia_valid |= ATTR_MTIME;
-	}
-	if (valid & OBD_MD_FLCTIME) {
-		LTIME_S(attr->ia_ctime) = oa->o_ctime;
-		attr->ia_valid |= ATTR_CTIME;
-	}
-	if (valid & OBD_MD_FLSIZE) {
-		attr->ia_size = oa->o_size;
-		attr->ia_valid |= ATTR_SIZE;
-	}
-#if 0   /* you shouldn't be able to change a file's type with setattr */
-	if (valid & OBD_MD_FLTYPE) {
-		attr->ia_mode = (attr->ia_mode & ~S_IFMT) |
-				(oa->o_mode & S_IFMT);
-		attr->ia_valid |= ATTR_MODE;
-	}
-#endif
-	if (valid & OBD_MD_FLMODE) {
-		attr->ia_mode = (attr->ia_mode & S_IFMT) |
-				(oa->o_mode & ~S_IFMT);
-		attr->ia_valid |= ATTR_MODE;
-		if (!in_group_p(make_kgid(&init_user_ns, oa->o_gid)) &&
-		    !capable(CFS_CAP_FSETID))
-			attr->ia_mode &= ~S_ISGID;
-	}
-	if (valid & OBD_MD_FLUID) {
-		attr->ia_uid = make_kuid(&init_user_ns, oa->o_uid);
-		attr->ia_valid |= ATTR_UID;
-	}
-	if (valid & OBD_MD_FLGID) {
-		attr->ia_gid = make_kgid(&init_user_ns, oa->o_gid);
-		attr->ia_valid |= ATTR_GID;
-	}
-}
-
-void md_from_obdo(struct md_op_data *op_data, const struct obdo *oa, u32 valid)
-{
-	iattr_from_obdo(&op_data->op_attr, oa, valid);
-	if (valid & OBD_MD_FLBLOCKS) {
-		op_data->op_attr_blocks = oa->o_blocks;
-		op_data->op_attr.ia_valid |= ATTR_BLOCKS;
-	}
-	if (valid & OBD_MD_FLFLAGS) {
-		op_data->op_attr_flags = oa->o_flags;
-		op_data->op_attr.ia_valid |= ATTR_ATTR_FLAG;
-	}
-}
-EXPORT_SYMBOL(md_from_obdo);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 05/41] staging: lustre: clio: Revise read ahead implementation
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jinshan Xiong, James Simmons

From: Jinshan Xiong <jinshan.xiong@intel.com>

In this implementation, read ahead will hold the underlying DLM lock
to add read ahead pages. A new cl_io operation cio_read_ahead() is
added for this purpose. It takes parameter cl_read_ahead{} so that
each layer can adjust it by their own requirements. For example, at
OSC layer, it will make sure the read ahead region is covered by a
LDLM lock; at the LOV layer, it will make sure that the region won't
cross stripe boundary.

Legacy callback cpo_is_under_lock() is removed.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3259
Reviewed-on: http://review.whamcloud.com/10859
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   63 +++---
 .../staging/lustre/lustre/llite/llite_internal.h   |    7 +-
 drivers/staging/lustre/lustre/llite/rw.c           |  218 ++++++++++++--------
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   47 ++---
 drivers/staging/lustre/lustre/llite/vvp_page.c     |   16 --
 drivers/staging/lustre/lustre/lov/lov_io.c         |   58 +++++
 drivers/staging/lustre/lustre/lov/lov_page.c       |   46 ----
 drivers/staging/lustre/lustre/obdclass/cl_io.c     |   57 +----
 drivers/staging/lustre/lustre/obdclass/cl_page.c   |   47 -----
 drivers/staging/lustre/lustre/osc/osc_cache.c      |    3 +-
 drivers/staging/lustre/lustre/osc/osc_internal.h   |   17 ++-
 drivers/staging/lustre/lustre/osc/osc_io.c         |   41 ++++-
 drivers/staging/lustre/lustre/osc/osc_lock.c       |   12 +-
 drivers/staging/lustre/lustre/osc/osc_page.c       |   20 --
 14 files changed, 312 insertions(+), 340 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 89292c9..bf93c1e 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -884,26 +884,6 @@ struct cl_page_operations {
 	/** Destructor. Frees resources and slice itself. */
 	void (*cpo_fini)(const struct lu_env *env,
 			 struct cl_page_slice *slice);
-
-	/**
-	 * Checks whether the page is protected by a cl_lock. This is a
-	 * per-layer method, because certain layers have ways to check for the
-	 * lock much more efficiently than through the generic locks scan, or
-	 * implement locking mechanisms separate from cl_lock, e.g.,
-	 * LL_FILE_GROUP_LOCKED in vvp. If \a pending is true, check for locks
-	 * being canceled, or scheduled for cancellation as soon as the last
-	 * user goes away, too.
-	 *
-	 * \retval    -EBUSY: page is protected by a lock of a given mode;
-	 * \retval  -ENODATA: page is not protected by a lock;
-	 * \retval	 0: this layer cannot decide.
-	 *
-	 * \see cl_page_is_under_lock()
-	 */
-	int (*cpo_is_under_lock)(const struct lu_env *env,
-				 const struct cl_page_slice *slice,
-				 struct cl_io *io, pgoff_t *max);
-
 	/**
 	 * Optional debugging helper. Prints given page slice.
 	 *
@@ -1365,7 +1345,6 @@ struct cl_2queue {
  *     (3) sort all locks to avoid dead-locks, and acquire them
  *
  *     (4) process the chunk: call per-page methods
- *	 (cl_io_operations::cio_read_page() for read,
  *	 cl_io_operations::cio_prepare_write(),
  *	 cl_io_operations::cio_commit_write() for write)
  *
@@ -1467,6 +1446,31 @@ struct cl_io_slice {
 
 typedef void (*cl_commit_cbt)(const struct lu_env *, struct cl_io *,
 			      struct cl_page *);
+
+struct cl_read_ahead {
+	/*
+	 * Maximum page index the readahead window will end.
+	 * This is determined DLM lock coverage, RPC and stripe boundary.
+	 * cra_end is included.
+	 */
+	pgoff_t cra_end;
+	/*
+	 * Release routine. If readahead holds resources underneath, this
+	 * function should be called to release it.
+	 */
+	void (*cra_release)(const struct lu_env *env, void *cbdata);
+	/* Callback data for cra_release routine */
+	void *cra_cbdata;
+};
+
+static inline void cl_read_ahead_release(const struct lu_env *env,
+					 struct cl_read_ahead *ra)
+{
+	if (ra->cra_release)
+		ra->cra_release(env, ra->cra_cbdata);
+	memset(ra, 0, sizeof(*ra));
+}
+
 /**
  * Per-layer io operations.
  * \see vvp_io_ops, lov_io_ops, lovsub_io_ops, osc_io_ops
@@ -1573,16 +1577,13 @@ struct cl_io_operations {
 				 struct cl_page_list *queue, int from, int to,
 				 cl_commit_cbt cb);
 	/**
-	 * Read missing page.
-	 *
-	 * Called by a top-level cl_io_operations::op[CIT_READ]::cio_start()
-	 * method, when it hits not-up-to-date page in the range. Optional.
+	 * Decide maximum read ahead extent
 	 *
 	 * \pre io->ci_type == CIT_READ
 	 */
-	int (*cio_read_page)(const struct lu_env *env,
-			     const struct cl_io_slice *slice,
-			     const struct cl_page_slice *page);
+	int (*cio_read_ahead)(const struct lu_env *env,
+			      const struct cl_io_slice *slice,
+			      pgoff_t start, struct cl_read_ahead *ra);
 	/**
 	 * Optional debugging helper. Print given io slice.
 	 */
@@ -2302,8 +2303,6 @@ void cl_page_discard(const struct lu_env *env, struct cl_io *io,
 void cl_page_delete(const struct lu_env *env, struct cl_page *pg);
 int cl_page_is_vmlocked(const struct lu_env *env, const struct cl_page *pg);
 void cl_page_export(const struct lu_env *env, struct cl_page *pg, int uptodate);
-int cl_page_is_under_lock(const struct lu_env *env, struct cl_io *io,
-			  struct cl_page *page, pgoff_t *max_index);
 loff_t cl_offset(const struct cl_object *obj, pgoff_t idx);
 pgoff_t cl_index(const struct cl_object *obj, loff_t offset);
 size_t cl_page_size(const struct cl_object *obj);
@@ -2414,8 +2413,6 @@ int cl_io_lock_add(const struct lu_env *env, struct cl_io *io,
 		   struct cl_io_lock_link *link);
 int cl_io_lock_alloc_add(const struct lu_env *env, struct cl_io *io,
 			 struct cl_lock_descr *descr);
-int cl_io_read_page(const struct lu_env *env, struct cl_io *io,
-		    struct cl_page *page);
 int cl_io_submit_rw(const struct lu_env *env, struct cl_io *io,
 		    enum cl_req_type iot, struct cl_2queue *queue);
 int cl_io_submit_sync(const struct lu_env *env, struct cl_io *io,
@@ -2424,6 +2421,8 @@ int cl_io_submit_sync(const struct lu_env *env, struct cl_io *io,
 int cl_io_commit_async(const struct lu_env *env, struct cl_io *io,
 		       struct cl_page_list *queue, int from, int to,
 		       cl_commit_cbt cb);
+int cl_io_read_ahead(const struct lu_env *env, struct cl_io *io,
+		     pgoff_t start, struct cl_read_ahead *ra);
 int cl_io_is_going(const struct lu_env *env);
 
 /**
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index cd89926..b06cd3c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -722,9 +722,7 @@ int ll_writepage(struct page *page, struct writeback_control *wbc);
 int ll_writepages(struct address_space *, struct writeback_control *wbc);
 int ll_readpage(struct file *file, struct page *page);
 void ll_readahead_init(struct inode *inode, struct ll_readahead_state *ras);
-int ll_readahead(const struct lu_env *env, struct cl_io *io,
-		 struct cl_page_list *queue, struct ll_readahead_state *ras,
-		 bool hit);
+int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io);
 struct ll_cl_context *ll_cl_find(struct file *file);
 void ll_cl_add(struct file *file, const struct lu_env *env, struct cl_io *io);
 void ll_cl_remove(struct file *file, const struct lu_env *env);
@@ -1020,9 +1018,6 @@ int cl_sb_init(struct super_block *sb);
 int cl_sb_fini(struct super_block *sb);
 void ll_io_init(struct cl_io *io, const struct file *file, int write);
 
-void ras_update(struct ll_sb_info *sbi, struct inode *inode,
-		struct ll_readahead_state *ras, unsigned long index,
-		unsigned hit);
 void ll_ra_count_put(struct ll_sb_info *sbi, unsigned long len);
 void ll_ra_stats_inc(struct inode *inode, enum ra_stat which);
 
diff --git a/drivers/staging/lustre/lustre/llite/rw.c b/drivers/staging/lustre/lustre/llite/rw.c
index 50c0152..80cb8e0 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++ b/drivers/staging/lustre/lustre/llite/rw.c
@@ -180,90 +180,73 @@ void ll_ras_enter(struct file *f)
 	spin_unlock(&ras->ras_lock);
 }
 
-static int cl_read_ahead_page(const struct lu_env *env, struct cl_io *io,
-			      struct cl_page_list *queue, struct cl_page *page,
-			      struct cl_object *clob, pgoff_t *max_index)
+/**
+ * Initiates read-ahead of a page with given index.
+ *
+ * \retval +ve:	page was already uptodate so it will be skipped
+ *		from being added;
+ * \retval -ve:	page wasn't added to \a queue for error;
+ * \retval   0:	page was added into \a queue for read ahead.
+ */
+static int ll_read_ahead_page(const struct lu_env *env, struct cl_io *io,
+			      struct cl_page_list *queue, pgoff_t index)
 {
-	struct page *vmpage = page->cp_vmpage;
+	enum ra_stat which = _NR_RA_STAT; /* keep gcc happy */
+	struct cl_object *clob = io->ci_obj;
+	struct inode *inode = vvp_object_inode(clob);
+	const char *msg = NULL;
+	struct cl_page *page;
 	struct vvp_page *vpg;
-	int	      rc;
+	struct page *vmpage;
+	int rc = 0;
+
+	vmpage = grab_cache_page_nowait(inode->i_mapping, index);
+	if (!vmpage) {
+		which = RA_STAT_FAILED_GRAB_PAGE;
+		msg = "g_c_p_n failed";
+		rc = -EBUSY;
+		goto out;
+	}
+
+	/* Check if vmpage was truncated or reclaimed */
+	if (vmpage->mapping != inode->i_mapping) {
+		which = RA_STAT_WRONG_GRAB_PAGE;
+		msg = "g_c_p_n returned invalid page";
+		rc = -EBUSY;
+		goto out;
+	}
+
+	page = cl_page_find(env, clob, vmpage->index, vmpage, CPT_CACHEABLE);
+	if (IS_ERR(page)) {
+		which = RA_STAT_FAILED_GRAB_PAGE;
+		msg = "cl_page_find failed";
+		rc = PTR_ERR(page);
+		goto out;
+	}
 
-	rc = 0;
-	cl_page_assume(env, io, page);
 	lu_ref_add(&page->cp_reference, "ra", current);
+	cl_page_assume(env, io, page);
 	vpg = cl2vvp_page(cl_object_page_slice(clob, page));
 	if (!vpg->vpg_defer_uptodate && !PageUptodate(vmpage)) {
-		CDEBUG(D_READA, "page index %lu, max_index: %lu\n",
-		       vvp_index(vpg), *max_index);
-		if (*max_index == 0 || vvp_index(vpg) > *max_index)
-			rc = cl_page_is_under_lock(env, io, page, max_index);
-		if (rc == 0) {
-			vpg->vpg_defer_uptodate = 1;
-			vpg->vpg_ra_used = 0;
-			cl_page_list_add(queue, page);
-			rc = 1;
-		} else {
-			cl_page_discard(env, io, page);
-			rc = -ENOLCK;
-		}
+		vpg->vpg_defer_uptodate = 1;
+		vpg->vpg_ra_used = 0;
+		cl_page_list_add(queue, page);
 	} else {
 		/* skip completed pages */
 		cl_page_unassume(env, io, page);
+		/* This page is already uptodate, returning a positive number
+		 * to tell the callers about this
+		 */
+		rc = 1;
 	}
+
 	lu_ref_del(&page->cp_reference, "ra", current);
 	cl_page_put(env, page);
-	return rc;
-}
-
-/**
- * Initiates read-ahead of a page with given index.
- *
- * \retval     +ve: page was added to \a queue.
- *
- * \retval -ENOLCK: there is no extent lock for this part of a file, stop
- *		  read-ahead.
- *
- * \retval  -ve, 0: page wasn't added to \a queue for other reason.
- */
-static int ll_read_ahead_page(const struct lu_env *env, struct cl_io *io,
-			      struct cl_page_list *queue,
-			      pgoff_t index, pgoff_t *max_index)
-{
-	struct cl_object *clob  = io->ci_obj;
-	struct inode     *inode = vvp_object_inode(clob);
-	struct page      *vmpage;
-	struct cl_page   *page;
-	enum ra_stat      which = _NR_RA_STAT; /* keep gcc happy */
-	int	       rc    = 0;
-	const char       *msg   = NULL;
-
-	vmpage = grab_cache_page_nowait(inode->i_mapping, index);
+out:
 	if (vmpage) {
-		/* Check if vmpage was truncated or reclaimed */
-		if (vmpage->mapping == inode->i_mapping) {
-			page = cl_page_find(env, clob, vmpage->index,
-					    vmpage, CPT_CACHEABLE);
-			if (!IS_ERR(page)) {
-				rc = cl_read_ahead_page(env, io, queue,
-							page, clob, max_index);
-				if (rc == -ENOLCK) {
-					which = RA_STAT_FAILED_MATCH;
-					msg   = "lock match failed";
-				}
-			} else {
-				which = RA_STAT_FAILED_GRAB_PAGE;
-				msg   = "cl_page_find failed";
-			}
-		} else {
-			which = RA_STAT_WRONG_GRAB_PAGE;
-			msg   = "g_c_p_n returned invalid page";
-		}
-		if (rc != 1)
+		if (rc)
 			unlock_page(vmpage);
 		put_page(vmpage);
-	} else {
-		which = RA_STAT_FAILED_GRAB_PAGE;
-		msg   = "g_c_p_n failed";
 	}
 	if (msg) {
 		ll_ra_stats_inc(inode, which);
@@ -378,12 +361,12 @@ static int ll_read_ahead_pages(const struct lu_env *env,
 			       struct cl_io *io, struct cl_page_list *queue,
 			       struct ra_io_arg *ria,
 			       unsigned long *reserved_pages,
-			       unsigned long *ra_end)
+			       pgoff_t *ra_end)
 {
+	struct cl_read_ahead ra = { 0 };
 	int rc, count = 0;
 	bool stride_ria;
 	pgoff_t page_idx;
-	pgoff_t max_index = 0;
 
 	LASSERT(ria);
 	RIA_DEBUG(ria);
@@ -392,14 +375,23 @@ static int ll_read_ahead_pages(const struct lu_env *env,
 	for (page_idx = ria->ria_start;
 	     page_idx <= ria->ria_end && *reserved_pages > 0; page_idx++) {
 		if (ras_inside_ra_window(page_idx, ria)) {
+			if (!ra.cra_end || ra.cra_end < page_idx) {
+				cl_read_ahead_release(env, &ra);
+
+				rc = cl_io_read_ahead(env, io, page_idx, &ra);
+				if (rc < 0)
+					break;
+
+				LASSERTF(ra.cra_end >= page_idx,
+					 "object: %p, indcies %lu / %lu\n",
+					 io->ci_obj, ra.cra_end, page_idx);
+			}
+
 			/* If the page is inside the read-ahead window*/
-			rc = ll_read_ahead_page(env, io, queue,
-						page_idx, &max_index);
-			if (rc == 1) {
+			rc = ll_read_ahead_page(env, io, queue, page_idx);
+			if (!rc) {
 				(*reserved_pages)--;
 				count++;
-			} else if (rc == -ENOLCK) {
-				break;
 			}
 		} else if (stride_ria) {
 			/* If it is not in the read-ahead window, and it is
@@ -425,19 +417,21 @@ static int ll_read_ahead_pages(const struct lu_env *env,
 			}
 		}
 	}
+	cl_read_ahead_release(env, &ra);
+
 	*ra_end = page_idx;
 	return count;
 }
 
-int ll_readahead(const struct lu_env *env, struct cl_io *io,
-		 struct cl_page_list *queue, struct ll_readahead_state *ras,
-		 bool hit)
+static int ll_readahead(const struct lu_env *env, struct cl_io *io,
+			struct cl_page_list *queue,
+			struct ll_readahead_state *ras, bool hit)
 {
 	struct vvp_io *vio = vvp_env_io(env);
 	struct ll_thread_info *lti = ll_env_info(env);
 	struct cl_attr *attr = vvp_env_thread_attr(env);
-	unsigned long start = 0, end = 0, reserved;
-	unsigned long ra_end, len, mlen = 0;
+	unsigned long len, mlen = 0, reserved;
+	pgoff_t ra_end, start = 0, end = 0;
 	struct inode *inode;
 	struct ra_io_arg *ria = &lti->lti_ria;
 	struct cl_object *clob;
@@ -575,8 +569,8 @@ int ll_readahead(const struct lu_env *env, struct cl_io *io,
 	 * if the region we failed to issue read-ahead on is still ahead
 	 * of the app and behind the next index to start read-ahead from
 	 */
-	CDEBUG(D_READA, "ra_end %lu end %lu stride end %lu\n",
-	       ra_end, end, ria->ria_end);
+	CDEBUG(D_READA, "ra_end = %lu end = %lu stride end = %lu pages = %d\n",
+	       ra_end, end, ria->ria_end, ret);
 
 	if (ra_end != end + 1) {
 		ll_ra_stats_inc(inode, RA_STAT_FAILED_REACH_END);
@@ -737,9 +731,9 @@ static void ras_increase_window(struct inode *inode,
 					  ra->ra_max_pages_per_file);
 }
 
-void ras_update(struct ll_sb_info *sbi, struct inode *inode,
-		struct ll_readahead_state *ras, unsigned long index,
-		unsigned hit)
+static void ras_update(struct ll_sb_info *sbi, struct inode *inode,
+		       struct ll_readahead_state *ras, unsigned long index,
+		       unsigned int hit)
 {
 	struct ll_ra_info *ra = &sbi->ll_ra_info;
 	int zero = 0, stride_detect = 0, ra_miss = 0;
@@ -1087,6 +1081,56 @@ void ll_cl_remove(struct file *file, const struct lu_env *env)
 	write_unlock(&fd->fd_lock);
 }
 
+static int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
+			   struct cl_page *page)
+{
+	struct inode *inode = vvp_object_inode(page->cp_obj);
+	struct ll_file_data *fd = vvp_env_io(env)->vui_fd;
+	struct ll_readahead_state *ras = &fd->fd_ras;
+	struct cl_2queue *queue  = &io->ci_queue;
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct vvp_page *vpg;
+	int rc = 0;
+
+	vpg = cl2vvp_page(cl_object_page_slice(page->cp_obj, page));
+	if (sbi->ll_ra_info.ra_max_pages_per_file > 0 &&
+	    sbi->ll_ra_info.ra_max_pages > 0)
+		ras_update(sbi, inode, ras, vvp_index(vpg),
+			   vpg->vpg_defer_uptodate);
+
+	if (vpg->vpg_defer_uptodate) {
+		vpg->vpg_ra_used = 1;
+		cl_page_export(env, page, 1);
+	}
+
+	cl_2queue_init(queue);
+	/*
+	 * Add page into the queue even when it is marked uptodate above.
+	 * this will unlock it automatically as part of cl_page_list_disown().
+	 */
+	cl_page_list_add(&queue->c2_qin, page);
+	if (sbi->ll_ra_info.ra_max_pages_per_file > 0 &&
+	    sbi->ll_ra_info.ra_max_pages > 0) {
+		int rc2;
+
+		rc2 = ll_readahead(env, io, &queue->c2_qin, ras,
+				   vpg->vpg_defer_uptodate);
+		CDEBUG(D_READA, DFID "%d pages read ahead at %lu\n",
+		       PFID(ll_inode2fid(inode)), rc2, vvp_index(vpg));
+	}
+
+	if (queue->c2_qin.pl_nr > 0)
+		rc = cl_io_submit_rw(env, io, CRT_READ, queue);
+
+	/*
+	 * Unlock unsent pages in case of error.
+	 */
+	cl_page_list_disown(env, io, &queue->c2_qin);
+	cl_2queue_fini(env, queue);
+
+	return rc;
+}
+
 int ll_readpage(struct file *file, struct page *vmpage)
 {
 	struct cl_object *clob = ll_i2info(file_inode(file))->lli_clob;
@@ -1110,7 +1154,7 @@ int ll_readpage(struct file *file, struct page *vmpage)
 		LASSERT(page->cp_type == CPT_CACHEABLE);
 		if (likely(!PageUptodate(vmpage))) {
 			cl_page_assume(env, io, page);
-			result = cl_io_read_page(env, io, page);
+			result = ll_io_read_page(env, io, page);
 		} else {
 			/* Page from a non-object file. */
 			unlock_page(vmpage);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index dbc4c26..8187fa3 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -1228,40 +1228,23 @@ static int vvp_io_fsync_start(const struct lu_env *env,
 	return 0;
 }
 
-static int vvp_io_read_page(const struct lu_env *env,
-			    const struct cl_io_slice *ios,
-			    const struct cl_page_slice *slice)
+static int vvp_io_read_ahead(const struct lu_env *env,
+			     const struct cl_io_slice *ios,
+			     pgoff_t start, struct cl_read_ahead *ra)
 {
-	struct cl_io	      *io     = ios->cis_io;
-	struct vvp_page           *vpg    = cl2vvp_page(slice);
-	struct cl_page	    *page   = slice->cpl_page;
-	struct inode              *inode  = vvp_object_inode(slice->cpl_obj);
-	struct ll_sb_info	 *sbi    = ll_i2sbi(inode);
-	struct ll_file_data       *fd     = cl2vvp_io(env, ios)->vui_fd;
-	struct ll_readahead_state *ras    = &fd->fd_ras;
-	struct cl_2queue	  *queue  = &io->ci_queue;
-
-	if (sbi->ll_ra_info.ra_max_pages_per_file &&
-	    sbi->ll_ra_info.ra_max_pages)
-		ras_update(sbi, inode, ras, vvp_index(vpg),
-			   vpg->vpg_defer_uptodate);
-
-	if (vpg->vpg_defer_uptodate) {
-		vpg->vpg_ra_used = 1;
-		cl_page_export(env, page, 1);
-	}
-	/*
-	 * Add page into the queue even when it is marked uptodate above.
-	 * this will unlock it automatically as part of cl_page_list_disown().
-	 */
+	int result = 0;
 
-	cl_page_list_add(&queue->c2_qin, page);
-	if (sbi->ll_ra_info.ra_max_pages_per_file &&
-	    sbi->ll_ra_info.ra_max_pages)
-		ll_readahead(env, io, &queue->c2_qin, ras,
-			     vpg->vpg_defer_uptodate);
+	if (ios->cis_io->ci_type == CIT_READ ||
+	    ios->cis_io->ci_type == CIT_FAULT) {
+		struct vvp_io *vio = cl2vvp_io(env, ios);
 
-	return 0;
+		if (unlikely(vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED)) {
+			ra->cra_end = CL_PAGE_EOF;
+			result = 1; /* no need to call down */
+		}
+	}
+
+	return result;
 }
 
 static void vvp_io_end(const struct lu_env *env, const struct cl_io_slice *ios)
@@ -1308,7 +1291,7 @@ static const struct cl_io_operations vvp_io_ops = {
 			.cio_fini   = vvp_io_fini
 		}
 	},
-	.cio_read_page     = vvp_io_read_page,
+	.cio_read_ahead	= vvp_io_read_ahead,
 };
 
 int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
diff --git a/drivers/staging/lustre/lustre/llite/vvp_page.c b/drivers/staging/lustre/lustre/llite/vvp_page.c
index 68f8990..75cec23 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_page.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_page.c
@@ -342,20 +342,6 @@ static int vvp_page_make_ready(const struct lu_env *env,
 	return result;
 }
 
-static int vvp_page_is_under_lock(const struct lu_env *env,
-				  const struct cl_page_slice *slice,
-				  struct cl_io *io, pgoff_t *max_index)
-{
-	if (io->ci_type == CIT_READ || io->ci_type == CIT_WRITE ||
-	    io->ci_type == CIT_FAULT) {
-		struct vvp_io *vio = vvp_env_io(env);
-
-		if (unlikely(vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED))
-			*max_index = CL_PAGE_EOF;
-	}
-	return 0;
-}
-
 static int vvp_page_print(const struct lu_env *env,
 			  const struct cl_page_slice *slice,
 			  void *cookie, lu_printer_t printer)
@@ -400,7 +386,6 @@ static const struct cl_page_operations vvp_page_ops = {
 	.cpo_is_vmlocked   = vvp_page_is_vmlocked,
 	.cpo_fini	  = vvp_page_fini,
 	.cpo_print	 = vvp_page_print,
-	.cpo_is_under_lock = vvp_page_is_under_lock,
 	.io = {
 		[CRT_READ] = {
 			.cpo_prep	= vvp_page_prep_read,
@@ -499,7 +484,6 @@ static const struct cl_page_operations vvp_transient_page_ops = {
 	.cpo_fini	  = vvp_transient_page_fini,
 	.cpo_is_vmlocked   = vvp_transient_page_is_vmlocked,
 	.cpo_print	 = vvp_page_print,
-	.cpo_is_under_lock	= vvp_page_is_under_lock,
 	.io = {
 		[CRT_READ] = {
 			.cpo_prep	= vvp_transient_page_prep,
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index d101579..e75e5d2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -555,6 +555,63 @@ static void lov_io_unlock(const struct lu_env *env,
 	LASSERT(rc == 0);
 }
 
+static int lov_io_read_ahead(const struct lu_env *env,
+			     const struct cl_io_slice *ios,
+			     pgoff_t start, struct cl_read_ahead *ra)
+{
+	struct lov_io *lio = cl2lov_io(env, ios);
+	struct lov_object *loo = lio->lis_object;
+	struct cl_object *obj = lov2cl(loo);
+	struct lov_layout_raid0 *r0 = lov_r0(loo);
+	unsigned int pps; /* pages per stripe */
+	struct lov_io_sub *sub;
+	pgoff_t ra_end;
+	loff_t suboff;
+	int stripe;
+	int rc;
+
+	stripe = lov_stripe_number(loo->lo_lsm, cl_offset(obj, start));
+	if (unlikely(!r0->lo_sub[stripe]))
+		return -EIO;
+
+	sub = lov_sub_get(env, lio, stripe);
+
+	lov_stripe_offset(loo->lo_lsm, cl_offset(obj, start), stripe, &suboff);
+	rc = cl_io_read_ahead(sub->sub_env, sub->sub_io,
+			      cl_index(lovsub2cl(r0->lo_sub[stripe]), suboff),
+			      ra);
+	lov_sub_put(sub);
+
+	CDEBUG(D_READA, DFID " cra_end = %lu, stripes = %d, rc = %d\n",
+	       PFID(lu_object_fid(lov2lu(loo))), ra->cra_end, r0->lo_nr, rc);
+	if (rc)
+		return rc;
+
+	/**
+	 * Adjust the stripe index by layout of raid0. ra->cra_end is
+	 * the maximum page index covered by an underlying DLM lock.
+	 * This function converts cra_end from stripe level to file
+	 * level, and make sure it's not beyond stripe boundary.
+	 */
+	if (r0->lo_nr == 1)	/* single stripe file */
+		return 0;
+
+	/* cra_end is stripe level, convert it into file level */
+	ra_end = ra->cra_end;
+	if (ra_end != CL_PAGE_EOF)
+		ra_end = lov_stripe_pgoff(loo->lo_lsm, ra_end, stripe);
+
+	pps = loo->lo_lsm->lsm_stripe_size >> PAGE_SHIFT;
+
+	CDEBUG(D_READA, DFID " max_index = %lu, pps = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
+	       PFID(lu_object_fid(lov2lu(loo))), ra_end, pps,
+	       loo->lo_lsm->lsm_stripe_size, stripe, start);
+
+	/* never exceed the end of the stripe */
+	ra->cra_end = min_t(pgoff_t, ra_end, start + pps - start % pps - 1);
+	return 0;
+}
+
 /**
  * lov implementation of cl_operations::cio_submit() method. It takes a list
  * of pages in \a queue, splits it into per-stripe sub-lists, invokes
@@ -801,6 +858,7 @@ static const struct cl_io_operations lov_io_ops = {
 			.cio_fini   = lov_io_fini
 		}
 	},
+	.cio_read_ahead			= lov_io_read_ahead,
 	.cio_submit                    = lov_io_submit,
 	.cio_commit_async              = lov_io_commit_async,
 };
diff --git a/drivers/staging/lustre/lustre/lov/lov_page.c b/drivers/staging/lustre/lustre/lov/lov_page.c
index 00bfaba..62ceb6d 100644
--- a/drivers/staging/lustre/lustre/lov/lov_page.c
+++ b/drivers/staging/lustre/lustre/lov/lov_page.c
@@ -49,51 +49,6 @@
  *
  */
 
-/**
- * Adjust the stripe index by layout of raid0. @max_index is the maximum
- * page index covered by an underlying DLM lock.
- * This function converts max_index from stripe level to file level, and make
- * sure it's not beyond one stripe.
- */
-static int lov_raid0_page_is_under_lock(const struct lu_env *env,
-					const struct cl_page_slice *slice,
-					struct cl_io *unused,
-					pgoff_t *max_index)
-{
-	struct lov_object *loo = cl2lov(slice->cpl_obj);
-	struct lov_layout_raid0 *r0 = lov_r0(loo);
-	pgoff_t index = *max_index;
-	unsigned int pps; /* pages per stripe */
-
-	CDEBUG(D_READA, DFID "*max_index = %lu, nr = %d\n",
-	       PFID(lu_object_fid(lov2lu(loo))), index, r0->lo_nr);
-
-	if (index == 0) /* the page is not covered by any lock */
-		return 0;
-
-	if (r0->lo_nr == 1) /* single stripe file */
-		return 0;
-
-	/* max_index is stripe level, convert it into file level */
-	if (index != CL_PAGE_EOF) {
-		int stripeno = lov_page_stripe(slice->cpl_page);
-		*max_index = lov_stripe_pgoff(loo->lo_lsm, index, stripeno);
-	}
-
-	/* calculate the end of current stripe */
-	pps = loo->lo_lsm->lsm_stripe_size >> PAGE_SHIFT;
-	index = slice->cpl_index + pps - slice->cpl_index % pps - 1;
-
-	CDEBUG(D_READA, DFID "*max_index = %lu, index = %lu, pps = %u, stripe_size = %u, stripe no = %u, page index = %lu\n",
-	       PFID(lu_object_fid(lov2lu(loo))), *max_index, index, pps,
-	       loo->lo_lsm->lsm_stripe_size, lov_page_stripe(slice->cpl_page),
-	       slice->cpl_index);
-
-	/* never exceed the end of the stripe */
-	*max_index = min_t(pgoff_t, *max_index, index);
-	return 0;
-}
-
 static int lov_raid0_page_print(const struct lu_env *env,
 				const struct cl_page_slice *slice,
 				void *cookie, lu_printer_t printer)
@@ -104,7 +59,6 @@ static int lov_raid0_page_print(const struct lu_env *env,
 }
 
 static const struct cl_page_operations lov_raid0_page_ops = {
-	.cpo_is_under_lock = lov_raid0_page_is_under_lock,
 	.cpo_print  = lov_raid0_page_print
 };
 
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_io.c b/drivers/staging/lustre/lustre/obdclass/cl_io.c
index bc4b7b6..577f76e 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_io.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_io.c
@@ -586,67 +586,32 @@ void cl_io_end(const struct lu_env *env, struct cl_io *io)
 }
 EXPORT_SYMBOL(cl_io_end);
 
-static const struct cl_page_slice *
-cl_io_slice_page(const struct cl_io_slice *ios, struct cl_page *page)
-{
-	const struct cl_page_slice *slice;
-
-	slice = cl_page_at(page, ios->cis_obj->co_lu.lo_dev->ld_type);
-	LINVRNT(slice);
-	return slice;
-}
-
 /**
- * Called by read io, when page has to be read from the server.
+ * Called by read io, to decide the readahead extent
  *
- * \see cl_io_operations::cio_read_page()
+ * \see cl_io_operations::cio_read_ahead()
  */
-int cl_io_read_page(const struct lu_env *env, struct cl_io *io,
-		    struct cl_page *page)
+int cl_io_read_ahead(const struct lu_env *env, struct cl_io *io,
+		     pgoff_t start, struct cl_read_ahead *ra)
 {
 	const struct cl_io_slice *scan;
-	struct cl_2queue	 *queue;
 	int		       result = 0;
 
 	LINVRNT(io->ci_type == CIT_READ || io->ci_type == CIT_FAULT);
-	LINVRNT(cl_page_is_owned(page, io));
 	LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED);
 	LINVRNT(cl_io_invariant(io));
 
-	queue = &io->ci_queue;
-
-	cl_2queue_init(queue);
-	/*
-	 * ->cio_read_page() methods called in the loop below are supposed to
-	 * never block waiting for network (the only subtle point is the
-	 * creation of new pages for read-ahead that might result in cache
-	 * shrinking, but currently only clean pages are shrunk and this
-	 * requires no network io).
-	 *
-	 * Should this ever starts blocking, retry loop would be needed for
-	 * "parallel io" (see CLO_REPEAT loops in cl_lock.c).
-	 */
 	cl_io_for_each(scan, io) {
-		if (scan->cis_iop->cio_read_page) {
-			const struct cl_page_slice *slice;
+		if (!scan->cis_iop->cio_read_ahead)
+			continue;
 
-			slice = cl_io_slice_page(scan, page);
-			LINVRNT(slice);
-			result = scan->cis_iop->cio_read_page(env, scan, slice);
-			if (result != 0)
-				break;
-		}
+		result = scan->cis_iop->cio_read_ahead(env, scan, start, ra);
+		if (result)
+			break;
 	}
-	if (result == 0 && queue->c2_qin.pl_nr > 0)
-		result = cl_io_submit_rw(env, io, CRT_READ, queue);
-	/*
-	 * Unlock unsent pages in case of error.
-	 */
-	cl_page_list_disown(env, io, &queue->c2_qin);
-	cl_2queue_fini(env, queue);
-	return result;
+	return result > 0 ? 0 : result;
 }
-EXPORT_SYMBOL(cl_io_read_page);
+EXPORT_SYMBOL(cl_io_read_ahead);
 
 /**
  * Commit a list of contiguous pages into writeback cache.
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_page.c b/drivers/staging/lustre/lustre/obdclass/cl_page.c
index 63973ba..40b7bee 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_page.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_page.c
@@ -390,30 +390,6 @@ EXPORT_SYMBOL(cl_page_at);
 	__result;						       \
 })
 
-#define CL_PAGE_INVOKE_REVERSE(_env, _page, _op, _proto, ...)		\
-({									\
-	const struct lu_env        *__env  = (_env);			\
-	struct cl_page             *__page = (_page);			\
-	const struct cl_page_slice *__scan;				\
-	int                         __result;				\
-	ptrdiff_t                   __op   = (_op);			\
-	int                       (*__method)_proto;			\
-									\
-	__result = 0;							\
-	list_for_each_entry_reverse(__scan, &__page->cp_layers,		\
-					cpl_linkage) {			\
-		__method = *(void **)((char *)__scan->cpl_ops +  __op);	\
-		if (__method) {						\
-			__result = (*__method)(__env, __scan, ## __VA_ARGS__); \
-			if (__result != 0)				\
-				break;					\
-		}							\
-	}								\
-	if (__result > 0)						\
-		__result = 0;						\
-	__result;							\
-})
-
 #define CL_PAGE_INVOID(_env, _page, _op, _proto, ...)		   \
 do {								    \
 	const struct lu_env	*__env  = (_env);		    \
@@ -927,29 +903,6 @@ int cl_page_flush(const struct lu_env *env, struct cl_io *io,
 EXPORT_SYMBOL(cl_page_flush);
 
 /**
- * Checks whether page is protected by any extent lock is at least required
- * mode.
- *
- * \return the same as in cl_page_operations::cpo_is_under_lock() method.
- * \see cl_page_operations::cpo_is_under_lock()
- */
-int cl_page_is_under_lock(const struct lu_env *env, struct cl_io *io,
-			  struct cl_page *page, pgoff_t *max_index)
-{
-	int rc;
-
-	PINVRNT(env, page, cl_page_invariant(page));
-
-	rc = CL_PAGE_INVOKE_REVERSE(env, page, CL_PAGE_OP(cpo_is_under_lock),
-				    (const struct lu_env *,
-				     const struct cl_page_slice *,
-				      struct cl_io *, pgoff_t *),
-				    io, max_index);
-	return rc;
-}
-EXPORT_SYMBOL(cl_page_is_under_lock);
-
-/**
  * Tells transfer engine that only part of a page is to be transmitted.
  *
  * \see cl_page_operations::cpo_clip()
diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 4bbe219..b645957 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -3158,7 +3158,8 @@ static int check_and_discard_cb(const struct lu_env *env, struct cl_io *io,
 		struct cl_page *page = ops->ops_cl.cpl_page;
 
 		/* refresh non-overlapped index */
-		tmp = osc_dlmlock_at_pgoff(env, osc, index, 0, 0);
+		tmp = osc_dlmlock_at_pgoff(env, osc, index,
+					   OSC_DAP_FL_TEST_LOCK);
 		if (tmp) {
 			__u64 end = tmp->l_policy_data.l_extent.end;
 			/* Cache the first-non-overlapped index so as to skip
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 67fe0a2..9a61c9b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -199,8 +199,23 @@ void osc_inc_unstable_pages(struct ptlrpc_request *req);
 void osc_dec_unstable_pages(struct ptlrpc_request *req);
 bool osc_over_unstable_soft_limit(struct client_obd *cli);
 
+/**
+ * Bit flags for osc_dlm_lock_at_pageoff().
+ */
+enum osc_dap_flags {
+	/**
+	 * Just check if the desired lock exists, it won't hold reference
+	 * count on lock.
+	 */
+	OSC_DAP_FL_TEST_LOCK	= BIT(0),
+	/**
+	 * Return the lock even if it is being canceled.
+	 */
+	OSC_DAP_FL_CANCELING	= BIT(1),
+};
+
 struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 				       struct osc_object *obj, pgoff_t index,
-				       int pending, int canceling);
+				       enum osc_dap_flags flags);
 
 #endif /* OSC_INTERNAL_H */
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index 8a559cb..47c6371 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -88,6 +88,44 @@ static void osc_io_fini(const struct lu_env *env, const struct cl_io_slice *io)
 {
 }
 
+static void osc_read_ahead_release(const struct lu_env *env, void *cbdata)
+{
+	struct ldlm_lock *dlmlock = cbdata;
+	struct lustre_handle lockh;
+
+	ldlm_lock2handle(dlmlock, &lockh);
+	ldlm_lock_decref(&lockh, LCK_PR);
+	LDLM_LOCK_PUT(dlmlock);
+}
+
+static int osc_io_read_ahead(const struct lu_env *env,
+			     const struct cl_io_slice *ios,
+			     pgoff_t start, struct cl_read_ahead *ra)
+{
+	struct osc_object *osc = cl2osc(ios->cis_obj);
+	struct ldlm_lock *dlmlock;
+	int result = -ENODATA;
+
+	dlmlock = osc_dlmlock_at_pgoff(env, osc, start, 0);
+	if (dlmlock) {
+		if (dlmlock->l_req_mode != LCK_PR) {
+			struct lustre_handle lockh;
+
+			ldlm_lock2handle(dlmlock, &lockh);
+			ldlm_lock_addref(&lockh, LCK_PR);
+			ldlm_lock_decref(&lockh, dlmlock->l_req_mode);
+		}
+
+		ra->cra_end = cl_index(osc2cl(osc),
+				       dlmlock->l_policy_data.l_extent.end);
+		ra->cra_release = osc_read_ahead_release;
+		ra->cra_cbdata = dlmlock;
+		result = 0;
+	}
+
+	return result;
+}
+
 /**
  * An implementation of cl_io_operations::cio_io_submit() method for osc
  * layer. Iterates over pages in the in-queue, prepares each for io by calling
@@ -724,6 +762,7 @@ static const struct cl_io_operations osc_io_ops = {
 			.cio_fini   = osc_io_fini
 		}
 	},
+	.cio_read_ahead			= osc_io_read_ahead,
 	.cio_submit                 = osc_io_submit,
 	.cio_commit_async           = osc_io_commit_async
 };
@@ -798,7 +837,7 @@ static void osc_req_attr_set(const struct lu_env *env,
 				     struct cl_page, cp_flight);
 		opg = osc_cl_page_osc(apage, NULL);
 		lock = osc_dlmlock_at_pgoff(env, cl2osc(obj), osc_index(opg),
-					    1, 1);
+					    OSC_DAP_FL_TEST_LOCK | OSC_DAP_FL_CANCELING);
 		if (!lock && !opg->ops_srvlock) {
 			struct ldlm_resource *res;
 			struct ldlm_res_id *resname;
diff --git a/drivers/staging/lustre/lustre/osc/osc_lock.c b/drivers/staging/lustre/lustre/osc/osc_lock.c
index 39a8a58..a42cb98 100644
--- a/drivers/staging/lustre/lustre/osc/osc_lock.c
+++ b/drivers/staging/lustre/lustre/osc/osc_lock.c
@@ -1180,7 +1180,7 @@ int osc_lock_init(const struct lu_env *env,
  */
 struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 				       struct osc_object *obj, pgoff_t index,
-				       int pending, int canceling)
+				       enum osc_dap_flags dap_flags)
 {
 	struct osc_thread_info *info = osc_env_info(env);
 	struct ldlm_res_id *resname = &info->oti_resname;
@@ -1194,9 +1194,10 @@ struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 	osc_index2policy(policy, osc2cl(obj), index, index);
 	policy->l_extent.gid = LDLM_GID_ANY;
 
-	flags = LDLM_FL_BLOCK_GRANTED | LDLM_FL_TEST_LOCK;
-	if (pending)
-		flags |= LDLM_FL_CBPENDING;
+	flags = LDLM_FL_BLOCK_GRANTED | LDLM_FL_CBPENDING;
+	if (dap_flags & OSC_DAP_FL_TEST_LOCK)
+		flags |= LDLM_FL_TEST_LOCK;
+
 	/*
 	 * It is fine to match any group lock since there could be only one
 	 * with a uniq gid and it conflicts with all other lock modes too
@@ -1204,7 +1205,8 @@ struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 again:
 	mode = ldlm_lock_match(osc_export(obj)->exp_obd->obd_namespace,
 			       flags, resname, LDLM_EXTENT, policy,
-			       LCK_PR | LCK_PW | LCK_GROUP, &lockh, canceling);
+			       LCK_PR | LCK_PW | LCK_GROUP, &lockh,
+			       dap_flags & OSC_DAP_FL_CANCELING);
 	if (mode != 0) {
 		lock = ldlm_handle2lock(&lockh);
 		/* RACE: the lock is cancelled so let's try again */
diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c b/drivers/staging/lustre/lustre/osc/osc_page.c
index 2a7a70a..399d36b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -117,25 +117,6 @@ void osc_index2policy(ldlm_policy_data_t *policy, const struct cl_object *obj,
 	policy->l_extent.end = cl_offset(obj, end + 1) - 1;
 }
 
-static int osc_page_is_under_lock(const struct lu_env *env,
-				  const struct cl_page_slice *slice,
-				  struct cl_io *unused, pgoff_t *max_index)
-{
-	struct osc_page *opg = cl2osc_page(slice);
-	struct ldlm_lock *dlmlock;
-	int result = -ENODATA;
-
-	dlmlock = osc_dlmlock_at_pgoff(env, cl2osc(slice->cpl_obj),
-				       osc_index(opg), 1, 0);
-	if (dlmlock) {
-		*max_index = cl_index(slice->cpl_obj,
-				      dlmlock->l_policy_data.l_extent.end);
-		LDLM_LOCK_PUT(dlmlock);
-		result = 0;
-	}
-	return result;
-}
-
 static const char *osc_list(struct list_head *head)
 {
 	return list_empty(head) ? "-" : "+";
@@ -276,7 +257,6 @@ static int osc_page_flush(const struct lu_env *env,
 static const struct cl_page_operations osc_page_ops = {
 	.cpo_print	 = osc_page_print,
 	.cpo_delete	= osc_page_delete,
-	.cpo_is_under_lock = osc_page_is_under_lock,
 	.cpo_clip	   = osc_page_clip,
 	.cpo_cancel	 = osc_page_cancel,
 	.cpo_flush	  = osc_page_flush
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 05/41] staging: lustre: clio: Revise read ahead implementation
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jinshan Xiong, James Simmons

From: Jinshan Xiong <jinshan.xiong@intel.com>

In this implementation, read ahead will hold the underlying DLM lock
to add read ahead pages. A new cl_io operation cio_read_ahead() is
added for this purpose. It takes parameter cl_read_ahead{} so that
each layer can adjust it by their own requirements. For example, at
OSC layer, it will make sure the read ahead region is covered by a
LDLM lock; at the LOV layer, it will make sure that the region won't
cross stripe boundary.

Legacy callback cpo_is_under_lock() is removed.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3259
Reviewed-on: http://review.whamcloud.com/10859
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   63 +++---
 .../staging/lustre/lustre/llite/llite_internal.h   |    7 +-
 drivers/staging/lustre/lustre/llite/rw.c           |  218 ++++++++++++--------
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   47 ++---
 drivers/staging/lustre/lustre/llite/vvp_page.c     |   16 --
 drivers/staging/lustre/lustre/lov/lov_io.c         |   58 +++++
 drivers/staging/lustre/lustre/lov/lov_page.c       |   46 ----
 drivers/staging/lustre/lustre/obdclass/cl_io.c     |   57 +----
 drivers/staging/lustre/lustre/obdclass/cl_page.c   |   47 -----
 drivers/staging/lustre/lustre/osc/osc_cache.c      |    3 +-
 drivers/staging/lustre/lustre/osc/osc_internal.h   |   17 ++-
 drivers/staging/lustre/lustre/osc/osc_io.c         |   41 ++++-
 drivers/staging/lustre/lustre/osc/osc_lock.c       |   12 +-
 drivers/staging/lustre/lustre/osc/osc_page.c       |   20 --
 14 files changed, 312 insertions(+), 340 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 89292c9..bf93c1e 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -884,26 +884,6 @@ struct cl_page_operations {
 	/** Destructor. Frees resources and slice itself. */
 	void (*cpo_fini)(const struct lu_env *env,
 			 struct cl_page_slice *slice);
-
-	/**
-	 * Checks whether the page is protected by a cl_lock. This is a
-	 * per-layer method, because certain layers have ways to check for the
-	 * lock much more efficiently than through the generic locks scan, or
-	 * implement locking mechanisms separate from cl_lock, e.g.,
-	 * LL_FILE_GROUP_LOCKED in vvp. If \a pending is true, check for locks
-	 * being canceled, or scheduled for cancellation as soon as the last
-	 * user goes away, too.
-	 *
-	 * \retval    -EBUSY: page is protected by a lock of a given mode;
-	 * \retval  -ENODATA: page is not protected by a lock;
-	 * \retval	 0: this layer cannot decide.
-	 *
-	 * \see cl_page_is_under_lock()
-	 */
-	int (*cpo_is_under_lock)(const struct lu_env *env,
-				 const struct cl_page_slice *slice,
-				 struct cl_io *io, pgoff_t *max);
-
 	/**
 	 * Optional debugging helper. Prints given page slice.
 	 *
@@ -1365,7 +1345,6 @@ struct cl_2queue {
  *     (3) sort all locks to avoid dead-locks, and acquire them
  *
  *     (4) process the chunk: call per-page methods
- *	 (cl_io_operations::cio_read_page() for read,
  *	 cl_io_operations::cio_prepare_write(),
  *	 cl_io_operations::cio_commit_write() for write)
  *
@@ -1467,6 +1446,31 @@ struct cl_io_slice {
 
 typedef void (*cl_commit_cbt)(const struct lu_env *, struct cl_io *,
 			      struct cl_page *);
+
+struct cl_read_ahead {
+	/*
+	 * Maximum page index the readahead window will end.
+	 * This is determined DLM lock coverage, RPC and stripe boundary.
+	 * cra_end is included.
+	 */
+	pgoff_t cra_end;
+	/*
+	 * Release routine. If readahead holds resources underneath, this
+	 * function should be called to release it.
+	 */
+	void (*cra_release)(const struct lu_env *env, void *cbdata);
+	/* Callback data for cra_release routine */
+	void *cra_cbdata;
+};
+
+static inline void cl_read_ahead_release(const struct lu_env *env,
+					 struct cl_read_ahead *ra)
+{
+	if (ra->cra_release)
+		ra->cra_release(env, ra->cra_cbdata);
+	memset(ra, 0, sizeof(*ra));
+}
+
 /**
  * Per-layer io operations.
  * \see vvp_io_ops, lov_io_ops, lovsub_io_ops, osc_io_ops
@@ -1573,16 +1577,13 @@ struct cl_io_operations {
 				 struct cl_page_list *queue, int from, int to,
 				 cl_commit_cbt cb);
 	/**
-	 * Read missing page.
-	 *
-	 * Called by a top-level cl_io_operations::op[CIT_READ]::cio_start()
-	 * method, when it hits not-up-to-date page in the range. Optional.
+	 * Decide maximum read ahead extent
 	 *
 	 * \pre io->ci_type == CIT_READ
 	 */
-	int (*cio_read_page)(const struct lu_env *env,
-			     const struct cl_io_slice *slice,
-			     const struct cl_page_slice *page);
+	int (*cio_read_ahead)(const struct lu_env *env,
+			      const struct cl_io_slice *slice,
+			      pgoff_t start, struct cl_read_ahead *ra);
 	/**
 	 * Optional debugging helper. Print given io slice.
 	 */
@@ -2302,8 +2303,6 @@ void cl_page_discard(const struct lu_env *env, struct cl_io *io,
 void cl_page_delete(const struct lu_env *env, struct cl_page *pg);
 int cl_page_is_vmlocked(const struct lu_env *env, const struct cl_page *pg);
 void cl_page_export(const struct lu_env *env, struct cl_page *pg, int uptodate);
-int cl_page_is_under_lock(const struct lu_env *env, struct cl_io *io,
-			  struct cl_page *page, pgoff_t *max_index);
 loff_t cl_offset(const struct cl_object *obj, pgoff_t idx);
 pgoff_t cl_index(const struct cl_object *obj, loff_t offset);
 size_t cl_page_size(const struct cl_object *obj);
@@ -2414,8 +2413,6 @@ int cl_io_lock_add(const struct lu_env *env, struct cl_io *io,
 		   struct cl_io_lock_link *link);
 int cl_io_lock_alloc_add(const struct lu_env *env, struct cl_io *io,
 			 struct cl_lock_descr *descr);
-int cl_io_read_page(const struct lu_env *env, struct cl_io *io,
-		    struct cl_page *page);
 int cl_io_submit_rw(const struct lu_env *env, struct cl_io *io,
 		    enum cl_req_type iot, struct cl_2queue *queue);
 int cl_io_submit_sync(const struct lu_env *env, struct cl_io *io,
@@ -2424,6 +2421,8 @@ int cl_io_submit_sync(const struct lu_env *env, struct cl_io *io,
 int cl_io_commit_async(const struct lu_env *env, struct cl_io *io,
 		       struct cl_page_list *queue, int from, int to,
 		       cl_commit_cbt cb);
+int cl_io_read_ahead(const struct lu_env *env, struct cl_io *io,
+		     pgoff_t start, struct cl_read_ahead *ra);
 int cl_io_is_going(const struct lu_env *env);
 
 /**
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index cd89926..b06cd3c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -722,9 +722,7 @@ int ll_writepage(struct page *page, struct writeback_control *wbc);
 int ll_writepages(struct address_space *, struct writeback_control *wbc);
 int ll_readpage(struct file *file, struct page *page);
 void ll_readahead_init(struct inode *inode, struct ll_readahead_state *ras);
-int ll_readahead(const struct lu_env *env, struct cl_io *io,
-		 struct cl_page_list *queue, struct ll_readahead_state *ras,
-		 bool hit);
+int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io);
 struct ll_cl_context *ll_cl_find(struct file *file);
 void ll_cl_add(struct file *file, const struct lu_env *env, struct cl_io *io);
 void ll_cl_remove(struct file *file, const struct lu_env *env);
@@ -1020,9 +1018,6 @@ int cl_sb_init(struct super_block *sb);
 int cl_sb_fini(struct super_block *sb);
 void ll_io_init(struct cl_io *io, const struct file *file, int write);
 
-void ras_update(struct ll_sb_info *sbi, struct inode *inode,
-		struct ll_readahead_state *ras, unsigned long index,
-		unsigned hit);
 void ll_ra_count_put(struct ll_sb_info *sbi, unsigned long len);
 void ll_ra_stats_inc(struct inode *inode, enum ra_stat which);
 
diff --git a/drivers/staging/lustre/lustre/llite/rw.c b/drivers/staging/lustre/lustre/llite/rw.c
index 50c0152..80cb8e0 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++ b/drivers/staging/lustre/lustre/llite/rw.c
@@ -180,90 +180,73 @@ void ll_ras_enter(struct file *f)
 	spin_unlock(&ras->ras_lock);
 }
 
-static int cl_read_ahead_page(const struct lu_env *env, struct cl_io *io,
-			      struct cl_page_list *queue, struct cl_page *page,
-			      struct cl_object *clob, pgoff_t *max_index)
+/**
+ * Initiates read-ahead of a page with given index.
+ *
+ * \retval +ve:	page was already uptodate so it will be skipped
+ *		from being added;
+ * \retval -ve:	page wasn't added to \a queue for error;
+ * \retval   0:	page was added into \a queue for read ahead.
+ */
+static int ll_read_ahead_page(const struct lu_env *env, struct cl_io *io,
+			      struct cl_page_list *queue, pgoff_t index)
 {
-	struct page *vmpage = page->cp_vmpage;
+	enum ra_stat which = _NR_RA_STAT; /* keep gcc happy */
+	struct cl_object *clob = io->ci_obj;
+	struct inode *inode = vvp_object_inode(clob);
+	const char *msg = NULL;
+	struct cl_page *page;
 	struct vvp_page *vpg;
-	int	      rc;
+	struct page *vmpage;
+	int rc = 0;
+
+	vmpage = grab_cache_page_nowait(inode->i_mapping, index);
+	if (!vmpage) {
+		which = RA_STAT_FAILED_GRAB_PAGE;
+		msg = "g_c_p_n failed";
+		rc = -EBUSY;
+		goto out;
+	}
+
+	/* Check if vmpage was truncated or reclaimed */
+	if (vmpage->mapping != inode->i_mapping) {
+		which = RA_STAT_WRONG_GRAB_PAGE;
+		msg = "g_c_p_n returned invalid page";
+		rc = -EBUSY;
+		goto out;
+	}
+
+	page = cl_page_find(env, clob, vmpage->index, vmpage, CPT_CACHEABLE);
+	if (IS_ERR(page)) {
+		which = RA_STAT_FAILED_GRAB_PAGE;
+		msg = "cl_page_find failed";
+		rc = PTR_ERR(page);
+		goto out;
+	}
 
-	rc = 0;
-	cl_page_assume(env, io, page);
 	lu_ref_add(&page->cp_reference, "ra", current);
+	cl_page_assume(env, io, page);
 	vpg = cl2vvp_page(cl_object_page_slice(clob, page));
 	if (!vpg->vpg_defer_uptodate && !PageUptodate(vmpage)) {
-		CDEBUG(D_READA, "page index %lu, max_index: %lu\n",
-		       vvp_index(vpg), *max_index);
-		if (*max_index == 0 || vvp_index(vpg) > *max_index)
-			rc = cl_page_is_under_lock(env, io, page, max_index);
-		if (rc == 0) {
-			vpg->vpg_defer_uptodate = 1;
-			vpg->vpg_ra_used = 0;
-			cl_page_list_add(queue, page);
-			rc = 1;
-		} else {
-			cl_page_discard(env, io, page);
-			rc = -ENOLCK;
-		}
+		vpg->vpg_defer_uptodate = 1;
+		vpg->vpg_ra_used = 0;
+		cl_page_list_add(queue, page);
 	} else {
 		/* skip completed pages */
 		cl_page_unassume(env, io, page);
+		/* This page is already uptodate, returning a positive number
+		 * to tell the callers about this
+		 */
+		rc = 1;
 	}
+
 	lu_ref_del(&page->cp_reference, "ra", current);
 	cl_page_put(env, page);
-	return rc;
-}
-
-/**
- * Initiates read-ahead of a page with given index.
- *
- * \retval     +ve: page was added to \a queue.
- *
- * \retval -ENOLCK: there is no extent lock for this part of a file, stop
- *		  read-ahead.
- *
- * \retval  -ve, 0: page wasn't added to \a queue for other reason.
- */
-static int ll_read_ahead_page(const struct lu_env *env, struct cl_io *io,
-			      struct cl_page_list *queue,
-			      pgoff_t index, pgoff_t *max_index)
-{
-	struct cl_object *clob  = io->ci_obj;
-	struct inode     *inode = vvp_object_inode(clob);
-	struct page      *vmpage;
-	struct cl_page   *page;
-	enum ra_stat      which = _NR_RA_STAT; /* keep gcc happy */
-	int	       rc    = 0;
-	const char       *msg   = NULL;
-
-	vmpage = grab_cache_page_nowait(inode->i_mapping, index);
+out:
 	if (vmpage) {
-		/* Check if vmpage was truncated or reclaimed */
-		if (vmpage->mapping == inode->i_mapping) {
-			page = cl_page_find(env, clob, vmpage->index,
-					    vmpage, CPT_CACHEABLE);
-			if (!IS_ERR(page)) {
-				rc = cl_read_ahead_page(env, io, queue,
-							page, clob, max_index);
-				if (rc == -ENOLCK) {
-					which = RA_STAT_FAILED_MATCH;
-					msg   = "lock match failed";
-				}
-			} else {
-				which = RA_STAT_FAILED_GRAB_PAGE;
-				msg   = "cl_page_find failed";
-			}
-		} else {
-			which = RA_STAT_WRONG_GRAB_PAGE;
-			msg   = "g_c_p_n returned invalid page";
-		}
-		if (rc != 1)
+		if (rc)
 			unlock_page(vmpage);
 		put_page(vmpage);
-	} else {
-		which = RA_STAT_FAILED_GRAB_PAGE;
-		msg   = "g_c_p_n failed";
 	}
 	if (msg) {
 		ll_ra_stats_inc(inode, which);
@@ -378,12 +361,12 @@ static int ll_read_ahead_pages(const struct lu_env *env,
 			       struct cl_io *io, struct cl_page_list *queue,
 			       struct ra_io_arg *ria,
 			       unsigned long *reserved_pages,
-			       unsigned long *ra_end)
+			       pgoff_t *ra_end)
 {
+	struct cl_read_ahead ra = { 0 };
 	int rc, count = 0;
 	bool stride_ria;
 	pgoff_t page_idx;
-	pgoff_t max_index = 0;
 
 	LASSERT(ria);
 	RIA_DEBUG(ria);
@@ -392,14 +375,23 @@ static int ll_read_ahead_pages(const struct lu_env *env,
 	for (page_idx = ria->ria_start;
 	     page_idx <= ria->ria_end && *reserved_pages > 0; page_idx++) {
 		if (ras_inside_ra_window(page_idx, ria)) {
+			if (!ra.cra_end || ra.cra_end < page_idx) {
+				cl_read_ahead_release(env, &ra);
+
+				rc = cl_io_read_ahead(env, io, page_idx, &ra);
+				if (rc < 0)
+					break;
+
+				LASSERTF(ra.cra_end >= page_idx,
+					 "object: %p, indcies %lu / %lu\n",
+					 io->ci_obj, ra.cra_end, page_idx);
+			}
+
 			/* If the page is inside the read-ahead window*/
-			rc = ll_read_ahead_page(env, io, queue,
-						page_idx, &max_index);
-			if (rc == 1) {
+			rc = ll_read_ahead_page(env, io, queue, page_idx);
+			if (!rc) {
 				(*reserved_pages)--;
 				count++;
-			} else if (rc == -ENOLCK) {
-				break;
 			}
 		} else if (stride_ria) {
 			/* If it is not in the read-ahead window, and it is
@@ -425,19 +417,21 @@ static int ll_read_ahead_pages(const struct lu_env *env,
 			}
 		}
 	}
+	cl_read_ahead_release(env, &ra);
+
 	*ra_end = page_idx;
 	return count;
 }
 
-int ll_readahead(const struct lu_env *env, struct cl_io *io,
-		 struct cl_page_list *queue, struct ll_readahead_state *ras,
-		 bool hit)
+static int ll_readahead(const struct lu_env *env, struct cl_io *io,
+			struct cl_page_list *queue,
+			struct ll_readahead_state *ras, bool hit)
 {
 	struct vvp_io *vio = vvp_env_io(env);
 	struct ll_thread_info *lti = ll_env_info(env);
 	struct cl_attr *attr = vvp_env_thread_attr(env);
-	unsigned long start = 0, end = 0, reserved;
-	unsigned long ra_end, len, mlen = 0;
+	unsigned long len, mlen = 0, reserved;
+	pgoff_t ra_end, start = 0, end = 0;
 	struct inode *inode;
 	struct ra_io_arg *ria = &lti->lti_ria;
 	struct cl_object *clob;
@@ -575,8 +569,8 @@ int ll_readahead(const struct lu_env *env, struct cl_io *io,
 	 * if the region we failed to issue read-ahead on is still ahead
 	 * of the app and behind the next index to start read-ahead from
 	 */
-	CDEBUG(D_READA, "ra_end %lu end %lu stride end %lu\n",
-	       ra_end, end, ria->ria_end);
+	CDEBUG(D_READA, "ra_end = %lu end = %lu stride end = %lu pages = %d\n",
+	       ra_end, end, ria->ria_end, ret);
 
 	if (ra_end != end + 1) {
 		ll_ra_stats_inc(inode, RA_STAT_FAILED_REACH_END);
@@ -737,9 +731,9 @@ static void ras_increase_window(struct inode *inode,
 					  ra->ra_max_pages_per_file);
 }
 
-void ras_update(struct ll_sb_info *sbi, struct inode *inode,
-		struct ll_readahead_state *ras, unsigned long index,
-		unsigned hit)
+static void ras_update(struct ll_sb_info *sbi, struct inode *inode,
+		       struct ll_readahead_state *ras, unsigned long index,
+		       unsigned int hit)
 {
 	struct ll_ra_info *ra = &sbi->ll_ra_info;
 	int zero = 0, stride_detect = 0, ra_miss = 0;
@@ -1087,6 +1081,56 @@ void ll_cl_remove(struct file *file, const struct lu_env *env)
 	write_unlock(&fd->fd_lock);
 }
 
+static int ll_io_read_page(const struct lu_env *env, struct cl_io *io,
+			   struct cl_page *page)
+{
+	struct inode *inode = vvp_object_inode(page->cp_obj);
+	struct ll_file_data *fd = vvp_env_io(env)->vui_fd;
+	struct ll_readahead_state *ras = &fd->fd_ras;
+	struct cl_2queue *queue  = &io->ci_queue;
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	struct vvp_page *vpg;
+	int rc = 0;
+
+	vpg = cl2vvp_page(cl_object_page_slice(page->cp_obj, page));
+	if (sbi->ll_ra_info.ra_max_pages_per_file > 0 &&
+	    sbi->ll_ra_info.ra_max_pages > 0)
+		ras_update(sbi, inode, ras, vvp_index(vpg),
+			   vpg->vpg_defer_uptodate);
+
+	if (vpg->vpg_defer_uptodate) {
+		vpg->vpg_ra_used = 1;
+		cl_page_export(env, page, 1);
+	}
+
+	cl_2queue_init(queue);
+	/*
+	 * Add page into the queue even when it is marked uptodate above.
+	 * this will unlock it automatically as part of cl_page_list_disown().
+	 */
+	cl_page_list_add(&queue->c2_qin, page);
+	if (sbi->ll_ra_info.ra_max_pages_per_file > 0 &&
+	    sbi->ll_ra_info.ra_max_pages > 0) {
+		int rc2;
+
+		rc2 = ll_readahead(env, io, &queue->c2_qin, ras,
+				   vpg->vpg_defer_uptodate);
+		CDEBUG(D_READA, DFID "%d pages read ahead at %lu\n",
+		       PFID(ll_inode2fid(inode)), rc2, vvp_index(vpg));
+	}
+
+	if (queue->c2_qin.pl_nr > 0)
+		rc = cl_io_submit_rw(env, io, CRT_READ, queue);
+
+	/*
+	 * Unlock unsent pages in case of error.
+	 */
+	cl_page_list_disown(env, io, &queue->c2_qin);
+	cl_2queue_fini(env, queue);
+
+	return rc;
+}
+
 int ll_readpage(struct file *file, struct page *vmpage)
 {
 	struct cl_object *clob = ll_i2info(file_inode(file))->lli_clob;
@@ -1110,7 +1154,7 @@ int ll_readpage(struct file *file, struct page *vmpage)
 		LASSERT(page->cp_type == CPT_CACHEABLE);
 		if (likely(!PageUptodate(vmpage))) {
 			cl_page_assume(env, io, page);
-			result = cl_io_read_page(env, io, page);
+			result = ll_io_read_page(env, io, page);
 		} else {
 			/* Page from a non-object file. */
 			unlock_page(vmpage);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index dbc4c26..8187fa3 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -1228,40 +1228,23 @@ static int vvp_io_fsync_start(const struct lu_env *env,
 	return 0;
 }
 
-static int vvp_io_read_page(const struct lu_env *env,
-			    const struct cl_io_slice *ios,
-			    const struct cl_page_slice *slice)
+static int vvp_io_read_ahead(const struct lu_env *env,
+			     const struct cl_io_slice *ios,
+			     pgoff_t start, struct cl_read_ahead *ra)
 {
-	struct cl_io	      *io     = ios->cis_io;
-	struct vvp_page           *vpg    = cl2vvp_page(slice);
-	struct cl_page	    *page   = slice->cpl_page;
-	struct inode              *inode  = vvp_object_inode(slice->cpl_obj);
-	struct ll_sb_info	 *sbi    = ll_i2sbi(inode);
-	struct ll_file_data       *fd     = cl2vvp_io(env, ios)->vui_fd;
-	struct ll_readahead_state *ras    = &fd->fd_ras;
-	struct cl_2queue	  *queue  = &io->ci_queue;
-
-	if (sbi->ll_ra_info.ra_max_pages_per_file &&
-	    sbi->ll_ra_info.ra_max_pages)
-		ras_update(sbi, inode, ras, vvp_index(vpg),
-			   vpg->vpg_defer_uptodate);
-
-	if (vpg->vpg_defer_uptodate) {
-		vpg->vpg_ra_used = 1;
-		cl_page_export(env, page, 1);
-	}
-	/*
-	 * Add page into the queue even when it is marked uptodate above.
-	 * this will unlock it automatically as part of cl_page_list_disown().
-	 */
+	int result = 0;
 
-	cl_page_list_add(&queue->c2_qin, page);
-	if (sbi->ll_ra_info.ra_max_pages_per_file &&
-	    sbi->ll_ra_info.ra_max_pages)
-		ll_readahead(env, io, &queue->c2_qin, ras,
-			     vpg->vpg_defer_uptodate);
+	if (ios->cis_io->ci_type == CIT_READ ||
+	    ios->cis_io->ci_type == CIT_FAULT) {
+		struct vvp_io *vio = cl2vvp_io(env, ios);
 
-	return 0;
+		if (unlikely(vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED)) {
+			ra->cra_end = CL_PAGE_EOF;
+			result = 1; /* no need to call down */
+		}
+	}
+
+	return result;
 }
 
 static void vvp_io_end(const struct lu_env *env, const struct cl_io_slice *ios)
@@ -1308,7 +1291,7 @@ static const struct cl_io_operations vvp_io_ops = {
 			.cio_fini   = vvp_io_fini
 		}
 	},
-	.cio_read_page     = vvp_io_read_page,
+	.cio_read_ahead	= vvp_io_read_ahead,
 };
 
 int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
diff --git a/drivers/staging/lustre/lustre/llite/vvp_page.c b/drivers/staging/lustre/lustre/llite/vvp_page.c
index 68f8990..75cec23 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_page.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_page.c
@@ -342,20 +342,6 @@ static int vvp_page_make_ready(const struct lu_env *env,
 	return result;
 }
 
-static int vvp_page_is_under_lock(const struct lu_env *env,
-				  const struct cl_page_slice *slice,
-				  struct cl_io *io, pgoff_t *max_index)
-{
-	if (io->ci_type == CIT_READ || io->ci_type == CIT_WRITE ||
-	    io->ci_type == CIT_FAULT) {
-		struct vvp_io *vio = vvp_env_io(env);
-
-		if (unlikely(vio->vui_fd->fd_flags & LL_FILE_GROUP_LOCKED))
-			*max_index = CL_PAGE_EOF;
-	}
-	return 0;
-}
-
 static int vvp_page_print(const struct lu_env *env,
 			  const struct cl_page_slice *slice,
 			  void *cookie, lu_printer_t printer)
@@ -400,7 +386,6 @@ static const struct cl_page_operations vvp_page_ops = {
 	.cpo_is_vmlocked   = vvp_page_is_vmlocked,
 	.cpo_fini	  = vvp_page_fini,
 	.cpo_print	 = vvp_page_print,
-	.cpo_is_under_lock = vvp_page_is_under_lock,
 	.io = {
 		[CRT_READ] = {
 			.cpo_prep	= vvp_page_prep_read,
@@ -499,7 +484,6 @@ static const struct cl_page_operations vvp_transient_page_ops = {
 	.cpo_fini	  = vvp_transient_page_fini,
 	.cpo_is_vmlocked   = vvp_transient_page_is_vmlocked,
 	.cpo_print	 = vvp_page_print,
-	.cpo_is_under_lock	= vvp_page_is_under_lock,
 	.io = {
 		[CRT_READ] = {
 			.cpo_prep	= vvp_transient_page_prep,
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index d101579..e75e5d2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -555,6 +555,63 @@ static void lov_io_unlock(const struct lu_env *env,
 	LASSERT(rc == 0);
 }
 
+static int lov_io_read_ahead(const struct lu_env *env,
+			     const struct cl_io_slice *ios,
+			     pgoff_t start, struct cl_read_ahead *ra)
+{
+	struct lov_io *lio = cl2lov_io(env, ios);
+	struct lov_object *loo = lio->lis_object;
+	struct cl_object *obj = lov2cl(loo);
+	struct lov_layout_raid0 *r0 = lov_r0(loo);
+	unsigned int pps; /* pages per stripe */
+	struct lov_io_sub *sub;
+	pgoff_t ra_end;
+	loff_t suboff;
+	int stripe;
+	int rc;
+
+	stripe = lov_stripe_number(loo->lo_lsm, cl_offset(obj, start));
+	if (unlikely(!r0->lo_sub[stripe]))
+		return -EIO;
+
+	sub = lov_sub_get(env, lio, stripe);
+
+	lov_stripe_offset(loo->lo_lsm, cl_offset(obj, start), stripe, &suboff);
+	rc = cl_io_read_ahead(sub->sub_env, sub->sub_io,
+			      cl_index(lovsub2cl(r0->lo_sub[stripe]), suboff),
+			      ra);
+	lov_sub_put(sub);
+
+	CDEBUG(D_READA, DFID " cra_end = %lu, stripes = %d, rc = %d\n",
+	       PFID(lu_object_fid(lov2lu(loo))), ra->cra_end, r0->lo_nr, rc);
+	if (rc)
+		return rc;
+
+	/**
+	 * Adjust the stripe index by layout of raid0. ra->cra_end is
+	 * the maximum page index covered by an underlying DLM lock.
+	 * This function converts cra_end from stripe level to file
+	 * level, and make sure it's not beyond stripe boundary.
+	 */
+	if (r0->lo_nr == 1)	/* single stripe file */
+		return 0;
+
+	/* cra_end is stripe level, convert it into file level */
+	ra_end = ra->cra_end;
+	if (ra_end != CL_PAGE_EOF)
+		ra_end = lov_stripe_pgoff(loo->lo_lsm, ra_end, stripe);
+
+	pps = loo->lo_lsm->lsm_stripe_size >> PAGE_SHIFT;
+
+	CDEBUG(D_READA, DFID " max_index = %lu, pps = %u, stripe_size = %u, stripe no = %u, start index = %lu\n",
+	       PFID(lu_object_fid(lov2lu(loo))), ra_end, pps,
+	       loo->lo_lsm->lsm_stripe_size, stripe, start);
+
+	/* never exceed the end of the stripe */
+	ra->cra_end = min_t(pgoff_t, ra_end, start + pps - start % pps - 1);
+	return 0;
+}
+
 /**
  * lov implementation of cl_operations::cio_submit() method. It takes a list
  * of pages in \a queue, splits it into per-stripe sub-lists, invokes
@@ -801,6 +858,7 @@ static const struct cl_io_operations lov_io_ops = {
 			.cio_fini   = lov_io_fini
 		}
 	},
+	.cio_read_ahead			= lov_io_read_ahead,
 	.cio_submit                    = lov_io_submit,
 	.cio_commit_async              = lov_io_commit_async,
 };
diff --git a/drivers/staging/lustre/lustre/lov/lov_page.c b/drivers/staging/lustre/lustre/lov/lov_page.c
index 00bfaba..62ceb6d 100644
--- a/drivers/staging/lustre/lustre/lov/lov_page.c
+++ b/drivers/staging/lustre/lustre/lov/lov_page.c
@@ -49,51 +49,6 @@
  *
  */
 
-/**
- * Adjust the stripe index by layout of raid0. @max_index is the maximum
- * page index covered by an underlying DLM lock.
- * This function converts max_index from stripe level to file level, and make
- * sure it's not beyond one stripe.
- */
-static int lov_raid0_page_is_under_lock(const struct lu_env *env,
-					const struct cl_page_slice *slice,
-					struct cl_io *unused,
-					pgoff_t *max_index)
-{
-	struct lov_object *loo = cl2lov(slice->cpl_obj);
-	struct lov_layout_raid0 *r0 = lov_r0(loo);
-	pgoff_t index = *max_index;
-	unsigned int pps; /* pages per stripe */
-
-	CDEBUG(D_READA, DFID "*max_index = %lu, nr = %d\n",
-	       PFID(lu_object_fid(lov2lu(loo))), index, r0->lo_nr);
-
-	if (index == 0) /* the page is not covered by any lock */
-		return 0;
-
-	if (r0->lo_nr == 1) /* single stripe file */
-		return 0;
-
-	/* max_index is stripe level, convert it into file level */
-	if (index != CL_PAGE_EOF) {
-		int stripeno = lov_page_stripe(slice->cpl_page);
-		*max_index = lov_stripe_pgoff(loo->lo_lsm, index, stripeno);
-	}
-
-	/* calculate the end of current stripe */
-	pps = loo->lo_lsm->lsm_stripe_size >> PAGE_SHIFT;
-	index = slice->cpl_index + pps - slice->cpl_index % pps - 1;
-
-	CDEBUG(D_READA, DFID "*max_index = %lu, index = %lu, pps = %u, stripe_size = %u, stripe no = %u, page index = %lu\n",
-	       PFID(lu_object_fid(lov2lu(loo))), *max_index, index, pps,
-	       loo->lo_lsm->lsm_stripe_size, lov_page_stripe(slice->cpl_page),
-	       slice->cpl_index);
-
-	/* never exceed the end of the stripe */
-	*max_index = min_t(pgoff_t, *max_index, index);
-	return 0;
-}
-
 static int lov_raid0_page_print(const struct lu_env *env,
 				const struct cl_page_slice *slice,
 				void *cookie, lu_printer_t printer)
@@ -104,7 +59,6 @@ static int lov_raid0_page_print(const struct lu_env *env,
 }
 
 static const struct cl_page_operations lov_raid0_page_ops = {
-	.cpo_is_under_lock = lov_raid0_page_is_under_lock,
 	.cpo_print  = lov_raid0_page_print
 };
 
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_io.c b/drivers/staging/lustre/lustre/obdclass/cl_io.c
index bc4b7b6..577f76e 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_io.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_io.c
@@ -586,67 +586,32 @@ void cl_io_end(const struct lu_env *env, struct cl_io *io)
 }
 EXPORT_SYMBOL(cl_io_end);
 
-static const struct cl_page_slice *
-cl_io_slice_page(const struct cl_io_slice *ios, struct cl_page *page)
-{
-	const struct cl_page_slice *slice;
-
-	slice = cl_page_at(page, ios->cis_obj->co_lu.lo_dev->ld_type);
-	LINVRNT(slice);
-	return slice;
-}
-
 /**
- * Called by read io, when page has to be read from the server.
+ * Called by read io, to decide the readahead extent
  *
- * \see cl_io_operations::cio_read_page()
+ * \see cl_io_operations::cio_read_ahead()
  */
-int cl_io_read_page(const struct lu_env *env, struct cl_io *io,
-		    struct cl_page *page)
+int cl_io_read_ahead(const struct lu_env *env, struct cl_io *io,
+		     pgoff_t start, struct cl_read_ahead *ra)
 {
 	const struct cl_io_slice *scan;
-	struct cl_2queue	 *queue;
 	int		       result = 0;
 
 	LINVRNT(io->ci_type == CIT_READ || io->ci_type == CIT_FAULT);
-	LINVRNT(cl_page_is_owned(page, io));
 	LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED);
 	LINVRNT(cl_io_invariant(io));
 
-	queue = &io->ci_queue;
-
-	cl_2queue_init(queue);
-	/*
-	 * ->cio_read_page() methods called in the loop below are supposed to
-	 * never block waiting for network (the only subtle point is the
-	 * creation of new pages for read-ahead that might result in cache
-	 * shrinking, but currently only clean pages are shrunk and this
-	 * requires no network io).
-	 *
-	 * Should this ever starts blocking, retry loop would be needed for
-	 * "parallel io" (see CLO_REPEAT loops in cl_lock.c).
-	 */
 	cl_io_for_each(scan, io) {
-		if (scan->cis_iop->cio_read_page) {
-			const struct cl_page_slice *slice;
+		if (!scan->cis_iop->cio_read_ahead)
+			continue;
 
-			slice = cl_io_slice_page(scan, page);
-			LINVRNT(slice);
-			result = scan->cis_iop->cio_read_page(env, scan, slice);
-			if (result != 0)
-				break;
-		}
+		result = scan->cis_iop->cio_read_ahead(env, scan, start, ra);
+		if (result)
+			break;
 	}
-	if (result == 0 && queue->c2_qin.pl_nr > 0)
-		result = cl_io_submit_rw(env, io, CRT_READ, queue);
-	/*
-	 * Unlock unsent pages in case of error.
-	 */
-	cl_page_list_disown(env, io, &queue->c2_qin);
-	cl_2queue_fini(env, queue);
-	return result;
+	return result > 0 ? 0 : result;
 }
-EXPORT_SYMBOL(cl_io_read_page);
+EXPORT_SYMBOL(cl_io_read_ahead);
 
 /**
  * Commit a list of contiguous pages into writeback cache.
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_page.c b/drivers/staging/lustre/lustre/obdclass/cl_page.c
index 63973ba..40b7bee 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_page.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_page.c
@@ -390,30 +390,6 @@ EXPORT_SYMBOL(cl_page_at);
 	__result;						       \
 })
 
-#define CL_PAGE_INVOKE_REVERSE(_env, _page, _op, _proto, ...)		\
-({									\
-	const struct lu_env        *__env  = (_env);			\
-	struct cl_page             *__page = (_page);			\
-	const struct cl_page_slice *__scan;				\
-	int                         __result;				\
-	ptrdiff_t                   __op   = (_op);			\
-	int                       (*__method)_proto;			\
-									\
-	__result = 0;							\
-	list_for_each_entry_reverse(__scan, &__page->cp_layers,		\
-					cpl_linkage) {			\
-		__method = *(void **)((char *)__scan->cpl_ops +  __op);	\
-		if (__method) {						\
-			__result = (*__method)(__env, __scan, ## __VA_ARGS__); \
-			if (__result != 0)				\
-				break;					\
-		}							\
-	}								\
-	if (__result > 0)						\
-		__result = 0;						\
-	__result;							\
-})
-
 #define CL_PAGE_INVOID(_env, _page, _op, _proto, ...)		   \
 do {								    \
 	const struct lu_env	*__env  = (_env);		    \
@@ -927,29 +903,6 @@ int cl_page_flush(const struct lu_env *env, struct cl_io *io,
 EXPORT_SYMBOL(cl_page_flush);
 
 /**
- * Checks whether page is protected by any extent lock is at least required
- * mode.
- *
- * \return the same as in cl_page_operations::cpo_is_under_lock() method.
- * \see cl_page_operations::cpo_is_under_lock()
- */
-int cl_page_is_under_lock(const struct lu_env *env, struct cl_io *io,
-			  struct cl_page *page, pgoff_t *max_index)
-{
-	int rc;
-
-	PINVRNT(env, page, cl_page_invariant(page));
-
-	rc = CL_PAGE_INVOKE_REVERSE(env, page, CL_PAGE_OP(cpo_is_under_lock),
-				    (const struct lu_env *,
-				     const struct cl_page_slice *,
-				      struct cl_io *, pgoff_t *),
-				    io, max_index);
-	return rc;
-}
-EXPORT_SYMBOL(cl_page_is_under_lock);
-
-/**
  * Tells transfer engine that only part of a page is to be transmitted.
  *
  * \see cl_page_operations::cpo_clip()
diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 4bbe219..b645957 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -3158,7 +3158,8 @@ static int check_and_discard_cb(const struct lu_env *env, struct cl_io *io,
 		struct cl_page *page = ops->ops_cl.cpl_page;
 
 		/* refresh non-overlapped index */
-		tmp = osc_dlmlock_at_pgoff(env, osc, index, 0, 0);
+		tmp = osc_dlmlock_at_pgoff(env, osc, index,
+					   OSC_DAP_FL_TEST_LOCK);
 		if (tmp) {
 			__u64 end = tmp->l_policy_data.l_extent.end;
 			/* Cache the first-non-overlapped index so as to skip
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 67fe0a2..9a61c9b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -199,8 +199,23 @@ void osc_inc_unstable_pages(struct ptlrpc_request *req);
 void osc_dec_unstable_pages(struct ptlrpc_request *req);
 bool osc_over_unstable_soft_limit(struct client_obd *cli);
 
+/**
+ * Bit flags for osc_dlm_lock_at_pageoff().
+ */
+enum osc_dap_flags {
+	/**
+	 * Just check if the desired lock exists, it won't hold reference
+	 * count on lock.
+	 */
+	OSC_DAP_FL_TEST_LOCK	= BIT(0),
+	/**
+	 * Return the lock even if it is being canceled.
+	 */
+	OSC_DAP_FL_CANCELING	= BIT(1),
+};
+
 struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 				       struct osc_object *obj, pgoff_t index,
-				       int pending, int canceling);
+				       enum osc_dap_flags flags);
 
 #endif /* OSC_INTERNAL_H */
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index 8a559cb..47c6371 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -88,6 +88,44 @@ static void osc_io_fini(const struct lu_env *env, const struct cl_io_slice *io)
 {
 }
 
+static void osc_read_ahead_release(const struct lu_env *env, void *cbdata)
+{
+	struct ldlm_lock *dlmlock = cbdata;
+	struct lustre_handle lockh;
+
+	ldlm_lock2handle(dlmlock, &lockh);
+	ldlm_lock_decref(&lockh, LCK_PR);
+	LDLM_LOCK_PUT(dlmlock);
+}
+
+static int osc_io_read_ahead(const struct lu_env *env,
+			     const struct cl_io_slice *ios,
+			     pgoff_t start, struct cl_read_ahead *ra)
+{
+	struct osc_object *osc = cl2osc(ios->cis_obj);
+	struct ldlm_lock *dlmlock;
+	int result = -ENODATA;
+
+	dlmlock = osc_dlmlock_at_pgoff(env, osc, start, 0);
+	if (dlmlock) {
+		if (dlmlock->l_req_mode != LCK_PR) {
+			struct lustre_handle lockh;
+
+			ldlm_lock2handle(dlmlock, &lockh);
+			ldlm_lock_addref(&lockh, LCK_PR);
+			ldlm_lock_decref(&lockh, dlmlock->l_req_mode);
+		}
+
+		ra->cra_end = cl_index(osc2cl(osc),
+				       dlmlock->l_policy_data.l_extent.end);
+		ra->cra_release = osc_read_ahead_release;
+		ra->cra_cbdata = dlmlock;
+		result = 0;
+	}
+
+	return result;
+}
+
 /**
  * An implementation of cl_io_operations::cio_io_submit() method for osc
  * layer. Iterates over pages in the in-queue, prepares each for io by calling
@@ -724,6 +762,7 @@ static const struct cl_io_operations osc_io_ops = {
 			.cio_fini   = osc_io_fini
 		}
 	},
+	.cio_read_ahead			= osc_io_read_ahead,
 	.cio_submit                 = osc_io_submit,
 	.cio_commit_async           = osc_io_commit_async
 };
@@ -798,7 +837,7 @@ static void osc_req_attr_set(const struct lu_env *env,
 				     struct cl_page, cp_flight);
 		opg = osc_cl_page_osc(apage, NULL);
 		lock = osc_dlmlock_at_pgoff(env, cl2osc(obj), osc_index(opg),
-					    1, 1);
+					    OSC_DAP_FL_TEST_LOCK | OSC_DAP_FL_CANCELING);
 		if (!lock && !opg->ops_srvlock) {
 			struct ldlm_resource *res;
 			struct ldlm_res_id *resname;
diff --git a/drivers/staging/lustre/lustre/osc/osc_lock.c b/drivers/staging/lustre/lustre/osc/osc_lock.c
index 39a8a58..a42cb98 100644
--- a/drivers/staging/lustre/lustre/osc/osc_lock.c
+++ b/drivers/staging/lustre/lustre/osc/osc_lock.c
@@ -1180,7 +1180,7 @@ int osc_lock_init(const struct lu_env *env,
  */
 struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 				       struct osc_object *obj, pgoff_t index,
-				       int pending, int canceling)
+				       enum osc_dap_flags dap_flags)
 {
 	struct osc_thread_info *info = osc_env_info(env);
 	struct ldlm_res_id *resname = &info->oti_resname;
@@ -1194,9 +1194,10 @@ struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 	osc_index2policy(policy, osc2cl(obj), index, index);
 	policy->l_extent.gid = LDLM_GID_ANY;
 
-	flags = LDLM_FL_BLOCK_GRANTED | LDLM_FL_TEST_LOCK;
-	if (pending)
-		flags |= LDLM_FL_CBPENDING;
+	flags = LDLM_FL_BLOCK_GRANTED | LDLM_FL_CBPENDING;
+	if (dap_flags & OSC_DAP_FL_TEST_LOCK)
+		flags |= LDLM_FL_TEST_LOCK;
+
 	/*
 	 * It is fine to match any group lock since there could be only one
 	 * with a uniq gid and it conflicts with all other lock modes too
@@ -1204,7 +1205,8 @@ struct ldlm_lock *osc_dlmlock_at_pgoff(const struct lu_env *env,
 again:
 	mode = ldlm_lock_match(osc_export(obj)->exp_obd->obd_namespace,
 			       flags, resname, LDLM_EXTENT, policy,
-			       LCK_PR | LCK_PW | LCK_GROUP, &lockh, canceling);
+			       LCK_PR | LCK_PW | LCK_GROUP, &lockh,
+			       dap_flags & OSC_DAP_FL_CANCELING);
 	if (mode != 0) {
 		lock = ldlm_handle2lock(&lockh);
 		/* RACE: the lock is cancelled so let's try again */
diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c b/drivers/staging/lustre/lustre/osc/osc_page.c
index 2a7a70a..399d36b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -117,25 +117,6 @@ void osc_index2policy(ldlm_policy_data_t *policy, const struct cl_object *obj,
 	policy->l_extent.end = cl_offset(obj, end + 1) - 1;
 }
 
-static int osc_page_is_under_lock(const struct lu_env *env,
-				  const struct cl_page_slice *slice,
-				  struct cl_io *unused, pgoff_t *max_index)
-{
-	struct osc_page *opg = cl2osc_page(slice);
-	struct ldlm_lock *dlmlock;
-	int result = -ENODATA;
-
-	dlmlock = osc_dlmlock_at_pgoff(env, cl2osc(slice->cpl_obj),
-				       osc_index(opg), 1, 0);
-	if (dlmlock) {
-		*max_index = cl_index(slice->cpl_obj,
-				      dlmlock->l_policy_data.l_extent.end);
-		LDLM_LOCK_PUT(dlmlock);
-		result = 0;
-	}
-	return result;
-}
-
 static const char *osc_list(struct list_head *head)
 {
 	return list_empty(head) ? "-" : "+";
@@ -276,7 +257,6 @@ static int osc_page_flush(const struct lu_env *env,
 static const struct cl_page_operations osc_page_ops = {
 	.cpo_print	 = osc_page_print,
 	.cpo_delete	= osc_page_delete,
-	.cpo_is_under_lock = osc_page_is_under_lock,
 	.cpo_clip	   = osc_page_clip,
 	.cpo_cancel	 = osc_page_cancel,
 	.cpo_flush	  = osc_page_flush
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 06/41] staging: lustre: ldlm: remove unnecessary EXPORT_SYMBOL
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, frank zago,
	James Simmons

From: frank zago <fzago@cray.com>

A lot of symbols don't need to be exported at all because they are
only used in the module they belong to.

Signed-off-by: frank zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5829
Reviewed-on: http://review.whamcloud.com/13324
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    3 ---
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |    2 --
 drivers/staging/lustre/lustre/ldlm/ldlm_pool.c     |    6 ------
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    3 ---
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    4 ----
 5 files changed, 0 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index 3c48b4f..ace8cb2 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -512,7 +512,6 @@ int ldlm_lock_change_resource(struct ldlm_namespace *ns, struct ldlm_lock *lock,
 
 	return 0;
 }
-EXPORT_SYMBOL(ldlm_lock_change_resource);
 
 /** \defgroup ldlm_handles LDLM HANDLES
  * Ways to get hold of locks without any addresses.
@@ -595,7 +594,6 @@ void ldlm_lock2desc(struct ldlm_lock *lock, struct ldlm_lock_desc *desc)
 				    &lock->l_policy_data,
 				    &desc->l_policy_data);
 }
-EXPORT_SYMBOL(ldlm_lock2desc);
 
 /**
  * Add a lock to list of conflicting locks to send AST to.
@@ -1147,7 +1145,6 @@ void ldlm_lock_fail_match_locked(struct ldlm_lock *lock)
 		wake_up_all(&lock->l_waitq);
 	}
 }
-EXPORT_SYMBOL(ldlm_lock_fail_match_locked);
 
 /**
  * Mark lock as "matchable" by OST.
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
index fde697e..c32b414 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
@@ -858,7 +858,6 @@ int ldlm_get_ref(void)
 
 	return rc;
 }
-EXPORT_SYMBOL(ldlm_get_ref);
 
 void ldlm_put_ref(void)
 {
@@ -875,7 +874,6 @@ void ldlm_put_ref(void)
 	}
 	mutex_unlock(&ldlm_ref_mutex);
 }
-EXPORT_SYMBOL(ldlm_put_ref);
 
 static ssize_t cancel_unused_locks_before_replay_show(struct kobject *kobj,
 						      struct attribute *attr,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
index 9a1136e..b29c956 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
@@ -684,7 +684,6 @@ int ldlm_pool_init(struct ldlm_pool *pl, struct ldlm_namespace *ns,
 
 	return rc;
 }
-EXPORT_SYMBOL(ldlm_pool_init);
 
 void ldlm_pool_fini(struct ldlm_pool *pl)
 {
@@ -698,7 +697,6 @@ void ldlm_pool_fini(struct ldlm_pool *pl)
 	 */
 	POISON(pl, 0x5a, sizeof(*pl));
 }
-EXPORT_SYMBOL(ldlm_pool_fini);
 
 /**
  * Add new taken ldlm lock \a lock into pool \a pl accounting.
@@ -724,7 +722,6 @@ void ldlm_pool_add(struct ldlm_pool *pl, struct ldlm_lock *lock)
 	 * with too long call paths.
 	 */
 }
-EXPORT_SYMBOL(ldlm_pool_add);
 
 /**
  * Remove ldlm lock \a lock from pool \a pl accounting.
@@ -743,7 +740,6 @@ void ldlm_pool_del(struct ldlm_pool *pl, struct ldlm_lock *lock)
 
 	lprocfs_counter_incr(pl->pl_stats, LDLM_POOL_CANCEL_STAT);
 }
-EXPORT_SYMBOL(ldlm_pool_del);
 
 /**
  * Returns current \a pl SLV.
@@ -1095,7 +1091,6 @@ int ldlm_pools_init(void)
 
 	return rc;
 }
-EXPORT_SYMBOL(ldlm_pools_init);
 
 void ldlm_pools_fini(void)
 {
@@ -1104,4 +1099,3 @@ void ldlm_pools_fini(void)
 
 	ldlm_pools_thread_stop();
 }
-EXPORT_SYMBOL(ldlm_pools_fini);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 35ba6f1..98730a3 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -1022,7 +1022,6 @@ int ldlm_cli_update_pool(struct ptlrpc_request *req)
 
 	return 0;
 }
-EXPORT_SYMBOL(ldlm_cli_update_pool);
 
 /**
  * Client side lock cancel.
@@ -1125,7 +1124,6 @@ int ldlm_cli_cancel_list_local(struct list_head *cancels, int count,
 
 	return count;
 }
-EXPORT_SYMBOL(ldlm_cli_cancel_list_local);
 
 /**
  * Cancel as many locks as possible w/o sending any RPCs (e.g. to write back
@@ -2048,4 +2046,3 @@ int ldlm_replay_locks(struct obd_import *imp)
 
 	return rc;
 }
-EXPORT_SYMBOL(ldlm_replay_locks);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index a09c25a..07cb955 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -999,7 +999,6 @@ void ldlm_namespace_get(struct ldlm_namespace *ns)
 {
 	atomic_inc(&ns->ns_bref);
 }
-EXPORT_SYMBOL(ldlm_namespace_get);
 
 /* This is only for callers that care about refcount */
 static int ldlm_namespace_get_return(struct ldlm_namespace *ns)
@@ -1014,7 +1013,6 @@ void ldlm_namespace_put(struct ldlm_namespace *ns)
 		spin_unlock(&ns->ns_lock);
 	}
 }
-EXPORT_SYMBOL(ldlm_namespace_put);
 
 /** Should be called with ldlm_namespace_lock(client) taken. */
 void ldlm_namespace_move_to_active_locked(struct ldlm_namespace *ns,
@@ -1323,7 +1321,6 @@ void ldlm_dump_all_namespaces(ldlm_side_t client, int level)
 
 	mutex_unlock(ldlm_namespace_lock(client));
 }
-EXPORT_SYMBOL(ldlm_dump_all_namespaces);
 
 static int ldlm_res_hash_dump(struct cfs_hash *hs, struct cfs_hash_bd *bd,
 			      struct hlist_node *hnode, void *arg)
@@ -1360,7 +1357,6 @@ void ldlm_namespace_dump(int level, struct ldlm_namespace *ns)
 	ns->ns_next_dump = cfs_time_shift(10);
 	spin_unlock(&ns->ns_lock);
 }
-EXPORT_SYMBOL(ldlm_namespace_dump);
 
 /**
  * Print information about all locks in this resource to debug log.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 06/41] staging: lustre: ldlm: remove unnecessary EXPORT_SYMBOL
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, frank zago,
	James Simmons

From: frank zago <fzago@cray.com>

A lot of symbols don't need to be exported at all because they are
only used in the module they belong to.

Signed-off-by: frank zago <fzago@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5829
Reviewed-on: http://review.whamcloud.com/13324
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |    3 ---
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |    2 --
 drivers/staging/lustre/lustre/ldlm/ldlm_pool.c     |    6 ------
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    3 ---
 drivers/staging/lustre/lustre/ldlm/ldlm_resource.c |    4 ----
 5 files changed, 0 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index 3c48b4f..ace8cb2 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -512,7 +512,6 @@ int ldlm_lock_change_resource(struct ldlm_namespace *ns, struct ldlm_lock *lock,
 
 	return 0;
 }
-EXPORT_SYMBOL(ldlm_lock_change_resource);
 
 /** \defgroup ldlm_handles LDLM HANDLES
  * Ways to get hold of locks without any addresses.
@@ -595,7 +594,6 @@ void ldlm_lock2desc(struct ldlm_lock *lock, struct ldlm_lock_desc *desc)
 				    &lock->l_policy_data,
 				    &desc->l_policy_data);
 }
-EXPORT_SYMBOL(ldlm_lock2desc);
 
 /**
  * Add a lock to list of conflicting locks to send AST to.
@@ -1147,7 +1145,6 @@ void ldlm_lock_fail_match_locked(struct ldlm_lock *lock)
 		wake_up_all(&lock->l_waitq);
 	}
 }
-EXPORT_SYMBOL(ldlm_lock_fail_match_locked);
 
 /**
  * Mark lock as "matchable" by OST.
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
index fde697e..c32b414 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
@@ -858,7 +858,6 @@ int ldlm_get_ref(void)
 
 	return rc;
 }
-EXPORT_SYMBOL(ldlm_get_ref);
 
 void ldlm_put_ref(void)
 {
@@ -875,7 +874,6 @@ void ldlm_put_ref(void)
 	}
 	mutex_unlock(&ldlm_ref_mutex);
 }
-EXPORT_SYMBOL(ldlm_put_ref);
 
 static ssize_t cancel_unused_locks_before_replay_show(struct kobject *kobj,
 						      struct attribute *attr,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
index 9a1136e..b29c956 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_pool.c
@@ -684,7 +684,6 @@ int ldlm_pool_init(struct ldlm_pool *pl, struct ldlm_namespace *ns,
 
 	return rc;
 }
-EXPORT_SYMBOL(ldlm_pool_init);
 
 void ldlm_pool_fini(struct ldlm_pool *pl)
 {
@@ -698,7 +697,6 @@ void ldlm_pool_fini(struct ldlm_pool *pl)
 	 */
 	POISON(pl, 0x5a, sizeof(*pl));
 }
-EXPORT_SYMBOL(ldlm_pool_fini);
 
 /**
  * Add new taken ldlm lock \a lock into pool \a pl accounting.
@@ -724,7 +722,6 @@ void ldlm_pool_add(struct ldlm_pool *pl, struct ldlm_lock *lock)
 	 * with too long call paths.
 	 */
 }
-EXPORT_SYMBOL(ldlm_pool_add);
 
 /**
  * Remove ldlm lock \a lock from pool \a pl accounting.
@@ -743,7 +740,6 @@ void ldlm_pool_del(struct ldlm_pool *pl, struct ldlm_lock *lock)
 
 	lprocfs_counter_incr(pl->pl_stats, LDLM_POOL_CANCEL_STAT);
 }
-EXPORT_SYMBOL(ldlm_pool_del);
 
 /**
  * Returns current \a pl SLV.
@@ -1095,7 +1091,6 @@ int ldlm_pools_init(void)
 
 	return rc;
 }
-EXPORT_SYMBOL(ldlm_pools_init);
 
 void ldlm_pools_fini(void)
 {
@@ -1104,4 +1099,3 @@ void ldlm_pools_fini(void)
 
 	ldlm_pools_thread_stop();
 }
-EXPORT_SYMBOL(ldlm_pools_fini);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 35ba6f1..98730a3 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -1022,7 +1022,6 @@ int ldlm_cli_update_pool(struct ptlrpc_request *req)
 
 	return 0;
 }
-EXPORT_SYMBOL(ldlm_cli_update_pool);
 
 /**
  * Client side lock cancel.
@@ -1125,7 +1124,6 @@ int ldlm_cli_cancel_list_local(struct list_head *cancels, int count,
 
 	return count;
 }
-EXPORT_SYMBOL(ldlm_cli_cancel_list_local);
 
 /**
  * Cancel as many locks as possible w/o sending any RPCs (e.g. to write back
@@ -2048,4 +2046,3 @@ int ldlm_replay_locks(struct obd_import *imp)
 
 	return rc;
 }
-EXPORT_SYMBOL(ldlm_replay_locks);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
index a09c25a..07cb955 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_resource.c
@@ -999,7 +999,6 @@ void ldlm_namespace_get(struct ldlm_namespace *ns)
 {
 	atomic_inc(&ns->ns_bref);
 }
-EXPORT_SYMBOL(ldlm_namespace_get);
 
 /* This is only for callers that care about refcount */
 static int ldlm_namespace_get_return(struct ldlm_namespace *ns)
@@ -1014,7 +1013,6 @@ void ldlm_namespace_put(struct ldlm_namespace *ns)
 		spin_unlock(&ns->ns_lock);
 	}
 }
-EXPORT_SYMBOL(ldlm_namespace_put);
 
 /** Should be called with ldlm_namespace_lock(client) taken. */
 void ldlm_namespace_move_to_active_locked(struct ldlm_namespace *ns,
@@ -1323,7 +1321,6 @@ void ldlm_dump_all_namespaces(ldlm_side_t client, int level)
 
 	mutex_unlock(ldlm_namespace_lock(client));
 }
-EXPORT_SYMBOL(ldlm_dump_all_namespaces);
 
 static int ldlm_res_hash_dump(struct cfs_hash *hs, struct cfs_hash_bd *bd,
 			      struct hlist_node *hnode, void *arg)
@@ -1360,7 +1357,6 @@ void ldlm_namespace_dump(int level, struct ldlm_namespace *ns)
 	ns->ns_next_dump = cfs_time_shift(10);
 	spin_unlock(&ns->ns_lock);
 }
-EXPORT_SYMBOL(ldlm_namespace_dump);
 
 /**
  * Print information about all locks in this resource to debug log.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 07/41] staging: lustre: llite: remove duplicate fiemap defines
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

 * replace struct ll_user_fiemap with struct fiemap
 * replace struct ll_fiemap_extent with struct fiemap_extent
 * remove kernel defined FIEMAP_EXTENT_* constants
 * remove kernel defined FIEMAP_FLAG_* flags
 * add member prefix for struct ll_fiemap_info_key

 * Add cl_object_operations::coo_fiemap().
 * Add cl_object_fiemap() to get FIEMAP mappings.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/12535
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6201
Reviewed-on: http://review.whamcloud.com/13608
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    9 +
 .../lustre/lustre/include/lustre/ll_fiemap.h       |   75 +---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    8 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |    1 -
 drivers/staging/lustre/lustre/llite/file.c         |  126 +----
 drivers/staging/lustre/lustre/lov/lov_obd.c        |  421 ----------------
 drivers/staging/lustre/lustre/lov/lov_object.c     |  504 +++++++++++++++++++-
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   32 ++
 drivers/staging/lustre/lustre/osc/osc_object.c     |   91 ++++-
 drivers/staging/lustre/lustre/osc/osc_request.c    |   98 ----
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |  134 +++---
 12 files changed, 742 insertions(+), 761 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index bf93c1e..3af9aa3 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -400,6 +400,12 @@ struct cl_object_operations {
 	 */
 	int (*coo_getstripe)(const struct lu_env *env, struct cl_object *obj,
 			     struct lov_user_md __user *lum);
+	/**
+	 * Get FIEMAP mapping from the object.
+	 */
+	int (*coo_fiemap)(const struct lu_env *env, struct cl_object *obj,
+			  struct ll_fiemap_info_key *fmkey,
+			  struct fiemap *fiemap, size_t *buflen);
 };
 
 /**
@@ -2184,6 +2190,9 @@ int cl_object_prune(const struct lu_env *env, struct cl_object *obj);
 void cl_object_kill(const struct lu_env *env, struct cl_object *obj);
 int  cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 			 struct lov_user_md __user *lum);
+int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
+		     struct ll_fiemap_info_key *fmkey, struct fiemap *fiemap,
+		     size_t *buflen);
 
 /**
  * Returns true, iff \a o0 and \a o1 are slices of the same object.
diff --git a/drivers/staging/lustre/lustre/include/lustre/ll_fiemap.h b/drivers/staging/lustre/lustre/include/lustre/ll_fiemap.h
index c2340d6..b8ad555 100644
--- a/drivers/staging/lustre/lustre/include/lustre/ll_fiemap.h
+++ b/drivers/staging/lustre/lustre/include/lustre/ll_fiemap.h
@@ -41,79 +41,24 @@
 #ifndef _LUSTRE_FIEMAP_H
 #define _LUSTRE_FIEMAP_H
 
-struct ll_fiemap_extent {
-	__u64 fe_logical;  /* logical offset in bytes for the start of
-			    * the extent from the beginning of the file
-			    */
-	__u64 fe_physical; /* physical offset in bytes for the start
-			    * of the extent from the beginning of the disk
-			    */
-	__u64 fe_length;   /* length in bytes for this extent */
-	__u64 fe_reserved64[2];
-	__u32 fe_flags;    /* FIEMAP_EXTENT_* flags for this extent */
-	__u32 fe_device;   /* device number for this extent */
-	__u32 fe_reserved[2];
-};
-
-struct ll_user_fiemap {
-	__u64 fm_start;  /* logical offset (inclusive) at
-			  * which to start mapping (in)
-			  */
-	__u64 fm_length; /* logical length of mapping which
-			  * userspace wants (in)
-			  */
-	__u32 fm_flags;  /* FIEMAP_FLAG_* flags for request (in/out) */
-	__u32 fm_mapped_extents;/* number of extents that were mapped (out) */
-	__u32 fm_extent_count;  /* size of fm_extents array (in) */
-	__u32 fm_reserved;
-	struct ll_fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
-};
-
-#define FIEMAP_MAX_OFFSET      (~0ULL)
+#ifndef __KERNEL__
+#include <stddef.h>
+#include <fiemap.h>
+#endif
 
-#define FIEMAP_FLAG_SYNC		0x00000001 /* sync file data before
-						    * map
-						    */
-#define FIEMAP_FLAG_XATTR		0x00000002 /* map extended attribute
-						    * tree
-						    */
-#define FIEMAP_EXTENT_LAST		0x00000001 /* Last extent in file. */
-#define FIEMAP_EXTENT_UNKNOWN		0x00000002 /* Data location unknown. */
-#define FIEMAP_EXTENT_DELALLOC		0x00000004 /* Location still pending.
-						    * Sets EXTENT_UNKNOWN.
-						    */
-#define FIEMAP_EXTENT_ENCODED		0x00000008 /* Data can not be read
-						    * while fs is unmounted
-						    */
-#define FIEMAP_EXTENT_DATA_ENCRYPTED	0x00000080 /* Data is encrypted by fs.
-						    * Sets EXTENT_NO_DIRECT.
-						    */
-#define FIEMAP_EXTENT_NOT_ALIGNED       0x00000100 /* Extent offsets may not be
-						    * block aligned.
-						    */
-#define FIEMAP_EXTENT_DATA_INLINE       0x00000200 /* Data mixed with metadata.
-						    * Sets EXTENT_NOT_ALIGNED.*/
-#define FIEMAP_EXTENT_DATA_TAIL		0x00000400 /* Multiple files in block.
-						    * Sets EXTENT_NOT_ALIGNED.
-						    */
-#define FIEMAP_EXTENT_UNWRITTEN		0x00000800 /* Space allocated, but
-						    * no data (i.e. zero).
-						    */
-#define FIEMAP_EXTENT_MERGED		0x00001000 /* File does not natively
-						    * support extents. Result
-						    * merged for efficiency.
-						    */
+/* XXX: We use fiemap_extent::fe_reserved[0] */
+#define fe_device	fe_reserved[0]
 
 static inline size_t fiemap_count_to_size(size_t extent_count)
 {
-	return (sizeof(struct ll_user_fiemap) + extent_count *
-					       sizeof(struct ll_fiemap_extent));
+	return sizeof(struct fiemap) + extent_count *
+				       sizeof(struct fiemap_extent);
 }
 
 static inline unsigned fiemap_size_to_count(size_t array_size)
 {
-	return ((array_size - sizeof(struct ll_user_fiemap)) /
-					       sizeof(struct ll_fiemap_extent));
+	return (array_size - sizeof(struct fiemap)) /
+		sizeof(struct fiemap_extent);
 }
 
 #define FIEMAP_FLAG_DEVICE_ORDER 0x40000000 /* return device ordered mapping */
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index d164545..4210716 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -3331,14 +3331,14 @@ struct ost_body {
 
 /* Key for FIEMAP to be used in get_info calls */
 struct ll_fiemap_info_key {
-	char    name[8];
-	struct  obdo oa;
-	struct  ll_user_fiemap fiemap;
+	char		lfik_name[8];
+	struct obdo	lfik_oa;
+	struct fiemap	lfik_fiemap;
 };
 
 void lustre_swab_ost_body(struct ost_body *b);
 void lustre_swab_ost_last_id(__u64 *id);
-void lustre_swab_fiemap(struct ll_user_fiemap *fiemap);
+void lustre_swab_fiemap(struct fiemap *fiemap);
 
 void lustre_swab_lov_user_md_v1(struct lov_user_md_v1 *lum);
 void lustre_swab_lov_user_md_v3(struct lov_user_md_v3 *lum);
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 6fc9855..dced31f 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -82,7 +82,6 @@ typedef struct stat     lstat_t;
 #define FSFILT_IOC_SETVERSION	     _IOW('f', 4, long)
 #define FSFILT_IOC_GETVERSION_OLD	 _IOR('v', 1, long)
 #define FSFILT_IOC_SETVERSION_OLD	 _IOW('v', 2, long)
-#define FSFILT_IOC_FIEMAP		 _IOWR('f', 11, struct ll_user_fiemap)
 #endif
 
 /* FIEMAP flags supported by Lustre */
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index b2058c6..9ca933f 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1545,15 +1545,17 @@ out:
 /**
  * Get size for inode for which FIEMAP mapping is requested.
  * Make the FIEMAP get_info call and returns the result.
+ *
+ * \param fiemap	kernel buffer to hold extens
+ * \param num_bytes	kernel buffer size
  */
-static int ll_do_fiemap(struct inode *inode, struct ll_user_fiemap *fiemap,
+static int ll_do_fiemap(struct inode *inode, struct fiemap *fiemap,
 			size_t num_bytes)
 {
-	struct obd_export *exp = ll_i2dtexp(inode);
-	struct lov_stripe_md *lsm = NULL;
-	struct ll_fiemap_info_key fm_key = { .name = KEY_FIEMAP, };
-	__u32 vallen = num_bytes;
-	int rc;
+	struct ll_fiemap_info_key fmkey = { .lfik_name = KEY_FIEMAP, };
+	struct lu_env *env;
+	int refcheck;
+	int rc = 0;
 
 	/* Checks for fiemap flags */
 	if (fiemap->fm_flags & ~LUSTRE_FIEMAP_FLAGS_COMPAT) {
@@ -1568,21 +1570,9 @@ static int ll_do_fiemap(struct inode *inode, struct ll_user_fiemap *fiemap,
 			return rc;
 	}
 
-	lsm = ccc_inode_lsm_get(inode);
-	if (!lsm)
-		return -ENOENT;
-
-	/* If the stripe_count > 1 and the application does not understand
-	 * DEVICE_ORDER flag, then it cannot interpret the extents correctly.
-	 */
-	if (lsm->lsm_stripe_count > 1 &&
-	    !(fiemap->fm_flags & FIEMAP_FLAG_DEVICE_ORDER)) {
-		rc = -EOPNOTSUPP;
-		goto out;
-	}
-
-	fm_key.oa.o_oi = lsm->lsm_oi;
-	fm_key.oa.o_valid = OBD_MD_FLID | OBD_MD_FLGROUP;
+	env = cl_env_get(&refcheck);
+	if (IS_ERR(env))
+		return PTR_ERR(env);
 
 	if (i_size_read(inode) == 0) {
 		rc = ll_glimpse_size(inode);
@@ -1590,24 +1580,23 @@ static int ll_do_fiemap(struct inode *inode, struct ll_user_fiemap *fiemap,
 			goto out;
 	}
 
-	obdo_from_inode(&fm_key.oa, inode, OBD_MD_FLSIZE);
-	obdo_set_parent_fid(&fm_key.oa, &ll_i2info(inode)->lli_fid);
+	fmkey.lfik_oa.o_valid = OBD_MD_FLID | OBD_MD_FLGROUP;
+	obdo_from_inode(&fmkey.lfik_oa, inode, OBD_MD_FLSIZE);
+	obdo_set_parent_fid(&fmkey.lfik_oa, &ll_i2info(inode)->lli_fid);
+
 	/* If filesize is 0, then there would be no objects for mapping */
-	if (fm_key.oa.o_size == 0) {
+	if (fmkey.lfik_oa.o_size == 0) {
 		fiemap->fm_mapped_extents = 0;
 		rc = 0;
 		goto out;
 	}
 
-	memcpy(&fm_key.fiemap, fiemap, sizeof(*fiemap));
-
-	rc = obd_get_info(NULL, exp, sizeof(fm_key), &fm_key, &vallen,
-			  fiemap, lsm);
-	if (rc)
-		CERROR("obd_get_info failed: rc = %d\n", rc);
+	memcpy(&fmkey.lfik_fiemap, fiemap, sizeof(*fiemap));
 
+	rc = cl_object_fiemap(env, ll_i2info(inode)->lli_clob,
+			      &fmkey, fiemap, &num_bytes);
 out:
-	ccc_inode_lsm_put(inode, lsm);
+	cl_env_put(env, &refcheck);
 	return rc;
 }
 
@@ -1655,68 +1644,6 @@ gf_free:
 	return rc;
 }
 
-static int ll_ioctl_fiemap(struct inode *inode, unsigned long arg)
-{
-	struct ll_user_fiemap *fiemap_s;
-	size_t num_bytes, ret_bytes;
-	unsigned int extent_count;
-	int rc = 0;
-
-	/* Get the extent count so we can calculate the size of
-	 * required fiemap buffer
-	 */
-	if (get_user(extent_count,
-		     &((struct ll_user_fiemap __user *)arg)->fm_extent_count))
-		return -EFAULT;
-
-	if (extent_count >=
-	    (SIZE_MAX - sizeof(*fiemap_s)) / sizeof(struct ll_fiemap_extent))
-		return -EINVAL;
-	num_bytes = sizeof(*fiemap_s) + (extent_count *
-					 sizeof(struct ll_fiemap_extent));
-
-	fiemap_s = libcfs_kvzalloc(num_bytes, GFP_NOFS);
-	if (!fiemap_s)
-		return -ENOMEM;
-
-	/* get the fiemap value */
-	if (copy_from_user(fiemap_s, (struct ll_user_fiemap __user *)arg,
-			   sizeof(*fiemap_s))) {
-		rc = -EFAULT;
-		goto error;
-	}
-
-	/* If fm_extent_count is non-zero, read the first extent since
-	 * it is used to calculate end_offset and device from previous
-	 * fiemap call.
-	 */
-	if (extent_count) {
-		if (copy_from_user(&fiemap_s->fm_extents[0],
-				   (char __user *)arg + sizeof(*fiemap_s),
-				   sizeof(struct ll_fiemap_extent))) {
-			rc = -EFAULT;
-			goto error;
-		}
-	}
-
-	rc = ll_do_fiemap(inode, fiemap_s, num_bytes);
-	if (rc)
-		goto error;
-
-	ret_bytes = sizeof(struct ll_user_fiemap);
-
-	if (extent_count != 0)
-		ret_bytes += (fiemap_s->fm_mapped_extents *
-				 sizeof(struct ll_fiemap_extent));
-
-	if (copy_to_user((void __user *)arg, fiemap_s, ret_bytes))
-		rc = -EFAULT;
-
-error:
-	kvfree(fiemap_s);
-	return rc;
-}
-
 /*
  * Read the data_version for inode.
  *
@@ -2158,8 +2085,6 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	case LL_IOC_LOV_GETSTRIPE:
 		return ll_file_getstripe(inode,
 					 (struct lov_user_md __user *)arg);
-	case FSFILT_IOC_FIEMAP:
-		return ll_ioctl_fiemap(inode, arg);
 	case FSFILT_IOC_GETFLAGS:
 	case FSFILT_IOC_SETFLAGS:
 		return ll_iocontrol(inode, file, cmd, arg);
@@ -3061,13 +2986,12 @@ static int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 {
 	int rc;
 	size_t num_bytes;
-	struct ll_user_fiemap *fiemap;
+	struct fiemap *fiemap;
 	unsigned int extent_count = fieinfo->fi_extents_max;
 
 	num_bytes = sizeof(*fiemap) + (extent_count *
-				       sizeof(struct ll_fiemap_extent));
+				       sizeof(struct fiemap_extent));
 	fiemap = libcfs_kvzalloc(num_bytes, GFP_NOFS);
-
 	if (!fiemap)
 		return -ENOMEM;
 
@@ -3075,9 +2999,10 @@ static int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 	fiemap->fm_extent_count = fieinfo->fi_extents_max;
 	fiemap->fm_start = start;
 	fiemap->fm_length = len;
+
 	if (extent_count > 0 &&
 	    copy_from_user(&fiemap->fm_extents[0], fieinfo->fi_extents_start,
-			   sizeof(struct ll_fiemap_extent)) != 0) {
+			   sizeof(struct fiemap_extent))) {
 		rc = -EFAULT;
 		goto out;
 	}
@@ -3089,11 +3014,10 @@ static int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 	if (extent_count > 0 &&
 	    copy_to_user(fieinfo->fi_extents_start, &fiemap->fm_extents[0],
 			 fiemap->fm_mapped_extents *
-			 sizeof(struct ll_fiemap_extent)) != 0) {
+			 sizeof(struct fiemap_extent))) {
 		rc = -EFAULT;
 		goto out;
 	}
-
 out:
 	kvfree(fiemap);
 	return rc;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index b23016f..02c7087 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -51,7 +51,6 @@
 #include "../include/lprocfs_status.h"
 #include "../include/lustre_param.h"
 #include "../include/cl_object.h"
-#include "../include/lustre/ll_fiemap.h"
 #include "../include/lustre_fid.h"
 
 #include "lov_internal.h"
@@ -1391,423 +1390,6 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	return rc;
 }
 
-#define FIEMAP_BUFFER_SIZE 4096
-
-/**
- * Non-zero fe_logical indicates that this is a continuation FIEMAP
- * call. The local end offset and the device are sent in the first
- * fm_extent. This function calculates the stripe number from the index.
- * This function returns a stripe_no on which mapping is to be restarted.
- *
- * This function returns fm_end_offset which is the in-OST offset at which
- * mapping should be restarted. If fm_end_offset=0 is returned then caller
- * will re-calculate proper offset in next stripe.
- * Note that the first extent is passed to lov_get_info via the value field.
- *
- * \param fiemap fiemap request header
- * \param lsm striping information for the file
- * \param fm_start logical start of mapping
- * \param fm_end logical end of mapping
- * \param start_stripe starting stripe will be returned in this
- */
-static u64 fiemap_calc_fm_end_offset(struct ll_user_fiemap *fiemap,
-				     struct lov_stripe_md *lsm, u64 fm_start,
-				     u64 fm_end, int *start_stripe)
-{
-	u64 local_end = fiemap->fm_extents[0].fe_logical;
-	u64 lun_start, lun_end;
-	u64 fm_end_offset;
-	int stripe_no = -1, i;
-
-	if (fiemap->fm_extent_count == 0 ||
-	    fiemap->fm_extents[0].fe_logical == 0)
-		return 0;
-
-	/* Find out stripe_no from ost_index saved in the fe_device */
-	for (i = 0; i < lsm->lsm_stripe_count; i++) {
-		struct lov_oinfo *oinfo = lsm->lsm_oinfo[i];
-
-		if (lov_oinfo_is_dummy(oinfo))
-			continue;
-
-		if (oinfo->loi_ost_idx == fiemap->fm_extents[0].fe_device) {
-			stripe_no = i;
-			break;
-		}
-	}
-	if (stripe_no == -1)
-		return -EINVAL;
-
-	/* If we have finished mapping on previous device, shift logical
-	 * offset to start of next device
-	 */
-	if ((lov_stripe_intersects(lsm, stripe_no, fm_start, fm_end,
-				   &lun_start, &lun_end)) != 0 &&
-				   local_end < lun_end) {
-		fm_end_offset = local_end;
-		*start_stripe = stripe_no;
-	} else {
-		/* This is a special value to indicate that caller should
-		 * calculate offset in next stripe.
-		 */
-		fm_end_offset = 0;
-		*start_stripe = (stripe_no + 1) % lsm->lsm_stripe_count;
-	}
-
-	return fm_end_offset;
-}
-
-/**
- * We calculate on which OST the mapping will end. If the length of mapping
- * is greater than (stripe_size * stripe_count) then the last_stripe will
- * will be one just before start_stripe. Else we check if the mapping
- * intersects each OST and find last_stripe.
- * This function returns the last_stripe and also sets the stripe_count
- * over which the mapping is spread
- *
- * \param lsm striping information for the file
- * \param fm_start logical start of mapping
- * \param fm_end logical end of mapping
- * \param start_stripe starting stripe of the mapping
- * \param stripe_count the number of stripes across which to map is returned
- *
- * \retval last_stripe return the last stripe of the mapping
- */
-static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, u64 fm_start,
-				   u64 fm_end, int start_stripe,
-				   int *stripe_count)
-{
-	int last_stripe;
-	u64 obd_start, obd_end;
-	int i, j;
-
-	if (fm_end - fm_start > lsm->lsm_stripe_size * lsm->lsm_stripe_count) {
-		last_stripe = start_stripe < 1 ? lsm->lsm_stripe_count - 1 :
-							      start_stripe - 1;
-		*stripe_count = lsm->lsm_stripe_count;
-	} else {
-		for (j = 0, i = start_stripe; j < lsm->lsm_stripe_count;
-		     i = (i + 1) % lsm->lsm_stripe_count, j++) {
-			if ((lov_stripe_intersects(lsm, i, fm_start, fm_end,
-						   &obd_start, &obd_end)) == 0)
-				break;
-		}
-		*stripe_count = j;
-		last_stripe = (start_stripe + j - 1) % lsm->lsm_stripe_count;
-	}
-
-	return last_stripe;
-}
-
-/**
- * Set fe_device and copy extents from local buffer into main return buffer.
- *
- * \param fiemap fiemap request header
- * \param lcl_fm_ext array of local fiemap extents to be copied
- * \param ost_index OST index to be written into the fm_device field for each
-		    extent
- * \param ext_count number of extents to be copied
- * \param current_extent where to start copying in main extent array
- */
-static void fiemap_prepare_and_copy_exts(struct ll_user_fiemap *fiemap,
-					 struct ll_fiemap_extent *lcl_fm_ext,
-					 int ost_index, unsigned int ext_count,
-					 int current_extent)
-{
-	char *to;
-	int ext;
-
-	for (ext = 0; ext < ext_count; ext++) {
-		lcl_fm_ext[ext].fe_device = ost_index;
-		lcl_fm_ext[ext].fe_flags |= FIEMAP_EXTENT_NET;
-	}
-
-	/* Copy fm_extent's from fm_local to return buffer */
-	to = (char *)fiemap + fiemap_count_to_size(current_extent);
-	memcpy(to, lcl_fm_ext, ext_count * sizeof(struct ll_fiemap_extent));
-}
-
-/**
- * Break down the FIEMAP request and send appropriate calls to individual OSTs.
- * This also handles the restarting of FIEMAP calls in case mapping overflows
- * the available number of extents in single call.
- */
-static int lov_fiemap(struct lov_obd *lov, __u32 keylen, void *key,
-		      __u32 *vallen, void *val, struct lov_stripe_md *lsm)
-{
-	struct ll_fiemap_info_key *fm_key = key;
-	struct ll_user_fiemap *fiemap = val;
-	struct ll_user_fiemap *fm_local = NULL;
-	struct ll_fiemap_extent *lcl_fm_ext;
-	int count_local;
-	unsigned int get_num_extents = 0;
-	int ost_index = 0, actual_start_stripe, start_stripe;
-	u64 fm_start, fm_end, fm_length, fm_end_offset;
-	u64 curr_loc;
-	int current_extent = 0, rc = 0, i;
-	/* Whether have we collected enough extents */
-	bool enough = false;
-	int ost_eof = 0; /* EOF for object */
-	int ost_done = 0; /* done with required mapping for this OST? */
-	int last_stripe;
-	int cur_stripe = 0, cur_stripe_wrap = 0, stripe_count;
-	unsigned int buffer_size = FIEMAP_BUFFER_SIZE;
-
-	if (!lsm_has_objects(lsm)) {
-		if (lsm && lsm_is_released(lsm) && (fm_key->fiemap.fm_start <
-		    fm_key->oa.o_size)) {
-			/*
-			 * released file, return a minimal FIEMAP if
-			 * request fits in file-size.
-			 */
-			fiemap->fm_mapped_extents = 1;
-			fiemap->fm_extents[0].fe_logical =
-					fm_key->fiemap.fm_start;
-			if (fm_key->fiemap.fm_start + fm_key->fiemap.fm_length <
-			    fm_key->oa.o_size) {
-				fiemap->fm_extents[0].fe_length =
-					fm_key->fiemap.fm_length;
-			} else {
-				fiemap->fm_extents[0].fe_length =
-					fm_key->oa.o_size - fm_key->fiemap.fm_start;
-				fiemap->fm_extents[0].fe_flags |=
-						(FIEMAP_EXTENT_UNKNOWN |
-						 FIEMAP_EXTENT_LAST);
-			}
-		}
-		rc = 0;
-		goto out;
-	}
-
-	if (fiemap_count_to_size(fm_key->fiemap.fm_extent_count) < buffer_size)
-		buffer_size = fiemap_count_to_size(fm_key->fiemap.fm_extent_count);
-
-	fm_local = libcfs_kvzalloc(buffer_size, GFP_NOFS);
-	if (!fm_local) {
-		rc = -ENOMEM;
-		goto out;
-	}
-	lcl_fm_ext = &fm_local->fm_extents[0];
-
-	count_local = fiemap_size_to_count(buffer_size);
-
-	memcpy(fiemap, &fm_key->fiemap, sizeof(*fiemap));
-	fm_start = fiemap->fm_start;
-	fm_length = fiemap->fm_length;
-	/* Calculate start stripe, last stripe and length of mapping */
-	start_stripe = lov_stripe_number(lsm, fm_start);
-	actual_start_stripe = start_stripe;
-	fm_end = (fm_length == ~0ULL ? fm_key->oa.o_size :
-						fm_start + fm_length - 1);
-	/* If fm_length != ~0ULL but fm_start+fm_length-1 exceeds file size */
-	if (fm_end > fm_key->oa.o_size)
-		fm_end = fm_key->oa.o_size;
-
-	last_stripe = fiemap_calc_last_stripe(lsm, fm_start, fm_end,
-					      actual_start_stripe,
-					      &stripe_count);
-
-	fm_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, fm_start,
-						  fm_end, &start_stripe);
-	if (fm_end_offset == -EINVAL) {
-		rc = -EINVAL;
-		goto out;
-	}
-
-	if (fiemap_count_to_size(fiemap->fm_extent_count) > *vallen)
-		fiemap->fm_extent_count = fiemap_size_to_count(*vallen);
-	if (fiemap->fm_extent_count == 0) {
-		get_num_extents = 1;
-		count_local = 0;
-	}
-	/* Check each stripe */
-	for (cur_stripe = start_stripe, i = 0; i < stripe_count;
-	     i++, cur_stripe = (cur_stripe + 1) % lsm->lsm_stripe_count) {
-		u64 req_fm_len; /* Stores length of required mapping */
-		u64 len_mapped_single_call;
-		u64 lun_start, lun_end, obd_object_end;
-		unsigned int ext_count;
-
-		cur_stripe_wrap = cur_stripe;
-
-		/* Find out range of mapping on this stripe */
-		if ((lov_stripe_intersects(lsm, cur_stripe, fm_start, fm_end,
-					   &lun_start, &obd_object_end)) == 0)
-			continue;
-
-		if (lov_oinfo_is_dummy(lsm->lsm_oinfo[cur_stripe])) {
-			rc = -EIO;
-			goto out;
-		}
-
-		/* If this is a continuation FIEMAP call and we are on
-		 * starting stripe then lun_start needs to be set to
-		 * fm_end_offset
-		 */
-		if (fm_end_offset != 0 && cur_stripe == start_stripe)
-			lun_start = fm_end_offset;
-
-		if (fm_length != ~0ULL) {
-			/* Handle fm_start + fm_length overflow */
-			if (fm_start + fm_length < fm_start)
-				fm_length = ~0ULL - fm_start;
-			lun_end = lov_size_to_stripe(lsm, fm_start + fm_length,
-						     cur_stripe);
-		} else {
-			lun_end = ~0ULL;
-		}
-
-		if (lun_start == lun_end)
-			continue;
-
-		req_fm_len = obd_object_end - lun_start;
-		fm_local->fm_length = 0;
-		len_mapped_single_call = 0;
-
-		/* If the output buffer is very large and the objects have many
-		 * extents we may need to loop on a single OST repeatedly
-		 */
-		ost_eof = 0;
-		ost_done = 0;
-		do {
-			if (get_num_extents == 0) {
-				/* Don't get too many extents. */
-				if (current_extent + count_local >
-				    fiemap->fm_extent_count)
-					count_local = fiemap->fm_extent_count -
-								 current_extent;
-			}
-
-			lun_start += len_mapped_single_call;
-			fm_local->fm_length = req_fm_len - len_mapped_single_call;
-			req_fm_len = fm_local->fm_length;
-			fm_local->fm_extent_count = enough ? 1 : count_local;
-			fm_local->fm_mapped_extents = 0;
-			fm_local->fm_flags = fiemap->fm_flags;
-
-			fm_key->oa.o_oi = lsm->lsm_oinfo[cur_stripe]->loi_oi;
-			ost_index = lsm->lsm_oinfo[cur_stripe]->loi_ost_idx;
-
-			if (ost_index < 0 ||
-			    ost_index >= lov->desc.ld_tgt_count) {
-				rc = -EINVAL;
-				goto out;
-			}
-
-			/* If OST is inactive, return extent with UNKNOWN flag */
-			if (!lov->lov_tgts[ost_index]->ltd_active) {
-				fm_local->fm_flags |= FIEMAP_EXTENT_LAST;
-				fm_local->fm_mapped_extents = 1;
-
-				lcl_fm_ext[0].fe_logical = lun_start;
-				lcl_fm_ext[0].fe_length = obd_object_end -
-								      lun_start;
-				lcl_fm_ext[0].fe_flags |= FIEMAP_EXTENT_UNKNOWN;
-
-				goto inactive_tgt;
-			}
-
-			fm_local->fm_start = lun_start;
-			fm_local->fm_flags &= ~FIEMAP_FLAG_DEVICE_ORDER;
-			memcpy(&fm_key->fiemap, fm_local, sizeof(*fm_local));
-			*vallen = fiemap_count_to_size(fm_local->fm_extent_count);
-			rc = obd_get_info(NULL,
-					  lov->lov_tgts[ost_index]->ltd_exp,
-					  keylen, key, vallen, fm_local, lsm);
-			if (rc != 0)
-				goto out;
-
-inactive_tgt:
-			ext_count = fm_local->fm_mapped_extents;
-			if (ext_count == 0) {
-				ost_done = 1;
-				/* If last stripe has hole at the end,
-				 * then we need to return
-				 */
-				if (cur_stripe_wrap == last_stripe) {
-					fiemap->fm_mapped_extents = 0;
-					goto finish;
-				}
-				break;
-			} else if (enough) {
-				/*
-				 * We've collected enough extents and there are
-				 * more extents after it.
-				 */
-				goto finish;
-			}
-
-			/* If we just need num of extents then go to next device */
-			if (get_num_extents) {
-				current_extent += ext_count;
-				break;
-			}
-
-			len_mapped_single_call =
-				lcl_fm_ext[ext_count - 1].fe_logical -
-				lun_start + lcl_fm_ext[ext_count - 1].fe_length;
-
-			/* Have we finished mapping on this device? */
-			if (req_fm_len <= len_mapped_single_call)
-				ost_done = 1;
-
-			/* Clear the EXTENT_LAST flag which can be present on
-			 * last extent
-			 */
-			if (lcl_fm_ext[ext_count - 1].fe_flags &
-			    FIEMAP_EXTENT_LAST)
-				lcl_fm_ext[ext_count - 1].fe_flags &=
-							    ~FIEMAP_EXTENT_LAST;
-
-			curr_loc = lov_stripe_size(lsm,
-					lcl_fm_ext[ext_count - 1].fe_logical +
-					lcl_fm_ext[ext_count - 1].fe_length,
-					cur_stripe);
-			if (curr_loc >= fm_key->oa.o_size)
-				ost_eof = 1;
-
-			fiemap_prepare_and_copy_exts(fiemap, lcl_fm_ext,
-						     ost_index, ext_count,
-						     current_extent);
-
-			current_extent += ext_count;
-
-			/* Ran out of available extents? */
-			if (current_extent >= fiemap->fm_extent_count)
-				enough = true;
-		} while (ost_done == 0 && ost_eof == 0);
-
-		if (cur_stripe_wrap == last_stripe)
-			goto finish;
-	}
-
-finish:
-	/* Indicate that we are returning device offsets unless file just has
-	 * single stripe
-	 */
-	if (lsm->lsm_stripe_count > 1)
-		fiemap->fm_flags |= FIEMAP_FLAG_DEVICE_ORDER;
-
-	if (get_num_extents)
-		goto skip_last_device_calc;
-
-	/* Check if we have reached the last stripe and whether mapping for that
-	 * stripe is done.
-	 */
-	if (cur_stripe_wrap == last_stripe) {
-		if (ost_done || ost_eof)
-			fiemap->fm_extents[current_extent - 1].fe_flags |=
-							     FIEMAP_EXTENT_LAST;
-	}
-
-skip_last_device_calc:
-	fiemap->fm_mapped_extents = current_extent;
-
-out:
-	kvfree(fm_local);
-	return rc;
-}
-
 static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
 			__u32 keylen, void *key, __u32 *vallen, void *val,
 			struct lov_stripe_md *lsm)
@@ -1827,9 +1409,6 @@ static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
 
 		rc = 0;
 		goto out;
-	} else if (KEY_IS(KEY_FIEMAP)) {
-		rc = lov_fiemap(lov, keylen, key, vallen, val, lsm);
-		goto out;
 	} else if (KEY_IS(KEY_TGT_COUNT)) {
 		*((int *)val) = lov->desc.ld_tgt_count;
 		rc = 0;
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 52f7363..07bef44 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -313,6 +313,40 @@ static int lov_init_released(const struct lu_env *env,
 	return 0;
 }
 
+static struct cl_object *lov_find_subobj(const struct lu_env *env,
+					 struct lov_object *lov,
+					 struct lov_stripe_md *lsm,
+					 int stripe_idx)
+{
+	struct lov_device *dev = lu2lov_dev(lov2lu(lov)->lo_dev);
+	struct lov_oinfo *oinfo = lsm->lsm_oinfo[stripe_idx];
+	struct lov_thread_info *lti = lov_env_info(env);
+	struct lu_fid *ofid = &lti->lti_fid;
+	struct cl_device *subdev;
+	struct cl_object *result;
+	int ost_idx;
+	int rc;
+
+	if (lov->lo_type != LLT_RAID0) {
+		result = NULL;
+		goto out;
+	}
+
+	ost_idx = oinfo->loi_ost_idx;
+	rc = ostid_to_fid(ofid, &oinfo->loi_oi, ost_idx);
+	if (rc) {
+		result = NULL;
+		goto out;
+	}
+
+	subdev = lovsub2cl_dev(dev->ld_target[ost_idx]);
+	result = lov_sub_find(env, subdev, ofid, NULL);
+out:
+	if (!result)
+		result = ERR_PTR(-EINVAL);
+	return result;
+}
+
 static int lov_delete_empty(const struct lu_env *env, struct lov_object *lov,
 			    union lov_layout_state *state)
 {
@@ -911,6 +945,473 @@ int lov_lock_init(const struct lu_env *env, struct cl_object *obj,
 				    io);
 }
 
+/**
+ * We calculate on which OST the mapping will end. If the length of mapping
+ * is greater than (stripe_size * stripe_count) then the last_stripe will
+ * will be one just before start_stripe. Else we check if the mapping
+ * intersects each OST and find last_stripe.
+ * This function returns the last_stripe and also sets the stripe_count
+ * over which the mapping is spread
+ *
+ * \param lsm [in]		striping information for the file
+ * \param fm_start [in]		logical start of mapping
+ * \param fm_end [in]		logical end of mapping
+ * \param start_stripe [in]	starting stripe of the mapping
+ * \param stripe_count [out]	the number of stripes across which to map is
+ *				returned
+ *
+ * \retval last_stripe		return the last stripe of the mapping
+ */
+static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm,
+				   loff_t fm_start, loff_t fm_end,
+				   int start_stripe, int *stripe_count)
+{
+	int last_stripe;
+	loff_t obd_start;
+	loff_t obd_end;
+	int i, j;
+
+	if (fm_end - fm_start > lsm->lsm_stripe_size * lsm->lsm_stripe_count) {
+		last_stripe = (start_stripe < 1 ? lsm->lsm_stripe_count - 1 :
+			       start_stripe - 1);
+		*stripe_count = lsm->lsm_stripe_count;
+	} else {
+		for (j = 0, i = start_stripe; j < lsm->lsm_stripe_count;
+		     i = (i + 1) % lsm->lsm_stripe_count, j++) {
+			if (!(lov_stripe_intersects(lsm, i, fm_start, fm_end,
+						    &obd_start, &obd_end)))
+				break;
+		}
+		*stripe_count = j;
+		last_stripe = (start_stripe + j - 1) % lsm->lsm_stripe_count;
+	}
+
+	return last_stripe;
+}
+
+/**
+ * Set fe_device and copy extents from local buffer into main return buffer.
+ *
+ * \param fiemap [out]		fiemap to hold all extents
+ * \param lcl_fm_ext [in]	array of fiemap extents get from OSC layer
+ * \param ost_index [in]	OST index to be written into the fm_device
+ *				field for each extent
+ * \param ext_count [in]	number of extents to be copied
+ * \param current_extent [in]	where to start copying in the extent array
+ */
+static void fiemap_prepare_and_copy_exts(struct fiemap *fiemap,
+					 struct fiemap_extent *lcl_fm_ext,
+					 int ost_index, unsigned int ext_count,
+					 int current_extent)
+{
+	unsigned int ext;
+	char *to;
+
+	for (ext = 0; ext < ext_count; ext++) {
+		lcl_fm_ext[ext].fe_device = ost_index;
+		lcl_fm_ext[ext].fe_flags |= FIEMAP_EXTENT_NET;
+	}
+
+	/* Copy fm_extent's from fm_local to return buffer */
+	to = (char *)fiemap + fiemap_count_to_size(current_extent);
+	memcpy(to, lcl_fm_ext, ext_count * sizeof(struct fiemap_extent));
+}
+
+#define FIEMAP_BUFFER_SIZE 4096
+
+/**
+ * Non-zero fe_logical indicates that this is a continuation FIEMAP
+ * call. The local end offset and the device are sent in the first
+ * fm_extent. This function calculates the stripe number from the index.
+ * This function returns a stripe_no on which mapping is to be restarted.
+ *
+ * This function returns fm_end_offset which is the in-OST offset at which
+ * mapping should be restarted. If fm_end_offset=0 is returned then caller
+ * will re-calculate proper offset in next stripe.
+ * Note that the first extent is passed to lov_get_info via the value field.
+ *
+ * \param fiemap [in]		fiemap request header
+ * \param lsm [in]		striping information for the file
+ * \param fm_start [in]		logical start of mapping
+ * \param fm_end [in]		logical end of mapping
+ * \param start_stripe [out]	starting stripe will be returned in this
+ */
+static loff_t fiemap_calc_fm_end_offset(struct fiemap *fiemap,
+					struct lov_stripe_md *lsm,
+					loff_t fm_start, loff_t fm_end,
+					int *start_stripe)
+{
+	loff_t local_end = fiemap->fm_extents[0].fe_logical;
+	loff_t lun_start, lun_end;
+	loff_t fm_end_offset;
+	int stripe_no = -1;
+	int i;
+
+	if (!fiemap->fm_extent_count || !fiemap->fm_extents[0].fe_logical)
+		return 0;
+
+	/* Find out stripe_no from ost_index saved in the fe_device */
+	for (i = 0; i < lsm->lsm_stripe_count; i++) {
+		struct lov_oinfo *oinfo = lsm->lsm_oinfo[i];
+
+		if (lov_oinfo_is_dummy(oinfo))
+			continue;
+
+		if (oinfo->loi_ost_idx == fiemap->fm_extents[0].fe_device) {
+			stripe_no = i;
+			break;
+		}
+	}
+
+	if (stripe_no == -1)
+		return -EINVAL;
+
+	/*
+	 * If we have finished mapping on previous device, shift logical
+	 * offset to start of next device
+	 */
+	if (lov_stripe_intersects(lsm, stripe_no, fm_start, fm_end,
+				  &lun_start, &lun_end) &&
+	    local_end < lun_end) {
+		fm_end_offset = local_end;
+		*start_stripe = stripe_no;
+	} else {
+		/* This is a special value to indicate that caller should
+		 * calculate offset in next stripe.
+		 */
+		fm_end_offset = 0;
+		*start_stripe = (stripe_no + 1) % lsm->lsm_stripe_count;
+	}
+
+	return fm_end_offset;
+}
+
+/**
+ * Break down the FIEMAP request and send appropriate calls to individual OSTs.
+ * This also handles the restarting of FIEMAP calls in case mapping overflows
+ * the available number of extents in single call.
+ *
+ * \param env [in]		lustre environment
+ * \param obj [in]		file object
+ * \param fmkey [in]		fiemap request header and other info
+ * \param fiemap [out]		fiemap buffer holding retrived map extents
+ * \param buflen [in/out]	max buffer length of @fiemap, when iterate
+ *				each OST, it is used to limit max map needed
+ * \retval 0	success
+ * \retval < 0	error
+ */
+static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
+			     struct ll_fiemap_info_key *fmkey,
+			     struct fiemap *fiemap, size_t *buflen)
+{
+	struct lov_obd *lov = lu2lov_dev(obj->co_lu.lo_dev)->ld_lov;
+	unsigned int buffer_size = FIEMAP_BUFFER_SIZE;
+	struct fiemap_extent *lcl_fm_ext;
+	struct cl_object *subobj = NULL;
+	struct fiemap *fm_local = NULL;
+	struct lov_stripe_md *lsm;
+	loff_t fm_start;
+	loff_t fm_end;
+	loff_t fm_length;
+	loff_t fm_end_offset;
+	int count_local;
+	int ost_index = 0;
+	int start_stripe;
+	int current_extent = 0;
+	int rc = 0;
+	int last_stripe;
+	int cur_stripe = 0;
+	int cur_stripe_wrap = 0;
+	int stripe_count;
+	/* Whether have we collected enough extents */
+	bool enough = false;
+	/* EOF for object */
+	bool ost_eof = false;
+	/* done with required mapping for this OST? */
+	bool ost_done = false;
+
+	lsm = lov_lsm_addref(cl2lov(obj));
+	if (!lsm)
+		return -ENODATA;
+
+	/**
+	 * If the stripe_count > 1 and the application does not understand
+	 * DEVICE_ORDER flag, it cannot interpret the extents correctly.
+	 */
+	if (lsm->lsm_stripe_count > 1 &&
+	    !(fiemap->fm_flags & FIEMAP_FLAG_DEVICE_ORDER)) {
+		rc = -ENOTSUPP;
+		goto out;
+	}
+
+	if (lsm_is_released(lsm)) {
+		if (fiemap->fm_start < fmkey->lfik_oa.o_size) {
+			/**
+			 * released file, return a minimal FIEMAP if
+			 * request fits in file-size.
+			 */
+			fiemap->fm_mapped_extents = 1;
+			fiemap->fm_extents[0].fe_logical = fiemap->fm_start;
+			if (fiemap->fm_start + fiemap->fm_length <
+			    fmkey->lfik_oa.o_size)
+				fiemap->fm_extents[0].fe_length =
+					 fiemap->fm_length;
+			else
+				fiemap->fm_extents[0].fe_length =
+					fmkey->lfik_oa.o_size -
+					fiemap->fm_start;
+			fiemap->fm_extents[0].fe_flags |=
+				FIEMAP_EXTENT_UNKNOWN | FIEMAP_EXTENT_LAST;
+		}
+		rc = 0;
+		goto out;
+	}
+
+	if (fiemap_count_to_size(fiemap->fm_extent_count) < buffer_size)
+		buffer_size = fiemap_count_to_size(fiemap->fm_extent_count);
+
+	fm_local = libcfs_kvzalloc(buffer_size, GFP_NOFS);
+	if (!fm_local) {
+		rc = -ENOMEM;
+		goto out;
+	}
+	lcl_fm_ext = &fm_local->fm_extents[0];
+	count_local = fiemap_size_to_count(buffer_size);
+
+	fm_start = fiemap->fm_start;
+	fm_length = fiemap->fm_length;
+	/* Calculate start stripe, last stripe and length of mapping */
+	start_stripe = lov_stripe_number(lsm, fm_start);
+	fm_end = (fm_length == ~0ULL) ? fmkey->lfik_oa.o_size :
+					fm_start + fm_length - 1;
+	/* If fm_length != ~0ULL but fm_start_fm_length-1 exceeds file size */
+	if (fm_end > fmkey->lfik_oa.o_size)
+		fm_end = fmkey->lfik_oa.o_size;
+
+	last_stripe = fiemap_calc_last_stripe(lsm, fm_start, fm_end,
+					      start_stripe, &stripe_count);
+	fm_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, fm_start, fm_end,
+						  &start_stripe);
+	if (fm_end_offset == -EINVAL) {
+		rc = -EINVAL;
+		goto out;
+	}
+
+	/**
+	 * Requested extent count exceeds the fiemap buffer size, shrink our
+	 * ambition.
+	 */
+	if (fiemap_count_to_size(fiemap->fm_extent_count) > *buflen)
+		fiemap->fm_extent_count = fiemap_size_to_count(*buflen);
+	if (!fiemap->fm_extent_count)
+		count_local = 0;
+
+	/* Check each stripe */
+	for (cur_stripe = start_stripe; stripe_count > 0;
+	     --stripe_count,
+	     cur_stripe = (cur_stripe + 1) % lsm->lsm_stripe_count) {
+		loff_t req_fm_len; /* Stores length of required mapping */
+		loff_t len_mapped_single_call;
+		loff_t lun_start;
+		loff_t lun_end;
+		loff_t obd_object_end;
+		unsigned int ext_count;
+
+		cur_stripe_wrap = cur_stripe;
+
+		/* Find out range of mapping on this stripe */
+		if (!(lov_stripe_intersects(lsm, cur_stripe, fm_start, fm_end,
+					    &lun_start, &obd_object_end)))
+			continue;
+
+		if (lov_oinfo_is_dummy(lsm->lsm_oinfo[cur_stripe])) {
+			rc = -EIO;
+			goto out;
+		}
+
+		/*
+		 * If this is a continuation FIEMAP call and we are on
+		 * starting stripe then lun_start needs to be set to
+		 * fm_end_offset
+		 */
+		if (fm_end_offset && cur_stripe == start_stripe)
+			lun_start = fm_end_offset;
+
+		if (fm_length != ~0ULL) {
+			/* Handle fm_start + fm_length overflow */
+			if (fm_start + fm_length < fm_start)
+				fm_length = ~0ULL - fm_start;
+			lun_end = lov_size_to_stripe(lsm, fm_start + fm_length,
+						     cur_stripe);
+		} else {
+			lun_end = ~0ULL;
+		}
+
+		if (lun_start == lun_end)
+			continue;
+
+		req_fm_len = obd_object_end - lun_start;
+		fm_local->fm_length = 0;
+		len_mapped_single_call = 0;
+
+		/* find lobsub object */
+		subobj = lov_find_subobj(env, cl2lov(obj), lsm,
+					 cur_stripe);
+		if (IS_ERR(subobj)) {
+			rc = PTR_ERR(subobj);
+			goto out;
+		}
+		/*
+		 * If the output buffer is very large and the objects have many
+		 * extents we may need to loop on a single OST repeatedly
+		 */
+		ost_eof = false;
+		ost_done = false;
+		do {
+			if (fiemap->fm_extent_count > 0) {
+				/* Don't get too many extents. */
+				if (current_extent + count_local >
+				    fiemap->fm_extent_count)
+					count_local = fiemap->fm_extent_count -
+						      current_extent;
+			}
+
+			lun_start += len_mapped_single_call;
+			fm_local->fm_length = req_fm_len -
+					      len_mapped_single_call;
+			req_fm_len = fm_local->fm_length;
+			fm_local->fm_extent_count = enough ? 1 : count_local;
+			fm_local->fm_mapped_extents = 0;
+			fm_local->fm_flags = fiemap->fm_flags;
+
+			ost_index = lsm->lsm_oinfo[cur_stripe]->loi_ost_idx;
+
+			if (ost_index < 0 ||
+			    ost_index >= lov->desc.ld_tgt_count) {
+				rc = -EINVAL;
+				goto obj_put;
+			}
+			/*
+			 * If OST is inactive, return extent with UNKNOWN
+			 * flag.
+			 */
+			if (!lov->lov_tgts[ost_index]->ltd_active) {
+				fm_local->fm_flags |= FIEMAP_EXTENT_LAST;
+				fm_local->fm_mapped_extents = 1;
+
+				lcl_fm_ext[0].fe_logical = lun_start;
+				lcl_fm_ext[0].fe_length = obd_object_end -
+							  lun_start;
+				lcl_fm_ext[0].fe_flags |= FIEMAP_EXTENT_UNKNOWN;
+
+				goto inactive_tgt;
+			}
+
+			fm_local->fm_start = lun_start;
+			fm_local->fm_flags &= ~FIEMAP_FLAG_DEVICE_ORDER;
+			memcpy(&fmkey->lfik_fiemap, fm_local, sizeof(*fm_local));
+			*buflen = fiemap_count_to_size(fm_local->fm_extent_count);
+
+			rc = cl_object_fiemap(env, subobj, fmkey, fm_local,
+					      buflen);
+			if (rc)
+				goto obj_put;
+inactive_tgt:
+			ext_count = fm_local->fm_mapped_extents;
+			if (!ext_count) {
+				ost_done = true;
+				/*
+				 * If last stripe has hold at the end,
+				 * we need to return
+				 */
+				if (cur_stripe_wrap == last_stripe) {
+					fiemap->fm_mapped_extents = 0;
+					goto finish;
+				}
+				break;
+			} else if (enough) {
+				/*
+				 * We've collected enough extents and there are
+				 * more extents after it.
+				 */
+				goto finish;
+			}
+
+			/* If we just need num of extents, got to next device */
+			if (!fiemap->fm_extent_count) {
+				current_extent += ext_count;
+				break;
+			}
+
+			/* prepare to copy retrived map extents */
+			len_mapped_single_call =
+				lcl_fm_ext[ext_count - 1].fe_logical -
+				lun_start + lcl_fm_ext[ext_count - 1].fe_length;
+
+			/* Have we finished mapping on this device? */
+			if (req_fm_len <= len_mapped_single_call)
+				ost_done = true;
+
+			/*
+			 * Clear the EXTENT_LAST flag which can be present on
+			 * the last extent
+			 */
+			if (lcl_fm_ext[ext_count - 1].fe_flags &
+			    FIEMAP_EXTENT_LAST)
+				lcl_fm_ext[ext_count - 1].fe_flags &=
+					~FIEMAP_EXTENT_LAST;
+
+			if (lov_stripe_size(lsm,
+					    lcl_fm_ext[ext_count - 1].fe_logical +
+					    lcl_fm_ext[ext_count - 1].fe_length,
+					    cur_stripe) >= fmkey->lfik_oa.o_size)
+				ost_eof = true;
+
+			fiemap_prepare_and_copy_exts(fiemap, lcl_fm_ext,
+						     ost_index, ext_count,
+						     current_extent);
+			current_extent += ext_count;
+
+			/* Ran out of available extents? */
+			if (current_extent >= fiemap->fm_extent_count)
+				enough = true;
+		} while (!ost_done && !ost_eof);
+
+		cl_object_put(env, subobj);
+		subobj = NULL;
+
+		if (cur_stripe_wrap == last_stripe)
+			goto finish;
+	} /* for each stripe */
+finish:
+	/*
+	 * Indicate that we are returning device offsets unless file just has
+	 * single stripe
+	 */
+	if (lsm->lsm_stripe_count > 1)
+		fiemap->fm_flags |= FIEMAP_FLAG_DEVICE_ORDER;
+
+	if (!fiemap->fm_extent_count)
+		goto skip_last_device_calc;
+
+	/*
+	 * Check if we have reached the last stripe and whether mapping for that
+	 * stripe is done.
+	 */
+	if ((cur_stripe_wrap == last_stripe) && (ost_done || ost_eof))
+		fiemap->fm_extents[current_extent - 1].fe_flags |=
+							FIEMAP_EXTENT_LAST;
+skip_last_device_calc:
+	fiemap->fm_mapped_extents = current_extent;
+obj_put:
+	if (subobj)
+		cl_object_put(env, subobj);
+out:
+	kvfree(fm_local);
+	lov_lsm_put(obj, lsm);
+	return rc;
+}
+
 static int lov_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 				struct lov_user_md __user *lum)
 {
@@ -934,7 +1435,8 @@ static const struct cl_object_operations lov_ops = {
 	.coo_attr_get  = lov_attr_get,
 	.coo_attr_update = lov_attr_update,
 	.coo_conf_set  = lov_conf_set,
-	.coo_getstripe = lov_object_getstripe
+	.coo_getstripe = lov_object_getstripe,
+	.coo_fiemap	 = lov_object_fiemap,
 };
 
 static const struct lu_object_operations lov_lu_obj_ops = {
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index 3199dd4..4ad2ee5 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -343,6 +343,38 @@ int cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 EXPORT_SYMBOL(cl_object_getstripe);
 
 /**
+ * Get fiemap extents from file object.
+ *
+ * \param env [in]	lustre environment
+ * \param obj [in]	file object
+ * \param key [in]	fiemap request argument
+ * \param fiemap [out]	fiemap extents mapping retrived
+ * \param buflen [in]	max buffer length of @fiemap
+ *
+ * \retval 0	success
+ * \retval < 0	error
+ */
+int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
+		     struct ll_fiemap_info_key *key,
+		     struct fiemap *fiemap, size_t *buflen)
+{
+	struct lu_object_header *top;
+	int result = 0;
+
+	top = obj->co_lu.lo_header;
+	list_for_each_entry(obj, &top->loh_layers, co_lu.lo_linkage) {
+		if (obj->co_ops->coo_fiemap) {
+			result = obj->co_ops->coo_fiemap(env, obj, key, fiemap,
+							 buflen);
+			if (result)
+				break;
+		}
+	}
+	return result;
+}
+EXPORT_SYMBOL(cl_object_fiemap);
+
+/**
  * Helper function removing all object locks, and marking object for
  * deletion. All object pages must have been deleted at this point.
  *
diff --git a/drivers/staging/lustre/lustre/osc/osc_object.c b/drivers/staging/lustre/lustre/osc/osc_object.c
index aae3a2d..dc0c173 100644
--- a/drivers/staging/lustre/lustre/osc/osc_object.c
+++ b/drivers/staging/lustre/lustre/osc/osc_object.c
@@ -218,6 +218,94 @@ static int osc_object_prune(const struct lu_env *env, struct cl_object *obj)
 	return 0;
 }
 
+static int osc_object_fiemap(const struct lu_env *env, struct cl_object *obj,
+			     struct ll_fiemap_info_key *fmkey,
+			     struct fiemap *fiemap, size_t *buflen)
+{
+	struct obd_export *exp = osc_export(cl2osc(obj));
+	ldlm_policy_data_t policy;
+	struct ptlrpc_request *req;
+	struct lustre_handle lockh;
+	struct ldlm_res_id resid;
+	enum ldlm_mode mode = 0;
+	struct fiemap *reply;
+	char *tmp;
+	int rc;
+
+	fmkey->lfik_oa.o_oi = cl2osc(obj)->oo_oinfo->loi_oi;
+	if (!(fmkey->lfik_fiemap.fm_flags & FIEMAP_FLAG_SYNC))
+		goto skip_locking;
+
+	policy.l_extent.start = fmkey->lfik_fiemap.fm_start & PAGE_MASK;
+
+	if (OBD_OBJECT_EOF - fmkey->lfik_fiemap.fm_length <=
+	    fmkey->lfik_fiemap.fm_start + PAGE_SIZE - 1)
+		policy.l_extent.end = OBD_OBJECT_EOF;
+	else
+		policy.l_extent.end = (fmkey->lfik_fiemap.fm_start +
+				       fmkey->lfik_fiemap.fm_length +
+				       PAGE_SIZE - 1) & PAGE_MASK;
+
+	ostid_build_res_name(&fmkey->lfik_oa.o_oi, &resid);
+	mode = ldlm_lock_match(exp->exp_obd->obd_namespace,
+			       LDLM_FL_BLOCK_GRANTED | LDLM_FL_LVB_READY,
+			       &resid, LDLM_EXTENT, &policy,
+			       LCK_PR | LCK_PW, &lockh, 0);
+	if (mode) { /* lock is cached on client */
+		if (mode != LCK_PR) {
+			ldlm_lock_addref(&lockh, LCK_PR);
+			ldlm_lock_decref(&lockh, LCK_PW);
+		}
+	} else { /* no cached lock, needs acquire lock on server side */
+		fmkey->lfik_oa.o_valid |= OBD_MD_FLFLAGS;
+		fmkey->lfik_oa.o_flags |= OBD_FL_SRVLOCK;
+	}
+
+skip_locking:
+	req = ptlrpc_request_alloc(class_exp2cliimp(exp),
+				   &RQF_OST_GET_INFO_FIEMAP);
+	if (!req) {
+		rc = -ENOMEM;
+		goto drop_lock;
+	}
+
+	req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_KEY, RCL_CLIENT,
+			     sizeof(*fmkey));
+	req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_VAL, RCL_CLIENT,
+			     *buflen);
+	req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_VAL, RCL_SERVER,
+			     *buflen);
+
+	rc = ptlrpc_request_pack(req, LUSTRE_OST_VERSION, OST_GET_INFO);
+	if (rc) {
+		ptlrpc_request_free(req);
+		goto drop_lock;
+	}
+	tmp = req_capsule_client_get(&req->rq_pill, &RMF_FIEMAP_KEY);
+	memcpy(tmp, fmkey, sizeof(*fmkey));
+	tmp = req_capsule_client_get(&req->rq_pill, &RMF_FIEMAP_VAL);
+	memcpy(tmp, fiemap, *buflen);
+	ptlrpc_request_set_replen(req);
+
+	rc = ptlrpc_queue_wait(req);
+	if (rc)
+		goto fini_req;
+
+	reply = req_capsule_server_get(&req->rq_pill, &RMF_FIEMAP_VAL);
+	if (!reply) {
+		rc = -EPROTO;
+		goto fini_req;
+	}
+
+	memcpy(fiemap, reply, *buflen);
+fini_req:
+	ptlrpc_req_finished(req);
+drop_lock:
+	if (mode)
+		ldlm_lock_decref(&lockh, LCK_PR);
+	return rc;
+}
+
 void osc_object_set_contended(struct osc_object *obj)
 {
 	obj->oo_contention_time = cfs_time_current();
@@ -263,7 +351,8 @@ static const struct cl_object_operations osc_ops = {
 	.coo_attr_get  = osc_attr_get,
 	.coo_attr_update = osc_attr_update,
 	.coo_glimpse   = osc_object_glimpse,
-	.coo_prune     = osc_object_prune
+	.coo_prune	 = osc_object_prune,
+	.coo_fiemap	 = osc_object_fiemap,
 };
 
 static const struct lu_object_operations osc_lu_obj_ops = {
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 749781f..963a485 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2543,103 +2543,6 @@ out:
 	return err;
 }
 
-static int osc_get_info(const struct lu_env *env, struct obd_export *exp,
-			u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm)
-{
-	if (!vallen || !val)
-		return -EFAULT;
-
-	if (KEY_IS(KEY_FIEMAP)) {
-		struct ll_fiemap_info_key *fm_key = key;
-		struct ldlm_res_id res_id;
-		ldlm_policy_data_t policy;
-		struct lustre_handle lockh;
-		enum ldlm_mode mode = 0;
-		struct ptlrpc_request *req;
-		struct ll_user_fiemap *reply;
-		char *tmp;
-		int rc;
-
-		if (!(fm_key->fiemap.fm_flags & FIEMAP_FLAG_SYNC))
-			goto skip_locking;
-
-		policy.l_extent.start = fm_key->fiemap.fm_start &
-						PAGE_MASK;
-
-		if (OBD_OBJECT_EOF - fm_key->fiemap.fm_length <=
-		    fm_key->fiemap.fm_start + PAGE_SIZE - 1)
-			policy.l_extent.end = OBD_OBJECT_EOF;
-		else
-			policy.l_extent.end = (fm_key->fiemap.fm_start +
-				fm_key->fiemap.fm_length +
-				PAGE_SIZE - 1) & PAGE_MASK;
-
-		ostid_build_res_name(&fm_key->oa.o_oi, &res_id);
-		mode = ldlm_lock_match(exp->exp_obd->obd_namespace,
-				       LDLM_FL_BLOCK_GRANTED |
-				       LDLM_FL_LVB_READY,
-				       &res_id, LDLM_EXTENT, &policy,
-				       LCK_PR | LCK_PW, &lockh, 0);
-		if (mode) { /* lock is cached on client */
-			if (mode != LCK_PR) {
-				ldlm_lock_addref(&lockh, LCK_PR);
-				ldlm_lock_decref(&lockh, LCK_PW);
-			}
-		} else { /* no cached lock, needs acquire lock on server side */
-			fm_key->oa.o_valid |= OBD_MD_FLFLAGS;
-			fm_key->oa.o_flags |= OBD_FL_SRVLOCK;
-		}
-
-skip_locking:
-		req = ptlrpc_request_alloc(class_exp2cliimp(exp),
-					   &RQF_OST_GET_INFO_FIEMAP);
-		if (!req) {
-			rc = -ENOMEM;
-			goto drop_lock;
-		}
-
-		req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_KEY,
-				     RCL_CLIENT, keylen);
-		req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_VAL,
-				     RCL_CLIENT, *vallen);
-		req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_VAL,
-				     RCL_SERVER, *vallen);
-
-		rc = ptlrpc_request_pack(req, LUSTRE_OST_VERSION, OST_GET_INFO);
-		if (rc) {
-			ptlrpc_request_free(req);
-			goto drop_lock;
-		}
-
-		tmp = req_capsule_client_get(&req->rq_pill, &RMF_FIEMAP_KEY);
-		memcpy(tmp, key, keylen);
-		tmp = req_capsule_client_get(&req->rq_pill, &RMF_FIEMAP_VAL);
-		memcpy(tmp, val, *vallen);
-
-		ptlrpc_request_set_replen(req);
-		rc = ptlrpc_queue_wait(req);
-		if (rc)
-			goto fini_req;
-
-		reply = req_capsule_server_get(&req->rq_pill, &RMF_FIEMAP_VAL);
-		if (!reply) {
-			rc = -EPROTO;
-			goto fini_req;
-		}
-
-		memcpy(val, reply, *vallen);
-fini_req:
-		ptlrpc_req_finished(req);
-drop_lock:
-		if (mode)
-			ldlm_lock_decref(&lockh, LCK_PR);
-		return rc;
-	}
-
-	return -EINVAL;
-}
-
 static int osc_set_info_async(const struct lu_env *env, struct obd_export *exp,
 			      u32 keylen, void *key, u32 vallen,
 			      void *val, struct ptlrpc_request_set *set)
@@ -3112,7 +3015,6 @@ static struct obd_ops osc_obd_ops = {
 	.setattr        = osc_setattr,
 	.setattr_async  = osc_setattr_async,
 	.iocontrol      = osc_iocontrol,
-	.get_info       = osc_get_info,
 	.set_info_async = osc_set_info_async,
 	.import_event   = osc_import_event,
 	.process_config = osc_process_config,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 8717685..36b86ae 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1772,7 +1772,7 @@ void lustre_swab_fid2path(struct getinfo_fid2path *gf)
 }
 EXPORT_SYMBOL(lustre_swab_fid2path);
 
-static void lustre_swab_fiemap_extent(struct ll_fiemap_extent *fm_extent)
+static void lustre_swab_fiemap_extent(struct fiemap_extent *fm_extent)
 {
 	__swab64s(&fm_extent->fe_logical);
 	__swab64s(&fm_extent->fe_physical);
@@ -1781,7 +1781,7 @@ static void lustre_swab_fiemap_extent(struct ll_fiemap_extent *fm_extent)
 	__swab32s(&fm_extent->fe_device);
 }
 
-void lustre_swab_fiemap(struct ll_user_fiemap *fiemap)
+void lustre_swab_fiemap(struct fiemap *fiemap)
 {
 	__u32 i;
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index e5945e2..cb89bf2 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -3520,21 +3520,21 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct llogd_conn_body *)0)->lgdc_ctxt_idx) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct llogd_conn_body *)0)->lgdc_ctxt_idx));
 
-	/* Checks for struct ll_fiemap_info_key */
+	/* Checks for struct fiemap_info_key */
 	LASSERTF((int)sizeof(struct ll_fiemap_info_key) == 248, "found %lld\n",
 		 (long long)(int)sizeof(struct ll_fiemap_info_key));
-	LASSERTF((int)offsetof(struct ll_fiemap_info_key, name[8]) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_info_key, name[8]));
-	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->name[8]) == 1, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->name[8]));
-	LASSERTF((int)offsetof(struct ll_fiemap_info_key, oa) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_info_key, oa));
-	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->oa) == 208, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->oa));
-	LASSERTF((int)offsetof(struct ll_fiemap_info_key, fiemap) == 216, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_info_key, fiemap));
-	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->fiemap) == 32, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->fiemap));
+	LASSERTF((int)offsetof(struct ll_fiemap_info_key, lfik_name[8]) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct ll_fiemap_info_key, lfik_name[8]));
+	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_name[8]) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_name[8]));
+	LASSERTF((int)offsetof(struct ll_fiemap_info_key, lfik_oa) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct ll_fiemap_info_key, lfik_oa));
+	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_oa) == 208, "found %lld\n",
+		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_oa));
+	LASSERTF((int)offsetof(struct ll_fiemap_info_key, lfik_fiemap) == 216, "found %lld\n",
+		 (long long)(int)offsetof(struct ll_fiemap_info_key, lfik_fiemap));
+	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_fiemap) == 32, "found %lld\n",
+		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_fiemap));
 
 	/* Checks for struct mgs_target_info */
 	LASSERTF((int)sizeof(struct mgs_target_info) == 4544, "found %lld\n",
@@ -3670,64 +3670,64 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct getinfo_fid2path *)0)->gf_path[0]) == 1, "found %lld\n",
 		 (long long)(int)sizeof(((struct getinfo_fid2path *)0)->gf_path[0]));
 
-	/* Checks for struct ll_user_fiemap */
-	LASSERTF((int)sizeof(struct ll_user_fiemap) == 32, "found %lld\n",
-		 (long long)(int)sizeof(struct ll_user_fiemap));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_start) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_start));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_start) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_start));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_length) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_length));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_length) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_length));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_flags) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_flags));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_flags) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_flags));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_mapped_extents) == 20, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_mapped_extents));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_mapped_extents) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_mapped_extents));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_extent_count) == 24, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_extent_count));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_extent_count) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_extent_count));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_reserved) == 28, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_reserved));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_reserved) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_reserved));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_extents) == 32, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_extents));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_extents) == 0, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_extents));
+	/* Checks for struct fiemap */
+	LASSERTF((int)sizeof(struct fiemap) == 32, "found %lld\n",
+		 (long long)(int)sizeof(struct fiemap));
+	LASSERTF((int)offsetof(struct fiemap, fm_start) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_start));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_start) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_start));
+	LASSERTF((int)offsetof(struct fiemap, fm_length) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_length));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_length) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_length));
+	LASSERTF((int)offsetof(struct fiemap, fm_flags) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_flags));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_flags) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_flags));
+	LASSERTF((int)offsetof(struct fiemap, fm_mapped_extents) == 20, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_mapped_extents));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_mapped_extents) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_mapped_extents));
+	LASSERTF((int)offsetof(struct fiemap, fm_extent_count) == 24, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_extent_count));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_extent_count) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_extent_count));
+	LASSERTF((int)offsetof(struct fiemap, fm_reserved) == 28, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_reserved));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_reserved) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_reserved));
+	LASSERTF((int)offsetof(struct fiemap, fm_extents) == 32, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_extents));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_extents) == 0, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_extents));
 	CLASSERT(FIEMAP_FLAG_SYNC == 0x00000001);
 	CLASSERT(FIEMAP_FLAG_XATTR == 0x00000002);
 	CLASSERT(FIEMAP_FLAG_DEVICE_ORDER == 0x40000000);
 
-	/* Checks for struct ll_fiemap_extent */
-	LASSERTF((int)sizeof(struct ll_fiemap_extent) == 56, "found %lld\n",
-		 (long long)(int)sizeof(struct ll_fiemap_extent));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_logical) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_logical));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_logical) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_logical));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_physical) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_physical));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_physical) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_physical));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_length) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_length));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_length) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_length));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_flags) == 40, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_flags));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_flags) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_flags));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_device) == 44, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_device));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_device) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_device));
+	/* Checks for struct fiemap_extent */
+	LASSERTF((int)sizeof(struct fiemap_extent) == 56, "found %lld\n",
+		 (long long)(int)sizeof(struct fiemap_extent));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_logical) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_logical));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_logical) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_logical));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_physical) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_physical));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_physical) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_physical));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_length) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_length));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_length) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_length));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_flags) == 40, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_flags));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_flags) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_flags));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_reserved[0]) == 44, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_reserved[0]));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_reserved[0]) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_reserved[0]));
 	CLASSERT(FIEMAP_EXTENT_LAST == 0x00000001);
 	CLASSERT(FIEMAP_EXTENT_UNKNOWN == 0x00000002);
 	CLASSERT(FIEMAP_EXTENT_DELALLOC == 0x00000004);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 07/41] staging: lustre: llite: remove duplicate fiemap defines
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

 * replace struct ll_user_fiemap with struct fiemap
 * replace struct ll_fiemap_extent with struct fiemap_extent
 * remove kernel defined FIEMAP_EXTENT_* constants
 * remove kernel defined FIEMAP_FLAG_* flags
 * add member prefix for struct ll_fiemap_info_key

 * Add cl_object_operations::coo_fiemap().
 * Add cl_object_fiemap() to get FIEMAP mappings.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/12535
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6201
Reviewed-on: http://review.whamcloud.com/13608
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    9 +
 .../lustre/lustre/include/lustre/ll_fiemap.h       |   75 +---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    8 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |    1 -
 drivers/staging/lustre/lustre/llite/file.c         |  126 +----
 drivers/staging/lustre/lustre/lov/lov_obd.c        |  421 ----------------
 drivers/staging/lustre/lustre/lov/lov_object.c     |  504 +++++++++++++++++++-
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   32 ++
 drivers/staging/lustre/lustre/osc/osc_object.c     |   91 ++++-
 drivers/staging/lustre/lustre/osc/osc_request.c    |   98 ----
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |  134 +++---
 12 files changed, 742 insertions(+), 761 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index bf93c1e..3af9aa3 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -400,6 +400,12 @@ struct cl_object_operations {
 	 */
 	int (*coo_getstripe)(const struct lu_env *env, struct cl_object *obj,
 			     struct lov_user_md __user *lum);
+	/**
+	 * Get FIEMAP mapping from the object.
+	 */
+	int (*coo_fiemap)(const struct lu_env *env, struct cl_object *obj,
+			  struct ll_fiemap_info_key *fmkey,
+			  struct fiemap *fiemap, size_t *buflen);
 };
 
 /**
@@ -2184,6 +2190,9 @@ int cl_object_prune(const struct lu_env *env, struct cl_object *obj);
 void cl_object_kill(const struct lu_env *env, struct cl_object *obj);
 int  cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 			 struct lov_user_md __user *lum);
+int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
+		     struct ll_fiemap_info_key *fmkey, struct fiemap *fiemap,
+		     size_t *buflen);
 
 /**
  * Returns true, iff \a o0 and \a o1 are slices of the same object.
diff --git a/drivers/staging/lustre/lustre/include/lustre/ll_fiemap.h b/drivers/staging/lustre/lustre/include/lustre/ll_fiemap.h
index c2340d6..b8ad555 100644
--- a/drivers/staging/lustre/lustre/include/lustre/ll_fiemap.h
+++ b/drivers/staging/lustre/lustre/include/lustre/ll_fiemap.h
@@ -41,79 +41,24 @@
 #ifndef _LUSTRE_FIEMAP_H
 #define _LUSTRE_FIEMAP_H
 
-struct ll_fiemap_extent {
-	__u64 fe_logical;  /* logical offset in bytes for the start of
-			    * the extent from the beginning of the file
-			    */
-	__u64 fe_physical; /* physical offset in bytes for the start
-			    * of the extent from the beginning of the disk
-			    */
-	__u64 fe_length;   /* length in bytes for this extent */
-	__u64 fe_reserved64[2];
-	__u32 fe_flags;    /* FIEMAP_EXTENT_* flags for this extent */
-	__u32 fe_device;   /* device number for this extent */
-	__u32 fe_reserved[2];
-};
-
-struct ll_user_fiemap {
-	__u64 fm_start;  /* logical offset (inclusive) at
-			  * which to start mapping (in)
-			  */
-	__u64 fm_length; /* logical length of mapping which
-			  * userspace wants (in)
-			  */
-	__u32 fm_flags;  /* FIEMAP_FLAG_* flags for request (in/out) */
-	__u32 fm_mapped_extents;/* number of extents that were mapped (out) */
-	__u32 fm_extent_count;  /* size of fm_extents array (in) */
-	__u32 fm_reserved;
-	struct ll_fiemap_extent fm_extents[0]; /* array of mapped extents (out) */
-};
-
-#define FIEMAP_MAX_OFFSET      (~0ULL)
+#ifndef __KERNEL__
+#include <stddef.h>
+#include <fiemap.h>
+#endif
 
-#define FIEMAP_FLAG_SYNC		0x00000001 /* sync file data before
-						    * map
-						    */
-#define FIEMAP_FLAG_XATTR		0x00000002 /* map extended attribute
-						    * tree
-						    */
-#define FIEMAP_EXTENT_LAST		0x00000001 /* Last extent in file. */
-#define FIEMAP_EXTENT_UNKNOWN		0x00000002 /* Data location unknown. */
-#define FIEMAP_EXTENT_DELALLOC		0x00000004 /* Location still pending.
-						    * Sets EXTENT_UNKNOWN.
-						    */
-#define FIEMAP_EXTENT_ENCODED		0x00000008 /* Data can not be read
-						    * while fs is unmounted
-						    */
-#define FIEMAP_EXTENT_DATA_ENCRYPTED	0x00000080 /* Data is encrypted by fs.
-						    * Sets EXTENT_NO_DIRECT.
-						    */
-#define FIEMAP_EXTENT_NOT_ALIGNED       0x00000100 /* Extent offsets may not be
-						    * block aligned.
-						    */
-#define FIEMAP_EXTENT_DATA_INLINE       0x00000200 /* Data mixed with metadata.
-						    * Sets EXTENT_NOT_ALIGNED.*/
-#define FIEMAP_EXTENT_DATA_TAIL		0x00000400 /* Multiple files in block.
-						    * Sets EXTENT_NOT_ALIGNED.
-						    */
-#define FIEMAP_EXTENT_UNWRITTEN		0x00000800 /* Space allocated, but
-						    * no data (i.e. zero).
-						    */
-#define FIEMAP_EXTENT_MERGED		0x00001000 /* File does not natively
-						    * support extents. Result
-						    * merged for efficiency.
-						    */
+/* XXX: We use fiemap_extent::fe_reserved[0] */
+#define fe_device	fe_reserved[0]
 
 static inline size_t fiemap_count_to_size(size_t extent_count)
 {
-	return (sizeof(struct ll_user_fiemap) + extent_count *
-					       sizeof(struct ll_fiemap_extent));
+	return sizeof(struct fiemap) + extent_count *
+				       sizeof(struct fiemap_extent);
 }
 
 static inline unsigned fiemap_size_to_count(size_t array_size)
 {
-	return ((array_size - sizeof(struct ll_user_fiemap)) /
-					       sizeof(struct ll_fiemap_extent));
+	return (array_size - sizeof(struct fiemap)) /
+		sizeof(struct fiemap_extent);
 }
 
 #define FIEMAP_FLAG_DEVICE_ORDER 0x40000000 /* return device ordered mapping */
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index d164545..4210716 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -3331,14 +3331,14 @@ struct ost_body {
 
 /* Key for FIEMAP to be used in get_info calls */
 struct ll_fiemap_info_key {
-	char    name[8];
-	struct  obdo oa;
-	struct  ll_user_fiemap fiemap;
+	char		lfik_name[8];
+	struct obdo	lfik_oa;
+	struct fiemap	lfik_fiemap;
 };
 
 void lustre_swab_ost_body(struct ost_body *b);
 void lustre_swab_ost_last_id(__u64 *id);
-void lustre_swab_fiemap(struct ll_user_fiemap *fiemap);
+void lustre_swab_fiemap(struct fiemap *fiemap);
 
 void lustre_swab_lov_user_md_v1(struct lov_user_md_v1 *lum);
 void lustre_swab_lov_user_md_v3(struct lov_user_md_v3 *lum);
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 6fc9855..dced31f 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -82,7 +82,6 @@ typedef struct stat     lstat_t;
 #define FSFILT_IOC_SETVERSION	     _IOW('f', 4, long)
 #define FSFILT_IOC_GETVERSION_OLD	 _IOR('v', 1, long)
 #define FSFILT_IOC_SETVERSION_OLD	 _IOW('v', 2, long)
-#define FSFILT_IOC_FIEMAP		 _IOWR('f', 11, struct ll_user_fiemap)
 #endif
 
 /* FIEMAP flags supported by Lustre */
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index b2058c6..9ca933f 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1545,15 +1545,17 @@ out:
 /**
  * Get size for inode for which FIEMAP mapping is requested.
  * Make the FIEMAP get_info call and returns the result.
+ *
+ * \param fiemap	kernel buffer to hold extens
+ * \param num_bytes	kernel buffer size
  */
-static int ll_do_fiemap(struct inode *inode, struct ll_user_fiemap *fiemap,
+static int ll_do_fiemap(struct inode *inode, struct fiemap *fiemap,
 			size_t num_bytes)
 {
-	struct obd_export *exp = ll_i2dtexp(inode);
-	struct lov_stripe_md *lsm = NULL;
-	struct ll_fiemap_info_key fm_key = { .name = KEY_FIEMAP, };
-	__u32 vallen = num_bytes;
-	int rc;
+	struct ll_fiemap_info_key fmkey = { .lfik_name = KEY_FIEMAP, };
+	struct lu_env *env;
+	int refcheck;
+	int rc = 0;
 
 	/* Checks for fiemap flags */
 	if (fiemap->fm_flags & ~LUSTRE_FIEMAP_FLAGS_COMPAT) {
@@ -1568,21 +1570,9 @@ static int ll_do_fiemap(struct inode *inode, struct ll_user_fiemap *fiemap,
 			return rc;
 	}
 
-	lsm = ccc_inode_lsm_get(inode);
-	if (!lsm)
-		return -ENOENT;
-
-	/* If the stripe_count > 1 and the application does not understand
-	 * DEVICE_ORDER flag, then it cannot interpret the extents correctly.
-	 */
-	if (lsm->lsm_stripe_count > 1 &&
-	    !(fiemap->fm_flags & FIEMAP_FLAG_DEVICE_ORDER)) {
-		rc = -EOPNOTSUPP;
-		goto out;
-	}
-
-	fm_key.oa.o_oi = lsm->lsm_oi;
-	fm_key.oa.o_valid = OBD_MD_FLID | OBD_MD_FLGROUP;
+	env = cl_env_get(&refcheck);
+	if (IS_ERR(env))
+		return PTR_ERR(env);
 
 	if (i_size_read(inode) == 0) {
 		rc = ll_glimpse_size(inode);
@@ -1590,24 +1580,23 @@ static int ll_do_fiemap(struct inode *inode, struct ll_user_fiemap *fiemap,
 			goto out;
 	}
 
-	obdo_from_inode(&fm_key.oa, inode, OBD_MD_FLSIZE);
-	obdo_set_parent_fid(&fm_key.oa, &ll_i2info(inode)->lli_fid);
+	fmkey.lfik_oa.o_valid = OBD_MD_FLID | OBD_MD_FLGROUP;
+	obdo_from_inode(&fmkey.lfik_oa, inode, OBD_MD_FLSIZE);
+	obdo_set_parent_fid(&fmkey.lfik_oa, &ll_i2info(inode)->lli_fid);
+
 	/* If filesize is 0, then there would be no objects for mapping */
-	if (fm_key.oa.o_size == 0) {
+	if (fmkey.lfik_oa.o_size == 0) {
 		fiemap->fm_mapped_extents = 0;
 		rc = 0;
 		goto out;
 	}
 
-	memcpy(&fm_key.fiemap, fiemap, sizeof(*fiemap));
-
-	rc = obd_get_info(NULL, exp, sizeof(fm_key), &fm_key, &vallen,
-			  fiemap, lsm);
-	if (rc)
-		CERROR("obd_get_info failed: rc = %d\n", rc);
+	memcpy(&fmkey.lfik_fiemap, fiemap, sizeof(*fiemap));
 
+	rc = cl_object_fiemap(env, ll_i2info(inode)->lli_clob,
+			      &fmkey, fiemap, &num_bytes);
 out:
-	ccc_inode_lsm_put(inode, lsm);
+	cl_env_put(env, &refcheck);
 	return rc;
 }
 
@@ -1655,68 +1644,6 @@ gf_free:
 	return rc;
 }
 
-static int ll_ioctl_fiemap(struct inode *inode, unsigned long arg)
-{
-	struct ll_user_fiemap *fiemap_s;
-	size_t num_bytes, ret_bytes;
-	unsigned int extent_count;
-	int rc = 0;
-
-	/* Get the extent count so we can calculate the size of
-	 * required fiemap buffer
-	 */
-	if (get_user(extent_count,
-		     &((struct ll_user_fiemap __user *)arg)->fm_extent_count))
-		return -EFAULT;
-
-	if (extent_count >=
-	    (SIZE_MAX - sizeof(*fiemap_s)) / sizeof(struct ll_fiemap_extent))
-		return -EINVAL;
-	num_bytes = sizeof(*fiemap_s) + (extent_count *
-					 sizeof(struct ll_fiemap_extent));
-
-	fiemap_s = libcfs_kvzalloc(num_bytes, GFP_NOFS);
-	if (!fiemap_s)
-		return -ENOMEM;
-
-	/* get the fiemap value */
-	if (copy_from_user(fiemap_s, (struct ll_user_fiemap __user *)arg,
-			   sizeof(*fiemap_s))) {
-		rc = -EFAULT;
-		goto error;
-	}
-
-	/* If fm_extent_count is non-zero, read the first extent since
-	 * it is used to calculate end_offset and device from previous
-	 * fiemap call.
-	 */
-	if (extent_count) {
-		if (copy_from_user(&fiemap_s->fm_extents[0],
-				   (char __user *)arg + sizeof(*fiemap_s),
-				   sizeof(struct ll_fiemap_extent))) {
-			rc = -EFAULT;
-			goto error;
-		}
-	}
-
-	rc = ll_do_fiemap(inode, fiemap_s, num_bytes);
-	if (rc)
-		goto error;
-
-	ret_bytes = sizeof(struct ll_user_fiemap);
-
-	if (extent_count != 0)
-		ret_bytes += (fiemap_s->fm_mapped_extents *
-				 sizeof(struct ll_fiemap_extent));
-
-	if (copy_to_user((void __user *)arg, fiemap_s, ret_bytes))
-		rc = -EFAULT;
-
-error:
-	kvfree(fiemap_s);
-	return rc;
-}
-
 /*
  * Read the data_version for inode.
  *
@@ -2158,8 +2085,6 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 	case LL_IOC_LOV_GETSTRIPE:
 		return ll_file_getstripe(inode,
 					 (struct lov_user_md __user *)arg);
-	case FSFILT_IOC_FIEMAP:
-		return ll_ioctl_fiemap(inode, arg);
 	case FSFILT_IOC_GETFLAGS:
 	case FSFILT_IOC_SETFLAGS:
 		return ll_iocontrol(inode, file, cmd, arg);
@@ -3061,13 +2986,12 @@ static int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 {
 	int rc;
 	size_t num_bytes;
-	struct ll_user_fiemap *fiemap;
+	struct fiemap *fiemap;
 	unsigned int extent_count = fieinfo->fi_extents_max;
 
 	num_bytes = sizeof(*fiemap) + (extent_count *
-				       sizeof(struct ll_fiemap_extent));
+				       sizeof(struct fiemap_extent));
 	fiemap = libcfs_kvzalloc(num_bytes, GFP_NOFS);
-
 	if (!fiemap)
 		return -ENOMEM;
 
@@ -3075,9 +2999,10 @@ static int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 	fiemap->fm_extent_count = fieinfo->fi_extents_max;
 	fiemap->fm_start = start;
 	fiemap->fm_length = len;
+
 	if (extent_count > 0 &&
 	    copy_from_user(&fiemap->fm_extents[0], fieinfo->fi_extents_start,
-			   sizeof(struct ll_fiemap_extent)) != 0) {
+			   sizeof(struct fiemap_extent))) {
 		rc = -EFAULT;
 		goto out;
 	}
@@ -3089,11 +3014,10 @@ static int ll_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 	if (extent_count > 0 &&
 	    copy_to_user(fieinfo->fi_extents_start, &fiemap->fm_extents[0],
 			 fiemap->fm_mapped_extents *
-			 sizeof(struct ll_fiemap_extent)) != 0) {
+			 sizeof(struct fiemap_extent))) {
 		rc = -EFAULT;
 		goto out;
 	}
-
 out:
 	kvfree(fiemap);
 	return rc;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index b23016f..02c7087 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -51,7 +51,6 @@
 #include "../include/lprocfs_status.h"
 #include "../include/lustre_param.h"
 #include "../include/cl_object.h"
-#include "../include/lustre/ll_fiemap.h"
 #include "../include/lustre_fid.h"
 
 #include "lov_internal.h"
@@ -1391,423 +1390,6 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	return rc;
 }
 
-#define FIEMAP_BUFFER_SIZE 4096
-
-/**
- * Non-zero fe_logical indicates that this is a continuation FIEMAP
- * call. The local end offset and the device are sent in the first
- * fm_extent. This function calculates the stripe number from the index.
- * This function returns a stripe_no on which mapping is to be restarted.
- *
- * This function returns fm_end_offset which is the in-OST offset at which
- * mapping should be restarted. If fm_end_offset=0 is returned then caller
- * will re-calculate proper offset in next stripe.
- * Note that the first extent is passed to lov_get_info via the value field.
- *
- * \param fiemap fiemap request header
- * \param lsm striping information for the file
- * \param fm_start logical start of mapping
- * \param fm_end logical end of mapping
- * \param start_stripe starting stripe will be returned in this
- */
-static u64 fiemap_calc_fm_end_offset(struct ll_user_fiemap *fiemap,
-				     struct lov_stripe_md *lsm, u64 fm_start,
-				     u64 fm_end, int *start_stripe)
-{
-	u64 local_end = fiemap->fm_extents[0].fe_logical;
-	u64 lun_start, lun_end;
-	u64 fm_end_offset;
-	int stripe_no = -1, i;
-
-	if (fiemap->fm_extent_count == 0 ||
-	    fiemap->fm_extents[0].fe_logical == 0)
-		return 0;
-
-	/* Find out stripe_no from ost_index saved in the fe_device */
-	for (i = 0; i < lsm->lsm_stripe_count; i++) {
-		struct lov_oinfo *oinfo = lsm->lsm_oinfo[i];
-
-		if (lov_oinfo_is_dummy(oinfo))
-			continue;
-
-		if (oinfo->loi_ost_idx == fiemap->fm_extents[0].fe_device) {
-			stripe_no = i;
-			break;
-		}
-	}
-	if (stripe_no == -1)
-		return -EINVAL;
-
-	/* If we have finished mapping on previous device, shift logical
-	 * offset to start of next device
-	 */
-	if ((lov_stripe_intersects(lsm, stripe_no, fm_start, fm_end,
-				   &lun_start, &lun_end)) != 0 &&
-				   local_end < lun_end) {
-		fm_end_offset = local_end;
-		*start_stripe = stripe_no;
-	} else {
-		/* This is a special value to indicate that caller should
-		 * calculate offset in next stripe.
-		 */
-		fm_end_offset = 0;
-		*start_stripe = (stripe_no + 1) % lsm->lsm_stripe_count;
-	}
-
-	return fm_end_offset;
-}
-
-/**
- * We calculate on which OST the mapping will end. If the length of mapping
- * is greater than (stripe_size * stripe_count) then the last_stripe will
- * will be one just before start_stripe. Else we check if the mapping
- * intersects each OST and find last_stripe.
- * This function returns the last_stripe and also sets the stripe_count
- * over which the mapping is spread
- *
- * \param lsm striping information for the file
- * \param fm_start logical start of mapping
- * \param fm_end logical end of mapping
- * \param start_stripe starting stripe of the mapping
- * \param stripe_count the number of stripes across which to map is returned
- *
- * \retval last_stripe return the last stripe of the mapping
- */
-static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm, u64 fm_start,
-				   u64 fm_end, int start_stripe,
-				   int *stripe_count)
-{
-	int last_stripe;
-	u64 obd_start, obd_end;
-	int i, j;
-
-	if (fm_end - fm_start > lsm->lsm_stripe_size * lsm->lsm_stripe_count) {
-		last_stripe = start_stripe < 1 ? lsm->lsm_stripe_count - 1 :
-							      start_stripe - 1;
-		*stripe_count = lsm->lsm_stripe_count;
-	} else {
-		for (j = 0, i = start_stripe; j < lsm->lsm_stripe_count;
-		     i = (i + 1) % lsm->lsm_stripe_count, j++) {
-			if ((lov_stripe_intersects(lsm, i, fm_start, fm_end,
-						   &obd_start, &obd_end)) == 0)
-				break;
-		}
-		*stripe_count = j;
-		last_stripe = (start_stripe + j - 1) % lsm->lsm_stripe_count;
-	}
-
-	return last_stripe;
-}
-
-/**
- * Set fe_device and copy extents from local buffer into main return buffer.
- *
- * \param fiemap fiemap request header
- * \param lcl_fm_ext array of local fiemap extents to be copied
- * \param ost_index OST index to be written into the fm_device field for each
-		    extent
- * \param ext_count number of extents to be copied
- * \param current_extent where to start copying in main extent array
- */
-static void fiemap_prepare_and_copy_exts(struct ll_user_fiemap *fiemap,
-					 struct ll_fiemap_extent *lcl_fm_ext,
-					 int ost_index, unsigned int ext_count,
-					 int current_extent)
-{
-	char *to;
-	int ext;
-
-	for (ext = 0; ext < ext_count; ext++) {
-		lcl_fm_ext[ext].fe_device = ost_index;
-		lcl_fm_ext[ext].fe_flags |= FIEMAP_EXTENT_NET;
-	}
-
-	/* Copy fm_extent's from fm_local to return buffer */
-	to = (char *)fiemap + fiemap_count_to_size(current_extent);
-	memcpy(to, lcl_fm_ext, ext_count * sizeof(struct ll_fiemap_extent));
-}
-
-/**
- * Break down the FIEMAP request and send appropriate calls to individual OSTs.
- * This also handles the restarting of FIEMAP calls in case mapping overflows
- * the available number of extents in single call.
- */
-static int lov_fiemap(struct lov_obd *lov, __u32 keylen, void *key,
-		      __u32 *vallen, void *val, struct lov_stripe_md *lsm)
-{
-	struct ll_fiemap_info_key *fm_key = key;
-	struct ll_user_fiemap *fiemap = val;
-	struct ll_user_fiemap *fm_local = NULL;
-	struct ll_fiemap_extent *lcl_fm_ext;
-	int count_local;
-	unsigned int get_num_extents = 0;
-	int ost_index = 0, actual_start_stripe, start_stripe;
-	u64 fm_start, fm_end, fm_length, fm_end_offset;
-	u64 curr_loc;
-	int current_extent = 0, rc = 0, i;
-	/* Whether have we collected enough extents */
-	bool enough = false;
-	int ost_eof = 0; /* EOF for object */
-	int ost_done = 0; /* done with required mapping for this OST? */
-	int last_stripe;
-	int cur_stripe = 0, cur_stripe_wrap = 0, stripe_count;
-	unsigned int buffer_size = FIEMAP_BUFFER_SIZE;
-
-	if (!lsm_has_objects(lsm)) {
-		if (lsm && lsm_is_released(lsm) && (fm_key->fiemap.fm_start <
-		    fm_key->oa.o_size)) {
-			/*
-			 * released file, return a minimal FIEMAP if
-			 * request fits in file-size.
-			 */
-			fiemap->fm_mapped_extents = 1;
-			fiemap->fm_extents[0].fe_logical =
-					fm_key->fiemap.fm_start;
-			if (fm_key->fiemap.fm_start + fm_key->fiemap.fm_length <
-			    fm_key->oa.o_size) {
-				fiemap->fm_extents[0].fe_length =
-					fm_key->fiemap.fm_length;
-			} else {
-				fiemap->fm_extents[0].fe_length =
-					fm_key->oa.o_size - fm_key->fiemap.fm_start;
-				fiemap->fm_extents[0].fe_flags |=
-						(FIEMAP_EXTENT_UNKNOWN |
-						 FIEMAP_EXTENT_LAST);
-			}
-		}
-		rc = 0;
-		goto out;
-	}
-
-	if (fiemap_count_to_size(fm_key->fiemap.fm_extent_count) < buffer_size)
-		buffer_size = fiemap_count_to_size(fm_key->fiemap.fm_extent_count);
-
-	fm_local = libcfs_kvzalloc(buffer_size, GFP_NOFS);
-	if (!fm_local) {
-		rc = -ENOMEM;
-		goto out;
-	}
-	lcl_fm_ext = &fm_local->fm_extents[0];
-
-	count_local = fiemap_size_to_count(buffer_size);
-
-	memcpy(fiemap, &fm_key->fiemap, sizeof(*fiemap));
-	fm_start = fiemap->fm_start;
-	fm_length = fiemap->fm_length;
-	/* Calculate start stripe, last stripe and length of mapping */
-	start_stripe = lov_stripe_number(lsm, fm_start);
-	actual_start_stripe = start_stripe;
-	fm_end = (fm_length == ~0ULL ? fm_key->oa.o_size :
-						fm_start + fm_length - 1);
-	/* If fm_length != ~0ULL but fm_start+fm_length-1 exceeds file size */
-	if (fm_end > fm_key->oa.o_size)
-		fm_end = fm_key->oa.o_size;
-
-	last_stripe = fiemap_calc_last_stripe(lsm, fm_start, fm_end,
-					      actual_start_stripe,
-					      &stripe_count);
-
-	fm_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, fm_start,
-						  fm_end, &start_stripe);
-	if (fm_end_offset == -EINVAL) {
-		rc = -EINVAL;
-		goto out;
-	}
-
-	if (fiemap_count_to_size(fiemap->fm_extent_count) > *vallen)
-		fiemap->fm_extent_count = fiemap_size_to_count(*vallen);
-	if (fiemap->fm_extent_count == 0) {
-		get_num_extents = 1;
-		count_local = 0;
-	}
-	/* Check each stripe */
-	for (cur_stripe = start_stripe, i = 0; i < stripe_count;
-	     i++, cur_stripe = (cur_stripe + 1) % lsm->lsm_stripe_count) {
-		u64 req_fm_len; /* Stores length of required mapping */
-		u64 len_mapped_single_call;
-		u64 lun_start, lun_end, obd_object_end;
-		unsigned int ext_count;
-
-		cur_stripe_wrap = cur_stripe;
-
-		/* Find out range of mapping on this stripe */
-		if ((lov_stripe_intersects(lsm, cur_stripe, fm_start, fm_end,
-					   &lun_start, &obd_object_end)) == 0)
-			continue;
-
-		if (lov_oinfo_is_dummy(lsm->lsm_oinfo[cur_stripe])) {
-			rc = -EIO;
-			goto out;
-		}
-
-		/* If this is a continuation FIEMAP call and we are on
-		 * starting stripe then lun_start needs to be set to
-		 * fm_end_offset
-		 */
-		if (fm_end_offset != 0 && cur_stripe == start_stripe)
-			lun_start = fm_end_offset;
-
-		if (fm_length != ~0ULL) {
-			/* Handle fm_start + fm_length overflow */
-			if (fm_start + fm_length < fm_start)
-				fm_length = ~0ULL - fm_start;
-			lun_end = lov_size_to_stripe(lsm, fm_start + fm_length,
-						     cur_stripe);
-		} else {
-			lun_end = ~0ULL;
-		}
-
-		if (lun_start == lun_end)
-			continue;
-
-		req_fm_len = obd_object_end - lun_start;
-		fm_local->fm_length = 0;
-		len_mapped_single_call = 0;
-
-		/* If the output buffer is very large and the objects have many
-		 * extents we may need to loop on a single OST repeatedly
-		 */
-		ost_eof = 0;
-		ost_done = 0;
-		do {
-			if (get_num_extents == 0) {
-				/* Don't get too many extents. */
-				if (current_extent + count_local >
-				    fiemap->fm_extent_count)
-					count_local = fiemap->fm_extent_count -
-								 current_extent;
-			}
-
-			lun_start += len_mapped_single_call;
-			fm_local->fm_length = req_fm_len - len_mapped_single_call;
-			req_fm_len = fm_local->fm_length;
-			fm_local->fm_extent_count = enough ? 1 : count_local;
-			fm_local->fm_mapped_extents = 0;
-			fm_local->fm_flags = fiemap->fm_flags;
-
-			fm_key->oa.o_oi = lsm->lsm_oinfo[cur_stripe]->loi_oi;
-			ost_index = lsm->lsm_oinfo[cur_stripe]->loi_ost_idx;
-
-			if (ost_index < 0 ||
-			    ost_index >= lov->desc.ld_tgt_count) {
-				rc = -EINVAL;
-				goto out;
-			}
-
-			/* If OST is inactive, return extent with UNKNOWN flag */
-			if (!lov->lov_tgts[ost_index]->ltd_active) {
-				fm_local->fm_flags |= FIEMAP_EXTENT_LAST;
-				fm_local->fm_mapped_extents = 1;
-
-				lcl_fm_ext[0].fe_logical = lun_start;
-				lcl_fm_ext[0].fe_length = obd_object_end -
-								      lun_start;
-				lcl_fm_ext[0].fe_flags |= FIEMAP_EXTENT_UNKNOWN;
-
-				goto inactive_tgt;
-			}
-
-			fm_local->fm_start = lun_start;
-			fm_local->fm_flags &= ~FIEMAP_FLAG_DEVICE_ORDER;
-			memcpy(&fm_key->fiemap, fm_local, sizeof(*fm_local));
-			*vallen = fiemap_count_to_size(fm_local->fm_extent_count);
-			rc = obd_get_info(NULL,
-					  lov->lov_tgts[ost_index]->ltd_exp,
-					  keylen, key, vallen, fm_local, lsm);
-			if (rc != 0)
-				goto out;
-
-inactive_tgt:
-			ext_count = fm_local->fm_mapped_extents;
-			if (ext_count == 0) {
-				ost_done = 1;
-				/* If last stripe has hole at the end,
-				 * then we need to return
-				 */
-				if (cur_stripe_wrap == last_stripe) {
-					fiemap->fm_mapped_extents = 0;
-					goto finish;
-				}
-				break;
-			} else if (enough) {
-				/*
-				 * We've collected enough extents and there are
-				 * more extents after it.
-				 */
-				goto finish;
-			}
-
-			/* If we just need num of extents then go to next device */
-			if (get_num_extents) {
-				current_extent += ext_count;
-				break;
-			}
-
-			len_mapped_single_call =
-				lcl_fm_ext[ext_count - 1].fe_logical -
-				lun_start + lcl_fm_ext[ext_count - 1].fe_length;
-
-			/* Have we finished mapping on this device? */
-			if (req_fm_len <= len_mapped_single_call)
-				ost_done = 1;
-
-			/* Clear the EXTENT_LAST flag which can be present on
-			 * last extent
-			 */
-			if (lcl_fm_ext[ext_count - 1].fe_flags &
-			    FIEMAP_EXTENT_LAST)
-				lcl_fm_ext[ext_count - 1].fe_flags &=
-							    ~FIEMAP_EXTENT_LAST;
-
-			curr_loc = lov_stripe_size(lsm,
-					lcl_fm_ext[ext_count - 1].fe_logical +
-					lcl_fm_ext[ext_count - 1].fe_length,
-					cur_stripe);
-			if (curr_loc >= fm_key->oa.o_size)
-				ost_eof = 1;
-
-			fiemap_prepare_and_copy_exts(fiemap, lcl_fm_ext,
-						     ost_index, ext_count,
-						     current_extent);
-
-			current_extent += ext_count;
-
-			/* Ran out of available extents? */
-			if (current_extent >= fiemap->fm_extent_count)
-				enough = true;
-		} while (ost_done == 0 && ost_eof == 0);
-
-		if (cur_stripe_wrap == last_stripe)
-			goto finish;
-	}
-
-finish:
-	/* Indicate that we are returning device offsets unless file just has
-	 * single stripe
-	 */
-	if (lsm->lsm_stripe_count > 1)
-		fiemap->fm_flags |= FIEMAP_FLAG_DEVICE_ORDER;
-
-	if (get_num_extents)
-		goto skip_last_device_calc;
-
-	/* Check if we have reached the last stripe and whether mapping for that
-	 * stripe is done.
-	 */
-	if (cur_stripe_wrap == last_stripe) {
-		if (ost_done || ost_eof)
-			fiemap->fm_extents[current_extent - 1].fe_flags |=
-							     FIEMAP_EXTENT_LAST;
-	}
-
-skip_last_device_calc:
-	fiemap->fm_mapped_extents = current_extent;
-
-out:
-	kvfree(fm_local);
-	return rc;
-}
-
 static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
 			__u32 keylen, void *key, __u32 *vallen, void *val,
 			struct lov_stripe_md *lsm)
@@ -1827,9 +1409,6 @@ static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
 
 		rc = 0;
 		goto out;
-	} else if (KEY_IS(KEY_FIEMAP)) {
-		rc = lov_fiemap(lov, keylen, key, vallen, val, lsm);
-		goto out;
 	} else if (KEY_IS(KEY_TGT_COUNT)) {
 		*((int *)val) = lov->desc.ld_tgt_count;
 		rc = 0;
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 52f7363..07bef44 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -313,6 +313,40 @@ static int lov_init_released(const struct lu_env *env,
 	return 0;
 }
 
+static struct cl_object *lov_find_subobj(const struct lu_env *env,
+					 struct lov_object *lov,
+					 struct lov_stripe_md *lsm,
+					 int stripe_idx)
+{
+	struct lov_device *dev = lu2lov_dev(lov2lu(lov)->lo_dev);
+	struct lov_oinfo *oinfo = lsm->lsm_oinfo[stripe_idx];
+	struct lov_thread_info *lti = lov_env_info(env);
+	struct lu_fid *ofid = &lti->lti_fid;
+	struct cl_device *subdev;
+	struct cl_object *result;
+	int ost_idx;
+	int rc;
+
+	if (lov->lo_type != LLT_RAID0) {
+		result = NULL;
+		goto out;
+	}
+
+	ost_idx = oinfo->loi_ost_idx;
+	rc = ostid_to_fid(ofid, &oinfo->loi_oi, ost_idx);
+	if (rc) {
+		result = NULL;
+		goto out;
+	}
+
+	subdev = lovsub2cl_dev(dev->ld_target[ost_idx]);
+	result = lov_sub_find(env, subdev, ofid, NULL);
+out:
+	if (!result)
+		result = ERR_PTR(-EINVAL);
+	return result;
+}
+
 static int lov_delete_empty(const struct lu_env *env, struct lov_object *lov,
 			    union lov_layout_state *state)
 {
@@ -911,6 +945,473 @@ int lov_lock_init(const struct lu_env *env, struct cl_object *obj,
 				    io);
 }
 
+/**
+ * We calculate on which OST the mapping will end. If the length of mapping
+ * is greater than (stripe_size * stripe_count) then the last_stripe will
+ * will be one just before start_stripe. Else we check if the mapping
+ * intersects each OST and find last_stripe.
+ * This function returns the last_stripe and also sets the stripe_count
+ * over which the mapping is spread
+ *
+ * \param lsm [in]		striping information for the file
+ * \param fm_start [in]		logical start of mapping
+ * \param fm_end [in]		logical end of mapping
+ * \param start_stripe [in]	starting stripe of the mapping
+ * \param stripe_count [out]	the number of stripes across which to map is
+ *				returned
+ *
+ * \retval last_stripe		return the last stripe of the mapping
+ */
+static int fiemap_calc_last_stripe(struct lov_stripe_md *lsm,
+				   loff_t fm_start, loff_t fm_end,
+				   int start_stripe, int *stripe_count)
+{
+	int last_stripe;
+	loff_t obd_start;
+	loff_t obd_end;
+	int i, j;
+
+	if (fm_end - fm_start > lsm->lsm_stripe_size * lsm->lsm_stripe_count) {
+		last_stripe = (start_stripe < 1 ? lsm->lsm_stripe_count - 1 :
+			       start_stripe - 1);
+		*stripe_count = lsm->lsm_stripe_count;
+	} else {
+		for (j = 0, i = start_stripe; j < lsm->lsm_stripe_count;
+		     i = (i + 1) % lsm->lsm_stripe_count, j++) {
+			if (!(lov_stripe_intersects(lsm, i, fm_start, fm_end,
+						    &obd_start, &obd_end)))
+				break;
+		}
+		*stripe_count = j;
+		last_stripe = (start_stripe + j - 1) % lsm->lsm_stripe_count;
+	}
+
+	return last_stripe;
+}
+
+/**
+ * Set fe_device and copy extents from local buffer into main return buffer.
+ *
+ * \param fiemap [out]		fiemap to hold all extents
+ * \param lcl_fm_ext [in]	array of fiemap extents get from OSC layer
+ * \param ost_index [in]	OST index to be written into the fm_device
+ *				field for each extent
+ * \param ext_count [in]	number of extents to be copied
+ * \param current_extent [in]	where to start copying in the extent array
+ */
+static void fiemap_prepare_and_copy_exts(struct fiemap *fiemap,
+					 struct fiemap_extent *lcl_fm_ext,
+					 int ost_index, unsigned int ext_count,
+					 int current_extent)
+{
+	unsigned int ext;
+	char *to;
+
+	for (ext = 0; ext < ext_count; ext++) {
+		lcl_fm_ext[ext].fe_device = ost_index;
+		lcl_fm_ext[ext].fe_flags |= FIEMAP_EXTENT_NET;
+	}
+
+	/* Copy fm_extent's from fm_local to return buffer */
+	to = (char *)fiemap + fiemap_count_to_size(current_extent);
+	memcpy(to, lcl_fm_ext, ext_count * sizeof(struct fiemap_extent));
+}
+
+#define FIEMAP_BUFFER_SIZE 4096
+
+/**
+ * Non-zero fe_logical indicates that this is a continuation FIEMAP
+ * call. The local end offset and the device are sent in the first
+ * fm_extent. This function calculates the stripe number from the index.
+ * This function returns a stripe_no on which mapping is to be restarted.
+ *
+ * This function returns fm_end_offset which is the in-OST offset@which
+ * mapping should be restarted. If fm_end_offset=0 is returned then caller
+ * will re-calculate proper offset in next stripe.
+ * Note that the first extent is passed to lov_get_info via the value field.
+ *
+ * \param fiemap [in]		fiemap request header
+ * \param lsm [in]		striping information for the file
+ * \param fm_start [in]		logical start of mapping
+ * \param fm_end [in]		logical end of mapping
+ * \param start_stripe [out]	starting stripe will be returned in this
+ */
+static loff_t fiemap_calc_fm_end_offset(struct fiemap *fiemap,
+					struct lov_stripe_md *lsm,
+					loff_t fm_start, loff_t fm_end,
+					int *start_stripe)
+{
+	loff_t local_end = fiemap->fm_extents[0].fe_logical;
+	loff_t lun_start, lun_end;
+	loff_t fm_end_offset;
+	int stripe_no = -1;
+	int i;
+
+	if (!fiemap->fm_extent_count || !fiemap->fm_extents[0].fe_logical)
+		return 0;
+
+	/* Find out stripe_no from ost_index saved in the fe_device */
+	for (i = 0; i < lsm->lsm_stripe_count; i++) {
+		struct lov_oinfo *oinfo = lsm->lsm_oinfo[i];
+
+		if (lov_oinfo_is_dummy(oinfo))
+			continue;
+
+		if (oinfo->loi_ost_idx == fiemap->fm_extents[0].fe_device) {
+			stripe_no = i;
+			break;
+		}
+	}
+
+	if (stripe_no == -1)
+		return -EINVAL;
+
+	/*
+	 * If we have finished mapping on previous device, shift logical
+	 * offset to start of next device
+	 */
+	if (lov_stripe_intersects(lsm, stripe_no, fm_start, fm_end,
+				  &lun_start, &lun_end) &&
+	    local_end < lun_end) {
+		fm_end_offset = local_end;
+		*start_stripe = stripe_no;
+	} else {
+		/* This is a special value to indicate that caller should
+		 * calculate offset in next stripe.
+		 */
+		fm_end_offset = 0;
+		*start_stripe = (stripe_no + 1) % lsm->lsm_stripe_count;
+	}
+
+	return fm_end_offset;
+}
+
+/**
+ * Break down the FIEMAP request and send appropriate calls to individual OSTs.
+ * This also handles the restarting of FIEMAP calls in case mapping overflows
+ * the available number of extents in single call.
+ *
+ * \param env [in]		lustre environment
+ * \param obj [in]		file object
+ * \param fmkey [in]		fiemap request header and other info
+ * \param fiemap [out]		fiemap buffer holding retrived map extents
+ * \param buflen [in/out]	max buffer length of @fiemap, when iterate
+ *				each OST, it is used to limit max map needed
+ * \retval 0	success
+ * \retval < 0	error
+ */
+static int lov_object_fiemap(const struct lu_env *env, struct cl_object *obj,
+			     struct ll_fiemap_info_key *fmkey,
+			     struct fiemap *fiemap, size_t *buflen)
+{
+	struct lov_obd *lov = lu2lov_dev(obj->co_lu.lo_dev)->ld_lov;
+	unsigned int buffer_size = FIEMAP_BUFFER_SIZE;
+	struct fiemap_extent *lcl_fm_ext;
+	struct cl_object *subobj = NULL;
+	struct fiemap *fm_local = NULL;
+	struct lov_stripe_md *lsm;
+	loff_t fm_start;
+	loff_t fm_end;
+	loff_t fm_length;
+	loff_t fm_end_offset;
+	int count_local;
+	int ost_index = 0;
+	int start_stripe;
+	int current_extent = 0;
+	int rc = 0;
+	int last_stripe;
+	int cur_stripe = 0;
+	int cur_stripe_wrap = 0;
+	int stripe_count;
+	/* Whether have we collected enough extents */
+	bool enough = false;
+	/* EOF for object */
+	bool ost_eof = false;
+	/* done with required mapping for this OST? */
+	bool ost_done = false;
+
+	lsm = lov_lsm_addref(cl2lov(obj));
+	if (!lsm)
+		return -ENODATA;
+
+	/**
+	 * If the stripe_count > 1 and the application does not understand
+	 * DEVICE_ORDER flag, it cannot interpret the extents correctly.
+	 */
+	if (lsm->lsm_stripe_count > 1 &&
+	    !(fiemap->fm_flags & FIEMAP_FLAG_DEVICE_ORDER)) {
+		rc = -ENOTSUPP;
+		goto out;
+	}
+
+	if (lsm_is_released(lsm)) {
+		if (fiemap->fm_start < fmkey->lfik_oa.o_size) {
+			/**
+			 * released file, return a minimal FIEMAP if
+			 * request fits in file-size.
+			 */
+			fiemap->fm_mapped_extents = 1;
+			fiemap->fm_extents[0].fe_logical = fiemap->fm_start;
+			if (fiemap->fm_start + fiemap->fm_length <
+			    fmkey->lfik_oa.o_size)
+				fiemap->fm_extents[0].fe_length =
+					 fiemap->fm_length;
+			else
+				fiemap->fm_extents[0].fe_length =
+					fmkey->lfik_oa.o_size -
+					fiemap->fm_start;
+			fiemap->fm_extents[0].fe_flags |=
+				FIEMAP_EXTENT_UNKNOWN | FIEMAP_EXTENT_LAST;
+		}
+		rc = 0;
+		goto out;
+	}
+
+	if (fiemap_count_to_size(fiemap->fm_extent_count) < buffer_size)
+		buffer_size = fiemap_count_to_size(fiemap->fm_extent_count);
+
+	fm_local = libcfs_kvzalloc(buffer_size, GFP_NOFS);
+	if (!fm_local) {
+		rc = -ENOMEM;
+		goto out;
+	}
+	lcl_fm_ext = &fm_local->fm_extents[0];
+	count_local = fiemap_size_to_count(buffer_size);
+
+	fm_start = fiemap->fm_start;
+	fm_length = fiemap->fm_length;
+	/* Calculate start stripe, last stripe and length of mapping */
+	start_stripe = lov_stripe_number(lsm, fm_start);
+	fm_end = (fm_length == ~0ULL) ? fmkey->lfik_oa.o_size :
+					fm_start + fm_length - 1;
+	/* If fm_length != ~0ULL but fm_start_fm_length-1 exceeds file size */
+	if (fm_end > fmkey->lfik_oa.o_size)
+		fm_end = fmkey->lfik_oa.o_size;
+
+	last_stripe = fiemap_calc_last_stripe(lsm, fm_start, fm_end,
+					      start_stripe, &stripe_count);
+	fm_end_offset = fiemap_calc_fm_end_offset(fiemap, lsm, fm_start, fm_end,
+						  &start_stripe);
+	if (fm_end_offset == -EINVAL) {
+		rc = -EINVAL;
+		goto out;
+	}
+
+	/**
+	 * Requested extent count exceeds the fiemap buffer size, shrink our
+	 * ambition.
+	 */
+	if (fiemap_count_to_size(fiemap->fm_extent_count) > *buflen)
+		fiemap->fm_extent_count = fiemap_size_to_count(*buflen);
+	if (!fiemap->fm_extent_count)
+		count_local = 0;
+
+	/* Check each stripe */
+	for (cur_stripe = start_stripe; stripe_count > 0;
+	     --stripe_count,
+	     cur_stripe = (cur_stripe + 1) % lsm->lsm_stripe_count) {
+		loff_t req_fm_len; /* Stores length of required mapping */
+		loff_t len_mapped_single_call;
+		loff_t lun_start;
+		loff_t lun_end;
+		loff_t obd_object_end;
+		unsigned int ext_count;
+
+		cur_stripe_wrap = cur_stripe;
+
+		/* Find out range of mapping on this stripe */
+		if (!(lov_stripe_intersects(lsm, cur_stripe, fm_start, fm_end,
+					    &lun_start, &obd_object_end)))
+			continue;
+
+		if (lov_oinfo_is_dummy(lsm->lsm_oinfo[cur_stripe])) {
+			rc = -EIO;
+			goto out;
+		}
+
+		/*
+		 * If this is a continuation FIEMAP call and we are on
+		 * starting stripe then lun_start needs to be set to
+		 * fm_end_offset
+		 */
+		if (fm_end_offset && cur_stripe == start_stripe)
+			lun_start = fm_end_offset;
+
+		if (fm_length != ~0ULL) {
+			/* Handle fm_start + fm_length overflow */
+			if (fm_start + fm_length < fm_start)
+				fm_length = ~0ULL - fm_start;
+			lun_end = lov_size_to_stripe(lsm, fm_start + fm_length,
+						     cur_stripe);
+		} else {
+			lun_end = ~0ULL;
+		}
+
+		if (lun_start == lun_end)
+			continue;
+
+		req_fm_len = obd_object_end - lun_start;
+		fm_local->fm_length = 0;
+		len_mapped_single_call = 0;
+
+		/* find lobsub object */
+		subobj = lov_find_subobj(env, cl2lov(obj), lsm,
+					 cur_stripe);
+		if (IS_ERR(subobj)) {
+			rc = PTR_ERR(subobj);
+			goto out;
+		}
+		/*
+		 * If the output buffer is very large and the objects have many
+		 * extents we may need to loop on a single OST repeatedly
+		 */
+		ost_eof = false;
+		ost_done = false;
+		do {
+			if (fiemap->fm_extent_count > 0) {
+				/* Don't get too many extents. */
+				if (current_extent + count_local >
+				    fiemap->fm_extent_count)
+					count_local = fiemap->fm_extent_count -
+						      current_extent;
+			}
+
+			lun_start += len_mapped_single_call;
+			fm_local->fm_length = req_fm_len -
+					      len_mapped_single_call;
+			req_fm_len = fm_local->fm_length;
+			fm_local->fm_extent_count = enough ? 1 : count_local;
+			fm_local->fm_mapped_extents = 0;
+			fm_local->fm_flags = fiemap->fm_flags;
+
+			ost_index = lsm->lsm_oinfo[cur_stripe]->loi_ost_idx;
+
+			if (ost_index < 0 ||
+			    ost_index >= lov->desc.ld_tgt_count) {
+				rc = -EINVAL;
+				goto obj_put;
+			}
+			/*
+			 * If OST is inactive, return extent with UNKNOWN
+			 * flag.
+			 */
+			if (!lov->lov_tgts[ost_index]->ltd_active) {
+				fm_local->fm_flags |= FIEMAP_EXTENT_LAST;
+				fm_local->fm_mapped_extents = 1;
+
+				lcl_fm_ext[0].fe_logical = lun_start;
+				lcl_fm_ext[0].fe_length = obd_object_end -
+							  lun_start;
+				lcl_fm_ext[0].fe_flags |= FIEMAP_EXTENT_UNKNOWN;
+
+				goto inactive_tgt;
+			}
+
+			fm_local->fm_start = lun_start;
+			fm_local->fm_flags &= ~FIEMAP_FLAG_DEVICE_ORDER;
+			memcpy(&fmkey->lfik_fiemap, fm_local, sizeof(*fm_local));
+			*buflen = fiemap_count_to_size(fm_local->fm_extent_count);
+
+			rc = cl_object_fiemap(env, subobj, fmkey, fm_local,
+					      buflen);
+			if (rc)
+				goto obj_put;
+inactive_tgt:
+			ext_count = fm_local->fm_mapped_extents;
+			if (!ext_count) {
+				ost_done = true;
+				/*
+				 * If last stripe has hold at the end,
+				 * we need to return
+				 */
+				if (cur_stripe_wrap == last_stripe) {
+					fiemap->fm_mapped_extents = 0;
+					goto finish;
+				}
+				break;
+			} else if (enough) {
+				/*
+				 * We've collected enough extents and there are
+				 * more extents after it.
+				 */
+				goto finish;
+			}
+
+			/* If we just need num of extents, got to next device */
+			if (!fiemap->fm_extent_count) {
+				current_extent += ext_count;
+				break;
+			}
+
+			/* prepare to copy retrived map extents */
+			len_mapped_single_call =
+				lcl_fm_ext[ext_count - 1].fe_logical -
+				lun_start + lcl_fm_ext[ext_count - 1].fe_length;
+
+			/* Have we finished mapping on this device? */
+			if (req_fm_len <= len_mapped_single_call)
+				ost_done = true;
+
+			/*
+			 * Clear the EXTENT_LAST flag which can be present on
+			 * the last extent
+			 */
+			if (lcl_fm_ext[ext_count - 1].fe_flags &
+			    FIEMAP_EXTENT_LAST)
+				lcl_fm_ext[ext_count - 1].fe_flags &=
+					~FIEMAP_EXTENT_LAST;
+
+			if (lov_stripe_size(lsm,
+					    lcl_fm_ext[ext_count - 1].fe_logical +
+					    lcl_fm_ext[ext_count - 1].fe_length,
+					    cur_stripe) >= fmkey->lfik_oa.o_size)
+				ost_eof = true;
+
+			fiemap_prepare_and_copy_exts(fiemap, lcl_fm_ext,
+						     ost_index, ext_count,
+						     current_extent);
+			current_extent += ext_count;
+
+			/* Ran out of available extents? */
+			if (current_extent >= fiemap->fm_extent_count)
+				enough = true;
+		} while (!ost_done && !ost_eof);
+
+		cl_object_put(env, subobj);
+		subobj = NULL;
+
+		if (cur_stripe_wrap == last_stripe)
+			goto finish;
+	} /* for each stripe */
+finish:
+	/*
+	 * Indicate that we are returning device offsets unless file just has
+	 * single stripe
+	 */
+	if (lsm->lsm_stripe_count > 1)
+		fiemap->fm_flags |= FIEMAP_FLAG_DEVICE_ORDER;
+
+	if (!fiemap->fm_extent_count)
+		goto skip_last_device_calc;
+
+	/*
+	 * Check if we have reached the last stripe and whether mapping for that
+	 * stripe is done.
+	 */
+	if ((cur_stripe_wrap == last_stripe) && (ost_done || ost_eof))
+		fiemap->fm_extents[current_extent - 1].fe_flags |=
+							FIEMAP_EXTENT_LAST;
+skip_last_device_calc:
+	fiemap->fm_mapped_extents = current_extent;
+obj_put:
+	if (subobj)
+		cl_object_put(env, subobj);
+out:
+	kvfree(fm_local);
+	lov_lsm_put(obj, lsm);
+	return rc;
+}
+
 static int lov_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 				struct lov_user_md __user *lum)
 {
@@ -934,7 +1435,8 @@ static const struct cl_object_operations lov_ops = {
 	.coo_attr_get  = lov_attr_get,
 	.coo_attr_update = lov_attr_update,
 	.coo_conf_set  = lov_conf_set,
-	.coo_getstripe = lov_object_getstripe
+	.coo_getstripe = lov_object_getstripe,
+	.coo_fiemap	 = lov_object_fiemap,
 };
 
 static const struct lu_object_operations lov_lu_obj_ops = {
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index 3199dd4..4ad2ee5 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -343,6 +343,38 @@ int cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 EXPORT_SYMBOL(cl_object_getstripe);
 
 /**
+ * Get fiemap extents from file object.
+ *
+ * \param env [in]	lustre environment
+ * \param obj [in]	file object
+ * \param key [in]	fiemap request argument
+ * \param fiemap [out]	fiemap extents mapping retrived
+ * \param buflen [in]	max buffer length of @fiemap
+ *
+ * \retval 0	success
+ * \retval < 0	error
+ */
+int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
+		     struct ll_fiemap_info_key *key,
+		     struct fiemap *fiemap, size_t *buflen)
+{
+	struct lu_object_header *top;
+	int result = 0;
+
+	top = obj->co_lu.lo_header;
+	list_for_each_entry(obj, &top->loh_layers, co_lu.lo_linkage) {
+		if (obj->co_ops->coo_fiemap) {
+			result = obj->co_ops->coo_fiemap(env, obj, key, fiemap,
+							 buflen);
+			if (result)
+				break;
+		}
+	}
+	return result;
+}
+EXPORT_SYMBOL(cl_object_fiemap);
+
+/**
  * Helper function removing all object locks, and marking object for
  * deletion. All object pages must have been deleted at this point.
  *
diff --git a/drivers/staging/lustre/lustre/osc/osc_object.c b/drivers/staging/lustre/lustre/osc/osc_object.c
index aae3a2d..dc0c173 100644
--- a/drivers/staging/lustre/lustre/osc/osc_object.c
+++ b/drivers/staging/lustre/lustre/osc/osc_object.c
@@ -218,6 +218,94 @@ static int osc_object_prune(const struct lu_env *env, struct cl_object *obj)
 	return 0;
 }
 
+static int osc_object_fiemap(const struct lu_env *env, struct cl_object *obj,
+			     struct ll_fiemap_info_key *fmkey,
+			     struct fiemap *fiemap, size_t *buflen)
+{
+	struct obd_export *exp = osc_export(cl2osc(obj));
+	ldlm_policy_data_t policy;
+	struct ptlrpc_request *req;
+	struct lustre_handle lockh;
+	struct ldlm_res_id resid;
+	enum ldlm_mode mode = 0;
+	struct fiemap *reply;
+	char *tmp;
+	int rc;
+
+	fmkey->lfik_oa.o_oi = cl2osc(obj)->oo_oinfo->loi_oi;
+	if (!(fmkey->lfik_fiemap.fm_flags & FIEMAP_FLAG_SYNC))
+		goto skip_locking;
+
+	policy.l_extent.start = fmkey->lfik_fiemap.fm_start & PAGE_MASK;
+
+	if (OBD_OBJECT_EOF - fmkey->lfik_fiemap.fm_length <=
+	    fmkey->lfik_fiemap.fm_start + PAGE_SIZE - 1)
+		policy.l_extent.end = OBD_OBJECT_EOF;
+	else
+		policy.l_extent.end = (fmkey->lfik_fiemap.fm_start +
+				       fmkey->lfik_fiemap.fm_length +
+				       PAGE_SIZE - 1) & PAGE_MASK;
+
+	ostid_build_res_name(&fmkey->lfik_oa.o_oi, &resid);
+	mode = ldlm_lock_match(exp->exp_obd->obd_namespace,
+			       LDLM_FL_BLOCK_GRANTED | LDLM_FL_LVB_READY,
+			       &resid, LDLM_EXTENT, &policy,
+			       LCK_PR | LCK_PW, &lockh, 0);
+	if (mode) { /* lock is cached on client */
+		if (mode != LCK_PR) {
+			ldlm_lock_addref(&lockh, LCK_PR);
+			ldlm_lock_decref(&lockh, LCK_PW);
+		}
+	} else { /* no cached lock, needs acquire lock on server side */
+		fmkey->lfik_oa.o_valid |= OBD_MD_FLFLAGS;
+		fmkey->lfik_oa.o_flags |= OBD_FL_SRVLOCK;
+	}
+
+skip_locking:
+	req = ptlrpc_request_alloc(class_exp2cliimp(exp),
+				   &RQF_OST_GET_INFO_FIEMAP);
+	if (!req) {
+		rc = -ENOMEM;
+		goto drop_lock;
+	}
+
+	req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_KEY, RCL_CLIENT,
+			     sizeof(*fmkey));
+	req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_VAL, RCL_CLIENT,
+			     *buflen);
+	req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_VAL, RCL_SERVER,
+			     *buflen);
+
+	rc = ptlrpc_request_pack(req, LUSTRE_OST_VERSION, OST_GET_INFO);
+	if (rc) {
+		ptlrpc_request_free(req);
+		goto drop_lock;
+	}
+	tmp = req_capsule_client_get(&req->rq_pill, &RMF_FIEMAP_KEY);
+	memcpy(tmp, fmkey, sizeof(*fmkey));
+	tmp = req_capsule_client_get(&req->rq_pill, &RMF_FIEMAP_VAL);
+	memcpy(tmp, fiemap, *buflen);
+	ptlrpc_request_set_replen(req);
+
+	rc = ptlrpc_queue_wait(req);
+	if (rc)
+		goto fini_req;
+
+	reply = req_capsule_server_get(&req->rq_pill, &RMF_FIEMAP_VAL);
+	if (!reply) {
+		rc = -EPROTO;
+		goto fini_req;
+	}
+
+	memcpy(fiemap, reply, *buflen);
+fini_req:
+	ptlrpc_req_finished(req);
+drop_lock:
+	if (mode)
+		ldlm_lock_decref(&lockh, LCK_PR);
+	return rc;
+}
+
 void osc_object_set_contended(struct osc_object *obj)
 {
 	obj->oo_contention_time = cfs_time_current();
@@ -263,7 +351,8 @@ static const struct cl_object_operations osc_ops = {
 	.coo_attr_get  = osc_attr_get,
 	.coo_attr_update = osc_attr_update,
 	.coo_glimpse   = osc_object_glimpse,
-	.coo_prune     = osc_object_prune
+	.coo_prune	 = osc_object_prune,
+	.coo_fiemap	 = osc_object_fiemap,
 };
 
 static const struct lu_object_operations osc_lu_obj_ops = {
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 749781f..963a485 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2543,103 +2543,6 @@ out:
 	return err;
 }
 
-static int osc_get_info(const struct lu_env *env, struct obd_export *exp,
-			u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm)
-{
-	if (!vallen || !val)
-		return -EFAULT;
-
-	if (KEY_IS(KEY_FIEMAP)) {
-		struct ll_fiemap_info_key *fm_key = key;
-		struct ldlm_res_id res_id;
-		ldlm_policy_data_t policy;
-		struct lustre_handle lockh;
-		enum ldlm_mode mode = 0;
-		struct ptlrpc_request *req;
-		struct ll_user_fiemap *reply;
-		char *tmp;
-		int rc;
-
-		if (!(fm_key->fiemap.fm_flags & FIEMAP_FLAG_SYNC))
-			goto skip_locking;
-
-		policy.l_extent.start = fm_key->fiemap.fm_start &
-						PAGE_MASK;
-
-		if (OBD_OBJECT_EOF - fm_key->fiemap.fm_length <=
-		    fm_key->fiemap.fm_start + PAGE_SIZE - 1)
-			policy.l_extent.end = OBD_OBJECT_EOF;
-		else
-			policy.l_extent.end = (fm_key->fiemap.fm_start +
-				fm_key->fiemap.fm_length +
-				PAGE_SIZE - 1) & PAGE_MASK;
-
-		ostid_build_res_name(&fm_key->oa.o_oi, &res_id);
-		mode = ldlm_lock_match(exp->exp_obd->obd_namespace,
-				       LDLM_FL_BLOCK_GRANTED |
-				       LDLM_FL_LVB_READY,
-				       &res_id, LDLM_EXTENT, &policy,
-				       LCK_PR | LCK_PW, &lockh, 0);
-		if (mode) { /* lock is cached on client */
-			if (mode != LCK_PR) {
-				ldlm_lock_addref(&lockh, LCK_PR);
-				ldlm_lock_decref(&lockh, LCK_PW);
-			}
-		} else { /* no cached lock, needs acquire lock on server side */
-			fm_key->oa.o_valid |= OBD_MD_FLFLAGS;
-			fm_key->oa.o_flags |= OBD_FL_SRVLOCK;
-		}
-
-skip_locking:
-		req = ptlrpc_request_alloc(class_exp2cliimp(exp),
-					   &RQF_OST_GET_INFO_FIEMAP);
-		if (!req) {
-			rc = -ENOMEM;
-			goto drop_lock;
-		}
-
-		req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_KEY,
-				     RCL_CLIENT, keylen);
-		req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_VAL,
-				     RCL_CLIENT, *vallen);
-		req_capsule_set_size(&req->rq_pill, &RMF_FIEMAP_VAL,
-				     RCL_SERVER, *vallen);
-
-		rc = ptlrpc_request_pack(req, LUSTRE_OST_VERSION, OST_GET_INFO);
-		if (rc) {
-			ptlrpc_request_free(req);
-			goto drop_lock;
-		}
-
-		tmp = req_capsule_client_get(&req->rq_pill, &RMF_FIEMAP_KEY);
-		memcpy(tmp, key, keylen);
-		tmp = req_capsule_client_get(&req->rq_pill, &RMF_FIEMAP_VAL);
-		memcpy(tmp, val, *vallen);
-
-		ptlrpc_request_set_replen(req);
-		rc = ptlrpc_queue_wait(req);
-		if (rc)
-			goto fini_req;
-
-		reply = req_capsule_server_get(&req->rq_pill, &RMF_FIEMAP_VAL);
-		if (!reply) {
-			rc = -EPROTO;
-			goto fini_req;
-		}
-
-		memcpy(val, reply, *vallen);
-fini_req:
-		ptlrpc_req_finished(req);
-drop_lock:
-		if (mode)
-			ldlm_lock_decref(&lockh, LCK_PR);
-		return rc;
-	}
-
-	return -EINVAL;
-}
-
 static int osc_set_info_async(const struct lu_env *env, struct obd_export *exp,
 			      u32 keylen, void *key, u32 vallen,
 			      void *val, struct ptlrpc_request_set *set)
@@ -3112,7 +3015,6 @@ static struct obd_ops osc_obd_ops = {
 	.setattr        = osc_setattr,
 	.setattr_async  = osc_setattr_async,
 	.iocontrol      = osc_iocontrol,
-	.get_info       = osc_get_info,
 	.set_info_async = osc_set_info_async,
 	.import_event   = osc_import_event,
 	.process_config = osc_process_config,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 8717685..36b86ae 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1772,7 +1772,7 @@ void lustre_swab_fid2path(struct getinfo_fid2path *gf)
 }
 EXPORT_SYMBOL(lustre_swab_fid2path);
 
-static void lustre_swab_fiemap_extent(struct ll_fiemap_extent *fm_extent)
+static void lustre_swab_fiemap_extent(struct fiemap_extent *fm_extent)
 {
 	__swab64s(&fm_extent->fe_logical);
 	__swab64s(&fm_extent->fe_physical);
@@ -1781,7 +1781,7 @@ static void lustre_swab_fiemap_extent(struct ll_fiemap_extent *fm_extent)
 	__swab32s(&fm_extent->fe_device);
 }
 
-void lustre_swab_fiemap(struct ll_user_fiemap *fiemap)
+void lustre_swab_fiemap(struct fiemap *fiemap)
 {
 	__u32 i;
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index e5945e2..cb89bf2 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -3520,21 +3520,21 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct llogd_conn_body *)0)->lgdc_ctxt_idx) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct llogd_conn_body *)0)->lgdc_ctxt_idx));
 
-	/* Checks for struct ll_fiemap_info_key */
+	/* Checks for struct fiemap_info_key */
 	LASSERTF((int)sizeof(struct ll_fiemap_info_key) == 248, "found %lld\n",
 		 (long long)(int)sizeof(struct ll_fiemap_info_key));
-	LASSERTF((int)offsetof(struct ll_fiemap_info_key, name[8]) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_info_key, name[8]));
-	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->name[8]) == 1, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->name[8]));
-	LASSERTF((int)offsetof(struct ll_fiemap_info_key, oa) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_info_key, oa));
-	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->oa) == 208, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->oa));
-	LASSERTF((int)offsetof(struct ll_fiemap_info_key, fiemap) == 216, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_info_key, fiemap));
-	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->fiemap) == 32, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->fiemap));
+	LASSERTF((int)offsetof(struct ll_fiemap_info_key, lfik_name[8]) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct ll_fiemap_info_key, lfik_name[8]));
+	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_name[8]) == 1, "found %lld\n",
+		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_name[8]));
+	LASSERTF((int)offsetof(struct ll_fiemap_info_key, lfik_oa) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct ll_fiemap_info_key, lfik_oa));
+	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_oa) == 208, "found %lld\n",
+		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_oa));
+	LASSERTF((int)offsetof(struct ll_fiemap_info_key, lfik_fiemap) == 216, "found %lld\n",
+		 (long long)(int)offsetof(struct ll_fiemap_info_key, lfik_fiemap));
+	LASSERTF((int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_fiemap) == 32, "found %lld\n",
+		 (long long)(int)sizeof(((struct ll_fiemap_info_key *)0)->lfik_fiemap));
 
 	/* Checks for struct mgs_target_info */
 	LASSERTF((int)sizeof(struct mgs_target_info) == 4544, "found %lld\n",
@@ -3670,64 +3670,64 @@ void lustre_assert_wire_constants(void)
 	LASSERTF((int)sizeof(((struct getinfo_fid2path *)0)->gf_path[0]) == 1, "found %lld\n",
 		 (long long)(int)sizeof(((struct getinfo_fid2path *)0)->gf_path[0]));
 
-	/* Checks for struct ll_user_fiemap */
-	LASSERTF((int)sizeof(struct ll_user_fiemap) == 32, "found %lld\n",
-		 (long long)(int)sizeof(struct ll_user_fiemap));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_start) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_start));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_start) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_start));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_length) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_length));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_length) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_length));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_flags) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_flags));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_flags) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_flags));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_mapped_extents) == 20, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_mapped_extents));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_mapped_extents) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_mapped_extents));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_extent_count) == 24, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_extent_count));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_extent_count) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_extent_count));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_reserved) == 28, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_reserved));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_reserved) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_reserved));
-	LASSERTF((int)offsetof(struct ll_user_fiemap, fm_extents) == 32, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_user_fiemap, fm_extents));
-	LASSERTF((int)sizeof(((struct ll_user_fiemap *)0)->fm_extents) == 0, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_user_fiemap *)0)->fm_extents));
+	/* Checks for struct fiemap */
+	LASSERTF((int)sizeof(struct fiemap) == 32, "found %lld\n",
+		 (long long)(int)sizeof(struct fiemap));
+	LASSERTF((int)offsetof(struct fiemap, fm_start) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_start));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_start) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_start));
+	LASSERTF((int)offsetof(struct fiemap, fm_length) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_length));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_length) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_length));
+	LASSERTF((int)offsetof(struct fiemap, fm_flags) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_flags));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_flags) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_flags));
+	LASSERTF((int)offsetof(struct fiemap, fm_mapped_extents) == 20, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_mapped_extents));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_mapped_extents) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_mapped_extents));
+	LASSERTF((int)offsetof(struct fiemap, fm_extent_count) == 24, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_extent_count));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_extent_count) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_extent_count));
+	LASSERTF((int)offsetof(struct fiemap, fm_reserved) == 28, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_reserved));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_reserved) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_reserved));
+	LASSERTF((int)offsetof(struct fiemap, fm_extents) == 32, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap, fm_extents));
+	LASSERTF((int)sizeof(((struct fiemap *)0)->fm_extents) == 0, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap *)0)->fm_extents));
 	CLASSERT(FIEMAP_FLAG_SYNC == 0x00000001);
 	CLASSERT(FIEMAP_FLAG_XATTR == 0x00000002);
 	CLASSERT(FIEMAP_FLAG_DEVICE_ORDER == 0x40000000);
 
-	/* Checks for struct ll_fiemap_extent */
-	LASSERTF((int)sizeof(struct ll_fiemap_extent) == 56, "found %lld\n",
-		 (long long)(int)sizeof(struct ll_fiemap_extent));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_logical) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_logical));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_logical) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_logical));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_physical) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_physical));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_physical) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_physical));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_length) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_length));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_length) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_length));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_flags) == 40, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_flags));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_flags) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_flags));
-	LASSERTF((int)offsetof(struct ll_fiemap_extent, fe_device) == 44, "found %lld\n",
-		 (long long)(int)offsetof(struct ll_fiemap_extent, fe_device));
-	LASSERTF((int)sizeof(((struct ll_fiemap_extent *)0)->fe_device) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct ll_fiemap_extent *)0)->fe_device));
+	/* Checks for struct fiemap_extent */
+	LASSERTF((int)sizeof(struct fiemap_extent) == 56, "found %lld\n",
+		 (long long)(int)sizeof(struct fiemap_extent));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_logical) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_logical));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_logical) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_logical));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_physical) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_physical));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_physical) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_physical));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_length) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_length));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_length) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_length));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_flags) == 40, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_flags));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_flags) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_flags));
+	LASSERTF((int)offsetof(struct fiemap_extent, fe_reserved[0]) == 44, "found %lld\n",
+		 (long long)(int)offsetof(struct fiemap_extent, fe_reserved[0]));
+	LASSERTF((int)sizeof(((struct fiemap_extent *)0)->fe_reserved[0]) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct fiemap_extent *)0)->fe_reserved[0]));
 	CLASSERT(FIEMAP_EXTENT_LAST == 0x00000001);
 	CLASSERT(FIEMAP_EXTENT_UNKNOWN == 0x00000002);
 	CLASSERT(FIEMAP_EXTENT_DELALLOC == 0x00000004);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 08/41] staging: lustre: ptlrpc: ret -ECONNREFUSED if not context found in req
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Sebastien Buisson, James Simmons

From: Sebastien Buisson <sebastien.buisson@bull.net>

Return -ECONNREFUSED instead of -ENOMEM in sptlrpc_req_get_ctx()
if no context is found in req. It is more graceful.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
Reviewed-on: http://review.whamcloud.com/14043
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/sec.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 5d3995d..1457a6e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -379,7 +379,7 @@ int sptlrpc_req_get_ctx(struct ptlrpc_request *req)
 
 	if (!req->rq_cli_ctx) {
 		CERROR("req %p: fail to get context\n", req);
-		return -ENOMEM;
+		return -ECONNREFUSED;
 	}
 
 	return 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 08/41] staging: lustre: ptlrpc: ret -ECONNREFUSED if not context found in req
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Sebastien Buisson, James Simmons

From: Sebastien Buisson <sebastien.buisson@bull.net>

Return -ECONNREFUSED instead of -ENOMEM in sptlrpc_req_get_ctx()
if no context is found in req. It is more graceful.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
Reviewed-on: http://review.whamcloud.com/14043
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/sec.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/sec.c b/drivers/staging/lustre/lustre/ptlrpc/sec.c
index 5d3995d..1457a6e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/sec.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/sec.c
@@ -379,7 +379,7 @@ int sptlrpc_req_get_ctx(struct ptlrpc_request *req)
 
 	if (!req->rq_cli_ctx) {
 		CERROR("req %p: fail to get context\n", req);
-		return -ENOMEM;
+		return -ECONNREFUSED;
 	}
 
 	return 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 09/41] staging: lustre: llite: default dir stripe index only for mkdir
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

Default dir stripe index should only work during mkdir,
otherwise it will cause other open/create request being
sent to the wrong MDT.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6373
Reviewed-on: http://review.whamcloud.com/14096
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/llite_lib.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index e75ab2f..c7aab3f 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -2269,8 +2269,9 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data,
 	op_data->op_default_stripe_offset = -1;
 	if (S_ISDIR(i1->i_mode)) {
 		op_data->op_mea1 = ll_i2info(i1)->lli_lsm_md;
-		op_data->op_default_stripe_offset =
-			ll_i2info(i1)->lli_def_stripe_offset;
+		if (opc == LUSTRE_OPC_MKDIR)
+			op_data->op_default_stripe_offset =
+				ll_i2info(i1)->lli_def_stripe_offset;
 	}
 
 	if (i2) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 09/41] staging: lustre: llite: default dir stripe index only for mkdir
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

Default dir stripe index should only work during mkdir,
otherwise it will cause other open/create request being
sent to the wrong MDT.

Signed-off-by: wang di <di.wang@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6373
Reviewed-on: http://review.whamcloud.com/14096
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/llite_lib.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index e75ab2f..c7aab3f 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -2269,8 +2269,9 @@ struct md_op_data *ll_prep_md_op_data(struct md_op_data *op_data,
 	op_data->op_default_stripe_offset = -1;
 	if (S_ISDIR(i1->i_mode)) {
 		op_data->op_mea1 = ll_i2info(i1)->lli_lsm_md;
-		op_data->op_default_stripe_offset =
-			ll_i2info(i1)->lli_def_stripe_offset;
+		if (opc == LUSTRE_OPC_MKDIR)
+			op_data->op_default_stripe_offset =
+				ll_i2info(i1)->lli_def_stripe_offset;
 	}
 
 	if (i2) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 10/41] staging: lustre: libcfs: shortcut to create CPT from NUMA topology
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Liang Zhen,
	James Simmons

From: Liang Zhen <liang.zhen@intel.com>

If user wants to create CPT table that can match numa topology,
she has to query cpu & numa topology, then provide a pattern
string to describe the topology, this is inconvenient.

To improve it, this patch can support shortcut expression "N" or "n"
to create CPT table from NUMA & CPU topology

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6325
Reviewed-on: http://review.whamcloud.com/14049
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lnet/libcfs/linux/linux-cpu.c   |   48 ++++++++++++++++---
 1 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index e8b1a61..464b279 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -55,6 +55,8 @@ MODULE_PARM_DESC(cpu_npartitions, "# of CPU partitions");
  * i.e: "N 0[0,1] 1[2,3]" the first character 'N' means numbers in bracket
  *       are NUMA node ID, number before bracket is CPU partition ID.
  *
+ * i.e: "N", shortcut expression to create CPT from NUMA & CPU topology
+ *
  * NB: If user specified cpu_pattern, cpu_npartitions will be ignored
  */
 static char	*cpu_pattern = "";
@@ -818,11 +820,14 @@ static struct cfs_cpt_table *
 cfs_cpt_table_create_pattern(char *pattern)
 {
 	struct cfs_cpt_table	*cptab;
-	char			*str	= pattern;
+	char *str;
 	int			node	= 0;
 	int			high;
-	int			ncpt;
+	int ncpt = 0;
+	int cpt;
+	int rc;
 	int			c;
+	int i;
 
 	for (ncpt = 0;; ncpt++) { /* quick scan bracket */
 		str = strchr(str, '[');
@@ -834,7 +839,20 @@ cfs_cpt_table_create_pattern(char *pattern)
 	str = cfs_trimwhite(pattern);
 	if (*str == 'n' || *str == 'N') {
 		pattern = str + 1;
-		node = 1;
+		if (*pattern != '\0') {
+			node = 1;
+		} else { /* shortcut to create CPT from NUMA & CPU topology */
+			node = -1;
+			ncpt = num_online_nodes();
+		}
+	}
+
+	if (!ncpt) { /* scanning bracket which is mark of partition */
+		for (str = pattern;; str++, ncpt++) {
+			str = strchr(str, '[');
+			if (!str)
+				break;
+		}
 	}
 
 	if (ncpt == 0 ||
@@ -845,21 +863,35 @@ cfs_cpt_table_create_pattern(char *pattern)
 		return NULL;
 	}
 
-	high = node ? MAX_NUMNODES - 1 : nr_cpu_ids - 1;
-
 	cptab = cfs_cpt_table_alloc(ncpt);
 	if (!cptab) {
 		CERROR("Failed to allocate cpu partition table\n");
 		return NULL;
 	}
 
+	if (node < 0) { /* shortcut to create CPT from NUMA & CPU topology */
+		cpt = 0;
+
+		for_each_online_node(i) {
+			if (cpt >= ncpt) {
+				CERROR("CPU changed while setting CPU partition table, %d/%d\n",
+				       cpt, ncpt);
+				goto failed;
+			}
+
+			rc = cfs_cpt_set_node(cptab, cpt++, i);
+			if (!rc)
+				goto failed;
+		}
+		return cptab;
+	}
+
+	high = node ? MAX_NUMNODES - 1 : nr_cpu_ids - 1;
+
 	for (str = cfs_trimwhite(pattern), c = 0;; c++) {
 		struct cfs_range_expr	*range;
 		struct cfs_expr_list	*el;
 		char			*bracket = strchr(str, '[');
-		int			cpt;
-		int			rc;
-		int			i;
 		int			n;
 
 		if (!bracket) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 10/41] staging: lustre: libcfs: shortcut to create CPT from NUMA topology
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Liang Zhen,
	James Simmons

From: Liang Zhen <liang.zhen@intel.com>

If user wants to create CPT table that can match numa topology,
she has to query cpu & numa topology, then provide a pattern
string to describe the topology, this is inconvenient.

To improve it, this patch can support shortcut expression "N" or "n"
to create CPT table from NUMA & CPU topology

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6325
Reviewed-on: http://review.whamcloud.com/14049
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lnet/libcfs/linux/linux-cpu.c   |   48 ++++++++++++++++---
 1 files changed, 40 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index e8b1a61..464b279 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -55,6 +55,8 @@ MODULE_PARM_DESC(cpu_npartitions, "# of CPU partitions");
  * i.e: "N 0[0,1] 1[2,3]" the first character 'N' means numbers in bracket
  *       are NUMA node ID, number before bracket is CPU partition ID.
  *
+ * i.e: "N", shortcut expression to create CPT from NUMA & CPU topology
+ *
  * NB: If user specified cpu_pattern, cpu_npartitions will be ignored
  */
 static char	*cpu_pattern = "";
@@ -818,11 +820,14 @@ static struct cfs_cpt_table *
 cfs_cpt_table_create_pattern(char *pattern)
 {
 	struct cfs_cpt_table	*cptab;
-	char			*str	= pattern;
+	char *str;
 	int			node	= 0;
 	int			high;
-	int			ncpt;
+	int ncpt = 0;
+	int cpt;
+	int rc;
 	int			c;
+	int i;
 
 	for (ncpt = 0;; ncpt++) { /* quick scan bracket */
 		str = strchr(str, '[');
@@ -834,7 +839,20 @@ cfs_cpt_table_create_pattern(char *pattern)
 	str = cfs_trimwhite(pattern);
 	if (*str == 'n' || *str == 'N') {
 		pattern = str + 1;
-		node = 1;
+		if (*pattern != '\0') {
+			node = 1;
+		} else { /* shortcut to create CPT from NUMA & CPU topology */
+			node = -1;
+			ncpt = num_online_nodes();
+		}
+	}
+
+	if (!ncpt) { /* scanning bracket which is mark of partition */
+		for (str = pattern;; str++, ncpt++) {
+			str = strchr(str, '[');
+			if (!str)
+				break;
+		}
 	}
 
 	if (ncpt == 0 ||
@@ -845,21 +863,35 @@ cfs_cpt_table_create_pattern(char *pattern)
 		return NULL;
 	}
 
-	high = node ? MAX_NUMNODES - 1 : nr_cpu_ids - 1;
-
 	cptab = cfs_cpt_table_alloc(ncpt);
 	if (!cptab) {
 		CERROR("Failed to allocate cpu partition table\n");
 		return NULL;
 	}
 
+	if (node < 0) { /* shortcut to create CPT from NUMA & CPU topology */
+		cpt = 0;
+
+		for_each_online_node(i) {
+			if (cpt >= ncpt) {
+				CERROR("CPU changed while setting CPU partition table, %d/%d\n",
+				       cpt, ncpt);
+				goto failed;
+			}
+
+			rc = cfs_cpt_set_node(cptab, cpt++, i);
+			if (!rc)
+				goto failed;
+		}
+		return cptab;
+	}
+
+	high = node ? MAX_NUMNODES - 1 : nr_cpu_ids - 1;
+
 	for (str = cfs_trimwhite(pattern), c = 0;; c++) {
 		struct cfs_range_expr	*range;
 		struct cfs_expr_list	*el;
 		char			*bracket = strchr(str, '[');
-		int			cpt;
-		int			rc;
-		int			i;
 		int			n;
 
 		if (!bracket) {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 11/41] staging: lustre: ptlrpc: Add OBD_CONNECT_MULTIMODRPCS flag
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Gregoire Pichon, James Simmons

From: Gregoire Pichon <gregoire.pichon@bull.net>

The new OBD_CONNECT_MULTIMODRPCS connection flag indicates the support
of multiple modify RPCs in parallel. It can be specified by the client
within the connection request and by the server within the connection
reply. The new ocd_maxmodrpcs connection data specifies the maximum modify
RPCs in parallel supported by the server.

To allow the MDS to send the new ocd_maxmodrpcs field, it has been
required to modify RMF_CONNECT_DATA so that its size includes the new
field. This change leads to remove the ocd_connect_data_v1 structure.
Note that the client has been allocating an extra 16*sizeof(__u64) for
the obd_connect_data reply since 2.0 and even in later versions of 1.8)
so there is no problem for the MDS to just send the full reply size.

This patch fixes a bug in __req_capsule_get() since it wasn't checking
RMF_F_NO_SIZE_CHECK when receiving the message. This allows legacy
clients (with version lower that this commit) to send connection
request with ocd_connect_data structure size smaller (actually size is
ocd_connect_data_v1 structure size) than new server ocd_connect_data
structure size.

This patch also fixes a bug in the routine that displays the import's
connect data.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5319
Reviewed-on: http://review.whamcloud.com/#/c/13960
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   26 ++++---------------
 .../lustre/lustre/obdclass/lprocfs_status.c        |    7 ++++-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    5 ++-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    3 ++
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   14 +++++++++-
 5 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 4210716..b88807f 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1282,6 +1282,9 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *pb);
 							 */
 #define OBD_CONNECT_LFSCK	0x40000000000000ULL/* support online LFSCK */
 #define OBD_CONNECT_UNLINK_CLOSE 0x100000000000000ULL/* close file in unlink */
+#define OBD_CONNECT_MULTIMODRPCS 0x200000000000000ULL /* support multiple modify
+						       *  RPCs in parallel
+						       */
 #define OBD_CONNECT_DIR_STRIPE	 0x400000000000000ULL/* striped DNE dir */
 
 /* XXX README XXX:
@@ -1313,25 +1316,6 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *pb);
  * If we eventually have separate connect data for different types, which we
  * almost certainly will, then perhaps we stick a union in here.
  */
-struct obd_connect_data_v1 {
-	__u64 ocd_connect_flags; /* OBD_CONNECT_* per above */
-	__u32 ocd_version;	 /* lustre release version number */
-	__u32 ocd_grant;	 /* initial cache grant amount (bytes) */
-	__u32 ocd_index;	 /* LOV index to connect to */
-	__u32 ocd_brw_size;	 /* Maximum BRW size in bytes, must be 2^n */
-	__u64 ocd_ibits_known;   /* inode bits this client understands */
-	__u8  ocd_blocksize;     /* log2 of the backend filesystem blocksize */
-	__u8  ocd_inodespace;    /* log2 of the per-inode space consumption */
-	__u16 ocd_grant_extent;  /* per-extent grant overhead, in 1K blocks */
-	__u32 ocd_unused;	/* also fix lustre_swab_connect */
-	__u64 ocd_transno;       /* first transno from client to be replayed */
-	__u32 ocd_group;	 /* MDS group on OST */
-	__u32 ocd_cksum_types;   /* supported checksum algorithms */
-	__u32 ocd_max_easize;    /* How big LOV EA can be on MDS */
-	__u32 ocd_instance;      /* also fix lustre_swab_connect */
-	__u64 ocd_maxbytes;      /* Maximum stripe size in bytes */
-};
-
 struct obd_connect_data {
 	__u64 ocd_connect_flags; /* OBD_CONNECT_* per above */
 	__u32 ocd_version;	 /* lustre release version number */
@@ -1354,7 +1338,9 @@ struct obd_connect_data {
 	 * any field after ocd_maxbytes on the receiver without a valid flag
 	 * may result in out-of-bound memory access and kernel oops.
 	 */
-	__u64 padding1;	  /* added 2.1.0. also fix lustre_swab_connect */
+	__u16 ocd_maxmodrpcs;	/* Maximum modify RPCs in parallel */
+	__u16 padding0;		/* added 2.1.0. also fix lustre_swab_connect */
+	__u32 padding1;		/* added 2.1.0. also fix lustre_swab_connect */
 	__u64 padding2;	  /* added 2.1.0. also fix lustre_swab_connect */
 	__u64 padding3;	  /* added 2.1.0. also fix lustre_swab_connect */
 	__u64 padding4;	  /* added 2.1.0. also fix lustre_swab_connect */
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 852a5ac..b520c96 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -100,7 +100,7 @@ static const char * const obd_connect_names[] = {
 	"lfsck",
 	"unknown",
 	"unlink_close",
-	"unknown",
+	"multi_mod_rpcs",
 	"dir_stripe",
 	"unknown",
 	NULL
@@ -127,7 +127,7 @@ EXPORT_SYMBOL(obd_connect_flags2str);
 static void obd_connect_data_seqprint(struct seq_file *m,
 				      struct obd_connect_data *ocd)
 {
-	int flags;
+	u64 flags;
 
 	LASSERT(ocd);
 	flags = ocd->ocd_connect_flags;
@@ -172,6 +172,9 @@ static void obd_connect_data_seqprint(struct seq_file *m,
 	if (flags & OBD_CONNECT_MAXBYTES)
 		seq_printf(m, "       max_object_bytes: %llx\n",
 			   ocd->ocd_maxbytes);
+	if (flags & OBD_CONNECT_MULTIMODRPCS)
+		seq_printf(m, "       max_mod_rpcs: %hu\n",
+			   ocd->ocd_maxmodrpcs);
 }
 
 int lprocfs_read_frac_helper(char *buffer, unsigned long count, long val,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 839ef3e..4ea8454 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -1874,13 +1874,14 @@ static void *__req_capsule_get(struct req_capsule *pill,
 	getter = (field->rmf_flags & RMF_F_STRING) ?
 		(typeof(getter))lustre_msg_string : lustre_msg_buf;
 
-	if (field->rmf_flags & RMF_F_STRUCT_ARRAY) {
+	if (field->rmf_flags & (RMF_F_STRUCT_ARRAY | RMF_F_NO_SIZE_CHECK)) {
 		/*
 		 * We've already asserted that field->rmf_size > 0 in
 		 * req_layout_init().
 		 */
 		len = lustre_msg_buflen(msg, offset);
-		if ((len % field->rmf_size) != 0) {
+		if (!(field->rmf_flags & RMF_F_NO_SIZE_CHECK) &&
+		    (len % field->rmf_size)) {
 			CERROR("%s: array field size mismatch %d modulo %u != 0 (%d)\n",
 			       field->rmf_name, len, field->rmf_size, loc);
 			return NULL;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 36b86ae..2dc0b79 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1492,6 +1492,9 @@ void lustre_swab_connect(struct obd_connect_data *ocd)
 		__swab32s(&ocd->ocd_max_easize);
 	if (ocd->ocd_connect_flags & OBD_CONNECT_MAXBYTES)
 		__swab64s(&ocd->ocd_maxbytes);
+	if (ocd->ocd_connect_flags & OBD_CONNECT_MULTIMODRPCS)
+		__swab16s(&ocd->ocd_maxmodrpcs);
+	CLASSERT(offsetof(typeof(*ocd), padding0));
 	CLASSERT(offsetof(typeof(*ocd), padding1) != 0);
 	CLASSERT(offsetof(typeof(*ocd), padding2) != 0);
 	CLASSERT(offsetof(typeof(*ocd), padding3) != 0);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index cb89bf2..c299099 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -905,9 +905,17 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct obd_connect_data, ocd_maxbytes));
 	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->ocd_maxbytes) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct obd_connect_data *)0)->ocd_maxbytes));
-	LASSERTF((int)offsetof(struct obd_connect_data, padding1) == 72, "found %lld\n",
+	LASSERTF((int)offsetof(struct obd_connect_data, ocd_maxmodrpcs) == 72, "found %lld\n",
+		 (long long)(int)offsetof(struct obd_connect_data, ocd_maxmodrpcs));
+	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->ocd_maxmodrpcs) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct obd_connect_data *)0)->ocd_maxmodrpcs));
+	LASSERTF((int)offsetof(struct obd_connect_data, padding0) == 74, "found %lld\n",
+		 (long long)(int)offsetof(struct obd_connect_data, padding0));
+	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->padding0) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct obd_connect_data *)0)->padding0));
+	LASSERTF((int)offsetof(struct obd_connect_data, padding1) == 76, "found %lld\n",
 		 (long long)(int)offsetof(struct obd_connect_data, padding1));
-	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->padding1) == 8, "found %lld\n",
+	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->padding1) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct obd_connect_data *)0)->padding1));
 	LASSERTF((int)offsetof(struct obd_connect_data, padding2) == 80, "found %lld\n",
 		 (long long)(int)offsetof(struct obd_connect_data, padding2));
@@ -1075,6 +1083,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_CONNECT_LFSCK);
 	LASSERTF(OBD_CONNECT_UNLINK_CLOSE == 0x100000000000000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_UNLINK_CLOSE);
+	LASSERTF(OBD_CONNECT_MULTIMODRPCS == 0x200000000000000ULL, "found 0x%.16llxULL\n",
+		 OBD_CONNECT_MULTIMODRPCS);
 	LASSERTF(OBD_CONNECT_DIR_STRIPE == 0x400000000000000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_DIR_STRIPE);
 	LASSERTF(OBD_CKSUM_CRC32 == 0x00000001UL, "found 0x%.8xUL\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 11/41] staging: lustre: ptlrpc: Add OBD_CONNECT_MULTIMODRPCS flag
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Gregoire Pichon, James Simmons

From: Gregoire Pichon <gregoire.pichon@bull.net>

The new OBD_CONNECT_MULTIMODRPCS connection flag indicates the support
of multiple modify RPCs in parallel. It can be specified by the client
within the connection request and by the server within the connection
reply. The new ocd_maxmodrpcs connection data specifies the maximum modify
RPCs in parallel supported by the server.

To allow the MDS to send the new ocd_maxmodrpcs field, it has been
required to modify RMF_CONNECT_DATA so that its size includes the new
field. This change leads to remove the ocd_connect_data_v1 structure.
Note that the client has been allocating an extra 16*sizeof(__u64) for
the obd_connect_data reply since 2.0 and even in later versions of 1.8)
so there is no problem for the MDS to just send the full reply size.

This patch fixes a bug in __req_capsule_get() since it wasn't checking
RMF_F_NO_SIZE_CHECK when receiving the message. This allows legacy
clients (with version lower that this commit) to send connection
request with ocd_connect_data structure size smaller (actually size is
ocd_connect_data_v1 structure size) than new server ocd_connect_data
structure size.

This patch also fixes a bug in the routine that displays the import's
connect data.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5319
Reviewed-on: http://review.whamcloud.com/#/c/13960
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   26 ++++---------------
 .../lustre/lustre/obdclass/lprocfs_status.c        |    7 ++++-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    5 ++-
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    3 ++
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   14 +++++++++-
 5 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 4210716..b88807f 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1282,6 +1282,9 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *pb);
 							 */
 #define OBD_CONNECT_LFSCK	0x40000000000000ULL/* support online LFSCK */
 #define OBD_CONNECT_UNLINK_CLOSE 0x100000000000000ULL/* close file in unlink */
+#define OBD_CONNECT_MULTIMODRPCS 0x200000000000000ULL /* support multiple modify
+						       *  RPCs in parallel
+						       */
 #define OBD_CONNECT_DIR_STRIPE	 0x400000000000000ULL/* striped DNE dir */
 
 /* XXX README XXX:
@@ -1313,25 +1316,6 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *pb);
  * If we eventually have separate connect data for different types, which we
  * almost certainly will, then perhaps we stick a union in here.
  */
-struct obd_connect_data_v1 {
-	__u64 ocd_connect_flags; /* OBD_CONNECT_* per above */
-	__u32 ocd_version;	 /* lustre release version number */
-	__u32 ocd_grant;	 /* initial cache grant amount (bytes) */
-	__u32 ocd_index;	 /* LOV index to connect to */
-	__u32 ocd_brw_size;	 /* Maximum BRW size in bytes, must be 2^n */
-	__u64 ocd_ibits_known;   /* inode bits this client understands */
-	__u8  ocd_blocksize;     /* log2 of the backend filesystem blocksize */
-	__u8  ocd_inodespace;    /* log2 of the per-inode space consumption */
-	__u16 ocd_grant_extent;  /* per-extent grant overhead, in 1K blocks */
-	__u32 ocd_unused;	/* also fix lustre_swab_connect */
-	__u64 ocd_transno;       /* first transno from client to be replayed */
-	__u32 ocd_group;	 /* MDS group on OST */
-	__u32 ocd_cksum_types;   /* supported checksum algorithms */
-	__u32 ocd_max_easize;    /* How big LOV EA can be on MDS */
-	__u32 ocd_instance;      /* also fix lustre_swab_connect */
-	__u64 ocd_maxbytes;      /* Maximum stripe size in bytes */
-};
-
 struct obd_connect_data {
 	__u64 ocd_connect_flags; /* OBD_CONNECT_* per above */
 	__u32 ocd_version;	 /* lustre release version number */
@@ -1354,7 +1338,9 @@ struct obd_connect_data {
 	 * any field after ocd_maxbytes on the receiver without a valid flag
 	 * may result in out-of-bound memory access and kernel oops.
 	 */
-	__u64 padding1;	  /* added 2.1.0. also fix lustre_swab_connect */
+	__u16 ocd_maxmodrpcs;	/* Maximum modify RPCs in parallel */
+	__u16 padding0;		/* added 2.1.0. also fix lustre_swab_connect */
+	__u32 padding1;		/* added 2.1.0. also fix lustre_swab_connect */
 	__u64 padding2;	  /* added 2.1.0. also fix lustre_swab_connect */
 	__u64 padding3;	  /* added 2.1.0. also fix lustre_swab_connect */
 	__u64 padding4;	  /* added 2.1.0. also fix lustre_swab_connect */
diff --git a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
index 852a5ac..b520c96 100644
--- a/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
+++ b/drivers/staging/lustre/lustre/obdclass/lprocfs_status.c
@@ -100,7 +100,7 @@ static const char * const obd_connect_names[] = {
 	"lfsck",
 	"unknown",
 	"unlink_close",
-	"unknown",
+	"multi_mod_rpcs",
 	"dir_stripe",
 	"unknown",
 	NULL
@@ -127,7 +127,7 @@ EXPORT_SYMBOL(obd_connect_flags2str);
 static void obd_connect_data_seqprint(struct seq_file *m,
 				      struct obd_connect_data *ocd)
 {
-	int flags;
+	u64 flags;
 
 	LASSERT(ocd);
 	flags = ocd->ocd_connect_flags;
@@ -172,6 +172,9 @@ static void obd_connect_data_seqprint(struct seq_file *m,
 	if (flags & OBD_CONNECT_MAXBYTES)
 		seq_printf(m, "       max_object_bytes: %llx\n",
 			   ocd->ocd_maxbytes);
+	if (flags & OBD_CONNECT_MULTIMODRPCS)
+		seq_printf(m, "       max_mod_rpcs: %hu\n",
+			   ocd->ocd_maxmodrpcs);
 }
 
 int lprocfs_read_frac_helper(char *buffer, unsigned long count, long val,
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 839ef3e..4ea8454 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -1874,13 +1874,14 @@ static void *__req_capsule_get(struct req_capsule *pill,
 	getter = (field->rmf_flags & RMF_F_STRING) ?
 		(typeof(getter))lustre_msg_string : lustre_msg_buf;
 
-	if (field->rmf_flags & RMF_F_STRUCT_ARRAY) {
+	if (field->rmf_flags & (RMF_F_STRUCT_ARRAY | RMF_F_NO_SIZE_CHECK)) {
 		/*
 		 * We've already asserted that field->rmf_size > 0 in
 		 * req_layout_init().
 		 */
 		len = lustre_msg_buflen(msg, offset);
-		if ((len % field->rmf_size) != 0) {
+		if (!(field->rmf_flags & RMF_F_NO_SIZE_CHECK) &&
+		    (len % field->rmf_size)) {
 			CERROR("%s: array field size mismatch %d modulo %u != 0 (%d)\n",
 			       field->rmf_name, len, field->rmf_size, loc);
 			return NULL;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 36b86ae..2dc0b79 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1492,6 +1492,9 @@ void lustre_swab_connect(struct obd_connect_data *ocd)
 		__swab32s(&ocd->ocd_max_easize);
 	if (ocd->ocd_connect_flags & OBD_CONNECT_MAXBYTES)
 		__swab64s(&ocd->ocd_maxbytes);
+	if (ocd->ocd_connect_flags & OBD_CONNECT_MULTIMODRPCS)
+		__swab16s(&ocd->ocd_maxmodrpcs);
+	CLASSERT(offsetof(typeof(*ocd), padding0));
 	CLASSERT(offsetof(typeof(*ocd), padding1) != 0);
 	CLASSERT(offsetof(typeof(*ocd), padding2) != 0);
 	CLASSERT(offsetof(typeof(*ocd), padding3) != 0);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index cb89bf2..c299099 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -905,9 +905,17 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct obd_connect_data, ocd_maxbytes));
 	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->ocd_maxbytes) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct obd_connect_data *)0)->ocd_maxbytes));
-	LASSERTF((int)offsetof(struct obd_connect_data, padding1) == 72, "found %lld\n",
+	LASSERTF((int)offsetof(struct obd_connect_data, ocd_maxmodrpcs) == 72, "found %lld\n",
+		 (long long)(int)offsetof(struct obd_connect_data, ocd_maxmodrpcs));
+	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->ocd_maxmodrpcs) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct obd_connect_data *)0)->ocd_maxmodrpcs));
+	LASSERTF((int)offsetof(struct obd_connect_data, padding0) == 74, "found %lld\n",
+		 (long long)(int)offsetof(struct obd_connect_data, padding0));
+	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->padding0) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct obd_connect_data *)0)->padding0));
+	LASSERTF((int)offsetof(struct obd_connect_data, padding1) == 76, "found %lld\n",
 		 (long long)(int)offsetof(struct obd_connect_data, padding1));
-	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->padding1) == 8, "found %lld\n",
+	LASSERTF((int)sizeof(((struct obd_connect_data *)0)->padding1) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct obd_connect_data *)0)->padding1));
 	LASSERTF((int)offsetof(struct obd_connect_data, padding2) == 80, "found %lld\n",
 		 (long long)(int)offsetof(struct obd_connect_data, padding2));
@@ -1075,6 +1083,8 @@ void lustre_assert_wire_constants(void)
 		 OBD_CONNECT_LFSCK);
 	LASSERTF(OBD_CONNECT_UNLINK_CLOSE == 0x100000000000000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_UNLINK_CLOSE);
+	LASSERTF(OBD_CONNECT_MULTIMODRPCS == 0x200000000000000ULL, "found 0x%.16llxULL\n",
+		 OBD_CONNECT_MULTIMODRPCS);
 	LASSERTF(OBD_CONNECT_DIR_STRIPE == 0x400000000000000ULL, "found 0x%.16llxULL\n",
 		 OBD_CONNECT_DIR_STRIPE);
 	LASSERTF(OBD_CKSUM_CRC32 == 0x00000001UL, "found 0x%.8xUL\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 12/41] staging: lustre: clio: get rid of lov_stripe_md reference
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	Jinshan Xiong, James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

Get rid of lov_stripe_md reference in setting file's stripe info.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/12639
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/file.c |   15 ---------------
 1 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 9ca933f..89a2841 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1225,37 +1225,22 @@ int ll_lov_setstripe_ea_info(struct inode *inode, struct dentry *dentry,
 			     __u64 flags, struct lov_user_md *lum,
 			     int lum_size)
 {
-	struct lov_stripe_md *lsm = NULL;
 	struct lookup_intent oit = {
 		.it_op = IT_OPEN,
 		.it_flags = flags | MDS_OPEN_BY_FID,
 	};
 	int rc = 0;
 
-	lsm = ccc_inode_lsm_get(inode);
-	if (lsm) {
-		ccc_inode_lsm_put(inode, lsm);
-		CDEBUG(D_IOCTL, "stripe already exists for inode "DFID"\n",
-		       PFID(ll_inode2fid(inode)));
-		rc = -EEXIST;
-		goto out;
-	}
-
 	ll_inode_size_lock(inode);
 	rc = ll_intent_file_open(dentry, lum, lum_size, &oit);
 	if (rc < 0)
 		goto out_unlock;
-	rc = oit.it_status;
-	if (rc < 0)
-		goto out_unlock;
 
 	ll_release_openhandle(inode, &oit);
 
 out_unlock:
 	ll_inode_size_unlock(inode);
 	ll_intent_release(&oit);
-	ccc_inode_lsm_put(inode, lsm);
-out:
 	return rc;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 12/41] staging: lustre: clio: get rid of lov_stripe_md reference
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	Jinshan Xiong, James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

Get rid of lov_stripe_md reference in setting file's stripe info.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/12639
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/file.c |   15 ---------------
 1 files changed, 0 insertions(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 9ca933f..89a2841 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1225,37 +1225,22 @@ int ll_lov_setstripe_ea_info(struct inode *inode, struct dentry *dentry,
 			     __u64 flags, struct lov_user_md *lum,
 			     int lum_size)
 {
-	struct lov_stripe_md *lsm = NULL;
 	struct lookup_intent oit = {
 		.it_op = IT_OPEN,
 		.it_flags = flags | MDS_OPEN_BY_FID,
 	};
 	int rc = 0;
 
-	lsm = ccc_inode_lsm_get(inode);
-	if (lsm) {
-		ccc_inode_lsm_put(inode, lsm);
-		CDEBUG(D_IOCTL, "stripe already exists for inode "DFID"\n",
-		       PFID(ll_inode2fid(inode)));
-		rc = -EEXIST;
-		goto out;
-	}
-
 	ll_inode_size_lock(inode);
 	rc = ll_intent_file_open(dentry, lum, lum_size, &oit);
 	if (rc < 0)
 		goto out_unlock;
-	rc = oit.it_status;
-	if (rc < 0)
-		goto out_unlock;
 
 	ll_release_openhandle(inode, &oit);
 
 out_unlock:
 	ll_inode_size_unlock(inode);
 	ll_intent_release(&oit);
-	ccc_inode_lsm_put(inode, lsm);
-out:
 	return rc;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 13/41] staging: lustre: clio: use CIT_SETATTR for FSFILT_IOC_SETFLAGS
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Add handling of inode flags to the handlers of CIT_SETATTR in lov and
osc. In the FSFILT_IOC_SETFLAGS case of ll_iocontrol() use
cl_setattr_ost() rather than obd_setattr_rqset() to set inode flags on
OST objects. Remove the then unused OBD API methods
obd_setattr_rqset() and obd_setattr_async() along with their
supporting functions.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/13422
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    3 +-
 .../staging/lustre/lustre/include/lustre_compat.h  |    2 +
 drivers/staging/lustre/lustre/include/obd.h        |    3 -
 drivers/staging/lustre/lustre/include/obd_class.h  |   39 ------
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |    8 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |    3 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   33 ++----
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   15 ++-
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    6 -
 drivers/staging/lustre/lustre/lov/lov_io.c         |    2 +
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   82 ------------
 drivers/staging/lustre/lustre/lov/lov_request.c    |  131 --------------------
 drivers/staging/lustre/lustre/osc/osc_internal.h   |    7 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |   29 +++--
 drivers/staging/lustre/lustre/osc/osc_request.c    |   19 +---
 15 files changed, 60 insertions(+), 322 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 3af9aa3..0b66d02 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1772,9 +1772,10 @@ struct cl_io {
 		struct cl_io_rw_common ci_rw;
 		struct cl_setattr_io {
 			struct ost_lvb   sa_attr;
+			unsigned int		 sa_attr_flags;
 			unsigned int     sa_valid;
 			int		sa_stripe_index;
-			struct lu_fid  *sa_parent_fid;
+			const struct lu_fid	*sa_parent_fid;
 		} ci_setattr;
 		struct cl_fault_io {
 			/** page index within file. */
diff --git a/drivers/staging/lustre/lustre/include/lustre_compat.h b/drivers/staging/lustre/lustre/include/lustre_compat.h
index 567c438..300e96f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_compat.h
+++ b/drivers/staging/lustre/lustre/include/lustre_compat.h
@@ -74,4 +74,6 @@
 # define ext2_find_next_zero_bit  find_next_zero_bit_le
 #endif
 
+#define TIMES_SET_FLAGS (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)
+
 #endif /* _LUSTRE_COMPAT_H */
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 51d5487..fe05cc6 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -896,9 +896,6 @@ struct obd_ops {
 		       struct obdo *oa, struct obd_trans_info *oti);
 	int (*setattr)(const struct lu_env *, struct obd_export *exp,
 		       struct obd_info *oinfo, struct obd_trans_info *oti);
-	int (*setattr_async)(struct obd_export *exp, struct obd_info *oinfo,
-			     struct obd_trans_info *oti,
-			     struct ptlrpc_request_set *rqset);
 	int (*getattr)(const struct lu_env *env, struct obd_export *exp,
 		       struct obd_info *oinfo);
 	int (*getattr_async)(struct obd_export *exp, struct obd_info *oinfo,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 2ea102d..b2ced8b 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -749,45 +749,6 @@ static inline int obd_setattr(const struct lu_env *env, struct obd_export *exp,
 	return rc;
 }
 
-/* This performs all the requests set init/wait/destroy actions. */
-static inline int obd_setattr_rqset(struct obd_export *exp,
-				    struct obd_info *oinfo,
-				    struct obd_trans_info *oti)
-{
-	struct ptlrpc_request_set *set = NULL;
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, setattr_async);
-	EXP_COUNTER_INCREMENT(exp, setattr_async);
-
-	set =  ptlrpc_prep_set();
-	if (!set)
-		return -ENOMEM;
-
-	rc = OBP(exp->exp_obd, setattr_async)(exp, oinfo, oti, set);
-	if (rc == 0)
-		rc = ptlrpc_set_wait(set);
-	ptlrpc_set_destroy(set);
-	return rc;
-}
-
-/* This adds all the requests into @set if @set != NULL, otherwise
- * all requests are sent asynchronously without waiting for response.
- */
-static inline int obd_setattr_async(struct obd_export *exp,
-				    struct obd_info *oinfo,
-				    struct obd_trans_info *oti,
-				    struct ptlrpc_request_set *set)
-{
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, setattr_async);
-	EXP_COUNTER_INCREMENT(exp, setattr_async);
-
-	rc = OBP(exp->exp_obd, setattr_async)(exp, oinfo, oti, set);
-	return rc;
-}
-
 static inline int obd_add_conn(struct obd_import *imp, struct obd_uuid *uuid,
 			       int priority)
 {
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index 084330d..64f4aed 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -80,7 +80,8 @@ int cl_inode_fini_refcheck;
  */
 static DEFINE_MUTEX(cl_inode_fini_guard);
 
-int cl_setattr_ost(struct inode *inode, const struct iattr *attr)
+int cl_setattr_ost(struct cl_object *obj, const struct iattr *attr,
+		   unsigned int attr_flags)
 {
 	struct lu_env *env;
 	struct cl_io  *io;
@@ -92,14 +93,15 @@ int cl_setattr_ost(struct inode *inode, const struct iattr *attr)
 		return PTR_ERR(env);
 
 	io = vvp_env_thread_io(env);
-	io->ci_obj = ll_i2info(inode)->lli_clob;
+	io->ci_obj = obj;
 
 	io->u.ci_setattr.sa_attr.lvb_atime = LTIME_S(attr->ia_atime);
 	io->u.ci_setattr.sa_attr.lvb_mtime = LTIME_S(attr->ia_mtime);
 	io->u.ci_setattr.sa_attr.lvb_ctime = LTIME_S(attr->ia_ctime);
 	io->u.ci_setattr.sa_attr.lvb_size = attr->ia_size;
+	io->u.ci_setattr.sa_attr_flags = attr_flags;
 	io->u.ci_setattr.sa_valid = attr->ia_valid;
-	io->u.ci_setattr.sa_parent_fid = ll_inode2fid(inode);
+	io->u.ci_setattr.sa_parent_fid = lu_object_fid(&obj->co_lu);
 
 again:
 	if (cl_io_init(env, io, CIT_SETATTR, io->ci_obj) == 0) {
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index b06cd3c..02541b1 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -1347,7 +1347,8 @@ int ll_page_sync_io(const struct lu_env *env, struct cl_io *io,
 int ll_getparent(struct file *file, struct getparent __user *arg);
 
 /* lcommon_cl.c */
-int cl_setattr_ost(struct inode *inode, const struct iattr *attr);
+int cl_setattr_ost(struct cl_object *obj, const struct iattr *attr,
+		   unsigned int attr_flags);
 
 extern struct lu_env *cl_inode_fini_env;
 extern int cl_inode_fini_refcheck;
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index c7aab3f..9112a52 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1502,7 +1502,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 		 */
 		if (attr->ia_valid & ATTR_SIZE)
 			down_write(&lli->lli_trunc_sem);
-		rc = cl_setattr_ost(inode, attr);
+		rc = cl_setattr_ost(ll_i2info(inode)->lli_clob, attr, 0);
 		if (attr->ia_valid & ATTR_SIZE)
 			up_write(&lli->lli_trunc_sem);
 	}
@@ -1879,9 +1879,9 @@ int ll_iocontrol(struct inode *inode, struct file *file,
 		return put_user(flags, (int __user *)arg);
 	}
 	case FSFILT_IOC_SETFLAGS: {
-		struct lov_stripe_md *lsm;
-		struct obd_info oinfo = { };
 		struct md_op_data *op_data;
+		struct cl_object *obj;
+		struct iattr *attr;
 
 		if (get_user(flags, (int __user *)arg))
 			return -EFAULT;
@@ -1901,30 +1901,17 @@ int ll_iocontrol(struct inode *inode, struct file *file,
 
 		inode->i_flags = ll_ext_to_inode_flags(flags);
 
-		lsm = ccc_inode_lsm_get(inode);
-		if (!lsm_has_objects(lsm)) {
-			ccc_inode_lsm_put(inode, lsm);
+		obj = ll_i2info(inode)->lli_clob;
+		if (!obj)
 			return 0;
-		}
 
-		oinfo.oi_oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-		if (!oinfo.oi_oa) {
-			ccc_inode_lsm_put(inode, lsm);
+		attr = kzalloc(sizeof(*attr), GFP_NOFS);
+		if (!attr)
 			return -ENOMEM;
-		}
-		oinfo.oi_md = lsm;
-		oinfo.oi_oa->o_oi = lsm->lsm_oi;
-		oinfo.oi_oa->o_flags = flags;
-		oinfo.oi_oa->o_valid = OBD_MD_FLID | OBD_MD_FLFLAGS |
-				       OBD_MD_FLGROUP;
-		obdo_set_parent_fid(oinfo.oi_oa, &ll_i2info(inode)->lli_fid);
-		rc = obd_setattr_rqset(sbi->ll_dt_exp, &oinfo, NULL);
-		kmem_cache_free(obdo_cachep, oinfo.oi_oa);
-		ccc_inode_lsm_put(inode, lsm);
-
-		if (rc && rc != -EPERM && rc != -EACCES)
-			CERROR("osc_setattr_async fails: rc = %d\n", rc);
 
+		attr->ia_valid = ATTR_ATTR_FLAG;
+		rc = cl_setattr_ost(obj, attr, flags);
+		kfree(attr);
 		return rc;
 	}
 	default:
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 8187fa3..8f1964f 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -571,9 +571,16 @@ static int vvp_io_setattr_lock(const struct lu_env *env,
 		if (new_size == 0)
 			enqflags = CEF_DISCARD_DATA;
 	} else {
-		if ((io->u.ci_setattr.sa_attr.lvb_mtime >=
-		     io->u.ci_setattr.sa_attr.lvb_ctime) ||
-		    (io->u.ci_setattr.sa_attr.lvb_atime >=
+		unsigned int valid = io->u.ci_setattr.sa_valid;
+
+		if (!(valid & TIMES_SET_FLAGS))
+			return 0;
+
+		if ((!(valid & ATTR_MTIME) ||
+		     io->u.ci_setattr.sa_attr.lvb_mtime >=
+		     io->u.ci_setattr.sa_attr.lvb_ctime) &&
+		    (!(valid & ATTR_ATIME) ||
+		     io->u.ci_setattr.sa_attr.lvb_atime >=
 		     io->u.ci_setattr.sa_attr.lvb_ctime))
 			return 0;
 		new_size = 0;
@@ -644,7 +651,7 @@ static int vvp_io_setattr_start(const struct lu_env *env,
 	if (cl_io_is_trunc(io))
 		result = vvp_io_setattr_trunc(env, ios, inode,
 					io->u.ci_setattr.sa_attr.lvb_size);
-	if (result == 0)
+	if (!result && io->u.ci_setattr.sa_valid & TIMES_SET_FLAGS)
 		result = vvp_io_setattr_time(env, ios);
 	return result;
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 07e5ede..4743c65 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -155,12 +155,6 @@ int lov_update_common_set(struct lov_request_set *set,
 int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
 			 struct lov_request_set **reqset);
 int lov_fini_getattr_set(struct lov_request_set *set);
-int lov_prep_setattr_set(struct obd_export *exp, struct obd_info *oinfo,
-			 struct obd_trans_info *oti,
-			 struct lov_request_set **reqset);
-int lov_update_setattr_set(struct lov_request_set *set,
-			   struct lov_request *req, int rc);
-int lov_fini_setattr_set(struct lov_request_set *set);
 int lov_prep_statfs_set(struct obd_device *obd, struct obd_info *oinfo,
 			struct lov_request_set **reqset);
 int lov_fini_statfs(struct obd_device *obd, struct obd_statfs *osfs,
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index e75e5d2..369718e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -86,6 +86,8 @@ static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
 	switch (io->ci_type) {
 	case CIT_SETATTR: {
 		io->u.ci_setattr.sa_attr = parent->u.ci_setattr.sa_attr;
+		io->u.ci_setattr.sa_attr_flags =
+					parent->u.ci_setattr.sa_attr_flags;
 		io->u.ci_setattr.sa_valid = parent->u.ci_setattr.sa_valid;
 		io->u.ci_setattr.sa_stripe_index = stripe;
 		io->u.ci_setattr.sa_parent_fid =
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 02c7087..30903fc 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1047,87 +1047,6 @@ out:
 	return rc ? rc : err;
 }
 
-static int lov_setattr_interpret(struct ptlrpc_request_set *rqset,
-				 void *data, int rc)
-{
-	struct lov_request_set *lovset = (struct lov_request_set *)data;
-	int err;
-
-	if (rc)
-		atomic_set(&lovset->set_completes, 0);
-	err = lov_fini_setattr_set(lovset);
-	return rc ? rc : err;
-}
-
-/* If @oti is given, the request goes from MDS and responses from OSTs are not
- * needed. Otherwise, a client is waiting for responses.
- */
-static int lov_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
-			     struct obd_trans_info *oti,
-			     struct ptlrpc_request_set *rqset)
-{
-	struct lov_request_set *set;
-	struct lov_request *req;
-	struct lov_obd *lov;
-	int rc = 0;
-
-	LASSERT(oinfo);
-	ASSERT_LSM_MAGIC(oinfo->oi_md);
-	if (oinfo->oi_oa->o_valid & OBD_MD_FLCOOKIE) {
-		LASSERT(oti);
-		LASSERT(oti->oti_logcookies);
-	}
-
-	if (!exp || !exp->exp_obd)
-		return -ENODEV;
-
-	lov = &exp->exp_obd->u.lov;
-	rc = lov_prep_setattr_set(exp, oinfo, oti, &set);
-	if (rc)
-		return rc;
-
-	CDEBUG(D_INFO, "objid "DOSTID": %ux%u byte stripes\n",
-	       POSTID(&oinfo->oi_md->lsm_oi),
-	       oinfo->oi_md->lsm_stripe_count,
-	       oinfo->oi_md->lsm_stripe_size);
-
-	list_for_each_entry(req, &set->set_list, rq_link) {
-		if (oinfo->oi_oa->o_valid & OBD_MD_FLCOOKIE)
-			oti->oti_logcookies = set->set_cookies + req->rq_stripe;
-
-		CDEBUG(D_INFO, "objid " DOSTID "[%d] has subobj " DOSTID " at idx%u\n",
-		       POSTID(&oinfo->oi_oa->o_oi), req->rq_stripe,
-		       POSTID(&req->rq_oi.oi_oa->o_oi), req->rq_idx);
-
-		rc = obd_setattr_async(lov->lov_tgts[req->rq_idx]->ltd_exp,
-				       &req->rq_oi, oti, rqset);
-		if (rc) {
-			CERROR("error: setattr objid "DOSTID" subobj"
-			       DOSTID" on OST idx %d: rc = %d\n",
-			       POSTID(&set->set_oi->oi_oa->o_oi),
-			       POSTID(&req->rq_oi.oi_oa->o_oi),
-			       req->rq_idx, rc);
-			break;
-		}
-	}
-
-	/* If we are not waiting for responses on async requests, return. */
-	if (rc || !rqset || list_empty(&rqset->set_requests)) {
-		int err;
-
-		if (rc)
-			atomic_set(&set->set_completes, 0);
-		err = lov_fini_setattr_set(set);
-		return rc ? rc : err;
-	}
-
-	LASSERT(!rqset->set_interpret);
-	rqset->set_interpret = lov_setattr_interpret;
-	rqset->set_arg = (void *)set;
-
-	return 0;
-}
-
 int lov_statfs_interpret(struct ptlrpc_request_set *rqset, void *data, int rc)
 {
 	struct lov_request_set *lovset = (struct lov_request_set *)data;
@@ -1613,7 +1532,6 @@ static struct obd_ops lov_obd_ops = {
 	.packmd         = lov_packmd,
 	.unpackmd       = lov_unpackmd,
 	.getattr_async  = lov_getattr_async,
-	.setattr_async  = lov_setattr_async,
 	.iocontrol      = lov_iocontrol,
 	.get_info       = lov_get_info,
 	.set_info_async = lov_set_info_async,
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 8e40702..048e597 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -311,137 +311,6 @@ out_set:
 	return rc;
 }
 
-int lov_fini_setattr_set(struct lov_request_set *set)
-{
-	int rc = 0;
-
-	if (!set)
-		return 0;
-	LASSERT(set->set_exp);
-	if (atomic_read(&set->set_completes)) {
-		rc = common_attr_done(set);
-		/* FIXME update qos data here */
-	}
-
-	lov_put_reqset(set);
-	return rc;
-}
-
-int lov_update_setattr_set(struct lov_request_set *set,
-			   struct lov_request *req, int rc)
-{
-	struct lov_obd *lov = &req->rq_rqset->set_exp->exp_obd->u.lov;
-	struct lov_stripe_md *lsm = req->rq_rqset->set_oi->oi_md;
-
-	lov_update_set(set, req, rc);
-
-	/* grace error on inactive ost */
-	if (rc && !(lov->lov_tgts[req->rq_idx] &&
-		    lov->lov_tgts[req->rq_idx]->ltd_active))
-		rc = 0;
-
-	if (rc == 0) {
-		if (req->rq_oi.oi_oa->o_valid & OBD_MD_FLCTIME)
-			lsm->lsm_oinfo[req->rq_stripe]->loi_lvb.lvb_ctime =
-				req->rq_oi.oi_oa->o_ctime;
-		if (req->rq_oi.oi_oa->o_valid & OBD_MD_FLMTIME)
-			lsm->lsm_oinfo[req->rq_stripe]->loi_lvb.lvb_mtime =
-				req->rq_oi.oi_oa->o_mtime;
-		if (req->rq_oi.oi_oa->o_valid & OBD_MD_FLATIME)
-			lsm->lsm_oinfo[req->rq_stripe]->loi_lvb.lvb_atime =
-				req->rq_oi.oi_oa->o_atime;
-	}
-
-	return rc;
-}
-
-/* The callback for osc_setattr_async that finalizes a request info when a
- * response is received.
- */
-static int cb_setattr_update(void *cookie, int rc)
-{
-	struct obd_info *oinfo = cookie;
-	struct lov_request *lovreq;
-
-	lovreq = container_of(oinfo, struct lov_request, rq_oi);
-	return lov_update_setattr_set(lovreq->rq_rqset, lovreq, rc);
-}
-
-int lov_prep_setattr_set(struct obd_export *exp, struct obd_info *oinfo,
-			 struct obd_trans_info *oti,
-			 struct lov_request_set **reqset)
-{
-	struct lov_request_set *set;
-	struct lov_obd *lov = &exp->exp_obd->u.lov;
-	int rc = 0, i;
-
-	set = kzalloc(sizeof(*set), GFP_NOFS);
-	if (!set)
-		return -ENOMEM;
-	lov_init_set(set);
-
-	set->set_exp = exp;
-	set->set_oi = oinfo;
-	if (oti && oinfo->oi_oa->o_valid & OBD_MD_FLCOOKIE)
-		set->set_cookies = oti->oti_logcookies;
-
-	for (i = 0; i < oinfo->oi_md->lsm_stripe_count; i++) {
-		struct lov_oinfo *loi = oinfo->oi_md->lsm_oinfo[i];
-		struct lov_request *req;
-
-		if (lov_oinfo_is_dummy(loi))
-			continue;
-
-		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
-			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
-			continue;
-		}
-
-		req = kzalloc(sizeof(*req), GFP_NOFS);
-		if (!req) {
-			rc = -ENOMEM;
-			goto out_set;
-		}
-		req->rq_stripe = i;
-		req->rq_idx = loi->loi_ost_idx;
-
-		req->rq_oi.oi_oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-		if (!req->rq_oi.oi_oa) {
-			kfree(req);
-			rc = -ENOMEM;
-			goto out_set;
-		}
-		memcpy(req->rq_oi.oi_oa, oinfo->oi_oa,
-		       sizeof(*req->rq_oi.oi_oa));
-		req->rq_oi.oi_oa->o_oi = loi->loi_oi;
-		req->rq_oi.oi_oa->o_stripe_idx = i;
-		req->rq_oi.oi_cb_up = cb_setattr_update;
-
-		if (oinfo->oi_oa->o_valid & OBD_MD_FLSIZE) {
-			int off = lov_stripe_offset(oinfo->oi_md,
-						    oinfo->oi_oa->o_size, i,
-						    &req->rq_oi.oi_oa->o_size);
-
-			if (off < 0 && req->rq_oi.oi_oa->o_size)
-				req->rq_oi.oi_oa->o_size--;
-
-			CDEBUG(D_INODE, "stripe %d has size %llu/%llu\n",
-			       i, req->rq_oi.oi_oa->o_size,
-			       oinfo->oi_oa->o_size);
-		}
-		lov_set_add_req(req, set);
-	}
-	if (!set->set_count) {
-		rc = -EIO;
-		goto out_set;
-	}
-	*reqset = set;
-	return rc;
-out_set:
-	lov_fini_setattr_set(set);
-	return rc;
-}
-
 #define LOV_U64_MAX ((__u64)~0ULL)
 #define LOV_SUM_MAX(tot, add)					   \
 	do {							    \
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 9a61c9b..90ea0d1 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -119,10 +119,9 @@ int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		   __u64 *flags, void *data, struct lustre_handle *lockh,
 		   int unref);
 
-int osc_setattr_async_base(struct obd_export *exp, struct obd_info *oinfo,
-			   struct obd_trans_info *oti,
-			   obd_enqueue_update_f upcall, void *cookie,
-			   struct ptlrpc_request_set *rqset);
+int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
+		      obd_enqueue_update_f upcall, void *cookie,
+		      struct ptlrpc_request_set *rqset);
 int osc_punch_base(struct obd_export *exp, struct obd_info *oinfo,
 		   obd_enqueue_update_f upcall, void *cookie,
 		   struct ptlrpc_request_set *rqset);
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index 47c6371..a96addf 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -524,11 +524,19 @@ static int osc_io_setattr_start(const struct lu_env *env,
 		oa->o_oi = loi->loi_oi;
 		obdo_set_parent_fid(oa, io->u.ci_setattr.sa_parent_fid);
 		oa->o_stripe_idx = io->u.ci_setattr.sa_stripe_index;
-		oa->o_mtime = attr->cat_mtime;
-		oa->o_atime = attr->cat_atime;
-		oa->o_ctime = attr->cat_ctime;
-		oa->o_valid |= OBD_MD_FLID | OBD_MD_FLGROUP | OBD_MD_FLATIME |
-			       OBD_MD_FLCTIME | OBD_MD_FLMTIME;
+		oa->o_valid |= OBD_MD_FLID | OBD_MD_FLGROUP;
+		if (ia_valid & ATTR_CTIME) {
+			oa->o_valid |= OBD_MD_FLCTIME;
+			oa->o_ctime = attr->cat_ctime;
+		}
+		if (ia_valid & ATTR_ATIME) {
+			oa->o_valid |= OBD_MD_FLATIME;
+			oa->o_atime = attr->cat_atime;
+		}
+		if (ia_valid & ATTR_MTIME) {
+			oa->o_valid |= OBD_MD_FLMTIME;
+			oa->o_mtime = attr->cat_mtime;
+		}
 		if (ia_valid & ATTR_SIZE) {
 			oa->o_size = size;
 			oa->o_blocks = OBD_OBJECT_EOF;
@@ -541,6 +549,10 @@ static int osc_io_setattr_start(const struct lu_env *env,
 		} else {
 			LASSERT(oio->oi_lockless == 0);
 		}
+		if (ia_valid & ATTR_ATTR_FLAG) {
+			oa->o_flags = io->u.ci_setattr.sa_attr_flags;
+			oa->o_valid |= OBD_MD_FLFLAGS;
+		}
 
 		oinfo.oi_oa = oa;
 		init_completion(&cbargs->opc_sync);
@@ -550,10 +562,9 @@ static int osc_io_setattr_start(const struct lu_env *env,
 						&oinfo, osc_async_upcall,
 						cbargs, PTLRPCD_SET);
 		else
-			result = osc_setattr_async_base(osc_export(cl2osc(obj)),
-							&oinfo, NULL,
-							osc_async_upcall,
-							cbargs, PTLRPCD_SET);
+			result = osc_setattr_async(osc_export(cl2osc(obj)),
+						   &oinfo, osc_async_upcall,
+						   cbargs, PTLRPCD_SET);
 		cbargs->opc_rpc_sent = result == 0;
 	}
 	return result;
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 963a485..de2f522 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -341,10 +341,9 @@ out:
 	return rc;
 }
 
-int osc_setattr_async_base(struct obd_export *exp, struct obd_info *oinfo,
-			   struct obd_trans_info *oti,
-			   obd_enqueue_update_f upcall, void *cookie,
-			   struct ptlrpc_request_set *rqset)
+int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
+		      obd_enqueue_update_f upcall, void *cookie,
+		      struct ptlrpc_request_set *rqset)
 {
 	struct ptlrpc_request *req;
 	struct osc_setattr_args *sa;
@@ -360,9 +359,6 @@ int osc_setattr_async_base(struct obd_export *exp, struct obd_info *oinfo,
 		return rc;
 	}
 
-	if (oti && oinfo->oi_oa->o_valid & OBD_MD_FLCOOKIE)
-		oinfo->oi_oa->o_lcookie = *oti->oti_logcookies;
-
 	osc_pack_req_body(req, oinfo);
 
 	ptlrpc_request_set_replen(req);
@@ -390,14 +386,6 @@ int osc_setattr_async_base(struct obd_export *exp, struct obd_info *oinfo,
 	return 0;
 }
 
-static int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
-			     struct obd_trans_info *oti,
-			     struct ptlrpc_request_set *rqset)
-{
-	return osc_setattr_async_base(exp, oinfo, oti,
-				      oinfo->oi_cb_up, oinfo, rqset);
-}
-
 static int osc_create(const struct lu_env *env, struct obd_export *exp,
 		      struct obdo *oa, struct obd_trans_info *oti)
 {
@@ -3013,7 +3001,6 @@ static struct obd_ops osc_obd_ops = {
 	.getattr        = osc_getattr,
 	.getattr_async  = osc_getattr_async,
 	.setattr        = osc_setattr,
-	.setattr_async  = osc_setattr_async,
 	.iocontrol      = osc_iocontrol,
 	.set_info_async = osc_set_info_async,
 	.import_event   = osc_import_event,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 13/41] staging: lustre: clio: use CIT_SETATTR for FSFILT_IOC_SETFLAGS
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Add handling of inode flags to the handlers of CIT_SETATTR in lov and
osc. In the FSFILT_IOC_SETFLAGS case of ll_iocontrol() use
cl_setattr_ost() rather than obd_setattr_rqset() to set inode flags on
OST objects. Remove the then unused OBD API methods
obd_setattr_rqset() and obd_setattr_async() along with their
supporting functions.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/13422
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    3 +-
 .../staging/lustre/lustre/include/lustre_compat.h  |    2 +
 drivers/staging/lustre/lustre/include/obd.h        |    3 -
 drivers/staging/lustre/lustre/include/obd_class.h  |   39 ------
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |    8 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |    3 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   33 ++----
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   15 ++-
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    6 -
 drivers/staging/lustre/lustre/lov/lov_io.c         |    2 +
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   82 ------------
 drivers/staging/lustre/lustre/lov/lov_request.c    |  131 --------------------
 drivers/staging/lustre/lustre/osc/osc_internal.h   |    7 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |   29 +++--
 drivers/staging/lustre/lustre/osc/osc_request.c    |   19 +---
 15 files changed, 60 insertions(+), 322 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 3af9aa3..0b66d02 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1772,9 +1772,10 @@ struct cl_io {
 		struct cl_io_rw_common ci_rw;
 		struct cl_setattr_io {
 			struct ost_lvb   sa_attr;
+			unsigned int		 sa_attr_flags;
 			unsigned int     sa_valid;
 			int		sa_stripe_index;
-			struct lu_fid  *sa_parent_fid;
+			const struct lu_fid	*sa_parent_fid;
 		} ci_setattr;
 		struct cl_fault_io {
 			/** page index within file. */
diff --git a/drivers/staging/lustre/lustre/include/lustre_compat.h b/drivers/staging/lustre/lustre/include/lustre_compat.h
index 567c438..300e96f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_compat.h
+++ b/drivers/staging/lustre/lustre/include/lustre_compat.h
@@ -74,4 +74,6 @@
 # define ext2_find_next_zero_bit  find_next_zero_bit_le
 #endif
 
+#define TIMES_SET_FLAGS (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)
+
 #endif /* _LUSTRE_COMPAT_H */
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 51d5487..fe05cc6 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -896,9 +896,6 @@ struct obd_ops {
 		       struct obdo *oa, struct obd_trans_info *oti);
 	int (*setattr)(const struct lu_env *, struct obd_export *exp,
 		       struct obd_info *oinfo, struct obd_trans_info *oti);
-	int (*setattr_async)(struct obd_export *exp, struct obd_info *oinfo,
-			     struct obd_trans_info *oti,
-			     struct ptlrpc_request_set *rqset);
 	int (*getattr)(const struct lu_env *env, struct obd_export *exp,
 		       struct obd_info *oinfo);
 	int (*getattr_async)(struct obd_export *exp, struct obd_info *oinfo,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 2ea102d..b2ced8b 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -749,45 +749,6 @@ static inline int obd_setattr(const struct lu_env *env, struct obd_export *exp,
 	return rc;
 }
 
-/* This performs all the requests set init/wait/destroy actions. */
-static inline int obd_setattr_rqset(struct obd_export *exp,
-				    struct obd_info *oinfo,
-				    struct obd_trans_info *oti)
-{
-	struct ptlrpc_request_set *set = NULL;
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, setattr_async);
-	EXP_COUNTER_INCREMENT(exp, setattr_async);
-
-	set =  ptlrpc_prep_set();
-	if (!set)
-		return -ENOMEM;
-
-	rc = OBP(exp->exp_obd, setattr_async)(exp, oinfo, oti, set);
-	if (rc == 0)
-		rc = ptlrpc_set_wait(set);
-	ptlrpc_set_destroy(set);
-	return rc;
-}
-
-/* This adds all the requests into @set if @set != NULL, otherwise
- * all requests are sent asynchronously without waiting for response.
- */
-static inline int obd_setattr_async(struct obd_export *exp,
-				    struct obd_info *oinfo,
-				    struct obd_trans_info *oti,
-				    struct ptlrpc_request_set *set)
-{
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, setattr_async);
-	EXP_COUNTER_INCREMENT(exp, setattr_async);
-
-	rc = OBP(exp->exp_obd, setattr_async)(exp, oinfo, oti, set);
-	return rc;
-}
-
 static inline int obd_add_conn(struct obd_import *imp, struct obd_uuid *uuid,
 			       int priority)
 {
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index 084330d..64f4aed 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -80,7 +80,8 @@ int cl_inode_fini_refcheck;
  */
 static DEFINE_MUTEX(cl_inode_fini_guard);
 
-int cl_setattr_ost(struct inode *inode, const struct iattr *attr)
+int cl_setattr_ost(struct cl_object *obj, const struct iattr *attr,
+		   unsigned int attr_flags)
 {
 	struct lu_env *env;
 	struct cl_io  *io;
@@ -92,14 +93,15 @@ int cl_setattr_ost(struct inode *inode, const struct iattr *attr)
 		return PTR_ERR(env);
 
 	io = vvp_env_thread_io(env);
-	io->ci_obj = ll_i2info(inode)->lli_clob;
+	io->ci_obj = obj;
 
 	io->u.ci_setattr.sa_attr.lvb_atime = LTIME_S(attr->ia_atime);
 	io->u.ci_setattr.sa_attr.lvb_mtime = LTIME_S(attr->ia_mtime);
 	io->u.ci_setattr.sa_attr.lvb_ctime = LTIME_S(attr->ia_ctime);
 	io->u.ci_setattr.sa_attr.lvb_size = attr->ia_size;
+	io->u.ci_setattr.sa_attr_flags = attr_flags;
 	io->u.ci_setattr.sa_valid = attr->ia_valid;
-	io->u.ci_setattr.sa_parent_fid = ll_inode2fid(inode);
+	io->u.ci_setattr.sa_parent_fid = lu_object_fid(&obj->co_lu);
 
 again:
 	if (cl_io_init(env, io, CIT_SETATTR, io->ci_obj) == 0) {
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index b06cd3c..02541b1 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -1347,7 +1347,8 @@ int ll_page_sync_io(const struct lu_env *env, struct cl_io *io,
 int ll_getparent(struct file *file, struct getparent __user *arg);
 
 /* lcommon_cl.c */
-int cl_setattr_ost(struct inode *inode, const struct iattr *attr);
+int cl_setattr_ost(struct cl_object *obj, const struct iattr *attr,
+		   unsigned int attr_flags);
 
 extern struct lu_env *cl_inode_fini_env;
 extern int cl_inode_fini_refcheck;
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index c7aab3f..9112a52 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1502,7 +1502,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 		 */
 		if (attr->ia_valid & ATTR_SIZE)
 			down_write(&lli->lli_trunc_sem);
-		rc = cl_setattr_ost(inode, attr);
+		rc = cl_setattr_ost(ll_i2info(inode)->lli_clob, attr, 0);
 		if (attr->ia_valid & ATTR_SIZE)
 			up_write(&lli->lli_trunc_sem);
 	}
@@ -1879,9 +1879,9 @@ int ll_iocontrol(struct inode *inode, struct file *file,
 		return put_user(flags, (int __user *)arg);
 	}
 	case FSFILT_IOC_SETFLAGS: {
-		struct lov_stripe_md *lsm;
-		struct obd_info oinfo = { };
 		struct md_op_data *op_data;
+		struct cl_object *obj;
+		struct iattr *attr;
 
 		if (get_user(flags, (int __user *)arg))
 			return -EFAULT;
@@ -1901,30 +1901,17 @@ int ll_iocontrol(struct inode *inode, struct file *file,
 
 		inode->i_flags = ll_ext_to_inode_flags(flags);
 
-		lsm = ccc_inode_lsm_get(inode);
-		if (!lsm_has_objects(lsm)) {
-			ccc_inode_lsm_put(inode, lsm);
+		obj = ll_i2info(inode)->lli_clob;
+		if (!obj)
 			return 0;
-		}
 
-		oinfo.oi_oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-		if (!oinfo.oi_oa) {
-			ccc_inode_lsm_put(inode, lsm);
+		attr = kzalloc(sizeof(*attr), GFP_NOFS);
+		if (!attr)
 			return -ENOMEM;
-		}
-		oinfo.oi_md = lsm;
-		oinfo.oi_oa->o_oi = lsm->lsm_oi;
-		oinfo.oi_oa->o_flags = flags;
-		oinfo.oi_oa->o_valid = OBD_MD_FLID | OBD_MD_FLFLAGS |
-				       OBD_MD_FLGROUP;
-		obdo_set_parent_fid(oinfo.oi_oa, &ll_i2info(inode)->lli_fid);
-		rc = obd_setattr_rqset(sbi->ll_dt_exp, &oinfo, NULL);
-		kmem_cache_free(obdo_cachep, oinfo.oi_oa);
-		ccc_inode_lsm_put(inode, lsm);
-
-		if (rc && rc != -EPERM && rc != -EACCES)
-			CERROR("osc_setattr_async fails: rc = %d\n", rc);
 
+		attr->ia_valid = ATTR_ATTR_FLAG;
+		rc = cl_setattr_ost(obj, attr, flags);
+		kfree(attr);
 		return rc;
 	}
 	default:
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 8187fa3..8f1964f 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -571,9 +571,16 @@ static int vvp_io_setattr_lock(const struct lu_env *env,
 		if (new_size == 0)
 			enqflags = CEF_DISCARD_DATA;
 	} else {
-		if ((io->u.ci_setattr.sa_attr.lvb_mtime >=
-		     io->u.ci_setattr.sa_attr.lvb_ctime) ||
-		    (io->u.ci_setattr.sa_attr.lvb_atime >=
+		unsigned int valid = io->u.ci_setattr.sa_valid;
+
+		if (!(valid & TIMES_SET_FLAGS))
+			return 0;
+
+		if ((!(valid & ATTR_MTIME) ||
+		     io->u.ci_setattr.sa_attr.lvb_mtime >=
+		     io->u.ci_setattr.sa_attr.lvb_ctime) &&
+		    (!(valid & ATTR_ATIME) ||
+		     io->u.ci_setattr.sa_attr.lvb_atime >=
 		     io->u.ci_setattr.sa_attr.lvb_ctime))
 			return 0;
 		new_size = 0;
@@ -644,7 +651,7 @@ static int vvp_io_setattr_start(const struct lu_env *env,
 	if (cl_io_is_trunc(io))
 		result = vvp_io_setattr_trunc(env, ios, inode,
 					io->u.ci_setattr.sa_attr.lvb_size);
-	if (result == 0)
+	if (!result && io->u.ci_setattr.sa_valid & TIMES_SET_FLAGS)
 		result = vvp_io_setattr_time(env, ios);
 	return result;
 }
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 07e5ede..4743c65 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -155,12 +155,6 @@ int lov_update_common_set(struct lov_request_set *set,
 int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
 			 struct lov_request_set **reqset);
 int lov_fini_getattr_set(struct lov_request_set *set);
-int lov_prep_setattr_set(struct obd_export *exp, struct obd_info *oinfo,
-			 struct obd_trans_info *oti,
-			 struct lov_request_set **reqset);
-int lov_update_setattr_set(struct lov_request_set *set,
-			   struct lov_request *req, int rc);
-int lov_fini_setattr_set(struct lov_request_set *set);
 int lov_prep_statfs_set(struct obd_device *obd, struct obd_info *oinfo,
 			struct lov_request_set **reqset);
 int lov_fini_statfs(struct obd_device *obd, struct obd_statfs *osfs,
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index e75e5d2..369718e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -86,6 +86,8 @@ static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
 	switch (io->ci_type) {
 	case CIT_SETATTR: {
 		io->u.ci_setattr.sa_attr = parent->u.ci_setattr.sa_attr;
+		io->u.ci_setattr.sa_attr_flags =
+					parent->u.ci_setattr.sa_attr_flags;
 		io->u.ci_setattr.sa_valid = parent->u.ci_setattr.sa_valid;
 		io->u.ci_setattr.sa_stripe_index = stripe;
 		io->u.ci_setattr.sa_parent_fid =
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 02c7087..30903fc 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1047,87 +1047,6 @@ out:
 	return rc ? rc : err;
 }
 
-static int lov_setattr_interpret(struct ptlrpc_request_set *rqset,
-				 void *data, int rc)
-{
-	struct lov_request_set *lovset = (struct lov_request_set *)data;
-	int err;
-
-	if (rc)
-		atomic_set(&lovset->set_completes, 0);
-	err = lov_fini_setattr_set(lovset);
-	return rc ? rc : err;
-}
-
-/* If @oti is given, the request goes from MDS and responses from OSTs are not
- * needed. Otherwise, a client is waiting for responses.
- */
-static int lov_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
-			     struct obd_trans_info *oti,
-			     struct ptlrpc_request_set *rqset)
-{
-	struct lov_request_set *set;
-	struct lov_request *req;
-	struct lov_obd *lov;
-	int rc = 0;
-
-	LASSERT(oinfo);
-	ASSERT_LSM_MAGIC(oinfo->oi_md);
-	if (oinfo->oi_oa->o_valid & OBD_MD_FLCOOKIE) {
-		LASSERT(oti);
-		LASSERT(oti->oti_logcookies);
-	}
-
-	if (!exp || !exp->exp_obd)
-		return -ENODEV;
-
-	lov = &exp->exp_obd->u.lov;
-	rc = lov_prep_setattr_set(exp, oinfo, oti, &set);
-	if (rc)
-		return rc;
-
-	CDEBUG(D_INFO, "objid "DOSTID": %ux%u byte stripes\n",
-	       POSTID(&oinfo->oi_md->lsm_oi),
-	       oinfo->oi_md->lsm_stripe_count,
-	       oinfo->oi_md->lsm_stripe_size);
-
-	list_for_each_entry(req, &set->set_list, rq_link) {
-		if (oinfo->oi_oa->o_valid & OBD_MD_FLCOOKIE)
-			oti->oti_logcookies = set->set_cookies + req->rq_stripe;
-
-		CDEBUG(D_INFO, "objid " DOSTID "[%d] has subobj " DOSTID " at idx%u\n",
-		       POSTID(&oinfo->oi_oa->o_oi), req->rq_stripe,
-		       POSTID(&req->rq_oi.oi_oa->o_oi), req->rq_idx);
-
-		rc = obd_setattr_async(lov->lov_tgts[req->rq_idx]->ltd_exp,
-				       &req->rq_oi, oti, rqset);
-		if (rc) {
-			CERROR("error: setattr objid "DOSTID" subobj"
-			       DOSTID" on OST idx %d: rc = %d\n",
-			       POSTID(&set->set_oi->oi_oa->o_oi),
-			       POSTID(&req->rq_oi.oi_oa->o_oi),
-			       req->rq_idx, rc);
-			break;
-		}
-	}
-
-	/* If we are not waiting for responses on async requests, return. */
-	if (rc || !rqset || list_empty(&rqset->set_requests)) {
-		int err;
-
-		if (rc)
-			atomic_set(&set->set_completes, 0);
-		err = lov_fini_setattr_set(set);
-		return rc ? rc : err;
-	}
-
-	LASSERT(!rqset->set_interpret);
-	rqset->set_interpret = lov_setattr_interpret;
-	rqset->set_arg = (void *)set;
-
-	return 0;
-}
-
 int lov_statfs_interpret(struct ptlrpc_request_set *rqset, void *data, int rc)
 {
 	struct lov_request_set *lovset = (struct lov_request_set *)data;
@@ -1613,7 +1532,6 @@ static struct obd_ops lov_obd_ops = {
 	.packmd         = lov_packmd,
 	.unpackmd       = lov_unpackmd,
 	.getattr_async  = lov_getattr_async,
-	.setattr_async  = lov_setattr_async,
 	.iocontrol      = lov_iocontrol,
 	.get_info       = lov_get_info,
 	.set_info_async = lov_set_info_async,
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 8e40702..048e597 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -311,137 +311,6 @@ out_set:
 	return rc;
 }
 
-int lov_fini_setattr_set(struct lov_request_set *set)
-{
-	int rc = 0;
-
-	if (!set)
-		return 0;
-	LASSERT(set->set_exp);
-	if (atomic_read(&set->set_completes)) {
-		rc = common_attr_done(set);
-		/* FIXME update qos data here */
-	}
-
-	lov_put_reqset(set);
-	return rc;
-}
-
-int lov_update_setattr_set(struct lov_request_set *set,
-			   struct lov_request *req, int rc)
-{
-	struct lov_obd *lov = &req->rq_rqset->set_exp->exp_obd->u.lov;
-	struct lov_stripe_md *lsm = req->rq_rqset->set_oi->oi_md;
-
-	lov_update_set(set, req, rc);
-
-	/* grace error on inactive ost */
-	if (rc && !(lov->lov_tgts[req->rq_idx] &&
-		    lov->lov_tgts[req->rq_idx]->ltd_active))
-		rc = 0;
-
-	if (rc == 0) {
-		if (req->rq_oi.oi_oa->o_valid & OBD_MD_FLCTIME)
-			lsm->lsm_oinfo[req->rq_stripe]->loi_lvb.lvb_ctime =
-				req->rq_oi.oi_oa->o_ctime;
-		if (req->rq_oi.oi_oa->o_valid & OBD_MD_FLMTIME)
-			lsm->lsm_oinfo[req->rq_stripe]->loi_lvb.lvb_mtime =
-				req->rq_oi.oi_oa->o_mtime;
-		if (req->rq_oi.oi_oa->o_valid & OBD_MD_FLATIME)
-			lsm->lsm_oinfo[req->rq_stripe]->loi_lvb.lvb_atime =
-				req->rq_oi.oi_oa->o_atime;
-	}
-
-	return rc;
-}
-
-/* The callback for osc_setattr_async that finalizes a request info when a
- * response is received.
- */
-static int cb_setattr_update(void *cookie, int rc)
-{
-	struct obd_info *oinfo = cookie;
-	struct lov_request *lovreq;
-
-	lovreq = container_of(oinfo, struct lov_request, rq_oi);
-	return lov_update_setattr_set(lovreq->rq_rqset, lovreq, rc);
-}
-
-int lov_prep_setattr_set(struct obd_export *exp, struct obd_info *oinfo,
-			 struct obd_trans_info *oti,
-			 struct lov_request_set **reqset)
-{
-	struct lov_request_set *set;
-	struct lov_obd *lov = &exp->exp_obd->u.lov;
-	int rc = 0, i;
-
-	set = kzalloc(sizeof(*set), GFP_NOFS);
-	if (!set)
-		return -ENOMEM;
-	lov_init_set(set);
-
-	set->set_exp = exp;
-	set->set_oi = oinfo;
-	if (oti && oinfo->oi_oa->o_valid & OBD_MD_FLCOOKIE)
-		set->set_cookies = oti->oti_logcookies;
-
-	for (i = 0; i < oinfo->oi_md->lsm_stripe_count; i++) {
-		struct lov_oinfo *loi = oinfo->oi_md->lsm_oinfo[i];
-		struct lov_request *req;
-
-		if (lov_oinfo_is_dummy(loi))
-			continue;
-
-		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
-			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
-			continue;
-		}
-
-		req = kzalloc(sizeof(*req), GFP_NOFS);
-		if (!req) {
-			rc = -ENOMEM;
-			goto out_set;
-		}
-		req->rq_stripe = i;
-		req->rq_idx = loi->loi_ost_idx;
-
-		req->rq_oi.oi_oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-		if (!req->rq_oi.oi_oa) {
-			kfree(req);
-			rc = -ENOMEM;
-			goto out_set;
-		}
-		memcpy(req->rq_oi.oi_oa, oinfo->oi_oa,
-		       sizeof(*req->rq_oi.oi_oa));
-		req->rq_oi.oi_oa->o_oi = loi->loi_oi;
-		req->rq_oi.oi_oa->o_stripe_idx = i;
-		req->rq_oi.oi_cb_up = cb_setattr_update;
-
-		if (oinfo->oi_oa->o_valid & OBD_MD_FLSIZE) {
-			int off = lov_stripe_offset(oinfo->oi_md,
-						    oinfo->oi_oa->o_size, i,
-						    &req->rq_oi.oi_oa->o_size);
-
-			if (off < 0 && req->rq_oi.oi_oa->o_size)
-				req->rq_oi.oi_oa->o_size--;
-
-			CDEBUG(D_INODE, "stripe %d has size %llu/%llu\n",
-			       i, req->rq_oi.oi_oa->o_size,
-			       oinfo->oi_oa->o_size);
-		}
-		lov_set_add_req(req, set);
-	}
-	if (!set->set_count) {
-		rc = -EIO;
-		goto out_set;
-	}
-	*reqset = set;
-	return rc;
-out_set:
-	lov_fini_setattr_set(set);
-	return rc;
-}
-
 #define LOV_U64_MAX ((__u64)~0ULL)
 #define LOV_SUM_MAX(tot, add)					   \
 	do {							    \
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 9a61c9b..90ea0d1 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -119,10 +119,9 @@ int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		   __u64 *flags, void *data, struct lustre_handle *lockh,
 		   int unref);
 
-int osc_setattr_async_base(struct obd_export *exp, struct obd_info *oinfo,
-			   struct obd_trans_info *oti,
-			   obd_enqueue_update_f upcall, void *cookie,
-			   struct ptlrpc_request_set *rqset);
+int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
+		      obd_enqueue_update_f upcall, void *cookie,
+		      struct ptlrpc_request_set *rqset);
 int osc_punch_base(struct obd_export *exp, struct obd_info *oinfo,
 		   obd_enqueue_update_f upcall, void *cookie,
 		   struct ptlrpc_request_set *rqset);
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index 47c6371..a96addf 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -524,11 +524,19 @@ static int osc_io_setattr_start(const struct lu_env *env,
 		oa->o_oi = loi->loi_oi;
 		obdo_set_parent_fid(oa, io->u.ci_setattr.sa_parent_fid);
 		oa->o_stripe_idx = io->u.ci_setattr.sa_stripe_index;
-		oa->o_mtime = attr->cat_mtime;
-		oa->o_atime = attr->cat_atime;
-		oa->o_ctime = attr->cat_ctime;
-		oa->o_valid |= OBD_MD_FLID | OBD_MD_FLGROUP | OBD_MD_FLATIME |
-			       OBD_MD_FLCTIME | OBD_MD_FLMTIME;
+		oa->o_valid |= OBD_MD_FLID | OBD_MD_FLGROUP;
+		if (ia_valid & ATTR_CTIME) {
+			oa->o_valid |= OBD_MD_FLCTIME;
+			oa->o_ctime = attr->cat_ctime;
+		}
+		if (ia_valid & ATTR_ATIME) {
+			oa->o_valid |= OBD_MD_FLATIME;
+			oa->o_atime = attr->cat_atime;
+		}
+		if (ia_valid & ATTR_MTIME) {
+			oa->o_valid |= OBD_MD_FLMTIME;
+			oa->o_mtime = attr->cat_mtime;
+		}
 		if (ia_valid & ATTR_SIZE) {
 			oa->o_size = size;
 			oa->o_blocks = OBD_OBJECT_EOF;
@@ -541,6 +549,10 @@ static int osc_io_setattr_start(const struct lu_env *env,
 		} else {
 			LASSERT(oio->oi_lockless == 0);
 		}
+		if (ia_valid & ATTR_ATTR_FLAG) {
+			oa->o_flags = io->u.ci_setattr.sa_attr_flags;
+			oa->o_valid |= OBD_MD_FLFLAGS;
+		}
 
 		oinfo.oi_oa = oa;
 		init_completion(&cbargs->opc_sync);
@@ -550,10 +562,9 @@ static int osc_io_setattr_start(const struct lu_env *env,
 						&oinfo, osc_async_upcall,
 						cbargs, PTLRPCD_SET);
 		else
-			result = osc_setattr_async_base(osc_export(cl2osc(obj)),
-							&oinfo, NULL,
-							osc_async_upcall,
-							cbargs, PTLRPCD_SET);
+			result = osc_setattr_async(osc_export(cl2osc(obj)),
+						   &oinfo, osc_async_upcall,
+						   cbargs, PTLRPCD_SET);
 		cbargs->opc_rpc_sent = result == 0;
 	}
 	return result;
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 963a485..de2f522 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -341,10 +341,9 @@ out:
 	return rc;
 }
 
-int osc_setattr_async_base(struct obd_export *exp, struct obd_info *oinfo,
-			   struct obd_trans_info *oti,
-			   obd_enqueue_update_f upcall, void *cookie,
-			   struct ptlrpc_request_set *rqset)
+int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
+		      obd_enqueue_update_f upcall, void *cookie,
+		      struct ptlrpc_request_set *rqset)
 {
 	struct ptlrpc_request *req;
 	struct osc_setattr_args *sa;
@@ -360,9 +359,6 @@ int osc_setattr_async_base(struct obd_export *exp, struct obd_info *oinfo,
 		return rc;
 	}
 
-	if (oti && oinfo->oi_oa->o_valid & OBD_MD_FLCOOKIE)
-		oinfo->oi_oa->o_lcookie = *oti->oti_logcookies;
-
 	osc_pack_req_body(req, oinfo);
 
 	ptlrpc_request_set_replen(req);
@@ -390,14 +386,6 @@ int osc_setattr_async_base(struct obd_export *exp, struct obd_info *oinfo,
 	return 0;
 }
 
-static int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
-			     struct obd_trans_info *oti,
-			     struct ptlrpc_request_set *rqset)
-{
-	return osc_setattr_async_base(exp, oinfo, oti,
-				      oinfo->oi_cb_up, oinfo, rqset);
-}
-
 static int osc_create(const struct lu_env *env, struct obd_export *exp,
 		      struct obdo *oa, struct obd_trans_info *oti)
 {
@@ -3013,7 +3001,6 @@ static struct obd_ops osc_obd_ops = {
 	.getattr        = osc_getattr,
 	.getattr_async  = osc_getattr_async,
 	.setattr        = osc_setattr,
-	.setattr_async  = osc_setattr_async,
 	.iocontrol      = osc_iocontrol,
 	.set_info_async = osc_set_info_async,
 	.import_event   = osc_import_event,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 14/41] staging: lustre: ptlrpc: Add a tag field to ptlrpc messages
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Gregoire Pichon, James Simmons

From: Gregoire Pichon <gregoire.pichon@bull.net>

The new tag field is used as a virtual index for multiple modifying
RPCs management. It is set by the client and allows the target to
release in-memory reply data when the tag is reused by a new RPC.

The tag field replaces the unused last_seen field of ptlrpcd_body
structure.

Additionally, the last_xid field is used to transfer the highest XID
for which a reply has been received and does not have an unreplied
lower-numbered XID.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5319
Reviewed-on: http://review.whamcloud.com/14095
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   12 ++++--
 drivers/staging/lustre/lustre/include/lustre_net.h |    2 +
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   39 +++++++++++++++++++-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   32 ++++++++++++----
 4 files changed, 72 insertions(+), 13 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index b88807f..4f6eeec 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1099,8 +1099,10 @@ struct ptlrpc_body_v3 {
 	__u32 pb_version;
 	__u32 pb_opc;
 	__u32 pb_status;
-	__u64 pb_last_xid;
-	__u64 pb_last_seen;
+	__u64 pb_last_xid; /* highest replied XID without lower unreplied XID */
+	__u16 pb_tag;      /* virtual slot idx for multiple modifying RPCs */
+	__u16 pb_padding0;
+	__u32 pb_padding1;
 	__u64 pb_last_committed;
 	__u64 pb_transno;
 	__u32 pb_flags;
@@ -1125,8 +1127,10 @@ struct ptlrpc_body_v2 {
 	__u32 pb_version;
 	__u32 pb_opc;
 	__u32 pb_status;
-	__u64 pb_last_xid;
-	__u64 pb_last_seen;
+	__u64 pb_last_xid; /* highest replied XID without lower unreplied XID */
+	__u16 pb_tag;      /* virtual slot idx for multiple modifying RPCs */
+	__u16 pb_padding0;
+	__u32 pb_padding1;
 	__u64 pb_last_committed;
 	__u64 pb_transno;
 	__u32 pb_flags;
diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index e9aba99..ab80330 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -2652,6 +2652,7 @@ struct lustre_handle *lustre_msg_get_handle(struct lustre_msg *msg);
 __u32 lustre_msg_get_type(struct lustre_msg *msg);
 void lustre_msg_add_version(struct lustre_msg *msg, u32 version);
 __u32 lustre_msg_get_opc(struct lustre_msg *msg);
+__u16 lustre_msg_get_tag(struct lustre_msg *msg);
 __u64 lustre_msg_get_last_committed(struct lustre_msg *msg);
 __u64 *lustre_msg_get_versions(struct lustre_msg *msg);
 __u64 lustre_msg_get_transno(struct lustre_msg *msg);
@@ -2670,6 +2671,7 @@ void lustre_msg_set_handle(struct lustre_msg *msg,
 			   struct lustre_handle *handle);
 void lustre_msg_set_type(struct lustre_msg *msg, __u32 type);
 void lustre_msg_set_opc(struct lustre_msg *msg, __u32 opc);
+void lustre_msg_set_tag(struct lustre_msg *msg, __u16 tag);
 void lustre_msg_set_versions(struct lustre_msg *msg, __u64 *versions);
 void lustre_msg_set_transno(struct lustre_msg *msg, __u64 transno);
 void lustre_msg_set_status(struct lustre_msg *msg, __u32 status);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 2dc0b79..3055649 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -942,6 +942,25 @@ __u32 lustre_msg_get_opc(struct lustre_msg *msg)
 }
 EXPORT_SYMBOL(lustre_msg_get_opc);
 
+__u16 lustre_msg_get_tag(struct lustre_msg *msg)
+{
+	switch (msg->lm_magic) {
+	case LUSTRE_MSG_MAGIC_V2: {
+		struct ptlrpc_body *pb = lustre_msg_ptlrpc_body(msg);
+
+		if (!pb) {
+			CERROR("invalid msg %p: no ptlrpc body!\n", msg);
+			return 0;
+		}
+		return pb->pb_tag;
+	}
+	default:
+		CERROR("incorrect message magic: %08x\n", msg->lm_magic);
+		return 0;
+	}
+}
+EXPORT_SYMBOL(lustre_msg_get_tag);
+
 __u64 lustre_msg_get_last_committed(struct lustre_msg *msg)
 {
 	switch (msg->lm_magic) {
@@ -1236,6 +1255,22 @@ void lustre_msg_set_opc(struct lustre_msg *msg, __u32 opc)
 	}
 }
 
+void lustre_msg_set_tag(struct lustre_msg *msg, __u16 tag)
+{
+	switch (msg->lm_magic) {
+	case LUSTRE_MSG_MAGIC_V2: {
+		struct ptlrpc_body *pb = lustre_msg_ptlrpc_body(msg);
+
+		LASSERTF(pb, "invalid msg %p: no ptlrpc body!\n", msg);
+		pb->pb_tag = tag;
+		return;
+	}
+	default:
+		LASSERTF(0, "incorrect message magic: %08x\n", msg->lm_magic);
+	}
+}
+EXPORT_SYMBOL(lustre_msg_set_tag);
+
 void lustre_msg_set_versions(struct lustre_msg *msg, __u64 *versions)
 {
 	switch (msg->lm_magic) {
@@ -1442,7 +1477,7 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *b)
 	__swab32s(&b->pb_opc);
 	__swab32s(&b->pb_status);
 	__swab64s(&b->pb_last_xid);
-	__swab64s(&b->pb_last_seen);
+	__swab16s(&b->pb_tag);
 	__swab64s(&b->pb_last_committed);
 	__swab64s(&b->pb_transno);
 	__swab32s(&b->pb_flags);
@@ -1456,6 +1491,8 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *b)
 	__swab64s(&b->pb_pre_versions[1]);
 	__swab64s(&b->pb_pre_versions[2]);
 	__swab64s(&b->pb_pre_versions[3]);
+	CLASSERT(offsetof(typeof(*b), pb_padding0) != 0);
+	CLASSERT(offsetof(typeof(*b), pb_padding1) != 0);
 	CLASSERT(offsetof(typeof(*b), pb_padding) != 0);
 	/* While we need to maintain compatibility between
 	 * clients and servers without ptlrpc_body_v2 (< 2.3)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index c299099..d5af8cd 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -635,10 +635,18 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_last_xid));
 	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_xid) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_xid));
-	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_last_seen) == 32, "found %lld\n",
-		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_last_seen));
-	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_seen) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_seen));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_tag) == 32, "found %lld\n",
+		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_tag));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_tag) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_tag));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding0) == 34, "found %lld\n",
+		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_padding0));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding0) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding0));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding1) == 36, "found %lld\n",
+		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_padding1));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding1) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding1));
 	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_last_committed) == 40, "found %lld\n",
 		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_last_committed));
 	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_committed) == 8, "found %lld\n",
@@ -713,10 +721,18 @@ void lustre_assert_wire_constants(void)
 		 (int)offsetof(struct ptlrpc_body_v3, pb_last_xid), (int)offsetof(struct ptlrpc_body_v2, pb_last_xid));
 	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_xid) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_xid), "%d != %d\n",
 		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_xid), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_xid));
-	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_last_seen) == (int)offsetof(struct ptlrpc_body_v2, pb_last_seen), "%d != %d\n",
-		 (int)offsetof(struct ptlrpc_body_v3, pb_last_seen), (int)offsetof(struct ptlrpc_body_v2, pb_last_seen));
-	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_seen) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_seen), "%d != %d\n",
-		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_seen), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_seen));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_tag) == (int)offsetof(struct ptlrpc_body_v2, pb_tag), "%d != %d\n",
+		 (int)offsetof(struct ptlrpc_body_v3, pb_tag), (int)offsetof(struct ptlrpc_body_v2, pb_tag));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_tag) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_tag), "%d != %d\n",
+		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_tag), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_tag));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding0) == (int)offsetof(struct ptlrpc_body_v2, pb_padding0), "%d != %d\n",
+		 (int)offsetof(struct ptlrpc_body_v3, pb_padding0), (int)offsetof(struct ptlrpc_body_v2, pb_padding0));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding0) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding0), "%d != %d\n",
+		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding0), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding0));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding1) == (int)offsetof(struct ptlrpc_body_v2, pb_padding1), "%d != %d\n",
+		 (int)offsetof(struct ptlrpc_body_v3, pb_padding1), (int)offsetof(struct ptlrpc_body_v2, pb_padding1));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding1) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding1), "%d != %d\n",
+		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding1), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding1));
 	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_last_committed) == (int)offsetof(struct ptlrpc_body_v2, pb_last_committed), "%d != %d\n",
 		 (int)offsetof(struct ptlrpc_body_v3, pb_last_committed), (int)offsetof(struct ptlrpc_body_v2, pb_last_committed));
 	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_committed) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_committed), "%d != %d\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 14/41] staging: lustre: ptlrpc: Add a tag field to ptlrpc messages
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Gregoire Pichon, James Simmons

From: Gregoire Pichon <gregoire.pichon@bull.net>

The new tag field is used as a virtual index for multiple modifying
RPCs management. It is set by the client and allows the target to
release in-memory reply data when the tag is reused by a new RPC.

The tag field replaces the unused last_seen field of ptlrpcd_body
structure.

Additionally, the last_xid field is used to transfer the highest XID
for which a reply has been received and does not have an unreplied
lower-numbered XID.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5319
Reviewed-on: http://review.whamcloud.com/14095
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   12 ++++--
 drivers/staging/lustre/lustre/include/lustre_net.h |    2 +
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |   39 +++++++++++++++++++-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   32 ++++++++++++----
 4 files changed, 72 insertions(+), 13 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index b88807f..4f6eeec 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1099,8 +1099,10 @@ struct ptlrpc_body_v3 {
 	__u32 pb_version;
 	__u32 pb_opc;
 	__u32 pb_status;
-	__u64 pb_last_xid;
-	__u64 pb_last_seen;
+	__u64 pb_last_xid; /* highest replied XID without lower unreplied XID */
+	__u16 pb_tag;      /* virtual slot idx for multiple modifying RPCs */
+	__u16 pb_padding0;
+	__u32 pb_padding1;
 	__u64 pb_last_committed;
 	__u64 pb_transno;
 	__u32 pb_flags;
@@ -1125,8 +1127,10 @@ struct ptlrpc_body_v2 {
 	__u32 pb_version;
 	__u32 pb_opc;
 	__u32 pb_status;
-	__u64 pb_last_xid;
-	__u64 pb_last_seen;
+	__u64 pb_last_xid; /* highest replied XID without lower unreplied XID */
+	__u16 pb_tag;      /* virtual slot idx for multiple modifying RPCs */
+	__u16 pb_padding0;
+	__u32 pb_padding1;
 	__u64 pb_last_committed;
 	__u64 pb_transno;
 	__u32 pb_flags;
diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index e9aba99..ab80330 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -2652,6 +2652,7 @@ struct lustre_handle *lustre_msg_get_handle(struct lustre_msg *msg);
 __u32 lustre_msg_get_type(struct lustre_msg *msg);
 void lustre_msg_add_version(struct lustre_msg *msg, u32 version);
 __u32 lustre_msg_get_opc(struct lustre_msg *msg);
+__u16 lustre_msg_get_tag(struct lustre_msg *msg);
 __u64 lustre_msg_get_last_committed(struct lustre_msg *msg);
 __u64 *lustre_msg_get_versions(struct lustre_msg *msg);
 __u64 lustre_msg_get_transno(struct lustre_msg *msg);
@@ -2670,6 +2671,7 @@ void lustre_msg_set_handle(struct lustre_msg *msg,
 			   struct lustre_handle *handle);
 void lustre_msg_set_type(struct lustre_msg *msg, __u32 type);
 void lustre_msg_set_opc(struct lustre_msg *msg, __u32 opc);
+void lustre_msg_set_tag(struct lustre_msg *msg, __u16 tag);
 void lustre_msg_set_versions(struct lustre_msg *msg, __u64 *versions);
 void lustre_msg_set_transno(struct lustre_msg *msg, __u64 transno);
 void lustre_msg_set_status(struct lustre_msg *msg, __u32 status);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 2dc0b79..3055649 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -942,6 +942,25 @@ __u32 lustre_msg_get_opc(struct lustre_msg *msg)
 }
 EXPORT_SYMBOL(lustre_msg_get_opc);
 
+__u16 lustre_msg_get_tag(struct lustre_msg *msg)
+{
+	switch (msg->lm_magic) {
+	case LUSTRE_MSG_MAGIC_V2: {
+		struct ptlrpc_body *pb = lustre_msg_ptlrpc_body(msg);
+
+		if (!pb) {
+			CERROR("invalid msg %p: no ptlrpc body!\n", msg);
+			return 0;
+		}
+		return pb->pb_tag;
+	}
+	default:
+		CERROR("incorrect message magic: %08x\n", msg->lm_magic);
+		return 0;
+	}
+}
+EXPORT_SYMBOL(lustre_msg_get_tag);
+
 __u64 lustre_msg_get_last_committed(struct lustre_msg *msg)
 {
 	switch (msg->lm_magic) {
@@ -1236,6 +1255,22 @@ void lustre_msg_set_opc(struct lustre_msg *msg, __u32 opc)
 	}
 }
 
+void lustre_msg_set_tag(struct lustre_msg *msg, __u16 tag)
+{
+	switch (msg->lm_magic) {
+	case LUSTRE_MSG_MAGIC_V2: {
+		struct ptlrpc_body *pb = lustre_msg_ptlrpc_body(msg);
+
+		LASSERTF(pb, "invalid msg %p: no ptlrpc body!\n", msg);
+		pb->pb_tag = tag;
+		return;
+	}
+	default:
+		LASSERTF(0, "incorrect message magic: %08x\n", msg->lm_magic);
+	}
+}
+EXPORT_SYMBOL(lustre_msg_set_tag);
+
 void lustre_msg_set_versions(struct lustre_msg *msg, __u64 *versions)
 {
 	switch (msg->lm_magic) {
@@ -1442,7 +1477,7 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *b)
 	__swab32s(&b->pb_opc);
 	__swab32s(&b->pb_status);
 	__swab64s(&b->pb_last_xid);
-	__swab64s(&b->pb_last_seen);
+	__swab16s(&b->pb_tag);
 	__swab64s(&b->pb_last_committed);
 	__swab64s(&b->pb_transno);
 	__swab32s(&b->pb_flags);
@@ -1456,6 +1491,8 @@ void lustre_swab_ptlrpc_body(struct ptlrpc_body *b)
 	__swab64s(&b->pb_pre_versions[1]);
 	__swab64s(&b->pb_pre_versions[2]);
 	__swab64s(&b->pb_pre_versions[3]);
+	CLASSERT(offsetof(typeof(*b), pb_padding0) != 0);
+	CLASSERT(offsetof(typeof(*b), pb_padding1) != 0);
 	CLASSERT(offsetof(typeof(*b), pb_padding) != 0);
 	/* While we need to maintain compatibility between
 	 * clients and servers without ptlrpc_body_v2 (< 2.3)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index c299099..d5af8cd 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -635,10 +635,18 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_last_xid));
 	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_xid) == 8, "found %lld\n",
 		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_xid));
-	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_last_seen) == 32, "found %lld\n",
-		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_last_seen));
-	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_seen) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_seen));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_tag) == 32, "found %lld\n",
+		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_tag));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_tag) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_tag));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding0) == 34, "found %lld\n",
+		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_padding0));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding0) == 2, "found %lld\n",
+		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding0));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding1) == 36, "found %lld\n",
+		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_padding1));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding1) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding1));
 	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_last_committed) == 40, "found %lld\n",
 		 (long long)(int)offsetof(struct ptlrpc_body_v3, pb_last_committed));
 	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_committed) == 8, "found %lld\n",
@@ -713,10 +721,18 @@ void lustre_assert_wire_constants(void)
 		 (int)offsetof(struct ptlrpc_body_v3, pb_last_xid), (int)offsetof(struct ptlrpc_body_v2, pb_last_xid));
 	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_xid) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_xid), "%d != %d\n",
 		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_xid), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_xid));
-	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_last_seen) == (int)offsetof(struct ptlrpc_body_v2, pb_last_seen), "%d != %d\n",
-		 (int)offsetof(struct ptlrpc_body_v3, pb_last_seen), (int)offsetof(struct ptlrpc_body_v2, pb_last_seen));
-	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_seen) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_seen), "%d != %d\n",
-		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_seen), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_seen));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_tag) == (int)offsetof(struct ptlrpc_body_v2, pb_tag), "%d != %d\n",
+		 (int)offsetof(struct ptlrpc_body_v3, pb_tag), (int)offsetof(struct ptlrpc_body_v2, pb_tag));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_tag) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_tag), "%d != %d\n",
+		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_tag), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_tag));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding0) == (int)offsetof(struct ptlrpc_body_v2, pb_padding0), "%d != %d\n",
+		 (int)offsetof(struct ptlrpc_body_v3, pb_padding0), (int)offsetof(struct ptlrpc_body_v2, pb_padding0));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding0) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding0), "%d != %d\n",
+		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding0), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding0));
+	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_padding1) == (int)offsetof(struct ptlrpc_body_v2, pb_padding1), "%d != %d\n",
+		 (int)offsetof(struct ptlrpc_body_v3, pb_padding1), (int)offsetof(struct ptlrpc_body_v2, pb_padding1));
+	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding1) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding1), "%d != %d\n",
+		 (int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_padding1), (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_padding1));
 	LASSERTF((int)offsetof(struct ptlrpc_body_v3, pb_last_committed) == (int)offsetof(struct ptlrpc_body_v2, pb_last_committed), "%d != %d\n",
 		 (int)offsetof(struct ptlrpc_body_v3, pb_last_committed), (int)offsetof(struct ptlrpc_body_v2, pb_last_committed));
 	LASSERTF((int)sizeof(((struct ptlrpc_body_v3 *)0)->pb_last_committed) == (int)sizeof(((struct ptlrpc_body_v2 *)0)->pb_last_committed), "%d != %d\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 15/41] staging: lustre: osc: fix bug when setting max_pages_per_rpc
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Wu Libin,
	James Simmons

From: Wu Libin <lwu@ddn.com>

After setting like "lctl set_param -P osc.*.max_pages_per_rpc", it
is possible that the function osc_obd_max_pages_per_rpc_seq_write
will be called before ocd_brw_size has been set when mount.
ocd_brw_size is meaningless when it is zero. So it should not be
the limit at that time.

Signed-off-by: Wu Libin <lwu@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6421
Reviewed-on: http://review.whamcloud.com/14333
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/lproc_osc.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/lproc_osc.c b/drivers/staging/lustre/lustre/osc/lproc_osc.c
index f0062d4..a837362 100644
--- a/drivers/staging/lustre/lustre/osc/lproc_osc.c
+++ b/drivers/staging/lustre/lustre/osc/lproc_osc.c
@@ -585,7 +585,8 @@ static ssize_t max_pages_per_rpc_store(struct kobject *kobj,
 	chunk_mask = ~((1 << (cli->cl_chunkbits - PAGE_SHIFT)) - 1);
 	/* max_pages_per_rpc must be chunk aligned */
 	val = (val + ~chunk_mask) & chunk_mask;
-	if (val == 0 || val > ocd->ocd_brw_size >> PAGE_SHIFT) {
+	if (!val || (ocd->ocd_brw_size &&
+		     val > ocd->ocd_brw_size >> PAGE_SHIFT)) {
 		return -ERANGE;
 	}
 	spin_lock(&cli->cl_loi_list_lock);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 15/41] staging: lustre: osc: fix bug when setting max_pages_per_rpc
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Wu Libin,
	James Simmons

From: Wu Libin <lwu@ddn.com>

After setting like "lctl set_param -P osc.*.max_pages_per_rpc", it
is possible that the function osc_obd_max_pages_per_rpc_seq_write
will be called before ocd_brw_size has been set when mount.
ocd_brw_size is meaningless when it is zero. So it should not be
the limit at that time.

Signed-off-by: Wu Libin <lwu@ddn.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6421
Reviewed-on: http://review.whamcloud.com/14333
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/osc/lproc_osc.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/lproc_osc.c b/drivers/staging/lustre/lustre/osc/lproc_osc.c
index f0062d4..a837362 100644
--- a/drivers/staging/lustre/lustre/osc/lproc_osc.c
+++ b/drivers/staging/lustre/lustre/osc/lproc_osc.c
@@ -585,7 +585,8 @@ static ssize_t max_pages_per_rpc_store(struct kobject *kobj,
 	chunk_mask = ~((1 << (cli->cl_chunkbits - PAGE_SHIFT)) - 1);
 	/* max_pages_per_rpc must be chunk aligned */
 	val = (val + ~chunk_mask) & chunk_mask;
-	if (val == 0 || val > ocd->ocd_brw_size >> PAGE_SHIFT) {
+	if (!val || (ocd->ocd_brw_size &&
+		     val > ocd->ocd_brw_size >> PAGE_SHIFT)) {
 		return -ERANGE;
 	}
 	spin_lock(&cli->cl_loi_list_lock);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 16/41] staging: lustre: ldlm: Do not use cbpending for group locks
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Patrick Farrell, James Simmons

From: Patrick Farrell <paf@cray.com>

Currently, the CBPENDING flag is set on group locks when
the osc lock above them is released (osc_cancel_base).

This results in a situation where a new group lock request
on a resource does not match an existing group lock because
LDLM_FL_CBPENDING is set on the existing lock.

So two group locks are granted on the same resource, which
is not valid, since a given client can only have one group
lock on a particular resource.

Since group locks are manually released and not called back
like other LDLM locks, the CBPENDING flag doesn't make
sense. Since they must be manually released, they also
cannot go in the LDLM LRU cache and must be fully released
immediately once they are no longer in use.

This was previously accomplished by setting CBPENDING when
the corresponding osc lock is released, but as noted above,
this prevents the group lock matching some future lock
requests.

This patch uses the fact that group locks have an l_writers
reference which they keep until they are manually released,
so we remove them when they have no more reader or writer
references, without checking cbpending.

Signed-off-by: Patrick Farrell <paf@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6368
Reviewed-on: http://review.whamcloud.com/14093
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c   |   11 +++++++----
 drivers/staging/lustre/lustre/osc/osc_internal.h |    1 -
 drivers/staging/lustre/lustre/osc/osc_lock.c     |    2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c  |   10 ----------
 4 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index ace8cb2..cc116ba 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -784,11 +784,16 @@ void ldlm_lock_decref_internal(struct ldlm_lock *lock, __u32 mode)
 	}
 
 	if (!lock->l_readers && !lock->l_writers &&
-	    ldlm_is_cbpending(lock)) {
+	    (ldlm_is_cbpending(lock) || lock->l_req_mode == LCK_GROUP)) {
 		/* If we received a blocked AST and this was the last reference,
 		 * run the callback.
+		 * Group locks are special:
+		 * They must not go in LRU, but they are not called back
+		 * like non-group locks, instead they are manually released.
+		 * They have an l_writers reference which they keep until
+		 * they are manually released, so we remove them when they have
+		 * no more reader or writer references. - LU-6368
 		 */
-
 		LDLM_DEBUG(lock, "final decref done on cbpending lock");
 
 		LDLM_LOCK_GET(lock); /* dropped by bl thread */
@@ -844,8 +849,6 @@ EXPORT_SYMBOL(ldlm_lock_decref);
  * Decrease reader/writer refcount for LDLM lock with handle
  * \a lockh and mark it for subsequent cancellation once r/w refcount
  * drops to zero instead of putting into LRU.
- *
- * Typical usage is for GROUP locks which we cannot allow to be cached.
  */
 void ldlm_lock_decref_and_cancel(const struct lustre_handle *lockh, __u32 mode)
 {
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 90ea0d1..64684c4 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -112,7 +112,6 @@ int osc_enqueue_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		     osc_enqueue_upcall_f upcall,
 		     void *cookie, struct ldlm_enqueue_info *einfo,
 		     struct ptlrpc_request_set *rqset, int async, int agl);
-int osc_cancel_base(struct lustre_handle *lockh, __u32 mode);
 
 int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		   __u32 type, ldlm_policy_data_t *policy, __u32 mode,
diff --git a/drivers/staging/lustre/lustre/osc/osc_lock.c b/drivers/staging/lustre/lustre/osc/osc_lock.c
index a42cb98..45d5e6d 100644
--- a/drivers/staging/lustre/lustre/osc/osc_lock.c
+++ b/drivers/staging/lustre/lustre/osc/osc_lock.c
@@ -1009,7 +1009,7 @@ static void osc_lock_detach(const struct lu_env *env, struct osc_lock *olck)
 
 	if (olck->ols_hold) {
 		olck->ols_hold = 0;
-		osc_cancel_base(&olck->ols_handle, olck->ols_einfo.ei_mode);
+		ldlm_lock_decref(&olck->ols_handle, olck->ols_einfo.ei_mode);
 		olck->ols_handle.cookie = 0ULL;
 	}
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index de2f522..21cd48b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2336,16 +2336,6 @@ int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 	return rc;
 }
 
-int osc_cancel_base(struct lustre_handle *lockh, __u32 mode)
-{
-	if (unlikely(mode == LCK_GROUP))
-		ldlm_lock_decref_and_cancel(lockh, mode);
-	else
-		ldlm_lock_decref(lockh, mode);
-
-	return 0;
-}
-
 static int osc_statfs_interpret(const struct lu_env *env,
 				struct ptlrpc_request *req,
 				struct osc_async_args *aa, int rc)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 16/41] staging: lustre: ldlm: Do not use cbpending for group locks
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Patrick Farrell, James Simmons

From: Patrick Farrell <paf@cray.com>

Currently, the CBPENDING flag is set on group locks when
the osc lock above them is released (osc_cancel_base).

This results in a situation where a new group lock request
on a resource does not match an existing group lock because
LDLM_FL_CBPENDING is set on the existing lock.

So two group locks are granted on the same resource, which
is not valid, since a given client can only have one group
lock on a particular resource.

Since group locks are manually released and not called back
like other LDLM locks, the CBPENDING flag doesn't make
sense. Since they must be manually released, they also
cannot go in the LDLM LRU cache and must be fully released
immediately once they are no longer in use.

This was previously accomplished by setting CBPENDING when
the corresponding osc lock is released, but as noted above,
this prevents the group lock matching some future lock
requests.

This patch uses the fact that group locks have an l_writers
reference which they keep until they are manually released,
so we remove them when they have no more reader or writer
references, without checking cbpending.

Signed-off-by: Patrick Farrell <paf@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6368
Reviewed-on: http://review.whamcloud.com/14093
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c   |   11 +++++++----
 drivers/staging/lustre/lustre/osc/osc_internal.h |    1 -
 drivers/staging/lustre/lustre/osc/osc_lock.c     |    2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c  |   10 ----------
 4 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index ace8cb2..cc116ba 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -784,11 +784,16 @@ void ldlm_lock_decref_internal(struct ldlm_lock *lock, __u32 mode)
 	}
 
 	if (!lock->l_readers && !lock->l_writers &&
-	    ldlm_is_cbpending(lock)) {
+	    (ldlm_is_cbpending(lock) || lock->l_req_mode == LCK_GROUP)) {
 		/* If we received a blocked AST and this was the last reference,
 		 * run the callback.
+		 * Group locks are special:
+		 * They must not go in LRU, but they are not called back
+		 * like non-group locks, instead they are manually released.
+		 * They have an l_writers reference which they keep until
+		 * they are manually released, so we remove them when they have
+		 * no more reader or writer references. - LU-6368
 		 */
-
 		LDLM_DEBUG(lock, "final decref done on cbpending lock");
 
 		LDLM_LOCK_GET(lock); /* dropped by bl thread */
@@ -844,8 +849,6 @@ EXPORT_SYMBOL(ldlm_lock_decref);
  * Decrease reader/writer refcount for LDLM lock with handle
  * \a lockh and mark it for subsequent cancellation once r/w refcount
  * drops to zero instead of putting into LRU.
- *
- * Typical usage is for GROUP locks which we cannot allow to be cached.
  */
 void ldlm_lock_decref_and_cancel(const struct lustre_handle *lockh, __u32 mode)
 {
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 90ea0d1..64684c4 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -112,7 +112,6 @@ int osc_enqueue_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		     osc_enqueue_upcall_f upcall,
 		     void *cookie, struct ldlm_enqueue_info *einfo,
 		     struct ptlrpc_request_set *rqset, int async, int agl);
-int osc_cancel_base(struct lustre_handle *lockh, __u32 mode);
 
 int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		   __u32 type, ldlm_policy_data_t *policy, __u32 mode,
diff --git a/drivers/staging/lustre/lustre/osc/osc_lock.c b/drivers/staging/lustre/lustre/osc/osc_lock.c
index a42cb98..45d5e6d 100644
--- a/drivers/staging/lustre/lustre/osc/osc_lock.c
+++ b/drivers/staging/lustre/lustre/osc/osc_lock.c
@@ -1009,7 +1009,7 @@ static void osc_lock_detach(const struct lu_env *env, struct osc_lock *olck)
 
 	if (olck->ols_hold) {
 		olck->ols_hold = 0;
-		osc_cancel_base(&olck->ols_handle, olck->ols_einfo.ei_mode);
+		ldlm_lock_decref(&olck->ols_handle, olck->ols_einfo.ei_mode);
 		olck->ols_handle.cookie = 0ULL;
 	}
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index de2f522..21cd48b 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2336,16 +2336,6 @@ int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 	return rc;
 }
 
-int osc_cancel_base(struct lustre_handle *lockh, __u32 mode)
-{
-	if (unlikely(mode == LCK_GROUP))
-		ldlm_lock_decref_and_cancel(lockh, mode);
-	else
-		ldlm_lock_decref(lockh, mode);
-
-	return 0;
-}
-
 static int osc_statfs_interpret(const struct lu_env *env,
 				struct ptlrpc_request *req,
 				struct osc_async_args *aa, int rc)
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 17/41] staging: lustre: ptlrpc: remove old protocol compatibility
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Andreas Dilger, James Simmons

From: Andreas Dilger <andreas.dilger@intel.com>

Some old protocol compatibility workarounds are still present in
master that should have been removed when LUSTRE_MSG_MAGIC_V1 was.

In particular, the process for upgrading LUSTRE_MSG_MAGIC_V1 to
LUSTRE_MSG_MAGIC_V2 had the client to connect to the server with the
V1 protocol with op_flag=MSG_CONNECT_NEXT_VER set, and if the server
supported the V2 protocol it would reply with LUSTRE_MSG_MAGIC_V2.
This ensured that if the new client connected to an old server the
connection would be allowed.  However, even with V1 protocol support
removed, the 2.x clients are still connecting with NEXT_VER set.
In b1_8 this flag was contingent on LUSTRE_MSG_MAGIC_V1 being used,
which is how it should have been in 2.x as well.

A few other cleanups are be done at the same time:
 - disallow 1.8 clients (or at least those that don't understand
   OBD_CONNECT_FULL20) so we can remove workarounds for 1.8 clients
 - remove support for pre-2.1 DLM flock lock handling
 - don't workaround the lack of MDS_ATTR_xTIME_SET flags in setattr
 - always set MSGHDR_CKSUM_INCOMPAT18 (it can eventually be removed)

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6349
Reviewed-on: http://review.whamcloud.com/14006
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   18 ++----------------
 drivers/staging/lustre/lustre/ldlm/ldlm_internal.h |    7 ++-----
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |   19 +++----------------
 drivers/staging/lustre/lustre/ptlrpc/import.c      |   11 ++---------
 4 files changed, 9 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
index 861f36f..98838e7 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
@@ -612,22 +612,8 @@ granted:
 }
 EXPORT_SYMBOL(ldlm_flock_completion_ast);
 
-void ldlm_flock_policy_wire18_to_local(const ldlm_wire_policy_data_t *wpolicy,
-				       ldlm_policy_data_t *lpolicy)
-{
-	memset(lpolicy, 0, sizeof(*lpolicy));
-	lpolicy->l_flock.start = wpolicy->l_flock.lfw_start;
-	lpolicy->l_flock.end = wpolicy->l_flock.lfw_end;
-	lpolicy->l_flock.pid = wpolicy->l_flock.lfw_pid;
-	/* Compat code, old clients had no idea about owner field and
-	 * relied solely on pid for ownership. Introduced in LU-104, 2.1,
-	 * April 2011
-	 */
-	lpolicy->l_flock.owner = wpolicy->l_flock.lfw_pid;
-}
-
-void ldlm_flock_policy_wire21_to_local(const ldlm_wire_policy_data_t *wpolicy,
-				       ldlm_policy_data_t *lpolicy)
+void ldlm_flock_policy_wire_to_local(const ldlm_wire_policy_data_t *wpolicy,
+				     ldlm_policy_data_t *lpolicy)
 {
 	memset(lpolicy, 0, sizeof(*lpolicy));
 	lpolicy->l_flock.start = wpolicy->l_flock.lfw_start;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_internal.h b/drivers/staging/lustre/lustre/ldlm/ldlm_internal.h
index 5e82cfc..0099ff3 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_internal.h
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_internal.h
@@ -329,10 +329,7 @@ void ldlm_extent_policy_wire_to_local(const ldlm_wire_policy_data_t *wpolicy,
 				      ldlm_policy_data_t *lpolicy);
 void ldlm_extent_policy_local_to_wire(const ldlm_policy_data_t *lpolicy,
 				      ldlm_wire_policy_data_t *wpolicy);
-void ldlm_flock_policy_wire18_to_local(const ldlm_wire_policy_data_t *wpolicy,
-				       ldlm_policy_data_t *lpolicy);
-void ldlm_flock_policy_wire21_to_local(const ldlm_wire_policy_data_t *wpolicy,
-				       ldlm_policy_data_t *lpolicy);
-
+void ldlm_flock_policy_wire_to_local(const ldlm_wire_policy_data_t *wpolicy,
+				     ldlm_policy_data_t *lpolicy);
 void ldlm_flock_policy_local_to_wire(const ldlm_policy_data_t *lpolicy,
 				     ldlm_wire_policy_data_t *wpolicy);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index cc116ba..22b4a52 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -63,17 +63,10 @@ static char *ldlm_typename[] = {
 	[LDLM_IBITS]	= "IBT",
 };
 
-static ldlm_policy_wire_to_local_t ldlm_policy_wire18_to_local[] = {
+static ldlm_policy_wire_to_local_t ldlm_policy_wire_to_local[] = {
 	[LDLM_PLAIN - LDLM_MIN_TYPE]	= ldlm_plain_policy_wire_to_local,
 	[LDLM_EXTENT - LDLM_MIN_TYPE]	= ldlm_extent_policy_wire_to_local,
-	[LDLM_FLOCK - LDLM_MIN_TYPE]	= ldlm_flock_policy_wire18_to_local,
-	[LDLM_IBITS - LDLM_MIN_TYPE]	= ldlm_ibits_policy_wire_to_local,
-};
-
-static ldlm_policy_wire_to_local_t ldlm_policy_wire21_to_local[] = {
-	[LDLM_PLAIN - LDLM_MIN_TYPE]	= ldlm_plain_policy_wire_to_local,
-	[LDLM_EXTENT - LDLM_MIN_TYPE]	= ldlm_extent_policy_wire_to_local,
-	[LDLM_FLOCK - LDLM_MIN_TYPE]	= ldlm_flock_policy_wire21_to_local,
+	[LDLM_FLOCK - LDLM_MIN_TYPE]	= ldlm_flock_policy_wire_to_local,
 	[LDLM_IBITS - LDLM_MIN_TYPE]	= ldlm_ibits_policy_wire_to_local,
 };
 
@@ -106,14 +99,8 @@ void ldlm_convert_policy_to_local(struct obd_export *exp, enum ldlm_type type,
 				  ldlm_policy_data_t *lpolicy)
 {
 	ldlm_policy_wire_to_local_t convert;
-	int new_client;
 
-	/** some badness for 2.0.0 clients, but 2.0.0 isn't supported */
-	new_client = (exp_connect_flags(exp) & OBD_CONNECT_FULL20) != 0;
-	if (new_client)
-		convert = ldlm_policy_wire21_to_local[type - LDLM_MIN_TYPE];
-	else
-		convert = ldlm_policy_wire18_to_local[type - LDLM_MIN_TYPE];
+	convert = ldlm_policy_wire_to_local[type - LDLM_MIN_TYPE];
 
 	convert(wpolicy, lpolicy);
 }
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index a23d0a0..b245784 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -691,8 +691,6 @@ int ptlrpc_connect_import(struct obd_import *imp)
 	request->rq_timeout = INITIAL_CONNECT_TIMEOUT;
 	lustre_msg_set_timeout(request->rq_reqmsg, request->rq_timeout);
 
-	lustre_msg_add_op_flags(request->rq_reqmsg, MSG_CONNECT_NEXT_VER);
-
 	request->rq_no_resend = 1;
 	request->rq_no_delay = 1;
 	request->rq_send_state = LUSTRE_IMP_CONNECTING;
@@ -873,8 +871,7 @@ static int ptlrpc_connect_set_flags(struct obd_import *imp,
 			ocd->ocd_connect_flags;
 	}
 
-	if ((ocd->ocd_connect_flags & OBD_CONNECT_AT) &&
-	    (imp->imp_msg_magic == LUSTRE_MSG_MAGIC_V2))
+	if (ocd->ocd_connect_flags & OBD_CONNECT_AT)
 		/*
 		 * We need a per-message support flag, because
 		 * a. we don't know if the incoming connect reply
@@ -889,11 +886,7 @@ static int ptlrpc_connect_set_flags(struct obd_import *imp,
 	else
 		imp->imp_msghdr_flags &= ~MSGHDR_AT_SUPPORT;
 
-	if ((ocd->ocd_connect_flags & OBD_CONNECT_FULL20) &&
-	    (imp->imp_msg_magic == LUSTRE_MSG_MAGIC_V2))
-		imp->imp_msghdr_flags |= MSGHDR_CKSUM_INCOMPAT18;
-	else
-		imp->imp_msghdr_flags &= ~MSGHDR_CKSUM_INCOMPAT18;
+	imp->imp_msghdr_flags |= MSGHDR_CKSUM_INCOMPAT18;
 
 	return 0;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 17/41] staging: lustre: ptlrpc: remove old protocol compatibility
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, James Simmons

From: Andreas Dilger <andreas.dilger@intel.com>

Some old protocol compatibility workarounds are still present in
master that should have been removed when LUSTRE_MSG_MAGIC_V1 was.

In particular, the process for upgrading LUSTRE_MSG_MAGIC_V1 to
LUSTRE_MSG_MAGIC_V2 had the client to connect to the server with the
V1 protocol with op_flag=MSG_CONNECT_NEXT_VER set, and if the server
supported the V2 protocol it would reply with LUSTRE_MSG_MAGIC_V2.
This ensured that if the new client connected to an old server the
connection would be allowed.  However, even with V1 protocol support
removed, the 2.x clients are still connecting with NEXT_VER set.
In b1_8 this flag was contingent on LUSTRE_MSG_MAGIC_V1 being used,
which is how it should have been in 2.x as well.

A few other cleanups are be done at the same time:
 - disallow 1.8 clients (or at least those that don't understand
   OBD_CONNECT_FULL20) so we can remove workarounds for 1.8 clients
 - remove support for pre-2.1 DLM flock lock handling
 - don't workaround the lack of MDS_ATTR_xTIME_SET flags in setattr
 - always set MSGHDR_CKSUM_INCOMPAT18 (it can eventually be removed)

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6349
Reviewed-on: http://review.whamcloud.com/14006
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_flock.c    |   18 ++----------------
 drivers/staging/lustre/lustre/ldlm/ldlm_internal.h |    7 ++-----
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c     |   19 +++----------------
 drivers/staging/lustre/lustre/ptlrpc/import.c      |   11 ++---------
 4 files changed, 9 insertions(+), 46 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
index 861f36f..98838e7 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_flock.c
@@ -612,22 +612,8 @@ granted:
 }
 EXPORT_SYMBOL(ldlm_flock_completion_ast);
 
-void ldlm_flock_policy_wire18_to_local(const ldlm_wire_policy_data_t *wpolicy,
-				       ldlm_policy_data_t *lpolicy)
-{
-	memset(lpolicy, 0, sizeof(*lpolicy));
-	lpolicy->l_flock.start = wpolicy->l_flock.lfw_start;
-	lpolicy->l_flock.end = wpolicy->l_flock.lfw_end;
-	lpolicy->l_flock.pid = wpolicy->l_flock.lfw_pid;
-	/* Compat code, old clients had no idea about owner field and
-	 * relied solely on pid for ownership. Introduced in LU-104, 2.1,
-	 * April 2011
-	 */
-	lpolicy->l_flock.owner = wpolicy->l_flock.lfw_pid;
-}
-
-void ldlm_flock_policy_wire21_to_local(const ldlm_wire_policy_data_t *wpolicy,
-				       ldlm_policy_data_t *lpolicy)
+void ldlm_flock_policy_wire_to_local(const ldlm_wire_policy_data_t *wpolicy,
+				     ldlm_policy_data_t *lpolicy)
 {
 	memset(lpolicy, 0, sizeof(*lpolicy));
 	lpolicy->l_flock.start = wpolicy->l_flock.lfw_start;
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_internal.h b/drivers/staging/lustre/lustre/ldlm/ldlm_internal.h
index 5e82cfc..0099ff3 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_internal.h
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_internal.h
@@ -329,10 +329,7 @@ void ldlm_extent_policy_wire_to_local(const ldlm_wire_policy_data_t *wpolicy,
 				      ldlm_policy_data_t *lpolicy);
 void ldlm_extent_policy_local_to_wire(const ldlm_policy_data_t *lpolicy,
 				      ldlm_wire_policy_data_t *wpolicy);
-void ldlm_flock_policy_wire18_to_local(const ldlm_wire_policy_data_t *wpolicy,
-				       ldlm_policy_data_t *lpolicy);
-void ldlm_flock_policy_wire21_to_local(const ldlm_wire_policy_data_t *wpolicy,
-				       ldlm_policy_data_t *lpolicy);
-
+void ldlm_flock_policy_wire_to_local(const ldlm_wire_policy_data_t *wpolicy,
+				     ldlm_policy_data_t *lpolicy);
 void ldlm_flock_policy_local_to_wire(const ldlm_policy_data_t *lpolicy,
 				     ldlm_wire_policy_data_t *wpolicy);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index cc116ba..22b4a52 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -63,17 +63,10 @@ static char *ldlm_typename[] = {
 	[LDLM_IBITS]	= "IBT",
 };
 
-static ldlm_policy_wire_to_local_t ldlm_policy_wire18_to_local[] = {
+static ldlm_policy_wire_to_local_t ldlm_policy_wire_to_local[] = {
 	[LDLM_PLAIN - LDLM_MIN_TYPE]	= ldlm_plain_policy_wire_to_local,
 	[LDLM_EXTENT - LDLM_MIN_TYPE]	= ldlm_extent_policy_wire_to_local,
-	[LDLM_FLOCK - LDLM_MIN_TYPE]	= ldlm_flock_policy_wire18_to_local,
-	[LDLM_IBITS - LDLM_MIN_TYPE]	= ldlm_ibits_policy_wire_to_local,
-};
-
-static ldlm_policy_wire_to_local_t ldlm_policy_wire21_to_local[] = {
-	[LDLM_PLAIN - LDLM_MIN_TYPE]	= ldlm_plain_policy_wire_to_local,
-	[LDLM_EXTENT - LDLM_MIN_TYPE]	= ldlm_extent_policy_wire_to_local,
-	[LDLM_FLOCK - LDLM_MIN_TYPE]	= ldlm_flock_policy_wire21_to_local,
+	[LDLM_FLOCK - LDLM_MIN_TYPE]	= ldlm_flock_policy_wire_to_local,
 	[LDLM_IBITS - LDLM_MIN_TYPE]	= ldlm_ibits_policy_wire_to_local,
 };
 
@@ -106,14 +99,8 @@ void ldlm_convert_policy_to_local(struct obd_export *exp, enum ldlm_type type,
 				  ldlm_policy_data_t *lpolicy)
 {
 	ldlm_policy_wire_to_local_t convert;
-	int new_client;
 
-	/** some badness for 2.0.0 clients, but 2.0.0 isn't supported */
-	new_client = (exp_connect_flags(exp) & OBD_CONNECT_FULL20) != 0;
-	if (new_client)
-		convert = ldlm_policy_wire21_to_local[type - LDLM_MIN_TYPE];
-	else
-		convert = ldlm_policy_wire18_to_local[type - LDLM_MIN_TYPE];
+	convert = ldlm_policy_wire_to_local[type - LDLM_MIN_TYPE];
 
 	convert(wpolicy, lpolicy);
 }
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index a23d0a0..b245784 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -691,8 +691,6 @@ int ptlrpc_connect_import(struct obd_import *imp)
 	request->rq_timeout = INITIAL_CONNECT_TIMEOUT;
 	lustre_msg_set_timeout(request->rq_reqmsg, request->rq_timeout);
 
-	lustre_msg_add_op_flags(request->rq_reqmsg, MSG_CONNECT_NEXT_VER);
-
 	request->rq_no_resend = 1;
 	request->rq_no_delay = 1;
 	request->rq_send_state = LUSTRE_IMP_CONNECTING;
@@ -873,8 +871,7 @@ static int ptlrpc_connect_set_flags(struct obd_import *imp,
 			ocd->ocd_connect_flags;
 	}
 
-	if ((ocd->ocd_connect_flags & OBD_CONNECT_AT) &&
-	    (imp->imp_msg_magic == LUSTRE_MSG_MAGIC_V2))
+	if (ocd->ocd_connect_flags & OBD_CONNECT_AT)
 		/*
 		 * We need a per-message support flag, because
 		 * a. we don't know if the incoming connect reply
@@ -889,11 +886,7 @@ static int ptlrpc_connect_set_flags(struct obd_import *imp,
 	else
 		imp->imp_msghdr_flags &= ~MSGHDR_AT_SUPPORT;
 
-	if ((ocd->ocd_connect_flags & OBD_CONNECT_FULL20) &&
-	    (imp->imp_msg_magic == LUSTRE_MSG_MAGIC_V2))
-		imp->imp_msghdr_flags |= MSGHDR_CKSUM_INCOMPAT18;
-	else
-		imp->imp_msghdr_flags &= ~MSGHDR_CKSUM_INCOMPAT18;
+	imp->imp_msghdr_flags |= MSGHDR_CKSUM_INCOMPAT18;
 
 	return 0;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 18/41] staging: lustre: llite: Report first encountered error
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Henri Doreau,
	James Simmons

From: Henri Doreau <henri.doreau@cea.fr>

Failures in ll_ioc_copy_{start,end} are reported to coordinator.
The return code delivered to the caller should indicate what this
error was, not whether coordinator was successfully notified.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5683
Reviewed-on: http://review.whamcloud.com/14335
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/dir.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 3641327..68675eb 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -681,7 +681,7 @@ static int ll_ioc_copy_start(struct super_block *sb, struct hsm_copy *copy)
 {
 	struct ll_sb_info		*sbi = ll_s2sbi(sb);
 	struct hsm_progress_kernel	 hpk;
-	int				 rc;
+	int rc2, rc = 0;
 
 	/* Forge a hsm_progress based on data from copy. */
 	hpk.hpk_fid = copy->hc_hai.hai_fid;
@@ -731,10 +731,10 @@ progress:
 	/* On error, the request should be considered as completed */
 	if (hpk.hpk_errval > 0)
 		hpk.hpk_flags |= HP_FLAG_COMPLETED;
-	rc = obd_iocontrol(LL_IOC_HSM_PROGRESS, sbi->ll_md_exp, sizeof(hpk),
-			   &hpk, NULL);
+	rc2 = obd_iocontrol(LL_IOC_HSM_PROGRESS, sbi->ll_md_exp, sizeof(hpk),
+			    &hpk, NULL);
 
-	return rc;
+	return rc ? rc : rc2;
 }
 
 /**
@@ -756,7 +756,7 @@ static int ll_ioc_copy_end(struct super_block *sb, struct hsm_copy *copy)
 {
 	struct ll_sb_info		*sbi = ll_s2sbi(sb);
 	struct hsm_progress_kernel	 hpk;
-	int				 rc;
+	int rc2, rc = 0;
 
 	/* If you modify the logic here, also check llapi_hsm_copy_end(). */
 	/* Take care: copy->hc_hai.hai_action, len, gid and data are not
@@ -830,10 +830,10 @@ static int ll_ioc_copy_end(struct super_block *sb, struct hsm_copy *copy)
 	}
 
 progress:
-	rc = obd_iocontrol(LL_IOC_HSM_PROGRESS, sbi->ll_md_exp, sizeof(hpk),
-			   &hpk, NULL);
+	rc2 = obd_iocontrol(LL_IOC_HSM_PROGRESS, sbi->ll_md_exp, sizeof(hpk),
+			    &hpk, NULL);
 
-	return rc;
+	return rc ? rc : rc2;
 }
 
 static int copy_and_ioctl(int cmd, struct obd_export *exp,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 18/41] staging: lustre: llite: Report first encountered error
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Henri Doreau,
	James Simmons

From: Henri Doreau <henri.doreau@cea.fr>

Failures in ll_ioc_copy_{start,end} are reported to coordinator.
The return code delivered to the caller should indicate what this
error was, not whether coordinator was successfully notified.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5683
Reviewed-on: http://review.whamcloud.com/14335
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/dir.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 3641327..68675eb 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -681,7 +681,7 @@ static int ll_ioc_copy_start(struct super_block *sb, struct hsm_copy *copy)
 {
 	struct ll_sb_info		*sbi = ll_s2sbi(sb);
 	struct hsm_progress_kernel	 hpk;
-	int				 rc;
+	int rc2, rc = 0;
 
 	/* Forge a hsm_progress based on data from copy. */
 	hpk.hpk_fid = copy->hc_hai.hai_fid;
@@ -731,10 +731,10 @@ progress:
 	/* On error, the request should be considered as completed */
 	if (hpk.hpk_errval > 0)
 		hpk.hpk_flags |= HP_FLAG_COMPLETED;
-	rc = obd_iocontrol(LL_IOC_HSM_PROGRESS, sbi->ll_md_exp, sizeof(hpk),
-			   &hpk, NULL);
+	rc2 = obd_iocontrol(LL_IOC_HSM_PROGRESS, sbi->ll_md_exp, sizeof(hpk),
+			    &hpk, NULL);
 
-	return rc;
+	return rc ? rc : rc2;
 }
 
 /**
@@ -756,7 +756,7 @@ static int ll_ioc_copy_end(struct super_block *sb, struct hsm_copy *copy)
 {
 	struct ll_sb_info		*sbi = ll_s2sbi(sb);
 	struct hsm_progress_kernel	 hpk;
-	int				 rc;
+	int rc2, rc = 0;
 
 	/* If you modify the logic here, also check llapi_hsm_copy_end(). */
 	/* Take care: copy->hc_hai.hai_action, len, gid and data are not
@@ -830,10 +830,10 @@ static int ll_ioc_copy_end(struct super_block *sb, struct hsm_copy *copy)
 	}
 
 progress:
-	rc = obd_iocontrol(LL_IOC_HSM_PROGRESS, sbi->ll_md_exp, sizeof(hpk),
-			   &hpk, NULL);
+	rc2 = obd_iocontrol(LL_IOC_HSM_PROGRESS, sbi->ll_md_exp, sizeof(hpk),
+			    &hpk, NULL);
 
-	return rc;
+	return rc ? rc : rc2;
 }
 
 static int copy_and_ioctl(int cmd, struct obd_export *exp,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 19/41] staging: lustre: ptlrpc: dont take unwrap in req_waittime calculation
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Sebastien Buisson, James Simmons

From: Sebastien Buisson <sebastien.buisson@bull.net>

Do not take unwrap time in req_waittime calculation on client part
as decryption is not related to request service time.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
Reviewed-on: http://review.whamcloud.com/14404
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/client.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 8c51d51..fa4d3c9 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1206,6 +1206,10 @@ static int after_reply(struct ptlrpc_request *req)
 		return 0;
 	}
 
+	ktime_get_real_ts64(&work_start);
+	timediff = (work_start.tv_sec - req->rq_sent_tv.tv_sec) * USEC_PER_SEC +
+		   (work_start.tv_nsec - req->rq_sent_tv.tv_nsec) /
+								 NSEC_PER_USEC;
 	/*
 	 * NB Until this point, the whole of the incoming message,
 	 * including buflens, status etc is in the sender's byte order.
@@ -1258,10 +1262,6 @@ static int after_reply(struct ptlrpc_request *req)
 		return 0;
 	}
 
-	ktime_get_real_ts64(&work_start);
-	timediff = (work_start.tv_sec - req->rq_sent_tv.tv_sec) * USEC_PER_SEC +
-		   (work_start.tv_nsec - req->rq_sent_tv.tv_nsec) /
-								 NSEC_PER_USEC;
 	if (obd->obd_svc_stats) {
 		lprocfs_counter_add(obd->obd_svc_stats, PTLRPC_REQWAIT_CNTR,
 				    timediff);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 19/41] staging: lustre: ptlrpc: dont take unwrap in req_waittime calculation
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Sebastien Buisson, James Simmons

From: Sebastien Buisson <sebastien.buisson@bull.net>

Do not take unwrap time in req_waittime calculation on client part
as decryption is not related to request service time.

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
Reviewed-on: http://review.whamcloud.com/14404
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ptlrpc/client.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 8c51d51..fa4d3c9 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1206,6 +1206,10 @@ static int after_reply(struct ptlrpc_request *req)
 		return 0;
 	}
 
+	ktime_get_real_ts64(&work_start);
+	timediff = (work_start.tv_sec - req->rq_sent_tv.tv_sec) * USEC_PER_SEC +
+		   (work_start.tv_nsec - req->rq_sent_tv.tv_nsec) /
+								 NSEC_PER_USEC;
 	/*
 	 * NB Until this point, the whole of the incoming message,
 	 * including buflens, status etc is in the sender's byte order.
@@ -1258,10 +1262,6 @@ static int after_reply(struct ptlrpc_request *req)
 		return 0;
 	}
 
-	ktime_get_real_ts64(&work_start);
-	timediff = (work_start.tv_sec - req->rq_sent_tv.tv_sec) * USEC_PER_SEC +
-		   (work_start.tv_nsec - req->rq_sent_tv.tv_nsec) /
-								 NSEC_PER_USEC;
 	if (obd->obd_svc_stats) {
 		lprocfs_counter_add(obd->obd_svc_stats, PTLRPC_REQWAIT_CNTR,
 				    timediff);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 20/41] staging: lustre: remove Size on MDS support
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Remove unused definitions related to Size on MDS support from
lustre/include/lustre/lustre_idl.h. Remove unused code from several
places in lustre/.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6047
Reviewed-on: http://review.whamcloud.com/13443
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   41 ++++-----------
 .../lustre/lustre/include/lustre_req_layout.h      |    1 -
 drivers/staging/lustre/lustre/include/obd.h        |    8 +++
 .../staging/lustre/lustre/include/obd_support.h    |    4 +-
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |    7 ++-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    6 --
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    7 ++-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   58 ++++++--------------
 9 files changed, 46 insertions(+), 90 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 4f6eeec..17feb71 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -310,7 +310,7 @@ static inline int range_compare_loc(const struct lu_seq_range *r1,
  */
 enum lma_compat {
 	LMAC_HSM	= 0x00000001,
-	LMAC_SOM	= 0x00000002,
+/*	LMAC_SOM	= 0x00000002, obsolete since 2.8.0 */
 	LMAC_NOT_IN_OI	= 0x00000004, /* the object does NOT need OI mapping */
 	LMAC_FID_ON_OST = 0x00000008, /* For OST-object, its OI mapping is
 				       * under /O/<seq>/d<x>.
@@ -1923,7 +1923,7 @@ enum mds_cmd {
 	MDS_PIN			= 42, /* obsolete, never used in a release */
 	MDS_UNPIN		= 43, /* obsolete, never used in a release */
 	MDS_SYNC		= 44,
-	MDS_DONE_WRITING	= 45,
+	MDS_DONE_WRITING	= 45, /* obsolete since 2.8.0 */
 	MDS_SET_INFO		= 46,
 	MDS_QUOTACHECK		= 47,
 	MDS_QUOTACTL		= 48,
@@ -2021,24 +2021,6 @@ enum {
 #define MDS_STATUS_CONN 1
 #define MDS_STATUS_LOV 2
 
-/* mdt_thread_info.mti_flags. */
-enum md_op_flags {
-	/* The flag indicates Size-on-MDS attributes are changed. */
-	MF_SOM_CHANGE	   = (1 << 0),
-	/* Flags indicates an epoch opens or closes. */
-	MF_EPOCH_OPEN	   = (1 << 1),
-	MF_EPOCH_CLOSE	  = (1 << 2),
-	MF_MDC_CANCEL_FID1      = (1 << 3),
-	MF_MDC_CANCEL_FID2      = (1 << 4),
-	MF_MDC_CANCEL_FID3      = (1 << 5),
-	MF_MDC_CANCEL_FID4      = (1 << 6),
-	/* There is a pending attribute update. */
-	MF_SOM_AU	       = (1 << 7),
-	/* Cancel OST locks while getattr OST attributes. */
-	MF_GETATTR_LOCK	 = (1 << 8),
-	MF_GET_MDT_IDX	  = (1 << 9),
-};
-
 #define LUSTRE_BFLAG_UNCOMMITTED_WRITES   0x1
 
 /* these should be identical to their EXT4_*_FL counterparts, they are
@@ -2123,10 +2105,10 @@ struct mdt_body {
 void lustre_swab_mdt_body(struct mdt_body *b);
 
 struct mdt_ioepoch {
-	struct lustre_handle handle;
-	__u64  ioepoch;
-	__u32  flags;
-	__u32  padding;
+	struct lustre_handle mio_handle;
+	__u64 mio_unused1; /* was ioepoch */
+	__u32 mio_unused2; /* was flags */
+	__u32 mio_padding;
 };
 
 void lustre_swab_mdt_ioepoch(struct mdt_ioepoch *b);
@@ -2195,12 +2177,9 @@ void lustre_swab_mdt_rec_setattr(struct mdt_rec_setattr *sa);
 
 #define MDS_FMODE_CLOSED	 00000000
 #define MDS_FMODE_EXEC	   00000004
-/* IO Epoch is opened on a closed file. */
-#define MDS_FMODE_EPOCH	  01000000
-/* IO Epoch is opened on a file truncate. */
-#define MDS_FMODE_TRUNC	  02000000
-/* Size-on-MDS Attribute Update is pending. */
-#define MDS_FMODE_SOM	    04000000
+/*	MDS_FMODE_EPOCH		01000000 obsolete since 2.8.0 */
+/*	MDS_FMODE_TRUNC		02000000 obsolete since 2.8.0 */
+/*	MDS_FMODE_SOM		04000000 obsolete since 2.8.0 */
 
 #define MDS_OPEN_CREATED	 00000010
 #define MDS_OPEN_CROSS	   00000020
@@ -2246,7 +2225,7 @@ enum mds_op_bias {
 	MDS_CROSS_REF		= 1 << 1,
 	MDS_VTX_BYPASS		= 1 << 2,
 	MDS_PERM_BYPASS		= 1 << 3,
-	MDS_SOM			= 1 << 4,
+/*	MDS_SOM			= 1 << 4, obsolete since 2.8.0 */
 	MDS_QUOTA_IGNORE	= 1 << 5,
 	MDS_CLOSE_CLEANUP	= 1 << 6,
 	MDS_KEEP_ORPHAN		= 1 << 7,
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index a13558e..dd8717e 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -154,7 +154,6 @@ extern struct req_format RQF_MDS_DISCONNECT;
 extern struct req_format RQF_MDS_GET_INFO;
 extern struct req_format RQF_MDS_READPAGE;
 extern struct req_format RQF_MDS_WRITEPAGE;
-extern struct req_format RQF_MDS_DONE_WRITING;
 extern struct req_format RQF_MDS_REINT;
 extern struct req_format RQF_MDS_REINT_CREATE;
 extern struct req_format RQF_MDS_REINT_CREATE_ACL;
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index fe05cc6..5c31376 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -755,6 +755,14 @@ static inline int it_to_lock_mode(struct lookup_intent *it)
 	return -EINVAL;
 }
 
+enum md_op_flags {
+	MF_MDC_CANCEL_FID1	= BIT(0),
+	MF_MDC_CANCEL_FID2      = BIT(1),
+	MF_MDC_CANCEL_FID3      = BIT(2),
+	MF_MDC_CANCEL_FID4      = BIT(3),
+	MF_GET_MDT_IDX          = BIT(4),
+};
+
 enum md_cli_flags {
 	CLI_SET_MEA	= BIT(0),
 	CLI_RM_ENTRY	= BIT(1),
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index b346a7f..9d2d6f8 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -172,8 +172,8 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_MDS_ALL_REQUEST_NET     0x123
 #define OBD_FAIL_MDS_SYNC_NET	    0x124
 #define OBD_FAIL_MDS_SYNC_PACK	   0x125
-#define OBD_FAIL_MDS_DONE_WRITING_NET    0x126
-#define OBD_FAIL_MDS_DONE_WRITING_PACK   0x127
+/*	OBD_FAIL_MDS_DONE_WRITING_NET	0x126 obsolete since 2.8.0 */
+/*	OBD_FAIL_MDS_DONE_WRITING_PACK	0x127 obsolete since 2.8.0 */
 #define OBD_FAIL_MDS_ALLOC_OBDO	  0x128
 #define OBD_FAIL_MDS_PAUSE_OPEN	  0x129
 #define OBD_FAIL_MDS_STATFS_LCW_SLEEP    0x12a
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index 709440b..1925072 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -301,9 +301,10 @@ static void mdc_setattr_pack_rec(struct mdt_rec_setattr *rec,
 static void mdc_ioepoch_pack(struct mdt_ioepoch *epoch,
 			     struct md_op_data *op_data)
 {
-	memcpy(&epoch->handle, &op_data->op_handle, sizeof(epoch->handle));
-	epoch->ioepoch = 0;
-	epoch->flags = 0;
+	epoch->mio_handle = op_data->op_handle;
+	epoch->mio_unused1 = 0;
+	epoch->mio_unused2 = 0;
+	epoch->mio_padding = 0;
 }
 
 void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 3ef1bae..d996326 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -567,9 +567,9 @@ void mdc_replay_open(struct ptlrpc_request *req)
 		LASSERT(epoch);
 
 		if (och)
-			LASSERT(!memcmp(&old, &epoch->handle, sizeof(old)));
+			LASSERT(!memcmp(&old, &epoch->mio_handle, sizeof(old)));
 		DEBUG_REQ(D_HA, close_req, "updating close body with new fh");
-		epoch->handle = body->mbo_handle;
+		epoch->mio_handle = body->mbo_handle;
 	}
 }
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 4ea8454..358c124 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -669,7 +669,6 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_RELEASE_CLOSE,
 	&RQF_MDS_READPAGE,
 	&RQF_MDS_WRITEPAGE,
-	&RQF_MDS_DONE_WRITING,
 	&RQF_MDS_REINT,
 	&RQF_MDS_REINT_CREATE,
 	&RQF_MDS_REINT_CREATE_ACL,
@@ -1386,11 +1385,6 @@ struct req_format RQF_MDS_RELEASE_CLOSE =
 			mdt_release_close_client, mds_last_unlink_server);
 EXPORT_SYMBOL(RQF_MDS_RELEASE_CLOSE);
 
-struct req_format RQF_MDS_DONE_WRITING =
-	DEFINE_REQ_FMT0("MDS_DONE_WRITING",
-			mdt_close_client, mdt_body_only);
-EXPORT_SYMBOL(RQF_MDS_DONE_WRITING);
-
 struct req_format RQF_MDS_READPAGE =
 	DEFINE_REQ_FMT0("MDS_READPAGE",
 			mdt_body_capa, mdt_body_only);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 3055649..bca781a 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1715,9 +1715,10 @@ void lustre_swab_mdt_body(struct mdt_body *b)
 void lustre_swab_mdt_ioepoch(struct mdt_ioepoch *b)
 {
 	/* handle is opaque */
-	 __swab64s(&b->ioepoch);
-	 __swab32s(&b->flags);
-	 CLASSERT(offsetof(typeof(*b), padding) != 0);
+	/* mio_handle is opaque */
+	CLASSERT(offsetof(typeof(*b), mio_unused1));
+	CLASSERT(offsetof(typeof(*b), mio_unused2));
+	CLASSERT(offsetof(typeof(*b), mio_padding));
 }
 
 void lustre_swab_mgs_target_info(struct mgs_target_info *mti)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index d5af8cd..391e83e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -220,24 +220,6 @@ void lustre_assert_wire_constants(void)
 		 (long long)MDS_STATUS_LOV);
 	LASSERTF(LUSTRE_BFLAG_UNCOMMITTED_WRITES == 1, "found %lld\n",
 		 (long long)LUSTRE_BFLAG_UNCOMMITTED_WRITES);
-	LASSERTF(MF_SOM_CHANGE == 0x00000001UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_SOM_CHANGE);
-	LASSERTF(MF_EPOCH_OPEN == 0x00000002UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_EPOCH_OPEN);
-	LASSERTF(MF_EPOCH_CLOSE == 0x00000004UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_EPOCH_CLOSE);
-	LASSERTF(MF_MDC_CANCEL_FID1 == 0x00000008UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_MDC_CANCEL_FID1);
-	LASSERTF(MF_MDC_CANCEL_FID2 == 0x00000010UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_MDC_CANCEL_FID2);
-	LASSERTF(MF_MDC_CANCEL_FID3 == 0x00000020UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_MDC_CANCEL_FID3);
-	LASSERTF(MF_MDC_CANCEL_FID4 == 0x00000040UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_MDC_CANCEL_FID4);
-	LASSERTF(MF_SOM_AU == 0x00000080UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_SOM_AU);
-	LASSERTF(MF_GETATTR_LOCK == 0x00000100UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_GETATTR_LOCK);
 	LASSERTF(MDS_ATTR_MODE == 0x0000000000000001ULL, "found 0x%.16llxULL\n",
 		 (long long)MDS_ATTR_MODE);
 	LASSERTF(MDS_ATTR_UID == 0x0000000000000002ULL, "found 0x%.16llxULL\n",
@@ -423,8 +405,6 @@ void lustre_assert_wire_constants(void)
 		 (unsigned)LMAI_RELEASED);
 	LASSERTF(LMAC_HSM == 0x00000001UL, "found 0x%.8xUL\n",
 		 (unsigned)LMAC_HSM);
-	LASSERTF(LMAC_SOM == 0x00000002UL, "found 0x%.8xUL\n",
-		 (unsigned)LMAC_SOM);
 	LASSERTF(LMAC_NOT_IN_OI == 0x00000004UL, "found 0x%.8xUL\n",
 		 (unsigned)LMAC_NOT_IN_OI);
 	LASSERTF(LMAC_FID_ON_OST == 0x00000008UL, "found 0x%.8xUL\n",
@@ -1883,12 +1863,6 @@ void lustre_assert_wire_constants(void)
 		 MDS_FMODE_CLOSED);
 	LASSERTF(MDS_FMODE_EXEC == 000000000004UL, "found 0%.11oUL\n",
 		 MDS_FMODE_EXEC);
-	LASSERTF(MDS_FMODE_EPOCH == 000001000000UL, "found 0%.11oUL\n",
-		 MDS_FMODE_EPOCH);
-	LASSERTF(MDS_FMODE_TRUNC == 000002000000UL, "found 0%.11oUL\n",
-		 MDS_FMODE_TRUNC);
-	LASSERTF(MDS_FMODE_SOM == 000004000000UL, "found 0%.11oUL\n",
-		 MDS_FMODE_SOM);
 	LASSERTF(MDS_OPEN_CREATED == 000000000010UL, "found 0%.11oUL\n",
 		 MDS_OPEN_CREATED);
 	LASSERTF(MDS_OPEN_CROSS == 000000000020UL, "found 0%.11oUL\n",
@@ -1947,22 +1921,22 @@ void lustre_assert_wire_constants(void)
 	/* Checks for struct mdt_ioepoch */
 	LASSERTF((int)sizeof(struct mdt_ioepoch) == 24, "found %lld\n",
 		 (long long)(int)sizeof(struct mdt_ioepoch));
-	LASSERTF((int)offsetof(struct mdt_ioepoch, handle) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_ioepoch, handle));
-	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->handle) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->handle));
-	LASSERTF((int)offsetof(struct mdt_ioepoch, ioepoch) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_ioepoch, ioepoch));
-	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->ioepoch) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->ioepoch));
-	LASSERTF((int)offsetof(struct mdt_ioepoch, flags) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_ioepoch, flags));
-	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->flags) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->flags));
-	LASSERTF((int)offsetof(struct mdt_ioepoch, padding) == 20, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_ioepoch, padding));
-	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->padding) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->padding));
+	LASSERTF((int)offsetof(struct mdt_ioepoch, mio_handle) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_ioepoch, mio_handle));
+	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->mio_handle) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->mio_handle));
+	LASSERTF((int)offsetof(struct mdt_ioepoch, mio_unused1) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_ioepoch, mio_unused1));
+	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->mio_unused1) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->mio_unused1));
+	LASSERTF((int)offsetof(struct mdt_ioepoch, mio_unused2) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_ioepoch, mio_unused2));
+	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->mio_unused2) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->mio_unused2));
+	LASSERTF((int)offsetof(struct mdt_ioepoch, mio_padding) == 20, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_ioepoch, mio_padding));
+	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->mio_padding) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->mio_padding));
 
 	/* Checks for struct mdt_rec_setattr */
 	LASSERTF((int)sizeof(struct mdt_rec_setattr) == 136, "found %lld\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 20/41] staging: lustre: remove Size on MDS support
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Remove unused definitions related to Size on MDS support from
lustre/include/lustre/lustre_idl.h. Remove unused code from several
places in lustre/.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6047
Reviewed-on: http://review.whamcloud.com/13443
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |   41 ++++-----------
 .../lustre/lustre/include/lustre_req_layout.h      |    1 -
 drivers/staging/lustre/lustre/include/obd.h        |    8 +++
 .../staging/lustre/lustre/include/obd_support.h    |    4 +-
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |    7 ++-
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |    6 --
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    7 ++-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   58 ++++++--------------
 9 files changed, 46 insertions(+), 90 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 4f6eeec..17feb71 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -310,7 +310,7 @@ static inline int range_compare_loc(const struct lu_seq_range *r1,
  */
 enum lma_compat {
 	LMAC_HSM	= 0x00000001,
-	LMAC_SOM	= 0x00000002,
+/*	LMAC_SOM	= 0x00000002, obsolete since 2.8.0 */
 	LMAC_NOT_IN_OI	= 0x00000004, /* the object does NOT need OI mapping */
 	LMAC_FID_ON_OST = 0x00000008, /* For OST-object, its OI mapping is
 				       * under /O/<seq>/d<x>.
@@ -1923,7 +1923,7 @@ enum mds_cmd {
 	MDS_PIN			= 42, /* obsolete, never used in a release */
 	MDS_UNPIN		= 43, /* obsolete, never used in a release */
 	MDS_SYNC		= 44,
-	MDS_DONE_WRITING	= 45,
+	MDS_DONE_WRITING	= 45, /* obsolete since 2.8.0 */
 	MDS_SET_INFO		= 46,
 	MDS_QUOTACHECK		= 47,
 	MDS_QUOTACTL		= 48,
@@ -2021,24 +2021,6 @@ enum {
 #define MDS_STATUS_CONN 1
 #define MDS_STATUS_LOV 2
 
-/* mdt_thread_info.mti_flags. */
-enum md_op_flags {
-	/* The flag indicates Size-on-MDS attributes are changed. */
-	MF_SOM_CHANGE	   = (1 << 0),
-	/* Flags indicates an epoch opens or closes. */
-	MF_EPOCH_OPEN	   = (1 << 1),
-	MF_EPOCH_CLOSE	  = (1 << 2),
-	MF_MDC_CANCEL_FID1      = (1 << 3),
-	MF_MDC_CANCEL_FID2      = (1 << 4),
-	MF_MDC_CANCEL_FID3      = (1 << 5),
-	MF_MDC_CANCEL_FID4      = (1 << 6),
-	/* There is a pending attribute update. */
-	MF_SOM_AU	       = (1 << 7),
-	/* Cancel OST locks while getattr OST attributes. */
-	MF_GETATTR_LOCK	 = (1 << 8),
-	MF_GET_MDT_IDX	  = (1 << 9),
-};
-
 #define LUSTRE_BFLAG_UNCOMMITTED_WRITES   0x1
 
 /* these should be identical to their EXT4_*_FL counterparts, they are
@@ -2123,10 +2105,10 @@ struct mdt_body {
 void lustre_swab_mdt_body(struct mdt_body *b);
 
 struct mdt_ioepoch {
-	struct lustre_handle handle;
-	__u64  ioepoch;
-	__u32  flags;
-	__u32  padding;
+	struct lustre_handle mio_handle;
+	__u64 mio_unused1; /* was ioepoch */
+	__u32 mio_unused2; /* was flags */
+	__u32 mio_padding;
 };
 
 void lustre_swab_mdt_ioepoch(struct mdt_ioepoch *b);
@@ -2195,12 +2177,9 @@ void lustre_swab_mdt_rec_setattr(struct mdt_rec_setattr *sa);
 
 #define MDS_FMODE_CLOSED	 00000000
 #define MDS_FMODE_EXEC	   00000004
-/* IO Epoch is opened on a closed file. */
-#define MDS_FMODE_EPOCH	  01000000
-/* IO Epoch is opened on a file truncate. */
-#define MDS_FMODE_TRUNC	  02000000
-/* Size-on-MDS Attribute Update is pending. */
-#define MDS_FMODE_SOM	    04000000
+/*	MDS_FMODE_EPOCH		01000000 obsolete since 2.8.0 */
+/*	MDS_FMODE_TRUNC		02000000 obsolete since 2.8.0 */
+/*	MDS_FMODE_SOM		04000000 obsolete since 2.8.0 */
 
 #define MDS_OPEN_CREATED	 00000010
 #define MDS_OPEN_CROSS	   00000020
@@ -2246,7 +2225,7 @@ enum mds_op_bias {
 	MDS_CROSS_REF		= 1 << 1,
 	MDS_VTX_BYPASS		= 1 << 2,
 	MDS_PERM_BYPASS		= 1 << 3,
-	MDS_SOM			= 1 << 4,
+/*	MDS_SOM			= 1 << 4, obsolete since 2.8.0 */
 	MDS_QUOTA_IGNORE	= 1 << 5,
 	MDS_CLOSE_CLEANUP	= 1 << 6,
 	MDS_KEEP_ORPHAN		= 1 << 7,
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index a13558e..dd8717e 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -154,7 +154,6 @@ extern struct req_format RQF_MDS_DISCONNECT;
 extern struct req_format RQF_MDS_GET_INFO;
 extern struct req_format RQF_MDS_READPAGE;
 extern struct req_format RQF_MDS_WRITEPAGE;
-extern struct req_format RQF_MDS_DONE_WRITING;
 extern struct req_format RQF_MDS_REINT;
 extern struct req_format RQF_MDS_REINT_CREATE;
 extern struct req_format RQF_MDS_REINT_CREATE_ACL;
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index fe05cc6..5c31376 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -755,6 +755,14 @@ static inline int it_to_lock_mode(struct lookup_intent *it)
 	return -EINVAL;
 }
 
+enum md_op_flags {
+	MF_MDC_CANCEL_FID1	= BIT(0),
+	MF_MDC_CANCEL_FID2      = BIT(1),
+	MF_MDC_CANCEL_FID3      = BIT(2),
+	MF_MDC_CANCEL_FID4      = BIT(3),
+	MF_GET_MDT_IDX          = BIT(4),
+};
+
 enum md_cli_flags {
 	CLI_SET_MEA	= BIT(0),
 	CLI_RM_ENTRY	= BIT(1),
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index b346a7f..9d2d6f8 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -172,8 +172,8 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_MDS_ALL_REQUEST_NET     0x123
 #define OBD_FAIL_MDS_SYNC_NET	    0x124
 #define OBD_FAIL_MDS_SYNC_PACK	   0x125
-#define OBD_FAIL_MDS_DONE_WRITING_NET    0x126
-#define OBD_FAIL_MDS_DONE_WRITING_PACK   0x127
+/*	OBD_FAIL_MDS_DONE_WRITING_NET	0x126 obsolete since 2.8.0 */
+/*	OBD_FAIL_MDS_DONE_WRITING_PACK	0x127 obsolete since 2.8.0 */
 #define OBD_FAIL_MDS_ALLOC_OBDO	  0x128
 #define OBD_FAIL_MDS_PAUSE_OPEN	  0x129
 #define OBD_FAIL_MDS_STATFS_LCW_SLEEP    0x12a
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index 709440b..1925072 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -301,9 +301,10 @@ static void mdc_setattr_pack_rec(struct mdt_rec_setattr *rec,
 static void mdc_ioepoch_pack(struct mdt_ioepoch *epoch,
 			     struct md_op_data *op_data)
 {
-	memcpy(&epoch->handle, &op_data->op_handle, sizeof(epoch->handle));
-	epoch->ioepoch = 0;
-	epoch->flags = 0;
+	epoch->mio_handle = op_data->op_handle;
+	epoch->mio_unused1 = 0;
+	epoch->mio_unused2 = 0;
+	epoch->mio_padding = 0;
 }
 
 void mdc_setattr_pack(struct ptlrpc_request *req, struct md_op_data *op_data,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 3ef1bae..d996326 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -567,9 +567,9 @@ void mdc_replay_open(struct ptlrpc_request *req)
 		LASSERT(epoch);
 
 		if (och)
-			LASSERT(!memcmp(&old, &epoch->handle, sizeof(old)));
+			LASSERT(!memcmp(&old, &epoch->mio_handle, sizeof(old)));
 		DEBUG_REQ(D_HA, close_req, "updating close body with new fh");
-		epoch->handle = body->mbo_handle;
+		epoch->mio_handle = body->mbo_handle;
 	}
 }
 
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 4ea8454..358c124 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -669,7 +669,6 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_RELEASE_CLOSE,
 	&RQF_MDS_READPAGE,
 	&RQF_MDS_WRITEPAGE,
-	&RQF_MDS_DONE_WRITING,
 	&RQF_MDS_REINT,
 	&RQF_MDS_REINT_CREATE,
 	&RQF_MDS_REINT_CREATE_ACL,
@@ -1386,11 +1385,6 @@ struct req_format RQF_MDS_RELEASE_CLOSE =
 			mdt_release_close_client, mds_last_unlink_server);
 EXPORT_SYMBOL(RQF_MDS_RELEASE_CLOSE);
 
-struct req_format RQF_MDS_DONE_WRITING =
-	DEFINE_REQ_FMT0("MDS_DONE_WRITING",
-			mdt_close_client, mdt_body_only);
-EXPORT_SYMBOL(RQF_MDS_DONE_WRITING);
-
 struct req_format RQF_MDS_READPAGE =
 	DEFINE_REQ_FMT0("MDS_READPAGE",
 			mdt_body_capa, mdt_body_only);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index 3055649..bca781a 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1715,9 +1715,10 @@ void lustre_swab_mdt_body(struct mdt_body *b)
 void lustre_swab_mdt_ioepoch(struct mdt_ioepoch *b)
 {
 	/* handle is opaque */
-	 __swab64s(&b->ioepoch);
-	 __swab32s(&b->flags);
-	 CLASSERT(offsetof(typeof(*b), padding) != 0);
+	/* mio_handle is opaque */
+	CLASSERT(offsetof(typeof(*b), mio_unused1));
+	CLASSERT(offsetof(typeof(*b), mio_unused2));
+	CLASSERT(offsetof(typeof(*b), mio_padding));
 }
 
 void lustre_swab_mgs_target_info(struct mgs_target_info *mti)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index d5af8cd..391e83e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -220,24 +220,6 @@ void lustre_assert_wire_constants(void)
 		 (long long)MDS_STATUS_LOV);
 	LASSERTF(LUSTRE_BFLAG_UNCOMMITTED_WRITES == 1, "found %lld\n",
 		 (long long)LUSTRE_BFLAG_UNCOMMITTED_WRITES);
-	LASSERTF(MF_SOM_CHANGE == 0x00000001UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_SOM_CHANGE);
-	LASSERTF(MF_EPOCH_OPEN == 0x00000002UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_EPOCH_OPEN);
-	LASSERTF(MF_EPOCH_CLOSE == 0x00000004UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_EPOCH_CLOSE);
-	LASSERTF(MF_MDC_CANCEL_FID1 == 0x00000008UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_MDC_CANCEL_FID1);
-	LASSERTF(MF_MDC_CANCEL_FID2 == 0x00000010UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_MDC_CANCEL_FID2);
-	LASSERTF(MF_MDC_CANCEL_FID3 == 0x00000020UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_MDC_CANCEL_FID3);
-	LASSERTF(MF_MDC_CANCEL_FID4 == 0x00000040UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_MDC_CANCEL_FID4);
-	LASSERTF(MF_SOM_AU == 0x00000080UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_SOM_AU);
-	LASSERTF(MF_GETATTR_LOCK == 0x00000100UL, "found 0x%.8xUL\n",
-		 (unsigned)MF_GETATTR_LOCK);
 	LASSERTF(MDS_ATTR_MODE == 0x0000000000000001ULL, "found 0x%.16llxULL\n",
 		 (long long)MDS_ATTR_MODE);
 	LASSERTF(MDS_ATTR_UID == 0x0000000000000002ULL, "found 0x%.16llxULL\n",
@@ -423,8 +405,6 @@ void lustre_assert_wire_constants(void)
 		 (unsigned)LMAI_RELEASED);
 	LASSERTF(LMAC_HSM == 0x00000001UL, "found 0x%.8xUL\n",
 		 (unsigned)LMAC_HSM);
-	LASSERTF(LMAC_SOM == 0x00000002UL, "found 0x%.8xUL\n",
-		 (unsigned)LMAC_SOM);
 	LASSERTF(LMAC_NOT_IN_OI == 0x00000004UL, "found 0x%.8xUL\n",
 		 (unsigned)LMAC_NOT_IN_OI);
 	LASSERTF(LMAC_FID_ON_OST == 0x00000008UL, "found 0x%.8xUL\n",
@@ -1883,12 +1863,6 @@ void lustre_assert_wire_constants(void)
 		 MDS_FMODE_CLOSED);
 	LASSERTF(MDS_FMODE_EXEC == 000000000004UL, "found 0%.11oUL\n",
 		 MDS_FMODE_EXEC);
-	LASSERTF(MDS_FMODE_EPOCH == 000001000000UL, "found 0%.11oUL\n",
-		 MDS_FMODE_EPOCH);
-	LASSERTF(MDS_FMODE_TRUNC == 000002000000UL, "found 0%.11oUL\n",
-		 MDS_FMODE_TRUNC);
-	LASSERTF(MDS_FMODE_SOM == 000004000000UL, "found 0%.11oUL\n",
-		 MDS_FMODE_SOM);
 	LASSERTF(MDS_OPEN_CREATED == 000000000010UL, "found 0%.11oUL\n",
 		 MDS_OPEN_CREATED);
 	LASSERTF(MDS_OPEN_CROSS == 000000000020UL, "found 0%.11oUL\n",
@@ -1947,22 +1921,22 @@ void lustre_assert_wire_constants(void)
 	/* Checks for struct mdt_ioepoch */
 	LASSERTF((int)sizeof(struct mdt_ioepoch) == 24, "found %lld\n",
 		 (long long)(int)sizeof(struct mdt_ioepoch));
-	LASSERTF((int)offsetof(struct mdt_ioepoch, handle) == 0, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_ioepoch, handle));
-	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->handle) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->handle));
-	LASSERTF((int)offsetof(struct mdt_ioepoch, ioepoch) == 8, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_ioepoch, ioepoch));
-	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->ioepoch) == 8, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->ioepoch));
-	LASSERTF((int)offsetof(struct mdt_ioepoch, flags) == 16, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_ioepoch, flags));
-	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->flags) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->flags));
-	LASSERTF((int)offsetof(struct mdt_ioepoch, padding) == 20, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_ioepoch, padding));
-	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->padding) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->padding));
+	LASSERTF((int)offsetof(struct mdt_ioepoch, mio_handle) == 0, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_ioepoch, mio_handle));
+	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->mio_handle) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->mio_handle));
+	LASSERTF((int)offsetof(struct mdt_ioepoch, mio_unused1) == 8, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_ioepoch, mio_unused1));
+	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->mio_unused1) == 8, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->mio_unused1));
+	LASSERTF((int)offsetof(struct mdt_ioepoch, mio_unused2) == 16, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_ioepoch, mio_unused2));
+	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->mio_unused2) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->mio_unused2));
+	LASSERTF((int)offsetof(struct mdt_ioepoch, mio_padding) == 20, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_ioepoch, mio_padding));
+	LASSERTF((int)sizeof(((struct mdt_ioepoch *)0)->mio_padding) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_ioepoch *)0)->mio_padding));
 
 	/* Checks for struct mdt_rec_setattr */
 	LASSERTF((int)sizeof(struct mdt_rec_setattr) == 136, "found %lld\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 21/41] staging: lustre: mdc: Removed unneeded NULL check
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Henri Doreau,
	James Simmons

From: Henri Doreau <henri.doreau@cea.fr>

Do not bother checking the return value of changelog_kuc_hdr()
against NULL since this value was dereferenced earlier.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4189
Reviewed-on: http://review.whamcloud.com/12919
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index d996326..206f5d0 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1867,10 +1867,8 @@ static int mdc_changelog_send_thread(void *csdata)
 
 	/* Send EOF no matter what our result */
 	kuch = changelog_kuc_hdr(cs->cs_buf, sizeof(*kuch), cs->cs_flags);
-	if (kuch) {
-		kuch->kuc_msgtype = CL_EOF;
-		libcfs_kkuc_msg_put(cs->cs_fp, kuch);
-	}
+	kuch->kuc_msgtype = CL_EOF;
+	libcfs_kkuc_msg_put(cs->cs_fp, kuch);
 
 out:
 	fput(cs->cs_fp);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 21/41] staging: lustre: mdc: Removed unneeded NULL check
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Henri Doreau,
	James Simmons

From: Henri Doreau <henri.doreau@cea.fr>

Do not bother checking the return value of changelog_kuc_hdr()
against NULL since this value was dereferenced earlier.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4189
Reviewed-on: http://review.whamcloud.com/12919
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c |    6 ++----
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index d996326..206f5d0 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1867,10 +1867,8 @@ static int mdc_changelog_send_thread(void *csdata)
 
 	/* Send EOF no matter what our result */
 	kuch = changelog_kuc_hdr(cs->cs_buf, sizeof(*kuch), cs->cs_flags);
-	if (kuch) {
-		kuch->kuc_msgtype = CL_EOF;
-		libcfs_kkuc_msg_put(cs->cs_fp, kuch);
-	}
+	kuch->kuc_msgtype = CL_EOF;
+	libcfs_kkuc_msg_put(cs->cs_fp, kuch);
 
 out:
 	fput(cs->cs_fp);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 22/41] staging: lustre: obd: remove unused LSM parameters
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Remove unused struct lov_stripe_md * parameters from obd_get_info(),
mgc_enqueue(), and osc_brw_prep_request().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13426
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |    3 +--
 drivers/staging/lustre/lustre/include/obd_class.h  |    6 ++----
 drivers/staging/lustre/lustre/llite/dir.c          |    2 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |    2 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    6 +++---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |    8 +++-----
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    3 +--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    5 ++---
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |    9 ++++-----
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |    7 ++-----
 11 files changed, 21 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 5c31376..c72a1e1 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -845,8 +845,7 @@ struct obd_ops {
 	int (*iocontrol)(unsigned int cmd, struct obd_export *exp, int len,
 			 void *karg, void __user *uarg);
 	int (*get_info)(const struct lu_env *env, struct obd_export *,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm);
+			__u32 keylen, void *key, __u32 *vallen, void *val);
 	int (*set_info_async)(const struct lu_env *, struct obd_export *,
 			      __u32 keylen, void *key,
 			      __u32 vallen, void *val,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index b2ced8b..7a5d75a 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -415,16 +415,14 @@ static inline int class_devno_max(void)
 
 static inline int obd_get_info(const struct lu_env *env,
 			       struct obd_export *exp, __u32 keylen,
-			       void *key, __u32 *vallen, void *val,
-			       struct lov_stripe_md *lsm)
+			       void *key, __u32 *vallen, void *val)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, get_info);
 	EXP_COUNTER_INCREMENT(exp, get_info);
 
-	rc = OBP(exp->exp_obd, get_info)(env, exp, keylen, key, vallen, val,
-					 lsm);
+	rc = OBP(exp->exp_obd, get_info)(env, exp, keylen, key, vallen, val);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 68675eb..12e9a38 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1535,7 +1535,7 @@ out_quotactl:
 		exp = count ? sbi->ll_md_exp : sbi->ll_dt_exp;
 		vallen = sizeof(count);
 		rc = obd_get_info(NULL, exp, sizeof(KEY_TGT_COUNT),
-				  KEY_TGT_COUNT, &vallen, &count, NULL);
+				  KEY_TGT_COUNT, &vallen, &count);
 		if (rc) {
 			CERROR("get target count failed: %d\n", rc);
 			return rc;
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_misc.c b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
index fb346c1..07d38e5 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
@@ -54,7 +54,7 @@ int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 	__u16 stripes, def_stripes;
 
 	rc = obd_get_info(NULL, dt_exp, sizeof(KEY_LOVDESC), KEY_LOVDESC,
-			  &valsize, &desc, NULL);
+			  &valsize, &desc);
 	if (rc)
 		return rc;
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 9112a52..6270301 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -285,7 +285,7 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 
 	size = sizeof(*data);
 	err = obd_get_info(NULL, sbi->ll_md_exp, sizeof(KEY_CONN_DATA),
-			   KEY_CONN_DATA,  &size, data, NULL);
+			   KEY_CONN_DATA,  &size, data);
 	if (err) {
 		CERROR("%s: Get connect data failed: rc = %d\n",
 		       sbi->ll_md_exp->exp_obd->obd_name, err);
@@ -563,7 +563,7 @@ int ll_get_max_mdsize(struct ll_sb_info *sbi, int *lmmsize)
 	*lmmsize = obd_size_diskmd(sbi->ll_dt_exp, NULL);
 	size = sizeof(int);
 	rc = obd_get_info(NULL, sbi->ll_md_exp, sizeof(KEY_MAX_EASIZE),
-			  KEY_MAX_EASIZE, &size, lmmsize, NULL);
+			  KEY_MAX_EASIZE, &size, lmmsize);
 	if (rc)
 		CERROR("Get max mdsize error rc %d\n", rc);
 
@@ -587,7 +587,7 @@ int ll_get_default_mdsize(struct ll_sb_info *sbi, int *lmmsize)
 
 	size = sizeof(int);
 	rc = obd_get_info(NULL, sbi->ll_md_exp, sizeof(KEY_DEFAULT_EASIZE),
-			  KEY_DEFAULT_EASIZE, &size, lmmsize, NULL);
+			  KEY_DEFAULT_EASIZE, &size, lmmsize);
 	if (rc)
 		CERROR("Get default mdsize error rc %d\n", rc);
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index b401ffb..7bd7e15 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -2628,14 +2628,12 @@ static int lmv_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
  * \param[in]  key	identifier of key to get value for
  * \param[in]  vallen	size of \a val
  * \param[out] val	pointer to storage location for value
- * \param[in]  lsm	optional striping metadata of object
  *
  * \retval 0		on success
  * \retval negative	negated errno on failure
  */
 static int lmv_get_info(const struct lu_env *env, struct obd_export *exp,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm)
+			__u32 keylen, void *key, __u32 *vallen, void *val)
 {
 	struct obd_device       *obd;
 	struct lmv_obd	  *lmv;
@@ -2667,7 +2665,7 @@ static int lmv_get_info(const struct lu_env *env, struct obd_export *exp,
 				continue;
 
 			if (!obd_get_info(env, tgt->ltd_exp, keylen, key,
-					  vallen, val, NULL))
+					  vallen, val))
 				return 0;
 		}
 		return -EINVAL;
@@ -2683,7 +2681,7 @@ static int lmv_get_info(const struct lu_env *env, struct obd_export *exp,
 		 * desc.
 		 */
 		rc = obd_get_info(env, lmv->tgts[0]->ltd_exp, keylen, key,
-				  vallen, val, NULL);
+				  vallen, val);
 		if (!rc && KEY_IS(KEY_CONN_DATA))
 			exp->exp_connect_data = *(struct obd_connect_data *)val;
 		return rc;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 30903fc..44f53c7 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1310,8 +1310,7 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 }
 
 static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm)
+			__u32 keylen, void *key, __u32 *vallen, void *val)
 {
 	struct obd_device *obddev = class_exp2obd(exp);
 	struct lov_obd *lov = &obddev->u.lov;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 206f5d0..7b9fb90 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1456,7 +1456,7 @@ static int mdc_ioc_fid2path(struct obd_export *exp, struct getinfo_fid2path *gf)
 	/* Val is struct getinfo_fid2path result plus path */
 	vallen = sizeof(*gf) + gf->gf_pathlen;
 
-	rc = obd_get_info(NULL, exp, keylen, key, &vallen, gf, NULL);
+	rc = obd_get_info(NULL, exp, keylen, key, &vallen, gf);
 	if (rc != 0 && rc != -EREMOTE)
 		goto out;
 
@@ -2436,8 +2436,7 @@ static int mdc_set_info_async(const struct lu_env *env,
 }
 
 static int mdc_get_info(const struct lu_env *env, struct obd_export *exp,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm)
+			__u32 keylen, void *key, __u32 *vallen, void *val)
 {
 	int rc = -EINVAL;
 
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 23374ca..1a0e59a 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -887,8 +887,8 @@ static int mgc_set_mgs_param(struct obd_export *exp,
 }
 
 /* Take a config lock so we can get cancel notifications */
-static int mgc_enqueue(struct obd_export *exp, struct lov_stripe_md *lsm,
-		       __u32 type, ldlm_policy_data_t *policy, __u32 mode,
+static int mgc_enqueue(struct obd_export *exp, __u32 type,
+		       ldlm_policy_data_t *policy, __u32 mode,
 		       __u64 *flags, void *bl_cb, void *cp_cb, void *gl_cb,
 		       void *data, __u32 lvb_len, void *lvb_swabber,
 		       struct lustre_handle *lockh)
@@ -1059,8 +1059,7 @@ static int mgc_set_info_async(const struct lu_env *env, struct obd_export *exp,
 }
 
 static int mgc_get_info(const struct lu_env *env, struct obd_export *exp,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *unused)
+			__u32 keylen, void *key, __u32 *vallen, void *val)
 {
 	int rc = -EINVAL;
 
@@ -1582,7 +1581,7 @@ int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 	       cld->cld_cfg.cfg_instance, cld->cld_cfg.cfg_last_idx + 1);
 
 	/* Get the cfg lock on the llog */
-	rcl = mgc_enqueue(mgc->u.cli.cl_mgc_mgsexp, NULL, LDLM_PLAIN, NULL,
+	rcl = mgc_enqueue(mgc->u.cli.cl_mgc_mgsexp, LDLM_PLAIN, NULL,
 			  LCK_CR, &flags, NULL, NULL, NULL,
 			  cld, 0, NULL, &lockh);
 	if (rcl == 0) {
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index 0d3a3b0..59fbc29 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -261,7 +261,7 @@ int lustre_start_mgc(struct super_block *sb)
 
 			rc = obd_get_info(NULL, obd->obd_self_export,
 					  strlen(KEY_CONN_DATA), KEY_CONN_DATA,
-					  &vallen, data, NULL);
+					  &vallen, data);
 			LASSERT(rc == 0);
 			has_ir = OCD_HAS_FLAG(data, IMP_RECOV);
 			if (has_ir ^ !(*flags & LMD_FLG_NOIR)) {
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 21cd48b..0413b88 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -1139,8 +1139,7 @@ static u32 osc_checksum_bulk(int nob, u32 pg_count,
 }
 
 static int osc_brw_prep_request(int cmd, struct client_obd *cli,
-				struct obdo *oa,
-				struct lov_stripe_md *lsm, u32 page_count,
+				struct obdo *oa, u32 page_count,
 				struct brw_page **pga,
 				struct ptlrpc_request **reqp,
 				int reserve,
@@ -1557,7 +1556,6 @@ static int osc_brw_redo_request(struct ptlrpc_request *request,
 	rc = osc_brw_prep_request(lustre_msg_get_opc(request->rq_reqmsg) ==
 					OST_WRITE ? OBD_BRW_WRITE : OBD_BRW_READ,
 				  aa->aa_cli, aa->aa_oa,
-				  NULL /* lsm unused by osc currently */,
 				  aa->aa_page_count, aa->aa_ppga,
 				  &new_req, 0, 1);
 	if (rc)
@@ -1902,8 +1900,7 @@ int osc_build_rpc(const struct lu_env *env, struct client_obd *cli,
 	}
 
 	sort_brw_pages(pga, page_count);
-	rc = osc_brw_prep_request(cmd, cli, oa, NULL, page_count,
-				  pga, &req, 1, 0);
+	rc = osc_brw_prep_request(cmd, cli, oa, page_count, pga, &req, 1, 0);
 	if (rc != 0) {
 		CERROR("prep_req failed: %d\n", rc);
 		goto out;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 22/41] staging: lustre: obd: remove unused LSM parameters
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Remove unused struct lov_stripe_md * parameters from obd_get_info(),
mgc_enqueue(), and osc_brw_prep_request().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13426
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |    3 +--
 drivers/staging/lustre/lustre/include/obd_class.h  |    6 ++----
 drivers/staging/lustre/lustre/llite/dir.c          |    2 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |    2 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    6 +++---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |    8 +++-----
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    3 +--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    5 ++---
 drivers/staging/lustre/lustre/mgc/mgc_request.c    |    9 ++++-----
 drivers/staging/lustre/lustre/obdclass/obd_mount.c |    2 +-
 drivers/staging/lustre/lustre/osc/osc_request.c    |    7 ++-----
 11 files changed, 21 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 5c31376..c72a1e1 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -845,8 +845,7 @@ struct obd_ops {
 	int (*iocontrol)(unsigned int cmd, struct obd_export *exp, int len,
 			 void *karg, void __user *uarg);
 	int (*get_info)(const struct lu_env *env, struct obd_export *,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm);
+			__u32 keylen, void *key, __u32 *vallen, void *val);
 	int (*set_info_async)(const struct lu_env *, struct obd_export *,
 			      __u32 keylen, void *key,
 			      __u32 vallen, void *val,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index b2ced8b..7a5d75a 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -415,16 +415,14 @@ static inline int class_devno_max(void)
 
 static inline int obd_get_info(const struct lu_env *env,
 			       struct obd_export *exp, __u32 keylen,
-			       void *key, __u32 *vallen, void *val,
-			       struct lov_stripe_md *lsm)
+			       void *key, __u32 *vallen, void *val)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, get_info);
 	EXP_COUNTER_INCREMENT(exp, get_info);
 
-	rc = OBP(exp->exp_obd, get_info)(env, exp, keylen, key, vallen, val,
-					 lsm);
+	rc = OBP(exp->exp_obd, get_info)(env, exp, keylen, key, vallen, val);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 68675eb..12e9a38 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1535,7 +1535,7 @@ out_quotactl:
 		exp = count ? sbi->ll_md_exp : sbi->ll_dt_exp;
 		vallen = sizeof(count);
 		rc = obd_get_info(NULL, exp, sizeof(KEY_TGT_COUNT),
-				  KEY_TGT_COUNT, &vallen, &count, NULL);
+				  KEY_TGT_COUNT, &vallen, &count);
 		if (rc) {
 			CERROR("get target count failed: %d\n", rc);
 			return rc;
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_misc.c b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
index fb346c1..07d38e5 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
@@ -54,7 +54,7 @@ int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 	__u16 stripes, def_stripes;
 
 	rc = obd_get_info(NULL, dt_exp, sizeof(KEY_LOVDESC), KEY_LOVDESC,
-			  &valsize, &desc, NULL);
+			  &valsize, &desc);
 	if (rc)
 		return rc;
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 9112a52..6270301 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -285,7 +285,7 @@ static int client_common_fill_super(struct super_block *sb, char *md, char *dt,
 
 	size = sizeof(*data);
 	err = obd_get_info(NULL, sbi->ll_md_exp, sizeof(KEY_CONN_DATA),
-			   KEY_CONN_DATA,  &size, data, NULL);
+			   KEY_CONN_DATA,  &size, data);
 	if (err) {
 		CERROR("%s: Get connect data failed: rc = %d\n",
 		       sbi->ll_md_exp->exp_obd->obd_name, err);
@@ -563,7 +563,7 @@ int ll_get_max_mdsize(struct ll_sb_info *sbi, int *lmmsize)
 	*lmmsize = obd_size_diskmd(sbi->ll_dt_exp, NULL);
 	size = sizeof(int);
 	rc = obd_get_info(NULL, sbi->ll_md_exp, sizeof(KEY_MAX_EASIZE),
-			  KEY_MAX_EASIZE, &size, lmmsize, NULL);
+			  KEY_MAX_EASIZE, &size, lmmsize);
 	if (rc)
 		CERROR("Get max mdsize error rc %d\n", rc);
 
@@ -587,7 +587,7 @@ int ll_get_default_mdsize(struct ll_sb_info *sbi, int *lmmsize)
 
 	size = sizeof(int);
 	rc = obd_get_info(NULL, sbi->ll_md_exp, sizeof(KEY_DEFAULT_EASIZE),
-			  KEY_DEFAULT_EASIZE, &size, lmmsize, NULL);
+			  KEY_DEFAULT_EASIZE, &size, lmmsize);
 	if (rc)
 		CERROR("Get default mdsize error rc %d\n", rc);
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index b401ffb..7bd7e15 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -2628,14 +2628,12 @@ static int lmv_precleanup(struct obd_device *obd, enum obd_cleanup_stage stage)
  * \param[in]  key	identifier of key to get value for
  * \param[in]  vallen	size of \a val
  * \param[out] val	pointer to storage location for value
- * \param[in]  lsm	optional striping metadata of object
  *
  * \retval 0		on success
  * \retval negative	negated errno on failure
  */
 static int lmv_get_info(const struct lu_env *env, struct obd_export *exp,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm)
+			__u32 keylen, void *key, __u32 *vallen, void *val)
 {
 	struct obd_device       *obd;
 	struct lmv_obd	  *lmv;
@@ -2667,7 +2665,7 @@ static int lmv_get_info(const struct lu_env *env, struct obd_export *exp,
 				continue;
 
 			if (!obd_get_info(env, tgt->ltd_exp, keylen, key,
-					  vallen, val, NULL))
+					  vallen, val))
 				return 0;
 		}
 		return -EINVAL;
@@ -2683,7 +2681,7 @@ static int lmv_get_info(const struct lu_env *env, struct obd_export *exp,
 		 * desc.
 		 */
 		rc = obd_get_info(env, lmv->tgts[0]->ltd_exp, keylen, key,
-				  vallen, val, NULL);
+				  vallen, val);
 		if (!rc && KEY_IS(KEY_CONN_DATA))
 			exp->exp_connect_data = *(struct obd_connect_data *)val;
 		return rc;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 30903fc..44f53c7 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1310,8 +1310,7 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 }
 
 static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm)
+			__u32 keylen, void *key, __u32 *vallen, void *val)
 {
 	struct obd_device *obddev = class_exp2obd(exp);
 	struct lov_obd *lov = &obddev->u.lov;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 206f5d0..7b9fb90 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1456,7 +1456,7 @@ static int mdc_ioc_fid2path(struct obd_export *exp, struct getinfo_fid2path *gf)
 	/* Val is struct getinfo_fid2path result plus path */
 	vallen = sizeof(*gf) + gf->gf_pathlen;
 
-	rc = obd_get_info(NULL, exp, keylen, key, &vallen, gf, NULL);
+	rc = obd_get_info(NULL, exp, keylen, key, &vallen, gf);
 	if (rc != 0 && rc != -EREMOTE)
 		goto out;
 
@@ -2436,8 +2436,7 @@ static int mdc_set_info_async(const struct lu_env *env,
 }
 
 static int mdc_get_info(const struct lu_env *env, struct obd_export *exp,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *lsm)
+			__u32 keylen, void *key, __u32 *vallen, void *val)
 {
 	int rc = -EINVAL;
 
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 23374ca..1a0e59a 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -887,8 +887,8 @@ static int mgc_set_mgs_param(struct obd_export *exp,
 }
 
 /* Take a config lock so we can get cancel notifications */
-static int mgc_enqueue(struct obd_export *exp, struct lov_stripe_md *lsm,
-		       __u32 type, ldlm_policy_data_t *policy, __u32 mode,
+static int mgc_enqueue(struct obd_export *exp, __u32 type,
+		       ldlm_policy_data_t *policy, __u32 mode,
 		       __u64 *flags, void *bl_cb, void *cp_cb, void *gl_cb,
 		       void *data, __u32 lvb_len, void *lvb_swabber,
 		       struct lustre_handle *lockh)
@@ -1059,8 +1059,7 @@ static int mgc_set_info_async(const struct lu_env *env, struct obd_export *exp,
 }
 
 static int mgc_get_info(const struct lu_env *env, struct obd_export *exp,
-			__u32 keylen, void *key, __u32 *vallen, void *val,
-			struct lov_stripe_md *unused)
+			__u32 keylen, void *key, __u32 *vallen, void *val)
 {
 	int rc = -EINVAL;
 
@@ -1582,7 +1581,7 @@ int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 	       cld->cld_cfg.cfg_instance, cld->cld_cfg.cfg_last_idx + 1);
 
 	/* Get the cfg lock on the llog */
-	rcl = mgc_enqueue(mgc->u.cli.cl_mgc_mgsexp, NULL, LDLM_PLAIN, NULL,
+	rcl = mgc_enqueue(mgc->u.cli.cl_mgc_mgsexp, LDLM_PLAIN, NULL,
 			  LCK_CR, &flags, NULL, NULL, NULL,
 			  cld, 0, NULL, &lockh);
 	if (rcl == 0) {
diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index 0d3a3b0..59fbc29 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -261,7 +261,7 @@ int lustre_start_mgc(struct super_block *sb)
 
 			rc = obd_get_info(NULL, obd->obd_self_export,
 					  strlen(KEY_CONN_DATA), KEY_CONN_DATA,
-					  &vallen, data, NULL);
+					  &vallen, data);
 			LASSERT(rc == 0);
 			has_ir = OCD_HAS_FLAG(data, IMP_RECOV);
 			if (has_ir ^ !(*flags & LMD_FLG_NOIR)) {
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 21cd48b..0413b88 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -1139,8 +1139,7 @@ static u32 osc_checksum_bulk(int nob, u32 pg_count,
 }
 
 static int osc_brw_prep_request(int cmd, struct client_obd *cli,
-				struct obdo *oa,
-				struct lov_stripe_md *lsm, u32 page_count,
+				struct obdo *oa, u32 page_count,
 				struct brw_page **pga,
 				struct ptlrpc_request **reqp,
 				int reserve,
@@ -1557,7 +1556,6 @@ static int osc_brw_redo_request(struct ptlrpc_request *request,
 	rc = osc_brw_prep_request(lustre_msg_get_opc(request->rq_reqmsg) ==
 					OST_WRITE ? OBD_BRW_WRITE : OBD_BRW_READ,
 				  aa->aa_cli, aa->aa_oa,
-				  NULL /* lsm unused by osc currently */,
 				  aa->aa_page_count, aa->aa_ppga,
 				  &new_req, 0, 1);
 	if (rc)
@@ -1902,8 +1900,7 @@ int osc_build_rpc(const struct lu_env *env, struct client_obd *cli,
 	}
 
 	sort_brw_pages(pga, page_count);
-	rc = osc_brw_prep_request(cmd, cli, oa, NULL, page_count,
-				  pga, &req, 1, 0);
+	rc = osc_brw_prep_request(cmd, cli, oa, page_count, pga, &req, 1, 0);
 	if (rc != 0) {
 		CERROR("prep_req failed: %d\n", rc);
 		goto out;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 23/41] staging: lustre: mgc: MGC should retry for invalid import
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	Andreas Dilger, James Simmons

From: wang di <di.wang@intel.com>

After http://review.whamcloud.com/#/c/9967/ is landed, mgc does
not wait the import connected(state = FULL), then enqueue and
retrieve config log, which will cause the mount process to fail,
especially if the mgc is shared by multiple targets.
So once mgc enqueue is failed, it will give another chance to
wait the import to recover, if the import comes back in time,
it will try to enqueue again. Otherwise it will use local config log.

Signed-off-by: wang di <di.wang@intel.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5420
Reviewed-on: http://review.whamcloud.com/11258
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_ha.h |    1 +
 drivers/staging/lustre/lustre/mgc/mgc_request.c   |   85 +++++++++++++++++++--
 drivers/staging/lustre/lustre/ptlrpc/import.c     |    3 +-
 3 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_ha.h b/drivers/staging/lustre/lustre/include/lustre_ha.h
index cde7ed7..dec1e99 100644
--- a/drivers/staging/lustre/lustre/include/lustre_ha.h
+++ b/drivers/staging/lustre/lustre/include/lustre_ha.h
@@ -53,6 +53,7 @@ void ptlrpc_activate_import(struct obd_import *imp);
 void ptlrpc_deactivate_import(struct obd_import *imp);
 void ptlrpc_invalidate_import(struct obd_import *imp);
 void ptlrpc_fail_import(struct obd_import *imp, __u32 conn_cnt);
+void ptlrpc_pinger_force(struct obd_import *imp);
 
 /** @} ha */
 
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 1a0e59a..7b5ac44 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -1552,14 +1552,52 @@ out_free:
 	return rc;
 }
 
-/** Get a config log from the MGS and process it.
- * This func is called for both clients and servers.
- * Copy the log locally before parsing it if appropriate (non-MGS server)
+static bool mgc_import_in_recovery(struct obd_import *imp)
+{
+	bool in_recovery = true;
+
+	spin_lock(&imp->imp_lock);
+	if (imp->imp_state == LUSTRE_IMP_FULL ||
+	    imp->imp_state == LUSTRE_IMP_CLOSED)
+		in_recovery = false;
+	spin_unlock(&imp->imp_lock);
+
+	return in_recovery;
+}
+
+/**
+ * Get a configuration log from the MGS and process it.
+ *
+ * This function is called for both clients and servers to process the
+ * configuration log from the MGS.  The MGC enqueues a DLM lock on the
+ * log from the MGS, and if the lock gets revoked the MGC will be notified
+ * by the lock cancellation callback that the config log has changed,
+ * and will enqueue another MGS lock on it, and then continue processing
+ * the new additions to the end of the log.
+ *
+ * Since the MGC import is not replayable, if the import is being evicted
+ * (rcl == -ESHUTDOWN, \see ptlrpc_import_delay_req()), retry to process
+ * the log until recovery is finished or the import is closed.
+ *
+ * Make a local copy of the log before parsing it if appropriate (non-MGS
+ * server) so that the server can start even when the MGS is down.
+ *
+ * There shouldn't be multiple processes running process_log at once --
+ * sounds like badness.  It actually might be fine, as long as they're not
+ * trying to update from the same log simultaneously, in which case we
+ * should use a per-log semaphore instead of cld_lock.
+ *
+ * \param[in] mgc	MGC device by which to fetch the configuration log
+ * \param[in] cld	log processing state (stored in lock callback data)
+ *
+ * \retval		0 on success
+ * \retval		negative errno on failure
  */
 int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 {
 	struct lustre_handle lockh = { 0 };
 	__u64 flags = LDLM_FL_NO_LRU;
+	bool retry = false;
 	int rc = 0, rcl;
 
 	LASSERT(cld);
@@ -1569,6 +1607,7 @@ int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 	 * we're not trying to update from the same log
 	 * simultaneously (in which case we should use a per-log sem.)
 	 */
+restart:
 	mutex_lock(&cld->cld_lock);
 	if (cld->cld_stopping) {
 		mutex_unlock(&cld->cld_lock);
@@ -1592,10 +1631,42 @@ int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 	} else {
 		CDEBUG(D_MGC, "Can't get cfg lock: %d\n", rcl);
 
-		/* mark cld_lostlock so that it will requeue
-		 * after MGC becomes available.
-		 */
-		cld->cld_lostlock = 1;
+		if (rcl == -ESHUTDOWN &&
+		    atomic_read(&mgc->u.cli.cl_mgc_refcount) > 0 && !retry) {
+			int secs = cfs_time_seconds(obd_timeout);
+			struct obd_import *imp;
+			struct l_wait_info lwi;
+
+			mutex_unlock(&cld->cld_lock);
+			imp = class_exp2cliimp(mgc->u.cli.cl_mgc_mgsexp);
+
+			/*
+			 * Let's force the pinger, and wait the import to be
+			 * connected, note: since mgc import is non-replayable,
+			 * and even the import state is disconnected, it does
+			 * not mean the "recovery" is stopped, so we will keep
+			 * waitting until timeout or the import state is
+			 * FULL or closed
+			 */
+			ptlrpc_pinger_force(imp);
+
+			lwi = LWI_TIMEOUT(secs, NULL, NULL);
+			l_wait_event(imp->imp_recovery_waitq,
+				     !mgc_import_in_recovery(imp), &lwi);
+
+			if (imp->imp_state == LUSTRE_IMP_FULL) {
+				retry = true;
+				goto restart;
+			} else {
+				mutex_lock(&cld->cld_lock);
+				cld->cld_lostlock = 1;
+			}
+		} else {
+			/* mark cld_lostlock so that it will requeue
+			 * after MGC becomes available.
+			 */
+			cld->cld_lostlock = 1;
+		}
 		/* Get extra reference, it will be put in requeue thread */
 		config_log_get(cld);
 	}
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index b245784..2bdaf2b 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -396,7 +396,7 @@ void ptlrpc_activate_import(struct obd_import *imp)
 }
 EXPORT_SYMBOL(ptlrpc_activate_import);
 
-static void ptlrpc_pinger_force(struct obd_import *imp)
+void ptlrpc_pinger_force(struct obd_import *imp)
 {
 	CDEBUG(D_HA, "%s: waking up pinger s:%s\n", obd2cli_tgt(imp->imp_obd),
 	       ptlrpc_import_state_name(imp->imp_state));
@@ -408,6 +408,7 @@ static void ptlrpc_pinger_force(struct obd_import *imp)
 	if (imp->imp_state != LUSTRE_IMP_CONNECTING)
 		ptlrpc_pinger_wake_up();
 }
+EXPORT_SYMBOL(ptlrpc_pinger_force);
 
 void ptlrpc_fail_import(struct obd_import *imp, __u32 conn_cnt)
 {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 23/41] staging: lustre: mgc: MGC should retry for invalid import
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, wang di,
	James Simmons

From: wang di <di.wang@intel.com>

After http://review.whamcloud.com/#/c/9967/ is landed, mgc does
not wait the import connected(state = FULL), then enqueue and
retrieve config log, which will cause the mount process to fail,
especially if the mgc is shared by multiple targets.
So once mgc enqueue is failed, it will give another chance to
wait the import to recover, if the import comes back in time,
it will try to enqueue again. Otherwise it will use local config log.

Signed-off-by: wang di <di.wang@intel.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5420
Reviewed-on: http://review.whamcloud.com/11258
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_ha.h |    1 +
 drivers/staging/lustre/lustre/mgc/mgc_request.c   |   85 +++++++++++++++++++--
 drivers/staging/lustre/lustre/ptlrpc/import.c     |    3 +-
 3 files changed, 81 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_ha.h b/drivers/staging/lustre/lustre/include/lustre_ha.h
index cde7ed7..dec1e99 100644
--- a/drivers/staging/lustre/lustre/include/lustre_ha.h
+++ b/drivers/staging/lustre/lustre/include/lustre_ha.h
@@ -53,6 +53,7 @@ void ptlrpc_activate_import(struct obd_import *imp);
 void ptlrpc_deactivate_import(struct obd_import *imp);
 void ptlrpc_invalidate_import(struct obd_import *imp);
 void ptlrpc_fail_import(struct obd_import *imp, __u32 conn_cnt);
+void ptlrpc_pinger_force(struct obd_import *imp);
 
 /** @} ha */
 
diff --git a/drivers/staging/lustre/lustre/mgc/mgc_request.c b/drivers/staging/lustre/lustre/mgc/mgc_request.c
index 1a0e59a..7b5ac44 100644
--- a/drivers/staging/lustre/lustre/mgc/mgc_request.c
+++ b/drivers/staging/lustre/lustre/mgc/mgc_request.c
@@ -1552,14 +1552,52 @@ out_free:
 	return rc;
 }
 
-/** Get a config log from the MGS and process it.
- * This func is called for both clients and servers.
- * Copy the log locally before parsing it if appropriate (non-MGS server)
+static bool mgc_import_in_recovery(struct obd_import *imp)
+{
+	bool in_recovery = true;
+
+	spin_lock(&imp->imp_lock);
+	if (imp->imp_state == LUSTRE_IMP_FULL ||
+	    imp->imp_state == LUSTRE_IMP_CLOSED)
+		in_recovery = false;
+	spin_unlock(&imp->imp_lock);
+
+	return in_recovery;
+}
+
+/**
+ * Get a configuration log from the MGS and process it.
+ *
+ * This function is called for both clients and servers to process the
+ * configuration log from the MGS.  The MGC enqueues a DLM lock on the
+ * log from the MGS, and if the lock gets revoked the MGC will be notified
+ * by the lock cancellation callback that the config log has changed,
+ * and will enqueue another MGS lock on it, and then continue processing
+ * the new additions to the end of the log.
+ *
+ * Since the MGC import is not replayable, if the import is being evicted
+ * (rcl == -ESHUTDOWN, \see ptlrpc_import_delay_req()), retry to process
+ * the log until recovery is finished or the import is closed.
+ *
+ * Make a local copy of the log before parsing it if appropriate (non-MGS
+ * server) so that the server can start even when the MGS is down.
+ *
+ * There shouldn't be multiple processes running process_log at once --
+ * sounds like badness.  It actually might be fine, as long as they're not
+ * trying to update from the same log simultaneously, in which case we
+ * should use a per-log semaphore instead of cld_lock.
+ *
+ * \param[in] mgc	MGC device by which to fetch the configuration log
+ * \param[in] cld	log processing state (stored in lock callback data)
+ *
+ * \retval		0 on success
+ * \retval		negative errno on failure
  */
 int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 {
 	struct lustre_handle lockh = { 0 };
 	__u64 flags = LDLM_FL_NO_LRU;
+	bool retry = false;
 	int rc = 0, rcl;
 
 	LASSERT(cld);
@@ -1569,6 +1607,7 @@ int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 	 * we're not trying to update from the same log
 	 * simultaneously (in which case we should use a per-log sem.)
 	 */
+restart:
 	mutex_lock(&cld->cld_lock);
 	if (cld->cld_stopping) {
 		mutex_unlock(&cld->cld_lock);
@@ -1592,10 +1631,42 @@ int mgc_process_log(struct obd_device *mgc, struct config_llog_data *cld)
 	} else {
 		CDEBUG(D_MGC, "Can't get cfg lock: %d\n", rcl);
 
-		/* mark cld_lostlock so that it will requeue
-		 * after MGC becomes available.
-		 */
-		cld->cld_lostlock = 1;
+		if (rcl == -ESHUTDOWN &&
+		    atomic_read(&mgc->u.cli.cl_mgc_refcount) > 0 && !retry) {
+			int secs = cfs_time_seconds(obd_timeout);
+			struct obd_import *imp;
+			struct l_wait_info lwi;
+
+			mutex_unlock(&cld->cld_lock);
+			imp = class_exp2cliimp(mgc->u.cli.cl_mgc_mgsexp);
+
+			/*
+			 * Let's force the pinger, and wait the import to be
+			 * connected, note: since mgc import is non-replayable,
+			 * and even the import state is disconnected, it does
+			 * not mean the "recovery" is stopped, so we will keep
+			 * waitting until timeout or the import state is
+			 * FULL or closed
+			 */
+			ptlrpc_pinger_force(imp);
+
+			lwi = LWI_TIMEOUT(secs, NULL, NULL);
+			l_wait_event(imp->imp_recovery_waitq,
+				     !mgc_import_in_recovery(imp), &lwi);
+
+			if (imp->imp_state == LUSTRE_IMP_FULL) {
+				retry = true;
+				goto restart;
+			} else {
+				mutex_lock(&cld->cld_lock);
+				cld->cld_lostlock = 1;
+			}
+		} else {
+			/* mark cld_lostlock so that it will requeue
+			 * after MGC becomes available.
+			 */
+			cld->cld_lostlock = 1;
+		}
 		/* Get extra reference, it will be put in requeue thread */
 		config_log_get(cld);
 	}
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index b245784..2bdaf2b 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -396,7 +396,7 @@ void ptlrpc_activate_import(struct obd_import *imp)
 }
 EXPORT_SYMBOL(ptlrpc_activate_import);
 
-static void ptlrpc_pinger_force(struct obd_import *imp)
+void ptlrpc_pinger_force(struct obd_import *imp)
 {
 	CDEBUG(D_HA, "%s: waking up pinger s:%s\n", obd2cli_tgt(imp->imp_obd),
 	       ptlrpc_import_state_name(imp->imp_state));
@@ -408,6 +408,7 @@ static void ptlrpc_pinger_force(struct obd_import *imp)
 	if (imp->imp_state != LUSTRE_IMP_CONNECTING)
 		ptlrpc_pinger_wake_up();
 }
+EXPORT_SYMBOL(ptlrpc_pinger_force);
 
 void ptlrpc_fail_import(struct obd_import *imp, __u32 conn_cnt)
 {
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 24/41] staging: lustre: clio: add CIT_DATA_VERSION and remove IOC_LOV_GETINFO
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Bobi Jam, James Simmons

From: John L. Hammond <john.hammond@intel.com>

During development a new api, cl_object_obd_info_get()
and cl_object_data_version() which then were later
replaced by a better solution CIT_DATA_VERSION. For
the case of the upstream client their is no point in
introducing a API to only have it removed later. Due
to the way the patches landed with their dependencies
it is not possible to separate out two patches. These
two combined patches do the following:

 * Add a new cl_io type CIT_DATA_VERSION to get file
   data version.
 * Remove the unused IOC_LOV_GETINFO ioctl.
 * Remove ll_glimpse_ioctl() and ll_lsm_getattr().
 * Remove the OBD API method obd_getattr_async().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/12748
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
Reviewed-on: http://review.whamcloud.com/14649
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    6 +
 .../lustre/lustre/include/lustre/lustre_user.h     |    6 +-
 drivers/staging/lustre/lustre/include/obd.h        |    2 -
 drivers/staging/lustre/lustre/include/obd_class.h  |   13 --
 drivers/staging/lustre/lustre/llite/dir.c          |   71 ----------
 drivers/staging/lustre/lustre/llite/file.c         |  125 +++++-------------
 .../staging/lustre/lustre/llite/llite_internal.h   |    2 -
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    4 -
 drivers/staging/lustre/lustre/lov/lov_io.c         |   40 ++++++
 drivers/staging/lustre/lustre/lov/lov_merge.c      |   50 -------
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   69 ----------
 drivers/staging/lustre/lustre/lov/lov_request.c    |  144 --------------------
 drivers/staging/lustre/lustre/obdclass/cl_io.c     |    1 +
 drivers/staging/lustre/lustre/osc/osc_io.c         |  105 ++++++++++++++
 drivers/staging/lustre/lustre/osc/osc_request.c    |   59 --------
 15 files changed, 189 insertions(+), 508 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 0b66d02..e46a510 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1373,6 +1373,8 @@ enum cl_io_type {
 	CIT_WRITE,
 	/** truncate, utime system calls */
 	CIT_SETATTR,
+	/** get data version */
+	CIT_DATA_VERSION,
 	/**
 	 * page fault handling
 	 */
@@ -1777,6 +1779,10 @@ struct cl_io {
 			int		sa_stripe_index;
 			const struct lu_fid	*sa_parent_fid;
 		} ci_setattr;
+		struct cl_data_version_io {
+			u64 dv_data_version;
+			int dv_flags;
+		} ci_data_version;
 		struct cl_fault_io {
 			/** page index within file. */
 			pgoff_t	 ft_index;
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index dced31f..043fc1c 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -63,9 +63,13 @@
 #if __BITS_PER_LONG != 64 || defined(__ARCH_WANT_STAT64)
 typedef struct stat64   lstat_t;
 #define lstat_f  lstat64
+#define fstat_f		fstat64
+#define fstatat_f	fstatat64
 #else
 typedef struct stat     lstat_t;
 #define lstat_f  lstat
+#define fstat_f		fstat
+#define fstatat_f	fstatat
 #endif
 
 #define HAVE_LOV_USER_MDS_DATA
@@ -234,7 +238,7 @@ struct ost_id {
 /* #define LL_IOC_POLL_QUOTACHECK	161 OBD_IOC_POLL_QUOTACHECK */
 /* #define LL_IOC_QUOTACTL		162 OBD_IOC_QUOTACTL */
 #define IOC_OBD_STATFS		  _IOWR('f', 164, struct obd_statfs *)
-#define IOC_LOV_GETINFO		 _IOWR('f', 165, struct lov_user_mds_data *)
+/*	IOC_LOV_GETINFO			165 obsolete */
 #define LL_IOC_FLUSHCTX		 _IOW('f', 166, long)
 /* LL_IOC_RMTACL			167 obsolete */
 #define LL_IOC_GETOBDCOUNT	      _IOR('f', 168, long)
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index c72a1e1..f254d88 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -905,8 +905,6 @@ struct obd_ops {
 		       struct obd_info *oinfo, struct obd_trans_info *oti);
 	int (*getattr)(const struct lu_env *env, struct obd_export *exp,
 		       struct obd_info *oinfo);
-	int (*getattr_async)(struct obd_export *exp, struct obd_info *oinfo,
-			     struct ptlrpc_request_set *set);
 	int (*preprw)(const struct lu_env *env, int cmd,
 		      struct obd_export *exp, struct obdo *oa, int objcount,
 		      struct obd_ioobj *obj, struct niobuf_remote *remote,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 7a5d75a..cb7160e 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -721,19 +721,6 @@ static inline int obd_getattr(const struct lu_env *env, struct obd_export *exp,
 	return rc;
 }
 
-static inline int obd_getattr_async(struct obd_export *exp,
-				    struct obd_info *oinfo,
-				    struct ptlrpc_request_set *set)
-{
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, getattr_async);
-	EXP_COUNTER_INCREMENT(exp, getattr_async);
-
-	rc = OBP(exp->exp_obd, getattr_async)(exp, oinfo, set);
-	return rc;
-}
-
 static inline int obd_setattr(const struct lu_env *env, struct obd_export *exp,
 			      struct obd_info *oinfo,
 			      struct obd_trans_info *oti)
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 12e9a38..360d97f 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1369,77 +1369,6 @@ out_req:
 			ll_putname(filename);
 		return rc;
 	}
-	case IOC_LOV_GETINFO: {
-		struct lov_user_mds_data __user *lumd;
-		struct lov_stripe_md *lsm;
-		struct lov_user_md __user *lum;
-		struct lov_mds_md *lmm;
-		int lmmsize;
-		lstat_t st;
-
-		lumd = (struct lov_user_mds_data __user *)arg;
-		lum = &lumd->lmd_lmm;
-
-		rc = ll_get_max_mdsize(sbi, &lmmsize);
-		if (rc)
-			return rc;
-
-		lmm = libcfs_kvzalloc(lmmsize, GFP_NOFS);
-		if (!lmm)
-			return -ENOMEM;
-		if (copy_from_user(lmm, lum, lmmsize)) {
-			rc = -EFAULT;
-			goto free_lmm;
-		}
-
-		switch (lmm->lmm_magic) {
-		case LOV_USER_MAGIC_V1:
-			if (cpu_to_le32(LOV_USER_MAGIC_V1) == LOV_USER_MAGIC_V1)
-				break;
-			/* swab objects first so that stripes num will be sane */
-			lustre_swab_lov_user_md_objects(
-				((struct lov_user_md_v1 *)lmm)->lmm_objects,
-				((struct lov_user_md_v1 *)lmm)->lmm_stripe_count);
-			lustre_swab_lov_user_md_v1((struct lov_user_md_v1 *)lmm);
-			break;
-		case LOV_USER_MAGIC_V3:
-			if (cpu_to_le32(LOV_USER_MAGIC_V3) == LOV_USER_MAGIC_V3)
-				break;
-			/* swab objects first so that stripes num will be sane */
-			lustre_swab_lov_user_md_objects(
-				((struct lov_user_md_v3 *)lmm)->lmm_objects,
-				((struct lov_user_md_v3 *)lmm)->lmm_stripe_count);
-			lustre_swab_lov_user_md_v3((struct lov_user_md_v3 *)lmm);
-			break;
-		default:
-			rc = -EINVAL;
-			goto free_lmm;
-		}
-
-		rc = obd_unpackmd(sbi->ll_dt_exp, &lsm, lmm, lmmsize);
-		if (rc < 0) {
-			rc = -ENOMEM;
-			goto free_lmm;
-		}
-
-		/* Perform glimpse_size operation. */
-		memset(&st, 0, sizeof(st));
-
-		rc = ll_glimpse_ioctl(sbi, lsm, &st);
-		if (rc)
-			goto free_lsm;
-
-		if (copy_to_user(&lumd->lmd_st, &st, sizeof(st))) {
-			rc = -EFAULT;
-			goto free_lsm;
-		}
-
-free_lsm:
-		obd_free_memmd(sbi->ll_dt_exp, &lsm);
-free_lmm:
-		kvfree(lmm);
-		return rc;
-	}
 	case OBD_IOC_QUOTACHECK: {
 		struct obd_quotactl *oqctl;
 		int error = 0;
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 89a2841..8fa65a5 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -865,55 +865,6 @@ static int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 					 inode, och, NULL);
 }
 
-/* Fills the obdo with the attributes for the lsm */
-static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
-			  struct obdo *obdo, int dv_flags)
-{
-	struct ptlrpc_request_set *set;
-	struct obd_info	    oinfo = { };
-	int			rc;
-
-	LASSERT(lsm);
-
-	oinfo.oi_md = lsm;
-	oinfo.oi_oa = obdo;
-	oinfo.oi_oa->o_oi = lsm->lsm_oi;
-	oinfo.oi_oa->o_mode = S_IFREG;
-	oinfo.oi_oa->o_valid = OBD_MD_FLID | OBD_MD_FLTYPE |
-			       OBD_MD_FLSIZE | OBD_MD_FLBLOCKS |
-			       OBD_MD_FLBLKSZ | OBD_MD_FLATIME |
-			       OBD_MD_FLMTIME | OBD_MD_FLCTIME |
-			       OBD_MD_FLGROUP | OBD_MD_FLDATAVERSION;
-	if (dv_flags & (LL_DV_WR_FLUSH | LL_DV_RD_FLUSH)) {
-		oinfo.oi_oa->o_valid |= OBD_MD_FLFLAGS;
-		oinfo.oi_oa->o_flags |= OBD_FL_SRVLOCK;
-		if (dv_flags & LL_DV_WR_FLUSH)
-			oinfo.oi_oa->o_flags |= OBD_FL_FLUSH;
-	}
-
-	set = ptlrpc_prep_set();
-	if (!set) {
-		CERROR("cannot allocate ptlrpc set: rc = %d\n", -ENOMEM);
-		rc = -ENOMEM;
-	} else {
-		rc = obd_getattr_async(exp, &oinfo, set);
-		if (rc == 0)
-			rc = ptlrpc_set_wait(set);
-		ptlrpc_set_destroy(set);
-	}
-	if (rc == 0) {
-		oinfo.oi_oa->o_valid &= (OBD_MD_FLBLOCKS | OBD_MD_FLBLKSZ |
-					 OBD_MD_FLATIME | OBD_MD_FLMTIME |
-					 OBD_MD_FLCTIME | OBD_MD_FLSIZE |
-					 OBD_MD_FLDATAVERSION | OBD_MD_FLFLAGS);
-		if (dv_flags & LL_DV_WR_FLUSH &&
-		    !(oinfo.oi_oa->o_valid & OBD_MD_FLFLAGS &&
-		      oinfo.oi_oa->o_flags & OBD_FL_FLUSH))
-			return -ENOTSUPP;
-	}
-	return rc;
-}
-
 int ll_merge_attr(const struct lu_env *env, struct inode *inode)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
@@ -970,23 +921,6 @@ out_size_unlock:
 	return rc;
 }
 
-int ll_glimpse_ioctl(struct ll_sb_info *sbi, struct lov_stripe_md *lsm,
-		     lstat_t *st)
-{
-	struct obdo obdo = { 0 };
-	int rc;
-
-	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, &obdo, 0);
-	if (rc == 0) {
-		st->st_size   = obdo.o_size;
-		st->st_blocks = obdo.o_blocks;
-		st->st_mtime  = obdo.o_mtime;
-		st->st_atime  = obdo.o_atime;
-		st->st_ctime  = obdo.o_ctime;
-	}
-	return rc;
-}
-
 static bool file_is_noatime(const struct file *file)
 {
 	const struct vfsmount *mnt = file->f_path.mnt;
@@ -1635,45 +1569,50 @@ gf_free:
  * This value is computed using stripe object version on OST.
  * Version is computed using server side locking.
  *
- * @param sync  if do sync on the OST side;
+ * @param flags if do sync on the OST side;
  *		0: no sync
  *		LL_DV_RD_FLUSH: flush dirty pages, LCK_PR on OSTs
  *		LL_DV_WR_FLUSH: drop all caching pages, LCK_PW on OSTs
  */
 int ll_data_version(struct inode *inode, __u64 *data_version, int flags)
 {
-	struct lov_stripe_md	*lsm = NULL;
-	struct ll_sb_info	*sbi = ll_i2sbi(inode);
-	struct obdo		*obdo = NULL;
-	int			 rc;
+	struct cl_object *obj = ll_i2info(inode)->lli_clob;
+	struct lu_env *env;
+	struct cl_io *io;
+	int refcheck;
+	int result;
 
-	/* If no stripe, we consider version is 0. */
-	lsm = ccc_inode_lsm_get(inode);
-	if (!lsm_has_objects(lsm)) {
+	/* If no file object initialized, we consider its version is 0. */
+	if (!obj) {
 		*data_version = 0;
-		CDEBUG(D_INODE, "No object for inode\n");
-		rc = 0;
-		goto out;
+		return 0;
 	}
 
-	obdo = kzalloc(sizeof(*obdo), GFP_NOFS);
-	if (!obdo) {
-		rc = -ENOMEM;
-		goto out;
-	}
+	env = cl_env_get(&refcheck);
+	if (IS_ERR(env))
+		return PTR_ERR(env);
 
-	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, obdo, flags);
-	if (rc == 0) {
-		if (!(obdo->o_valid & OBD_MD_FLDATAVERSION))
-			rc = -EOPNOTSUPP;
-		else
-			*data_version = obdo->o_data_version;
-	}
+	io = vvp_env_thread_io(env);
+	io->ci_obj = obj;
+	io->u.ci_data_version.dv_data_version = 0;
+	io->u.ci_data_version.dv_flags = flags;
 
-	kfree(obdo);
-out:
-	ccc_inode_lsm_put(inode, lsm);
-	return rc;
+restart:
+	if (!cl_io_init(env, io, CIT_DATA_VERSION, io->ci_obj))
+		result = cl_io_loop(env, io);
+	else
+		result = io->ci_result;
+
+	*data_version = io->u.ci_data_version.dv_data_version;
+
+	cl_io_fini(env, io);
+
+	if (unlikely(io->ci_need_restart))
+		goto restart;
+
+	cl_env_put(env, &refcheck);
+
+	return result;
 }
 
 /*
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 02541b1..e249895 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -741,8 +741,6 @@ enum ldlm_mode ll_take_md_lock(struct inode *inode, __u64 bits,
 			       enum ldlm_mode mode);
 int ll_file_open(struct inode *inode, struct file *file);
 int ll_file_release(struct inode *inode, struct file *file);
-int ll_glimpse_ioctl(struct ll_sb_info *sbi,
-		     struct lov_stripe_md *lsm, lstat_t *st);
 int ll_release_openhandle(struct inode *, struct lookup_intent *);
 int ll_md_real_close(struct inode *inode, fmode_t fmode);
 void ll_pack_inode2opdata(struct inode *inode, struct md_op_data *op_data,
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 4743c65..fffc18c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -132,8 +132,6 @@ static inline void lov_put_reqset(struct lov_request_set *set)
 	(char *)((lv)->lov_tgts[index]->ltd_uuid.uuid)
 
 /* lov_merge.c */
-void lov_merge_attrs(struct obdo *tgt, struct obdo *src, u64 valid,
-		     struct lov_stripe_md *lsm, int stripeno, int *set);
 int lov_merge_lvb_kms(struct lov_stripe_md *lsm,
 		      struct ost_lvb *lvb, __u64 *kms_place);
 
@@ -150,8 +148,6 @@ pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, pgoff_t stripe_index,
 			 int stripe);
 
 /* lov_request.c */
-int lov_update_common_set(struct lov_request_set *set,
-			  struct lov_request *req, int rc);
 int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
 			 struct lov_request_set **reqset);
 int lov_fini_getattr_set(struct lov_request_set *set);
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 369718e..d6be613 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -100,6 +100,12 @@ static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
 		}
 		break;
 	}
+	case CIT_DATA_VERSION: {
+		io->u.ci_data_version.dv_data_version = 0;
+		io->u.ci_data_version.dv_flags =
+			parent->u.ci_data_version.dv_flags;
+		break;
+	}
 	case CIT_FAULT: {
 		struct cl_object *obj = parent->ci_obj;
 		loff_t off = cl_offset(obj, parent->u.ci_fault.ft_index);
@@ -339,6 +345,11 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj,
 		lio->lis_endpos = OBD_OBJECT_EOF;
 		break;
 
+	case CIT_DATA_VERSION:
+		lio->lis_pos = 0;
+		lio->lis_endpos = OBD_OBJECT_EOF;
+		break;
+
 	case CIT_FAULT: {
 		pgoff_t index = io->u.ci_fault.ft_index;
 
@@ -516,6 +527,24 @@ static int lov_io_end_wrapper(const struct lu_env *env, struct cl_io *io)
 	return 0;
 }
 
+static void
+lov_io_data_version_end(const struct lu_env *env, const struct cl_io_slice *ios)
+{
+	struct lov_io *lio = cl2lov_io(env, ios);
+	struct cl_io *parent = lio->lis_cl.cis_io;
+	struct lov_io_sub *sub;
+
+	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
+		lov_io_end_wrapper(env, sub->sub_io);
+
+		parent->u.ci_data_version.dv_data_version +=
+			sub->sub_io->u.ci_data_version.dv_data_version;
+
+		if (!parent->ci_result)
+			parent->ci_result = sub->sub_io->ci_result;
+	}
+}
+
 static int lov_io_iter_fini_wrapper(const struct lu_env *env, struct cl_io *io)
 {
 	cl_io_iter_fini(env, io);
@@ -838,6 +867,15 @@ static const struct cl_io_operations lov_io_ops = {
 			.cio_start     = lov_io_start,
 			.cio_end       = lov_io_end
 		},
+		[CIT_DATA_VERSION] = {
+			.cio_fini	= lov_io_fini,
+			.cio_iter_init	= lov_io_iter_init,
+			.cio_iter_fini	= lov_io_iter_fini,
+			.cio_lock	= lov_io_lock,
+			.cio_unlock	= lov_io_unlock,
+			.cio_start	= lov_io_start,
+			.cio_end	= lov_io_data_version_end,
+		},
 		[CIT_FAULT] = {
 			.cio_fini      = lov_io_fini,
 			.cio_iter_init = lov_io_iter_init,
@@ -969,6 +1007,7 @@ int lov_io_init_empty(const struct lu_env *env, struct cl_object *obj,
 		break;
 	case CIT_FSYNC:
 	case CIT_SETATTR:
+	case CIT_DATA_VERSION:
 		result = 1;
 		break;
 	case CIT_WRITE:
@@ -1004,6 +1043,7 @@ int lov_io_init_released(const struct lu_env *env, struct cl_object *obj,
 		LASSERTF(0, "invalid type %d\n", io->ci_type);
 	case CIT_MISC:
 	case CIT_FSYNC:
+	case CIT_DATA_VERSION:
 		result = 1;
 		break;
 	case CIT_SETATTR:
diff --git a/drivers/staging/lustre/lustre/lov/lov_merge.c b/drivers/staging/lustre/lustre/lov/lov_merge.c
index 674af10..391dfd2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_merge.c
+++ b/drivers/staging/lustre/lustre/lov/lov_merge.c
@@ -104,53 +104,3 @@ int lov_merge_lvb_kms(struct lov_stripe_md *lsm,
 	lvb->lvb_ctime = current_ctime;
 	return rc;
 }
-
-void lov_merge_attrs(struct obdo *tgt, struct obdo *src, u64 valid,
-		     struct lov_stripe_md *lsm, int stripeno, int *set)
-{
-	valid &= src->o_valid;
-
-	if (*set) {
-		tgt->o_valid &= valid;
-		if (valid & OBD_MD_FLSIZE) {
-			/* this handles sparse files properly */
-			u64 lov_size;
-
-			lov_size = lov_stripe_size(lsm, src->o_size, stripeno);
-			if (lov_size > tgt->o_size)
-				tgt->o_size = lov_size;
-		}
-		if (valid & OBD_MD_FLBLOCKS)
-			tgt->o_blocks += src->o_blocks;
-		if (valid & OBD_MD_FLBLKSZ)
-			tgt->o_blksize += src->o_blksize;
-		if (valid & OBD_MD_FLCTIME && tgt->o_ctime < src->o_ctime)
-			tgt->o_ctime = src->o_ctime;
-		if (valid & OBD_MD_FLMTIME && tgt->o_mtime < src->o_mtime)
-			tgt->o_mtime = src->o_mtime;
-		if (valid & OBD_MD_FLDATAVERSION)
-			tgt->o_data_version += src->o_data_version;
-
-		/* handle flags */
-		if (valid & OBD_MD_FLFLAGS)
-			tgt->o_flags &= src->o_flags;
-		else
-			tgt->o_flags = 0;
-	} else {
-		memcpy(tgt, src, sizeof(*tgt));
-		tgt->o_oi = lsm->lsm_oi;
-		tgt->o_valid = valid;
-		if (valid & OBD_MD_FLSIZE)
-			tgt->o_size = lov_stripe_size(lsm, src->o_size,
-						      stripeno);
-		tgt->o_flags = 0;
-		if (valid & OBD_MD_FLFLAGS)
-			tgt->o_flags = src->o_flags;
-	}
-
-	/* data_version needs to be valid on all stripes to be correct! */
-	if (!(valid & OBD_MD_FLDATAVERSION))
-		tgt->o_valid &= ~OBD_MD_FLDATAVERSION;
-
-	*set += 1;
-}
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 44f53c7..019fe95 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -979,74 +979,6 @@ do {									    \
 		 "%p->lsm_magic=%x\n", (lsmp), (lsmp)->lsm_magic);	      \
 } while (0)
 
-static int lov_getattr_interpret(struct ptlrpc_request_set *rqset,
-				 void *data, int rc)
-{
-	struct lov_request_set *lovset = (struct lov_request_set *)data;
-	int err;
-
-	/* don't do attribute merge if this async op failed */
-	if (rc)
-		atomic_set(&lovset->set_completes, 0);
-	err = lov_fini_getattr_set(lovset);
-	return rc ? rc : err;
-}
-
-static int lov_getattr_async(struct obd_export *exp, struct obd_info *oinfo,
-			     struct ptlrpc_request_set *rqset)
-{
-	struct lov_request_set *lovset;
-	struct lov_obd *lov;
-	struct lov_request *req;
-	int rc = 0, err;
-
-	LASSERT(oinfo);
-	ASSERT_LSM_MAGIC(oinfo->oi_md);
-
-	if (!exp || !exp->exp_obd)
-		return -ENODEV;
-
-	lov = &exp->exp_obd->u.lov;
-
-	rc = lov_prep_getattr_set(exp, oinfo, &lovset);
-	if (rc)
-		return rc;
-
-	CDEBUG(D_INFO, "objid "DOSTID": %ux%u byte stripes\n",
-	       POSTID(&oinfo->oi_md->lsm_oi), oinfo->oi_md->lsm_stripe_count,
-	       oinfo->oi_md->lsm_stripe_size);
-
-	list_for_each_entry(req, &lovset->set_list, rq_link) {
-		CDEBUG(D_INFO, "objid " DOSTID "[%d] has subobj " DOSTID " at idx%u\n",
-		       POSTID(&oinfo->oi_oa->o_oi), req->rq_stripe,
-		       POSTID(&req->rq_oi.oi_oa->o_oi), req->rq_idx);
-		rc = obd_getattr_async(lov->lov_tgts[req->rq_idx]->ltd_exp,
-				       &req->rq_oi, rqset);
-		if (rc) {
-			CERROR("%s: getattr objid "DOSTID" subobj"
-			       DOSTID" on OST idx %d: rc = %d\n",
-			       exp->exp_obd->obd_name,
-			       POSTID(&oinfo->oi_oa->o_oi),
-			       POSTID(&req->rq_oi.oi_oa->o_oi),
-			       req->rq_idx, rc);
-			goto out;
-		}
-	}
-
-	if (!list_empty(&rqset->set_requests)) {
-		LASSERT(rc == 0);
-		LASSERT(!rqset->set_interpret);
-		rqset->set_interpret = lov_getattr_interpret;
-		rqset->set_arg = (void *)lovset;
-		return rc;
-	}
-out:
-	if (rc)
-		atomic_set(&lovset->set_completes, 0);
-	err = lov_fini_getattr_set(lovset);
-	return rc ? rc : err;
-}
-
 int lov_statfs_interpret(struct ptlrpc_request_set *rqset, void *data, int rc)
 {
 	struct lov_request_set *lovset = (struct lov_request_set *)data;
@@ -1530,7 +1462,6 @@ static struct obd_ops lov_obd_ops = {
 	.statfs_async   = lov_statfs_async,
 	.packmd         = lov_packmd,
 	.unpackmd       = lov_unpackmd,
-	.getattr_async  = lov_getattr_async,
 	.iocontrol      = lov_iocontrol,
 	.get_info       = lov_get_info,
 	.set_info_async = lov_set_info_async,
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 048e597..42e66d1 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -97,22 +97,6 @@ static void lov_update_set(struct lov_request_set *set,
 	wake_up(&set->set_waitq);
 }
 
-int lov_update_common_set(struct lov_request_set *set,
-			  struct lov_request *req, int rc)
-{
-	struct lov_obd *lov = &set->set_exp->exp_obd->u.lov;
-
-	lov_update_set(set, req, rc);
-
-	/* grace error on inactive ost */
-	if (rc && !(lov->lov_tgts[req->rq_idx] &&
-		    lov->lov_tgts[req->rq_idx]->ltd_active))
-		rc = 0;
-
-	/* FIXME in raid1 regime, should return 0 */
-	return rc;
-}
-
 static void lov_set_add_req(struct lov_request *req,
 			    struct lov_request_set *set)
 {
@@ -183,134 +167,6 @@ out:
 	return rc;
 }
 
-static int common_attr_done(struct lov_request_set *set)
-{
-	struct lov_request *req;
-	struct obdo *tmp_oa;
-	int rc = 0, attrset = 0;
-
-	if (!set->set_oi->oi_oa)
-		return 0;
-
-	if (!atomic_read(&set->set_success))
-		return -EIO;
-
-	tmp_oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-	if (!tmp_oa) {
-		rc = -ENOMEM;
-		goto out;
-	}
-
-	list_for_each_entry(req, &set->set_list, rq_link) {
-		if (!req->rq_complete || req->rq_rc)
-			continue;
-		if (req->rq_oi.oi_oa->o_valid == 0)   /* inactive stripe */
-			continue;
-		lov_merge_attrs(tmp_oa, req->rq_oi.oi_oa,
-				req->rq_oi.oi_oa->o_valid,
-				set->set_oi->oi_md, req->rq_stripe, &attrset);
-	}
-	if (!attrset) {
-		CERROR("No stripes had valid attrs\n");
-		rc = -EIO;
-	}
-
-	tmp_oa->o_oi = set->set_oi->oi_oa->o_oi;
-	memcpy(set->set_oi->oi_oa, tmp_oa, sizeof(*set->set_oi->oi_oa));
-out:
-	if (tmp_oa)
-		kmem_cache_free(obdo_cachep, tmp_oa);
-	return rc;
-}
-
-int lov_fini_getattr_set(struct lov_request_set *set)
-{
-	int rc = 0;
-
-	if (!set)
-		return 0;
-	LASSERT(set->set_exp);
-	if (atomic_read(&set->set_completes))
-		rc = common_attr_done(set);
-
-	lov_put_reqset(set);
-
-	return rc;
-}
-
-/* The callback for osc_getattr_async that finalizes a request info when a
- * response is received.
- */
-static int cb_getattr_update(void *cookie, int rc)
-{
-	struct obd_info *oinfo = cookie;
-	struct lov_request *lovreq;
-
-	lovreq = container_of(oinfo, struct lov_request, rq_oi);
-	return lov_update_common_set(lovreq->rq_rqset, lovreq, rc);
-}
-
-int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
-			 struct lov_request_set **reqset)
-{
-	struct lov_request_set *set;
-	struct lov_obd *lov = &exp->exp_obd->u.lov;
-	int rc = 0, i;
-
-	set = kzalloc(sizeof(*set), GFP_NOFS);
-	if (!set)
-		return -ENOMEM;
-	lov_init_set(set);
-
-	set->set_exp = exp;
-	set->set_oi = oinfo;
-
-	for (i = 0; i < oinfo->oi_md->lsm_stripe_count; i++) {
-		struct lov_oinfo *loi;
-		struct lov_request *req;
-
-		loi = oinfo->oi_md->lsm_oinfo[i];
-		if (lov_oinfo_is_dummy(loi))
-			continue;
-
-		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
-			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
-			continue;
-		}
-
-		req = kzalloc(sizeof(*req), GFP_NOFS);
-		if (!req) {
-			rc = -ENOMEM;
-			goto out_set;
-		}
-
-		req->rq_stripe = i;
-		req->rq_idx = loi->loi_ost_idx;
-
-		req->rq_oi.oi_oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-		if (!req->rq_oi.oi_oa) {
-			kfree(req);
-			rc = -ENOMEM;
-			goto out_set;
-		}
-		memcpy(req->rq_oi.oi_oa, oinfo->oi_oa,
-		       sizeof(*req->rq_oi.oi_oa));
-		req->rq_oi.oi_oa->o_oi = loi->loi_oi;
-		req->rq_oi.oi_cb_up = cb_getattr_update;
-
-		lov_set_add_req(req, set);
-	}
-	if (!set->set_count) {
-		rc = -EIO;
-		goto out_set;
-	}
-	*reqset = set;
-	return rc;
-out_set:
-	lov_fini_getattr_set(set);
-	return rc;
-}
-
 #define LOV_U64_MAX ((__u64)~0ULL)
 #define LOV_SUM_MAX(tot, add)					   \
 	do {							    \
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_io.c b/drivers/staging/lustre/lustre/obdclass/cl_io.c
index 577f76e..c5621ad 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_io.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_io.c
@@ -126,6 +126,7 @@ void cl_io_fini(const struct lu_env *env, struct cl_io *io)
 	switch (io->ci_type) {
 	case CIT_READ:
 	case CIT_WRITE:
+	case CIT_DATA_VERSION:
 		break;
 	case CIT_FAULT:
 		break;
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index a96addf..b4e062d 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -606,6 +606,107 @@ static void osc_io_setattr_end(const struct lu_env *env,
 	}
 }
 
+struct osc_data_version_args {
+	struct osc_io *dva_oio;
+};
+
+static int
+osc_data_version_interpret(const struct lu_env *env, struct ptlrpc_request *req,
+			   void *arg, int rc)
+{
+	struct osc_data_version_args *dva = arg;
+	struct osc_io *oio = dva->dva_oio;
+	const struct ost_body *body;
+
+	if (rc < 0)
+		goto out;
+
+	body = req_capsule_server_get(&req->rq_pill, &RMF_OST_BODY);
+	if (!body) {
+		rc = -EPROTO;
+		goto out;
+	}
+
+	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, &oio->oi_oa,
+			     &body->oa);
+out:
+	oio->oi_cbarg.opc_rc = rc;
+	complete(&oio->oi_cbarg.opc_sync);
+
+	return 0;
+}
+
+static int osc_io_data_version_start(const struct lu_env *env,
+				     const struct cl_io_slice *slice)
+{
+	struct cl_data_version_io *dv = &slice->cis_io->u.ci_data_version;
+	struct osc_io *oio = cl2osc_io(env, slice);
+	struct osc_async_cbargs *cbargs = &oio->oi_cbarg;
+	struct osc_object *obj = cl2osc(slice->cis_obj);
+	struct obd_export *exp = osc_export(obj);
+	struct lov_oinfo *loi = obj->oo_oinfo;
+	struct osc_data_version_args *dva;
+	struct obdo *oa = &oio->oi_oa;
+	struct ptlrpc_request *req;
+	struct ost_body *body;
+	int rc;
+
+	memset(oa, 0, sizeof(*oa));
+	oa->o_oi = loi->loi_oi;
+	oa->o_valid = OBD_MD_FLID | OBD_MD_FLGROUP;
+
+	if (dv->dv_flags & (LL_DV_RD_FLUSH | LL_DV_WR_FLUSH)) {
+		oa->o_valid |= OBD_MD_FLFLAGS;
+		oa->o_flags |= OBD_FL_SRVLOCK;
+		if (dv->dv_flags & LL_DV_WR_FLUSH)
+			oa->o_flags |= OBD_FL_FLUSH;
+	}
+
+	init_completion(&cbargs->opc_sync);
+
+	req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_OST_GETATTR);
+	if (!req)
+		return -ENOMEM;
+
+	rc = ptlrpc_request_pack(req, LUSTRE_OST_VERSION, OST_GETATTR);
+	if (rc < 0) {
+		ptlrpc_request_free(req);
+		return rc;
+	}
+
+	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
+	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa, oa);
+
+	ptlrpc_request_set_replen(req);
+	req->rq_interpret_reply = osc_data_version_interpret;
+	CLASSERT(sizeof(*dva) <= sizeof(req->rq_async_args));
+	dva = ptlrpc_req_async_args(req);
+	dva->dva_oio = oio;
+
+	ptlrpcd_add_req(req);
+
+	return 0;
+}
+
+static void osc_io_data_version_end(const struct lu_env *env,
+				    const struct cl_io_slice *slice)
+{
+	struct cl_data_version_io *dv = &slice->cis_io->u.ci_data_version;
+	struct osc_io *oio = cl2osc_io(env, slice);
+	struct osc_async_cbargs *cbargs = &oio->oi_cbarg;
+
+	wait_for_completion(&cbargs->opc_sync);
+
+	if (cbargs->opc_rc) {
+		slice->cis_io->ci_result = cbargs->opc_rc;
+	} else if (!(oio->oi_oa.o_valid & OBD_MD_FLDATAVERSION)) {
+		slice->cis_io->ci_result = -EOPNOTSUPP;
+	} else {
+		dv->dv_data_version = oio->oi_oa.o_data_version;
+		slice->cis_io->ci_result = 0;
+	}
+}
+
 static int osc_io_read_start(const struct lu_env *env,
 			     const struct cl_io_slice *slice)
 {
@@ -759,6 +860,10 @@ static const struct cl_io_operations osc_io_ops = {
 			.cio_start  = osc_io_setattr_start,
 			.cio_end    = osc_io_setattr_end
 		},
+		[CIT_DATA_VERSION] = {
+			.cio_start	= osc_io_data_version_start,
+			.cio_end	= osc_io_data_version_end,
+		},
 		[CIT_FAULT] = {
 			.cio_start  = osc_io_fault_start,
 			.cio_end    = osc_io_end,
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 0413b88..2e4d2d5 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -177,64 +177,6 @@ static inline void osc_pack_req_body(struct ptlrpc_request *req,
 			     oinfo->oi_oa);
 }
 
-static int osc_getattr_interpret(const struct lu_env *env,
-				 struct ptlrpc_request *req,
-				 struct osc_async_args *aa, int rc)
-{
-	struct ost_body *body;
-
-	if (rc != 0)
-		goto out;
-
-	body = req_capsule_server_get(&req->rq_pill, &RMF_OST_BODY);
-	if (body) {
-		CDEBUG(D_INODE, "mode: %o\n", body->oa.o_mode);
-		lustre_get_wire_obdo(&req->rq_import->imp_connect_data,
-				     aa->aa_oi->oi_oa, &body->oa);
-
-		/* This should really be sent by the OST */
-		aa->aa_oi->oi_oa->o_blksize = DT_MAX_BRW_SIZE;
-		aa->aa_oi->oi_oa->o_valid |= OBD_MD_FLBLKSZ;
-	} else {
-		CDEBUG(D_INFO, "can't unpack ost_body\n");
-		rc = -EPROTO;
-		aa->aa_oi->oi_oa->o_valid = 0;
-	}
-out:
-	rc = aa->aa_oi->oi_cb_up(aa->aa_oi, rc);
-	return rc;
-}
-
-static int osc_getattr_async(struct obd_export *exp, struct obd_info *oinfo,
-			     struct ptlrpc_request_set *set)
-{
-	struct ptlrpc_request *req;
-	struct osc_async_args *aa;
-	int rc;
-
-	req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_OST_GETATTR);
-	if (!req)
-		return -ENOMEM;
-
-	rc = ptlrpc_request_pack(req, LUSTRE_OST_VERSION, OST_GETATTR);
-	if (rc) {
-		ptlrpc_request_free(req);
-		return rc;
-	}
-
-	osc_pack_req_body(req, oinfo);
-
-	ptlrpc_request_set_replen(req);
-	req->rq_interpret_reply = (ptlrpc_interpterer_t)osc_getattr_interpret;
-
-	CLASSERT(sizeof(*aa) <= sizeof(req->rq_async_args));
-	aa = ptlrpc_req_async_args(req);
-	aa->aa_oi = oinfo;
-
-	ptlrpc_set_add_req(set, req);
-	return 0;
-}
-
 static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 		       struct obd_info *oinfo)
 {
@@ -2986,7 +2928,6 @@ static struct obd_ops osc_obd_ops = {
 	.create         = osc_create,
 	.destroy        = osc_destroy,
 	.getattr        = osc_getattr,
-	.getattr_async  = osc_getattr_async,
 	.setattr        = osc_setattr,
 	.iocontrol      = osc_iocontrol,
 	.set_info_async = osc_set_info_async,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 24/41] staging: lustre: clio: add CIT_DATA_VERSION and remove IOC_LOV_GETINFO
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Bobi Jam, James Simmons

From: John L. Hammond <john.hammond@intel.com>

During development a new api, cl_object_obd_info_get()
and cl_object_data_version() which then were later
replaced by a better solution CIT_DATA_VERSION. For
the case of the upstream client their is no point in
introducing a API to only have it removed later. Due
to the way the patches landed with their dependencies
it is not possible to separate out two patches. These
two combined patches do the following:

 * Add a new cl_io type CIT_DATA_VERSION to get file
   data version.
 * Remove the unused IOC_LOV_GETINFO ioctl.
 * Remove ll_glimpse_ioctl() and ll_lsm_getattr().
 * Remove the OBD API method obd_getattr_async().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/12748
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
Reviewed-on: http://review.whamcloud.com/14649
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    6 +
 .../lustre/lustre/include/lustre/lustre_user.h     |    6 +-
 drivers/staging/lustre/lustre/include/obd.h        |    2 -
 drivers/staging/lustre/lustre/include/obd_class.h  |   13 --
 drivers/staging/lustre/lustre/llite/dir.c          |   71 ----------
 drivers/staging/lustre/lustre/llite/file.c         |  125 +++++-------------
 .../staging/lustre/lustre/llite/llite_internal.h   |    2 -
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    4 -
 drivers/staging/lustre/lustre/lov/lov_io.c         |   40 ++++++
 drivers/staging/lustre/lustre/lov/lov_merge.c      |   50 -------
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   69 ----------
 drivers/staging/lustre/lustre/lov/lov_request.c    |  144 --------------------
 drivers/staging/lustre/lustre/obdclass/cl_io.c     |    1 +
 drivers/staging/lustre/lustre/osc/osc_io.c         |  105 ++++++++++++++
 drivers/staging/lustre/lustre/osc/osc_request.c    |   59 --------
 15 files changed, 189 insertions(+), 508 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index 0b66d02..e46a510 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1373,6 +1373,8 @@ enum cl_io_type {
 	CIT_WRITE,
 	/** truncate, utime system calls */
 	CIT_SETATTR,
+	/** get data version */
+	CIT_DATA_VERSION,
 	/**
 	 * page fault handling
 	 */
@@ -1777,6 +1779,10 @@ struct cl_io {
 			int		sa_stripe_index;
 			const struct lu_fid	*sa_parent_fid;
 		} ci_setattr;
+		struct cl_data_version_io {
+			u64 dv_data_version;
+			int dv_flags;
+		} ci_data_version;
 		struct cl_fault_io {
 			/** page index within file. */
 			pgoff_t	 ft_index;
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index dced31f..043fc1c 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -63,9 +63,13 @@
 #if __BITS_PER_LONG != 64 || defined(__ARCH_WANT_STAT64)
 typedef struct stat64   lstat_t;
 #define lstat_f  lstat64
+#define fstat_f		fstat64
+#define fstatat_f	fstatat64
 #else
 typedef struct stat     lstat_t;
 #define lstat_f  lstat
+#define fstat_f		fstat
+#define fstatat_f	fstatat
 #endif
 
 #define HAVE_LOV_USER_MDS_DATA
@@ -234,7 +238,7 @@ struct ost_id {
 /* #define LL_IOC_POLL_QUOTACHECK	161 OBD_IOC_POLL_QUOTACHECK */
 /* #define LL_IOC_QUOTACTL		162 OBD_IOC_QUOTACTL */
 #define IOC_OBD_STATFS		  _IOWR('f', 164, struct obd_statfs *)
-#define IOC_LOV_GETINFO		 _IOWR('f', 165, struct lov_user_mds_data *)
+/*	IOC_LOV_GETINFO			165 obsolete */
 #define LL_IOC_FLUSHCTX		 _IOW('f', 166, long)
 /* LL_IOC_RMTACL			167 obsolete */
 #define LL_IOC_GETOBDCOUNT	      _IOR('f', 168, long)
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index c72a1e1..f254d88 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -905,8 +905,6 @@ struct obd_ops {
 		       struct obd_info *oinfo, struct obd_trans_info *oti);
 	int (*getattr)(const struct lu_env *env, struct obd_export *exp,
 		       struct obd_info *oinfo);
-	int (*getattr_async)(struct obd_export *exp, struct obd_info *oinfo,
-			     struct ptlrpc_request_set *set);
 	int (*preprw)(const struct lu_env *env, int cmd,
 		      struct obd_export *exp, struct obdo *oa, int objcount,
 		      struct obd_ioobj *obj, struct niobuf_remote *remote,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 7a5d75a..cb7160e 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -721,19 +721,6 @@ static inline int obd_getattr(const struct lu_env *env, struct obd_export *exp,
 	return rc;
 }
 
-static inline int obd_getattr_async(struct obd_export *exp,
-				    struct obd_info *oinfo,
-				    struct ptlrpc_request_set *set)
-{
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, getattr_async);
-	EXP_COUNTER_INCREMENT(exp, getattr_async);
-
-	rc = OBP(exp->exp_obd, getattr_async)(exp, oinfo, set);
-	return rc;
-}
-
 static inline int obd_setattr(const struct lu_env *env, struct obd_export *exp,
 			      struct obd_info *oinfo,
 			      struct obd_trans_info *oti)
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 12e9a38..360d97f 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -1369,77 +1369,6 @@ out_req:
 			ll_putname(filename);
 		return rc;
 	}
-	case IOC_LOV_GETINFO: {
-		struct lov_user_mds_data __user *lumd;
-		struct lov_stripe_md *lsm;
-		struct lov_user_md __user *lum;
-		struct lov_mds_md *lmm;
-		int lmmsize;
-		lstat_t st;
-
-		lumd = (struct lov_user_mds_data __user *)arg;
-		lum = &lumd->lmd_lmm;
-
-		rc = ll_get_max_mdsize(sbi, &lmmsize);
-		if (rc)
-			return rc;
-
-		lmm = libcfs_kvzalloc(lmmsize, GFP_NOFS);
-		if (!lmm)
-			return -ENOMEM;
-		if (copy_from_user(lmm, lum, lmmsize)) {
-			rc = -EFAULT;
-			goto free_lmm;
-		}
-
-		switch (lmm->lmm_magic) {
-		case LOV_USER_MAGIC_V1:
-			if (cpu_to_le32(LOV_USER_MAGIC_V1) == LOV_USER_MAGIC_V1)
-				break;
-			/* swab objects first so that stripes num will be sane */
-			lustre_swab_lov_user_md_objects(
-				((struct lov_user_md_v1 *)lmm)->lmm_objects,
-				((struct lov_user_md_v1 *)lmm)->lmm_stripe_count);
-			lustre_swab_lov_user_md_v1((struct lov_user_md_v1 *)lmm);
-			break;
-		case LOV_USER_MAGIC_V3:
-			if (cpu_to_le32(LOV_USER_MAGIC_V3) == LOV_USER_MAGIC_V3)
-				break;
-			/* swab objects first so that stripes num will be sane */
-			lustre_swab_lov_user_md_objects(
-				((struct lov_user_md_v3 *)lmm)->lmm_objects,
-				((struct lov_user_md_v3 *)lmm)->lmm_stripe_count);
-			lustre_swab_lov_user_md_v3((struct lov_user_md_v3 *)lmm);
-			break;
-		default:
-			rc = -EINVAL;
-			goto free_lmm;
-		}
-
-		rc = obd_unpackmd(sbi->ll_dt_exp, &lsm, lmm, lmmsize);
-		if (rc < 0) {
-			rc = -ENOMEM;
-			goto free_lmm;
-		}
-
-		/* Perform glimpse_size operation. */
-		memset(&st, 0, sizeof(st));
-
-		rc = ll_glimpse_ioctl(sbi, lsm, &st);
-		if (rc)
-			goto free_lsm;
-
-		if (copy_to_user(&lumd->lmd_st, &st, sizeof(st))) {
-			rc = -EFAULT;
-			goto free_lsm;
-		}
-
-free_lsm:
-		obd_free_memmd(sbi->ll_dt_exp, &lsm);
-free_lmm:
-		kvfree(lmm);
-		return rc;
-	}
 	case OBD_IOC_QUOTACHECK: {
 		struct obd_quotactl *oqctl;
 		int error = 0;
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 89a2841..8fa65a5 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -865,55 +865,6 @@ static int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 					 inode, och, NULL);
 }
 
-/* Fills the obdo with the attributes for the lsm */
-static int ll_lsm_getattr(struct lov_stripe_md *lsm, struct obd_export *exp,
-			  struct obdo *obdo, int dv_flags)
-{
-	struct ptlrpc_request_set *set;
-	struct obd_info	    oinfo = { };
-	int			rc;
-
-	LASSERT(lsm);
-
-	oinfo.oi_md = lsm;
-	oinfo.oi_oa = obdo;
-	oinfo.oi_oa->o_oi = lsm->lsm_oi;
-	oinfo.oi_oa->o_mode = S_IFREG;
-	oinfo.oi_oa->o_valid = OBD_MD_FLID | OBD_MD_FLTYPE |
-			       OBD_MD_FLSIZE | OBD_MD_FLBLOCKS |
-			       OBD_MD_FLBLKSZ | OBD_MD_FLATIME |
-			       OBD_MD_FLMTIME | OBD_MD_FLCTIME |
-			       OBD_MD_FLGROUP | OBD_MD_FLDATAVERSION;
-	if (dv_flags & (LL_DV_WR_FLUSH | LL_DV_RD_FLUSH)) {
-		oinfo.oi_oa->o_valid |= OBD_MD_FLFLAGS;
-		oinfo.oi_oa->o_flags |= OBD_FL_SRVLOCK;
-		if (dv_flags & LL_DV_WR_FLUSH)
-			oinfo.oi_oa->o_flags |= OBD_FL_FLUSH;
-	}
-
-	set = ptlrpc_prep_set();
-	if (!set) {
-		CERROR("cannot allocate ptlrpc set: rc = %d\n", -ENOMEM);
-		rc = -ENOMEM;
-	} else {
-		rc = obd_getattr_async(exp, &oinfo, set);
-		if (rc == 0)
-			rc = ptlrpc_set_wait(set);
-		ptlrpc_set_destroy(set);
-	}
-	if (rc == 0) {
-		oinfo.oi_oa->o_valid &= (OBD_MD_FLBLOCKS | OBD_MD_FLBLKSZ |
-					 OBD_MD_FLATIME | OBD_MD_FLMTIME |
-					 OBD_MD_FLCTIME | OBD_MD_FLSIZE |
-					 OBD_MD_FLDATAVERSION | OBD_MD_FLFLAGS);
-		if (dv_flags & LL_DV_WR_FLUSH &&
-		    !(oinfo.oi_oa->o_valid & OBD_MD_FLFLAGS &&
-		      oinfo.oi_oa->o_flags & OBD_FL_FLUSH))
-			return -ENOTSUPP;
-	}
-	return rc;
-}
-
 int ll_merge_attr(const struct lu_env *env, struct inode *inode)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
@@ -970,23 +921,6 @@ out_size_unlock:
 	return rc;
 }
 
-int ll_glimpse_ioctl(struct ll_sb_info *sbi, struct lov_stripe_md *lsm,
-		     lstat_t *st)
-{
-	struct obdo obdo = { 0 };
-	int rc;
-
-	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, &obdo, 0);
-	if (rc == 0) {
-		st->st_size   = obdo.o_size;
-		st->st_blocks = obdo.o_blocks;
-		st->st_mtime  = obdo.o_mtime;
-		st->st_atime  = obdo.o_atime;
-		st->st_ctime  = obdo.o_ctime;
-	}
-	return rc;
-}
-
 static bool file_is_noatime(const struct file *file)
 {
 	const struct vfsmount *mnt = file->f_path.mnt;
@@ -1635,45 +1569,50 @@ gf_free:
  * This value is computed using stripe object version on OST.
  * Version is computed using server side locking.
  *
- * @param sync  if do sync on the OST side;
+ * @param flags if do sync on the OST side;
  *		0: no sync
  *		LL_DV_RD_FLUSH: flush dirty pages, LCK_PR on OSTs
  *		LL_DV_WR_FLUSH: drop all caching pages, LCK_PW on OSTs
  */
 int ll_data_version(struct inode *inode, __u64 *data_version, int flags)
 {
-	struct lov_stripe_md	*lsm = NULL;
-	struct ll_sb_info	*sbi = ll_i2sbi(inode);
-	struct obdo		*obdo = NULL;
-	int			 rc;
+	struct cl_object *obj = ll_i2info(inode)->lli_clob;
+	struct lu_env *env;
+	struct cl_io *io;
+	int refcheck;
+	int result;
 
-	/* If no stripe, we consider version is 0. */
-	lsm = ccc_inode_lsm_get(inode);
-	if (!lsm_has_objects(lsm)) {
+	/* If no file object initialized, we consider its version is 0. */
+	if (!obj) {
 		*data_version = 0;
-		CDEBUG(D_INODE, "No object for inode\n");
-		rc = 0;
-		goto out;
+		return 0;
 	}
 
-	obdo = kzalloc(sizeof(*obdo), GFP_NOFS);
-	if (!obdo) {
-		rc = -ENOMEM;
-		goto out;
-	}
+	env = cl_env_get(&refcheck);
+	if (IS_ERR(env))
+		return PTR_ERR(env);
 
-	rc = ll_lsm_getattr(lsm, sbi->ll_dt_exp, obdo, flags);
-	if (rc == 0) {
-		if (!(obdo->o_valid & OBD_MD_FLDATAVERSION))
-			rc = -EOPNOTSUPP;
-		else
-			*data_version = obdo->o_data_version;
-	}
+	io = vvp_env_thread_io(env);
+	io->ci_obj = obj;
+	io->u.ci_data_version.dv_data_version = 0;
+	io->u.ci_data_version.dv_flags = flags;
 
-	kfree(obdo);
-out:
-	ccc_inode_lsm_put(inode, lsm);
-	return rc;
+restart:
+	if (!cl_io_init(env, io, CIT_DATA_VERSION, io->ci_obj))
+		result = cl_io_loop(env, io);
+	else
+		result = io->ci_result;
+
+	*data_version = io->u.ci_data_version.dv_data_version;
+
+	cl_io_fini(env, io);
+
+	if (unlikely(io->ci_need_restart))
+		goto restart;
+
+	cl_env_put(env, &refcheck);
+
+	return result;
 }
 
 /*
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 02541b1..e249895 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -741,8 +741,6 @@ enum ldlm_mode ll_take_md_lock(struct inode *inode, __u64 bits,
 			       enum ldlm_mode mode);
 int ll_file_open(struct inode *inode, struct file *file);
 int ll_file_release(struct inode *inode, struct file *file);
-int ll_glimpse_ioctl(struct ll_sb_info *sbi,
-		     struct lov_stripe_md *lsm, lstat_t *st);
 int ll_release_openhandle(struct inode *, struct lookup_intent *);
 int ll_md_real_close(struct inode *inode, fmode_t fmode);
 void ll_pack_inode2opdata(struct inode *inode, struct md_op_data *op_data,
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 4743c65..fffc18c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -132,8 +132,6 @@ static inline void lov_put_reqset(struct lov_request_set *set)
 	(char *)((lv)->lov_tgts[index]->ltd_uuid.uuid)
 
 /* lov_merge.c */
-void lov_merge_attrs(struct obdo *tgt, struct obdo *src, u64 valid,
-		     struct lov_stripe_md *lsm, int stripeno, int *set);
 int lov_merge_lvb_kms(struct lov_stripe_md *lsm,
 		      struct ost_lvb *lvb, __u64 *kms_place);
 
@@ -150,8 +148,6 @@ pgoff_t lov_stripe_pgoff(struct lov_stripe_md *lsm, pgoff_t stripe_index,
 			 int stripe);
 
 /* lov_request.c */
-int lov_update_common_set(struct lov_request_set *set,
-			  struct lov_request *req, int rc);
 int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
 			 struct lov_request_set **reqset);
 int lov_fini_getattr_set(struct lov_request_set *set);
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index 369718e..d6be613 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -100,6 +100,12 @@ static void lov_io_sub_inherit(struct cl_io *io, struct lov_io *lio,
 		}
 		break;
 	}
+	case CIT_DATA_VERSION: {
+		io->u.ci_data_version.dv_data_version = 0;
+		io->u.ci_data_version.dv_flags =
+			parent->u.ci_data_version.dv_flags;
+		break;
+	}
 	case CIT_FAULT: {
 		struct cl_object *obj = parent->ci_obj;
 		loff_t off = cl_offset(obj, parent->u.ci_fault.ft_index);
@@ -339,6 +345,11 @@ static int lov_io_slice_init(struct lov_io *lio, struct lov_object *obj,
 		lio->lis_endpos = OBD_OBJECT_EOF;
 		break;
 
+	case CIT_DATA_VERSION:
+		lio->lis_pos = 0;
+		lio->lis_endpos = OBD_OBJECT_EOF;
+		break;
+
 	case CIT_FAULT: {
 		pgoff_t index = io->u.ci_fault.ft_index;
 
@@ -516,6 +527,24 @@ static int lov_io_end_wrapper(const struct lu_env *env, struct cl_io *io)
 	return 0;
 }
 
+static void
+lov_io_data_version_end(const struct lu_env *env, const struct cl_io_slice *ios)
+{
+	struct lov_io *lio = cl2lov_io(env, ios);
+	struct cl_io *parent = lio->lis_cl.cis_io;
+	struct lov_io_sub *sub;
+
+	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
+		lov_io_end_wrapper(env, sub->sub_io);
+
+		parent->u.ci_data_version.dv_data_version +=
+			sub->sub_io->u.ci_data_version.dv_data_version;
+
+		if (!parent->ci_result)
+			parent->ci_result = sub->sub_io->ci_result;
+	}
+}
+
 static int lov_io_iter_fini_wrapper(const struct lu_env *env, struct cl_io *io)
 {
 	cl_io_iter_fini(env, io);
@@ -838,6 +867,15 @@ static const struct cl_io_operations lov_io_ops = {
 			.cio_start     = lov_io_start,
 			.cio_end       = lov_io_end
 		},
+		[CIT_DATA_VERSION] = {
+			.cio_fini	= lov_io_fini,
+			.cio_iter_init	= lov_io_iter_init,
+			.cio_iter_fini	= lov_io_iter_fini,
+			.cio_lock	= lov_io_lock,
+			.cio_unlock	= lov_io_unlock,
+			.cio_start	= lov_io_start,
+			.cio_end	= lov_io_data_version_end,
+		},
 		[CIT_FAULT] = {
 			.cio_fini      = lov_io_fini,
 			.cio_iter_init = lov_io_iter_init,
@@ -969,6 +1007,7 @@ int lov_io_init_empty(const struct lu_env *env, struct cl_object *obj,
 		break;
 	case CIT_FSYNC:
 	case CIT_SETATTR:
+	case CIT_DATA_VERSION:
 		result = 1;
 		break;
 	case CIT_WRITE:
@@ -1004,6 +1043,7 @@ int lov_io_init_released(const struct lu_env *env, struct cl_object *obj,
 		LASSERTF(0, "invalid type %d\n", io->ci_type);
 	case CIT_MISC:
 	case CIT_FSYNC:
+	case CIT_DATA_VERSION:
 		result = 1;
 		break;
 	case CIT_SETATTR:
diff --git a/drivers/staging/lustre/lustre/lov/lov_merge.c b/drivers/staging/lustre/lustre/lov/lov_merge.c
index 674af10..391dfd2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_merge.c
+++ b/drivers/staging/lustre/lustre/lov/lov_merge.c
@@ -104,53 +104,3 @@ int lov_merge_lvb_kms(struct lov_stripe_md *lsm,
 	lvb->lvb_ctime = current_ctime;
 	return rc;
 }
-
-void lov_merge_attrs(struct obdo *tgt, struct obdo *src, u64 valid,
-		     struct lov_stripe_md *lsm, int stripeno, int *set)
-{
-	valid &= src->o_valid;
-
-	if (*set) {
-		tgt->o_valid &= valid;
-		if (valid & OBD_MD_FLSIZE) {
-			/* this handles sparse files properly */
-			u64 lov_size;
-
-			lov_size = lov_stripe_size(lsm, src->o_size, stripeno);
-			if (lov_size > tgt->o_size)
-				tgt->o_size = lov_size;
-		}
-		if (valid & OBD_MD_FLBLOCKS)
-			tgt->o_blocks += src->o_blocks;
-		if (valid & OBD_MD_FLBLKSZ)
-			tgt->o_blksize += src->o_blksize;
-		if (valid & OBD_MD_FLCTIME && tgt->o_ctime < src->o_ctime)
-			tgt->o_ctime = src->o_ctime;
-		if (valid & OBD_MD_FLMTIME && tgt->o_mtime < src->o_mtime)
-			tgt->o_mtime = src->o_mtime;
-		if (valid & OBD_MD_FLDATAVERSION)
-			tgt->o_data_version += src->o_data_version;
-
-		/* handle flags */
-		if (valid & OBD_MD_FLFLAGS)
-			tgt->o_flags &= src->o_flags;
-		else
-			tgt->o_flags = 0;
-	} else {
-		memcpy(tgt, src, sizeof(*tgt));
-		tgt->o_oi = lsm->lsm_oi;
-		tgt->o_valid = valid;
-		if (valid & OBD_MD_FLSIZE)
-			tgt->o_size = lov_stripe_size(lsm, src->o_size,
-						      stripeno);
-		tgt->o_flags = 0;
-		if (valid & OBD_MD_FLFLAGS)
-			tgt->o_flags = src->o_flags;
-	}
-
-	/* data_version needs to be valid on all stripes to be correct! */
-	if (!(valid & OBD_MD_FLDATAVERSION))
-		tgt->o_valid &= ~OBD_MD_FLDATAVERSION;
-
-	*set += 1;
-}
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 44f53c7..019fe95 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -979,74 +979,6 @@ do {									    \
 		 "%p->lsm_magic=%x\n", (lsmp), (lsmp)->lsm_magic);	      \
 } while (0)
 
-static int lov_getattr_interpret(struct ptlrpc_request_set *rqset,
-				 void *data, int rc)
-{
-	struct lov_request_set *lovset = (struct lov_request_set *)data;
-	int err;
-
-	/* don't do attribute merge if this async op failed */
-	if (rc)
-		atomic_set(&lovset->set_completes, 0);
-	err = lov_fini_getattr_set(lovset);
-	return rc ? rc : err;
-}
-
-static int lov_getattr_async(struct obd_export *exp, struct obd_info *oinfo,
-			     struct ptlrpc_request_set *rqset)
-{
-	struct lov_request_set *lovset;
-	struct lov_obd *lov;
-	struct lov_request *req;
-	int rc = 0, err;
-
-	LASSERT(oinfo);
-	ASSERT_LSM_MAGIC(oinfo->oi_md);
-
-	if (!exp || !exp->exp_obd)
-		return -ENODEV;
-
-	lov = &exp->exp_obd->u.lov;
-
-	rc = lov_prep_getattr_set(exp, oinfo, &lovset);
-	if (rc)
-		return rc;
-
-	CDEBUG(D_INFO, "objid "DOSTID": %ux%u byte stripes\n",
-	       POSTID(&oinfo->oi_md->lsm_oi), oinfo->oi_md->lsm_stripe_count,
-	       oinfo->oi_md->lsm_stripe_size);
-
-	list_for_each_entry(req, &lovset->set_list, rq_link) {
-		CDEBUG(D_INFO, "objid " DOSTID "[%d] has subobj " DOSTID " at idx%u\n",
-		       POSTID(&oinfo->oi_oa->o_oi), req->rq_stripe,
-		       POSTID(&req->rq_oi.oi_oa->o_oi), req->rq_idx);
-		rc = obd_getattr_async(lov->lov_tgts[req->rq_idx]->ltd_exp,
-				       &req->rq_oi, rqset);
-		if (rc) {
-			CERROR("%s: getattr objid "DOSTID" subobj"
-			       DOSTID" on OST idx %d: rc = %d\n",
-			       exp->exp_obd->obd_name,
-			       POSTID(&oinfo->oi_oa->o_oi),
-			       POSTID(&req->rq_oi.oi_oa->o_oi),
-			       req->rq_idx, rc);
-			goto out;
-		}
-	}
-
-	if (!list_empty(&rqset->set_requests)) {
-		LASSERT(rc == 0);
-		LASSERT(!rqset->set_interpret);
-		rqset->set_interpret = lov_getattr_interpret;
-		rqset->set_arg = (void *)lovset;
-		return rc;
-	}
-out:
-	if (rc)
-		atomic_set(&lovset->set_completes, 0);
-	err = lov_fini_getattr_set(lovset);
-	return rc ? rc : err;
-}
-
 int lov_statfs_interpret(struct ptlrpc_request_set *rqset, void *data, int rc)
 {
 	struct lov_request_set *lovset = (struct lov_request_set *)data;
@@ -1530,7 +1462,6 @@ static struct obd_ops lov_obd_ops = {
 	.statfs_async   = lov_statfs_async,
 	.packmd         = lov_packmd,
 	.unpackmd       = lov_unpackmd,
-	.getattr_async  = lov_getattr_async,
 	.iocontrol      = lov_iocontrol,
 	.get_info       = lov_get_info,
 	.set_info_async = lov_set_info_async,
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 048e597..42e66d1 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -97,22 +97,6 @@ static void lov_update_set(struct lov_request_set *set,
 	wake_up(&set->set_waitq);
 }
 
-int lov_update_common_set(struct lov_request_set *set,
-			  struct lov_request *req, int rc)
-{
-	struct lov_obd *lov = &set->set_exp->exp_obd->u.lov;
-
-	lov_update_set(set, req, rc);
-
-	/* grace error on inactive ost */
-	if (rc && !(lov->lov_tgts[req->rq_idx] &&
-		    lov->lov_tgts[req->rq_idx]->ltd_active))
-		rc = 0;
-
-	/* FIXME in raid1 regime, should return 0 */
-	return rc;
-}
-
 static void lov_set_add_req(struct lov_request *req,
 			    struct lov_request_set *set)
 {
@@ -183,134 +167,6 @@ out:
 	return rc;
 }
 
-static int common_attr_done(struct lov_request_set *set)
-{
-	struct lov_request *req;
-	struct obdo *tmp_oa;
-	int rc = 0, attrset = 0;
-
-	if (!set->set_oi->oi_oa)
-		return 0;
-
-	if (!atomic_read(&set->set_success))
-		return -EIO;
-
-	tmp_oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-	if (!tmp_oa) {
-		rc = -ENOMEM;
-		goto out;
-	}
-
-	list_for_each_entry(req, &set->set_list, rq_link) {
-		if (!req->rq_complete || req->rq_rc)
-			continue;
-		if (req->rq_oi.oi_oa->o_valid == 0)   /* inactive stripe */
-			continue;
-		lov_merge_attrs(tmp_oa, req->rq_oi.oi_oa,
-				req->rq_oi.oi_oa->o_valid,
-				set->set_oi->oi_md, req->rq_stripe, &attrset);
-	}
-	if (!attrset) {
-		CERROR("No stripes had valid attrs\n");
-		rc = -EIO;
-	}
-
-	tmp_oa->o_oi = set->set_oi->oi_oa->o_oi;
-	memcpy(set->set_oi->oi_oa, tmp_oa, sizeof(*set->set_oi->oi_oa));
-out:
-	if (tmp_oa)
-		kmem_cache_free(obdo_cachep, tmp_oa);
-	return rc;
-}
-
-int lov_fini_getattr_set(struct lov_request_set *set)
-{
-	int rc = 0;
-
-	if (!set)
-		return 0;
-	LASSERT(set->set_exp);
-	if (atomic_read(&set->set_completes))
-		rc = common_attr_done(set);
-
-	lov_put_reqset(set);
-
-	return rc;
-}
-
-/* The callback for osc_getattr_async that finalizes a request info when a
- * response is received.
- */
-static int cb_getattr_update(void *cookie, int rc)
-{
-	struct obd_info *oinfo = cookie;
-	struct lov_request *lovreq;
-
-	lovreq = container_of(oinfo, struct lov_request, rq_oi);
-	return lov_update_common_set(lovreq->rq_rqset, lovreq, rc);
-}
-
-int lov_prep_getattr_set(struct obd_export *exp, struct obd_info *oinfo,
-			 struct lov_request_set **reqset)
-{
-	struct lov_request_set *set;
-	struct lov_obd *lov = &exp->exp_obd->u.lov;
-	int rc = 0, i;
-
-	set = kzalloc(sizeof(*set), GFP_NOFS);
-	if (!set)
-		return -ENOMEM;
-	lov_init_set(set);
-
-	set->set_exp = exp;
-	set->set_oi = oinfo;
-
-	for (i = 0; i < oinfo->oi_md->lsm_stripe_count; i++) {
-		struct lov_oinfo *loi;
-		struct lov_request *req;
-
-		loi = oinfo->oi_md->lsm_oinfo[i];
-		if (lov_oinfo_is_dummy(loi))
-			continue;
-
-		if (!lov_check_and_wait_active(lov, loi->loi_ost_idx)) {
-			CDEBUG(D_HA, "lov idx %d inactive\n", loi->loi_ost_idx);
-			continue;
-		}
-
-		req = kzalloc(sizeof(*req), GFP_NOFS);
-		if (!req) {
-			rc = -ENOMEM;
-			goto out_set;
-		}
-
-		req->rq_stripe = i;
-		req->rq_idx = loi->loi_ost_idx;
-
-		req->rq_oi.oi_oa = kmem_cache_zalloc(obdo_cachep, GFP_NOFS);
-		if (!req->rq_oi.oi_oa) {
-			kfree(req);
-			rc = -ENOMEM;
-			goto out_set;
-		}
-		memcpy(req->rq_oi.oi_oa, oinfo->oi_oa,
-		       sizeof(*req->rq_oi.oi_oa));
-		req->rq_oi.oi_oa->o_oi = loi->loi_oi;
-		req->rq_oi.oi_cb_up = cb_getattr_update;
-
-		lov_set_add_req(req, set);
-	}
-	if (!set->set_count) {
-		rc = -EIO;
-		goto out_set;
-	}
-	*reqset = set;
-	return rc;
-out_set:
-	lov_fini_getattr_set(set);
-	return rc;
-}
-
 #define LOV_U64_MAX ((__u64)~0ULL)
 #define LOV_SUM_MAX(tot, add)					   \
 	do {							    \
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_io.c b/drivers/staging/lustre/lustre/obdclass/cl_io.c
index 577f76e..c5621ad 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_io.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_io.c
@@ -126,6 +126,7 @@ void cl_io_fini(const struct lu_env *env, struct cl_io *io)
 	switch (io->ci_type) {
 	case CIT_READ:
 	case CIT_WRITE:
+	case CIT_DATA_VERSION:
 		break;
 	case CIT_FAULT:
 		break;
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index a96addf..b4e062d 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -606,6 +606,107 @@ static void osc_io_setattr_end(const struct lu_env *env,
 	}
 }
 
+struct osc_data_version_args {
+	struct osc_io *dva_oio;
+};
+
+static int
+osc_data_version_interpret(const struct lu_env *env, struct ptlrpc_request *req,
+			   void *arg, int rc)
+{
+	struct osc_data_version_args *dva = arg;
+	struct osc_io *oio = dva->dva_oio;
+	const struct ost_body *body;
+
+	if (rc < 0)
+		goto out;
+
+	body = req_capsule_server_get(&req->rq_pill, &RMF_OST_BODY);
+	if (!body) {
+		rc = -EPROTO;
+		goto out;
+	}
+
+	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, &oio->oi_oa,
+			     &body->oa);
+out:
+	oio->oi_cbarg.opc_rc = rc;
+	complete(&oio->oi_cbarg.opc_sync);
+
+	return 0;
+}
+
+static int osc_io_data_version_start(const struct lu_env *env,
+				     const struct cl_io_slice *slice)
+{
+	struct cl_data_version_io *dv = &slice->cis_io->u.ci_data_version;
+	struct osc_io *oio = cl2osc_io(env, slice);
+	struct osc_async_cbargs *cbargs = &oio->oi_cbarg;
+	struct osc_object *obj = cl2osc(slice->cis_obj);
+	struct obd_export *exp = osc_export(obj);
+	struct lov_oinfo *loi = obj->oo_oinfo;
+	struct osc_data_version_args *dva;
+	struct obdo *oa = &oio->oi_oa;
+	struct ptlrpc_request *req;
+	struct ost_body *body;
+	int rc;
+
+	memset(oa, 0, sizeof(*oa));
+	oa->o_oi = loi->loi_oi;
+	oa->o_valid = OBD_MD_FLID | OBD_MD_FLGROUP;
+
+	if (dv->dv_flags & (LL_DV_RD_FLUSH | LL_DV_WR_FLUSH)) {
+		oa->o_valid |= OBD_MD_FLFLAGS;
+		oa->o_flags |= OBD_FL_SRVLOCK;
+		if (dv->dv_flags & LL_DV_WR_FLUSH)
+			oa->o_flags |= OBD_FL_FLUSH;
+	}
+
+	init_completion(&cbargs->opc_sync);
+
+	req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_OST_GETATTR);
+	if (!req)
+		return -ENOMEM;
+
+	rc = ptlrpc_request_pack(req, LUSTRE_OST_VERSION, OST_GETATTR);
+	if (rc < 0) {
+		ptlrpc_request_free(req);
+		return rc;
+	}
+
+	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
+	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa, oa);
+
+	ptlrpc_request_set_replen(req);
+	req->rq_interpret_reply = osc_data_version_interpret;
+	CLASSERT(sizeof(*dva) <= sizeof(req->rq_async_args));
+	dva = ptlrpc_req_async_args(req);
+	dva->dva_oio = oio;
+
+	ptlrpcd_add_req(req);
+
+	return 0;
+}
+
+static void osc_io_data_version_end(const struct lu_env *env,
+				    const struct cl_io_slice *slice)
+{
+	struct cl_data_version_io *dv = &slice->cis_io->u.ci_data_version;
+	struct osc_io *oio = cl2osc_io(env, slice);
+	struct osc_async_cbargs *cbargs = &oio->oi_cbarg;
+
+	wait_for_completion(&cbargs->opc_sync);
+
+	if (cbargs->opc_rc) {
+		slice->cis_io->ci_result = cbargs->opc_rc;
+	} else if (!(oio->oi_oa.o_valid & OBD_MD_FLDATAVERSION)) {
+		slice->cis_io->ci_result = -EOPNOTSUPP;
+	} else {
+		dv->dv_data_version = oio->oi_oa.o_data_version;
+		slice->cis_io->ci_result = 0;
+	}
+}
+
 static int osc_io_read_start(const struct lu_env *env,
 			     const struct cl_io_slice *slice)
 {
@@ -759,6 +860,10 @@ static const struct cl_io_operations osc_io_ops = {
 			.cio_start  = osc_io_setattr_start,
 			.cio_end    = osc_io_setattr_end
 		},
+		[CIT_DATA_VERSION] = {
+			.cio_start	= osc_io_data_version_start,
+			.cio_end	= osc_io_data_version_end,
+		},
 		[CIT_FAULT] = {
 			.cio_start  = osc_io_fault_start,
 			.cio_end    = osc_io_end,
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 0413b88..2e4d2d5 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -177,64 +177,6 @@ static inline void osc_pack_req_body(struct ptlrpc_request *req,
 			     oinfo->oi_oa);
 }
 
-static int osc_getattr_interpret(const struct lu_env *env,
-				 struct ptlrpc_request *req,
-				 struct osc_async_args *aa, int rc)
-{
-	struct ost_body *body;
-
-	if (rc != 0)
-		goto out;
-
-	body = req_capsule_server_get(&req->rq_pill, &RMF_OST_BODY);
-	if (body) {
-		CDEBUG(D_INODE, "mode: %o\n", body->oa.o_mode);
-		lustre_get_wire_obdo(&req->rq_import->imp_connect_data,
-				     aa->aa_oi->oi_oa, &body->oa);
-
-		/* This should really be sent by the OST */
-		aa->aa_oi->oi_oa->o_blksize = DT_MAX_BRW_SIZE;
-		aa->aa_oi->oi_oa->o_valid |= OBD_MD_FLBLKSZ;
-	} else {
-		CDEBUG(D_INFO, "can't unpack ost_body\n");
-		rc = -EPROTO;
-		aa->aa_oi->oi_oa->o_valid = 0;
-	}
-out:
-	rc = aa->aa_oi->oi_cb_up(aa->aa_oi, rc);
-	return rc;
-}
-
-static int osc_getattr_async(struct obd_export *exp, struct obd_info *oinfo,
-			     struct ptlrpc_request_set *set)
-{
-	struct ptlrpc_request *req;
-	struct osc_async_args *aa;
-	int rc;
-
-	req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_OST_GETATTR);
-	if (!req)
-		return -ENOMEM;
-
-	rc = ptlrpc_request_pack(req, LUSTRE_OST_VERSION, OST_GETATTR);
-	if (rc) {
-		ptlrpc_request_free(req);
-		return rc;
-	}
-
-	osc_pack_req_body(req, oinfo);
-
-	ptlrpc_request_set_replen(req);
-	req->rq_interpret_reply = (ptlrpc_interpterer_t)osc_getattr_interpret;
-
-	CLASSERT(sizeof(*aa) <= sizeof(req->rq_async_args));
-	aa = ptlrpc_req_async_args(req);
-	aa->aa_oi = oinfo;
-
-	ptlrpc_set_add_req(set, req);
-	return 0;
-}
-
 static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 		       struct obd_info *oinfo)
 {
@@ -2986,7 +2928,6 @@ static struct obd_ops osc_obd_ops = {
 	.create         = osc_create,
 	.destroy        = osc_destroy,
 	.getattr        = osc_getattr,
-	.getattr_async  = osc_getattr_async,
 	.setattr        = osc_setattr,
 	.iocontrol      = osc_iocontrol,
 	.set_info_async = osc_set_info_async,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 25/41] staging: lustre: lov: add cl_object_layout_get()
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Add cl_object_layout_get() to return the layout and generation of an
object. Replace some direct accesses to object LSM with calls to this
function.

In ll_getxattr() factor out the LOV xattr specific handling into a new
function ll_getxattr_lov() which calls cl_object_layout_get(). In
ll_listxattr() call ll_getxattr_lov() to determine if a lustre.lov
xattr should be emitted.  Add lov_lsm_pack() to generate LOV xattrs
from a LSM.

Remove the unused functions ccc_inode_lsm_{get,put}() and
lov_lsm_get().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13680
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   27 +++
 .../lustre/lustre/include/lustre/lustre_user.h     |    3 +
 drivers/staging/lustre/lustre/llite/file.c         |  149 +++++++------
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |   19 --
 .../staging/lustre/lustre/llite/llite_internal.h   |    5 -
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   33 +++-
 drivers/staging/lustre/lustre/llite/vvp_internal.h |   13 -
 drivers/staging/lustre/lustre/llite/vvp_object.c   |    4 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |  234 ++++++++++----------
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    2 +
 drivers/staging/lustre/lustre/lov/lov_object.c     |   51 +++--
 drivers/staging/lustre/lustre/lov/lov_pack.c       |   99 +++++----
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   14 ++
 13 files changed, 364 insertions(+), 289 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index e46a510..b80539d 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -301,6 +301,26 @@ enum {
 	OBJECT_CONF_WAIT = 2
 };
 
+enum {
+	CL_LAYOUT_GEN_NONE	= (u32)-2,	/* layout lock was cancelled */
+	CL_LAYOUT_GEN_EMPTY	= (u32)-1,	/* for empty layout */
+};
+
+struct cl_layout {
+	/** the buffer to return the layout in lov_mds_md format. */
+	struct lu_buf	cl_buf;
+	/** size of layout in lov_mds_md format. */
+	size_t		cl_size;
+	/** Layout generation. */
+	u32		cl_layout_gen;
+	/**
+	 * True if this is a released file.
+	 * Temporarily added for released file truncate in ll_setattr_raw().
+	 * It will be removed later. -Jinshan
+	 */
+	bool		cl_is_released;
+};
+
 /**
  * Operations implemented for each cl object layer.
  *
@@ -406,6 +426,11 @@ struct cl_object_operations {
 	int (*coo_fiemap)(const struct lu_env *env, struct cl_object *obj,
 			  struct ll_fiemap_info_key *fmkey,
 			  struct fiemap *fiemap, size_t *buflen);
+	/**
+	 * Get layout and generation of the object.
+	 */
+	int (*coo_layout_get)(const struct lu_env *env, struct cl_object *obj,
+			      struct cl_layout *layout);
 };
 
 /**
@@ -2200,6 +2225,8 @@ int  cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 		     struct ll_fiemap_info_key *fmkey, struct fiemap *fiemap,
 		     size_t *buflen);
+int cl_object_layout_get(const struct lu_env *env, struct cl_object *obj,
+			 struct cl_layout *cl);
 
 /**
  * Returns true, iff \a o0 and \a o1 are slices of the same object.
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 043fc1c..80fecba 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -346,6 +346,9 @@ enum ll_lease_type {
 #define LOV_ALL_STRIPES       0xffff /* only valid for directories */
 #define LOV_V1_INSANE_STRIPE_COUNT 65532 /* maximum stripe count bz13933 */
 
+#define XATTR_LUSTRE_PREFIX	"lustre."
+#define XATTR_LUSTRE_LOV	"lustre.lov"
+
 #define lov_user_ost_data lov_user_ost_data_v1
 struct lov_user_ost_data_v1 {     /* per-stripe data structure */
 	struct ost_id l_ost_oi;	  /* OST object ID */
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 8fa65a5..73ea446 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3187,35 +3187,51 @@ ll_iocontrol_call(struct inode *inode, struct file *file,
 int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
+	struct cl_object *obj = lli->lli_clob;
 	struct cl_env_nest nest;
 	struct lu_env *env;
-	int result;
+	int rc;
 
-	if (!lli->lli_clob)
+	if (!obj)
 		return 0;
 
 	env = cl_env_nested_get(&nest);
 	if (IS_ERR(env))
 		return PTR_ERR(env);
 
-	result = cl_conf_set(env, lli->lli_clob, conf);
-	cl_env_nested_put(&nest, env);
+	rc = cl_conf_set(env, obj, conf);
+	if (rc < 0)
+		goto out;
 
 	if (conf->coc_opc == OBJECT_CONF_SET) {
 		struct ldlm_lock *lock = conf->coc_lock;
+		struct cl_layout cl = {
+			.cl_layout_gen = 0,
+		};
 
 		LASSERT(lock);
 		LASSERT(ldlm_has_layout(lock));
-		if (result == 0) {
-			/* it can only be allowed to match after layout is
-			 * applied to inode otherwise false layout would be
-			 * seen. Applying layout should happen before dropping
-			 * the intent lock.
-			 */
-			ldlm_lock_allow_match(lock);
-		}
+
+		/* it can only be allowed to match after layout is
+		 * applied to inode otherwise false layout would be
+		 * seen. Applying layout should happen before dropping
+		 * the intent lock.
+		 */
+		ldlm_lock_allow_match(lock);
+
+		rc = cl_object_layout_get(env, obj, &cl);
+		if (rc < 0)
+			goto out;
+
+		CDEBUG(D_VFSTRACE, DFID ": layout version change: %u -> %u\n",
+		       PFID(&lli->lli_fid), ll_layout_version_get(lli),
+		       cl.cl_layout_gen);
+		ll_layout_version_set(lli, cl.cl_layout_gen);
+		lli->lli_has_smd = lsm_has_objects(conf->u.coc_md->lsm);
 	}
-	return result;
+out:
+	cl_env_nested_put(&nest, env);
+	return rc;
 }
 
 /* Fetch layout from MDT with getxattr request, if it's not ready yet */
@@ -3294,7 +3310,7 @@ out:
  * in this function.
  */
 static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
-			      struct inode *inode, __u32 *gen, bool reconf)
+			      struct inode *inode)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct ll_sb_info    *sbi = ll_i2sbi(inode);
@@ -3311,8 +3327,8 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	LASSERT(lock);
 	LASSERT(ldlm_has_layout(lock));
 
-	LDLM_DEBUG(lock, "File "DFID"(%p) being reconfigured: %d",
-		   PFID(&lli->lli_fid), inode, reconf);
+	LDLM_DEBUG(lock, "File " DFID "(%p) being reconfigured",
+		   PFID(&lli->lli_fid), inode);
 
 	/* in case this is a caching lock and reinstate with new inode */
 	md_set_lock_data(sbi->ll_md_exp, lockh, inode, NULL);
@@ -3323,15 +3339,8 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	/* checking lvb_ready is racy but this is okay. The worst case is
 	 * that multi processes may configure the file on the same time.
 	 */
-	if (lvb_ready || !reconf) {
-		rc = -ENODATA;
-		if (lvb_ready) {
-			/* layout_gen must be valid if layout lock is not
-			 * cancelled and stripe has already set
-			 */
-			*gen = ll_layout_version_get(lli);
-			rc = 0;
-		}
+	if (lvb_ready) {
+		rc = 0;
 		goto out;
 	}
 
@@ -3347,19 +3356,17 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	if (lock->l_lvb_data) {
 		rc = obd_unpackmd(sbi->ll_dt_exp, &md.lsm,
 				  lock->l_lvb_data, lock->l_lvb_len);
-		if (rc >= 0) {
-			*gen = LL_LAYOUT_GEN_EMPTY;
-			if (md.lsm)
-				*gen = md.lsm->lsm_layout_gen;
-			rc = 0;
-		} else {
+		if (rc < 0) {
 			CERROR("%s: file " DFID " unpackmd error: %d\n",
 			       ll_get_fsname(inode->i_sb, NULL, 0),
 			       PFID(&lli->lli_fid), rc);
+			goto out;
 		}
+
+		LASSERTF(md.lsm, "lvb_data = %p, lvb_len = %u\n",
+			 lock->l_lvb_data, lock->l_lvb_len);
+		rc = 0;
 	}
-	if (rc < 0)
-		goto out;
 
 	/* set layout to file. Unlikely this will fail as old layout was
 	 * surely eliminated
@@ -3401,20 +3408,7 @@ out:
 	return rc;
 }
 
-/**
- * This function checks if there exists a LAYOUT lock on the client side,
- * or enqueues it if it doesn't have one in cache.
- *
- * This function will not hold layout lock so it may be revoked any time after
- * this function returns. Any operations depend on layout should be redone
- * in that case.
- *
- * This function should be called before lov_io_init() to get an uptodate
- * layout version, the caller should save the version number and after IO
- * is finished, this function should be called again to verify that layout
- * is not changed during IO time.
- */
-int ll_layout_refresh(struct inode *inode, __u32 *gen)
+static int ll_layout_refresh_locked(struct inode *inode)
 {
 	struct ll_inode_info  *lli = ll_i2info(inode);
 	struct ll_sb_info     *sbi = ll_i2sbi(inode);
@@ -3430,17 +3424,6 @@ int ll_layout_refresh(struct inode *inode, __u32 *gen)
 	};
 	int rc;
 
-	*gen = ll_layout_version_get(lli);
-	if (!(sbi->ll_flags & LL_SBI_LAYOUT_LOCK) || *gen != LL_LAYOUT_GEN_NONE)
-		return 0;
-
-	/* sanity checks */
-	LASSERT(fid_is_sane(ll_inode2fid(inode)));
-	LASSERT(S_ISREG(inode->i_mode));
-
-	/* take layout lock mutex to enqueue layout lock exclusively. */
-	mutex_lock(&lli->lli_layout_mutex);
-
 again:
 	/* mostly layout lock is caching on the local side, so try to match
 	 * it before grabbing layout lock mutex.
@@ -3448,20 +3431,16 @@ again:
 	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0,
 			       LCK_CR | LCK_CW | LCK_PR | LCK_PW);
 	if (mode != 0) { /* hit cached lock */
-		rc = ll_layout_lock_set(&lockh, mode, inode, gen, true);
+		rc = ll_layout_lock_set(&lockh, mode, inode);
 		if (rc == -EAGAIN)
 			goto again;
-
-		mutex_unlock(&lli->lli_layout_mutex);
 		return rc;
 	}
 
 	op_data = ll_prep_md_op_data(NULL, inode, inode, NULL,
 				     0, 0, LUSTRE_OPC_ANY, NULL);
-	if (IS_ERR(op_data)) {
-		mutex_unlock(&lli->lli_layout_mutex);
+	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
-	}
 
 	/* have to enqueue one */
 	memset(&it, 0, sizeof(it));
@@ -3485,10 +3464,50 @@ again:
 	if (rc == 0) {
 		/* set lock data in case this is a new lock */
 		ll_set_lock_data(sbi->ll_md_exp, inode, &it, NULL);
-		rc = ll_layout_lock_set(&lockh, mode, inode, gen, true);
+		rc = ll_layout_lock_set(&lockh, mode, inode);
 		if (rc == -EAGAIN)
 			goto again;
 	}
+
+	return rc;
+}
+
+/**
+ * This function checks if there exists a LAYOUT lock on the client side,
+ * or enqueues it if it doesn't have one in cache.
+ *
+ * This function will not hold layout lock so it may be revoked any time after
+ * this function returns. Any operations depend on layout should be redone
+ * in that case.
+ *
+ * This function should be called before lov_io_init() to get an uptodate
+ * layout version, the caller should save the version number and after IO
+ * is finished, this function should be called again to verify that layout
+ * is not changed during IO time.
+ */
+int ll_layout_refresh(struct inode *inode, __u32 *gen)
+{
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	int rc;
+
+	*gen = ll_layout_version_get(lli);
+	if (!(sbi->ll_flags & LL_SBI_LAYOUT_LOCK) || *gen != CL_LAYOUT_GEN_NONE)
+		return 0;
+
+	/* sanity checks */
+	LASSERT(fid_is_sane(ll_inode2fid(inode)));
+	LASSERT(S_ISREG(inode->i_mode));
+
+	/* take layout lock mutex to enqueue layout lock exclusively. */
+	mutex_lock(&lli->lli_layout_mutex);
+
+	rc = ll_layout_refresh_locked(inode);
+	if (rc < 0)
+		goto out;
+
+	*gen = ll_layout_version_get(lli);
+out:
 	mutex_unlock(&lli->lli_layout_mutex);
 
 	return rc;
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index 64f4aed..bd98ec2 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -304,22 +304,3 @@ __u32 cl_fid_build_gen(const struct lu_fid *fid)
 	gen = fid_flatten(fid) >> 32;
 	return gen;
 }
-
-/* lsm is unreliable after hsm implementation as layout can be changed at
- * any time. This is only to support old, non-clio-ized interfaces. It will
- * cause deadlock if clio operations are called with this extra layout refcount
- * because in case the layout changed during the IO, ll_layout_refresh() will
- * have to wait for the refcount to become zero to destroy the older layout.
- *
- * Notice that the lsm returned by this function may not be valid unless called
- * inside layout lock - MDS_INODELOCK_LAYOUT.
- */
-struct lov_stripe_md *ccc_inode_lsm_get(struct inode *inode)
-{
-	return lov_lsm_get(ll_i2info(inode)->lli_clob);
-}
-
-inline void ccc_inode_lsm_put(struct inode *inode, struct lov_stripe_md *lsm)
-{
-	lov_lsm_put(ll_i2info(inode)->lli_clob, lsm);
-}
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index e249895..c89e1b8 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -1327,11 +1327,6 @@ static inline void d_lustre_revalidate(struct dentry *dentry)
 	spin_unlock(&dentry->d_lock);
 }
 
-enum {
-	LL_LAYOUT_GEN_NONE  = ((__u32)-2),	/* layout lock was cancelled */
-	LL_LAYOUT_GEN_EMPTY = ((__u32)-1)	/* for empty layout */
-};
-
 int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf);
 int ll_layout_refresh(struct inode *inode, __u32 *gen);
 int ll_layout_restore(struct inode *inode, loff_t start, __u64 length);
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 6270301..4b53119 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -800,7 +800,7 @@ void ll_lli_init(struct ll_inode_info *lli)
 	spin_lock_init(&lli->lli_agl_lock);
 	lli->lli_has_smd = false;
 	spin_lock_init(&lli->lli_layout_lock);
-	ll_layout_version_set(lli, LL_LAYOUT_GEN_NONE);
+	ll_layout_version_set(lli, CL_LAYOUT_GEN_NONE);
 	lli->lli_clob = NULL;
 
 	init_rwsem(&lli->lli_xattrs_list_rwsem);
@@ -1441,14 +1441,33 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 	 * but other attributes must be set
 	 */
 	if (S_ISREG(inode->i_mode)) {
-		struct lov_stripe_md *lsm;
+		struct cl_layout cl = {
+			.cl_is_released = false,
+		};
+		struct lu_env *env;
+		int refcheck;
 		__u32 gen;
 
-		ll_layout_refresh(inode, &gen);
-		lsm = ccc_inode_lsm_get(inode);
-		if (lsm && lsm->lsm_pattern & LOV_PATTERN_F_RELEASED)
-			file_is_released = true;
-		ccc_inode_lsm_put(inode, lsm);
+		rc = ll_layout_refresh(inode, &gen);
+		if (rc < 0)
+			goto out;
+
+		/*
+		 * XXX: the only place we need to know the layout type,
+		 * this will be removed by a later patch. -Jinshan
+		 */
+		env = cl_env_get(&refcheck);
+		if (IS_ERR(env)) {
+			rc = PTR_ERR(env);
+			goto out;
+		}
+
+		rc = cl_object_layout_get(env, lli->lli_clob, &cl);
+		cl_env_put(env, &refcheck);
+		if (rc < 0)
+			goto out;
+
+		file_is_released = cl.cl_is_released;
 
 		if (!hsm_import && attr->ia_valid & ATTR_SIZE) {
 			if (file_is_released) {
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index 47d035e..a025b35 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -323,21 +323,8 @@ static inline struct vvp_lock *cl2vvp_lock(const struct cl_lock_slice *slice)
 # define CLOBINVRNT(env, clob, expr)					\
 	((void)sizeof(env), (void)sizeof(clob), (void)sizeof(!!(expr)))
 
-/**
- * New interfaces to get and put lov_stripe_md from lov layer. This violates
- * layering because lov_stripe_md is supposed to be a private data in lov.
- *
- * NB: If you find you have to use these interfaces for your new code, please
- * think about it again. These interfaces may be removed in the future for
- * better layering.
- */
-struct lov_stripe_md *lov_lsm_get(struct cl_object *clobj);
-void lov_lsm_put(struct cl_object *clobj, struct lov_stripe_md *lsm);
 int lov_read_and_clear_async_rc(struct cl_object *clob);
 
-struct lov_stripe_md *ccc_inode_lsm_get(struct inode *inode);
-void ccc_inode_lsm_put(struct inode *inode, struct lov_stripe_md *lsm);
-
 int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
 		struct cl_io *io);
 int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_object.c b/drivers/staging/lustre/lustre/llite/vvp_object.c
index 3214885..420a649 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_object.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_object.c
@@ -132,7 +132,7 @@ static int vvp_conf_set(const struct lu_env *env, struct cl_object *obj,
 		CDEBUG(D_VFSTRACE, DFID ": losing layout lock\n",
 		       PFID(&lli->lli_fid));
 
-		ll_layout_version_set(lli, LL_LAYOUT_GEN_NONE);
+		ll_layout_version_set(lli, CL_LAYOUT_GEN_NONE);
 
 		/* Clean up page mmap for this inode.
 		 * The reason for us to do this is that if the page has
@@ -164,7 +164,7 @@ static int vvp_conf_set(const struct lu_env *env, struct cl_object *obj,
 		       PFID(&lli->lli_fid), lli->lli_layout_gen);
 
 		lli->lli_has_smd = false;
-		ll_layout_version_set(lli, LL_LAYOUT_GEN_EMPTY);
+		ll_layout_version_set(lli, CL_LAYOUT_GEN_EMPTY);
 	}
 	return 0;
 }
diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index e070adb..b8b5c32 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -353,80 +353,99 @@ static int ll_xattr_get_common(const struct xattr_handler *handler,
 			     OBD_MD_FLXATTR);
 }
 
-static int ll_xattr_get(const struct xattr_handler *handler,
-			struct dentry *dentry, struct inode *inode,
-			const char *name, void *buffer, size_t size)
+static ssize_t ll_getxattr_lov(struct inode *inode, void *buf, size_t buf_size)
 {
-	LASSERT(inode);
-	LASSERT(name);
+	ssize_t rc;
 
-	CDEBUG(D_VFSTRACE, "VFS Op:inode="DFID"(%p), xattr %s\n",
-	       PFID(ll_inode2fid(inode)), inode, name);
-
-	if (!strcmp(name, "lov")) {
-		struct lov_stripe_md *lsm;
-		struct lov_user_md *lump;
-		struct lov_mds_md *lmm = NULL;
-		struct ptlrpc_request *request = NULL;
-		int rc = 0, lmmsize = 0;
+	if (S_ISREG(inode->i_mode)) {
+		struct cl_object *obj = ll_i2info(inode)->lli_clob;
+		struct cl_layout cl = {
+			.cl_buf.lb_buf = buf,
+			.cl_buf.lb_len = buf_size,
+		};
+		struct lu_env *env;
+		int refcheck;
+
+		if (!obj)
+			return -ENODATA;
 
-		ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_GETXATTR, 1);
+		env = cl_env_get(&refcheck);
+		if (IS_ERR(env))
+			return PTR_ERR(env);
 
-		if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
-			return -ENODATA;
+		rc = cl_object_layout_get(env, obj, &cl);
+		if (rc < 0)
+			goto out_env;
 
-		lsm = ccc_inode_lsm_get(inode);
-		if (!lsm) {
-			if (S_ISDIR(inode->i_mode)) {
-				rc = ll_dir_getstripe(inode, (void **)&lmm,
-						      &lmmsize, &request, 0);
-			} else {
-				rc = -ENODATA;
-			}
-		} else {
-			/* LSM is present already after lookup/getattr call.
-			 * we need to grab layout lock once it is implemented
-			 */
-			rc = obd_packmd(ll_i2dtexp(inode), &lmm, lsm);
-			lmmsize = rc;
+		if (!cl.cl_size) {
+			rc = -ENODATA;
+			goto out_env;
 		}
-		ccc_inode_lsm_put(inode, lsm);
 
+		rc = cl.cl_size;
+
+		if (!buf_size)
+			goto out_env;
+
+		LASSERT(buf && rc <= buf_size);
+
+		/*
+		 * Do not return layout gen for getxattr() since
+		 * otherwise it would confuse tar --xattr by
+		 * recognizing layout gen as stripe offset when the
+		 * file is restored. See LU-2809.
+		 */
+		((struct lov_mds_md *)buf)->lmm_layout_gen = 0;
+out_env:
+		cl_env_put(env, &refcheck);
+
+		return rc;
+	} else if (S_ISDIR(inode->i_mode)) {
+		struct ptlrpc_request *req = NULL;
+		struct lov_mds_md *lmm = NULL;
+		int lmm_size = 0;
+
+		rc = ll_dir_getstripe(inode, (void **)&lmm, &lmm_size,
+				      &req, 0);
 		if (rc < 0)
-			goto out;
+			goto out_req;
 
-		if (size == 0) {
-			/* used to call ll_get_max_mdsize() forward to get
-			 * the maximum buffer size, while some apps (such as
-			 * rsync 3.0.x) care much about the exact xattr value
-			 * size
-			 */
-			rc = lmmsize;
-			goto out;
+		if (!buf_size) {
+			rc = lmm_size;
+			goto out_req;
 		}
 
-		if (size < lmmsize) {
-			CERROR("server bug: replied size %d > %d for %pd (%s)\n",
-			       lmmsize, (int)size, dentry, name);
+		if (buf_size < lmm_size) {
 			rc = -ERANGE;
-			goto out;
+			goto out_req;
 		}
 
-		lump = buffer;
-		memcpy(lump, lmm, lmmsize);
-		/* do not return layout gen for getxattr otherwise it would
-		 * confuse tar --xattr by recognizing layout gen as stripe
-		 * offset when the file is restored. See LU-2809.
-		 */
-		lump->lmm_layout_gen = 0;
+		memcpy(buf, lmm, lmm_size);
+		rc = lmm_size;
+out_req:
+		if (req)
+			ptlrpc_req_finished(req);
 
-		rc = lmmsize;
-out:
-		if (request)
-			ptlrpc_req_finished(request);
-		else if (lmm)
-			obd_free_diskmd(ll_i2dtexp(inode), &lmm);
 		return rc;
+	} else {
+		return -ENODATA;
+	}
+}
+
+static int ll_xattr_get(const struct xattr_handler *handler,
+			struct dentry *dentry, struct inode *inode,
+			const char *name, void *buffer, size_t size)
+{
+	LASSERT(inode);
+	LASSERT(name);
+
+	CDEBUG(D_VFSTRACE, "VFS Op:inode=" DFID "(%p), xattr %s\n",
+	       PFID(ll_inode2fid(inode)), inode, name);
+
+	if (!strcmp(name, "lov")) {
+		ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_GETXATTR, 1);
+
+		return ll_getxattr_lov(inode, buffer, size);
 	}
 
 	return ll_xattr_get_common(handler, dentry, inode, name, buffer, size);
@@ -435,10 +454,10 @@ out:
 ssize_t ll_listxattr(struct dentry *dentry, char *buffer, size_t size)
 {
 	struct inode *inode = d_inode(dentry);
-	int rc = 0, rc2 = 0;
-	struct lov_mds_md *lmm = NULL;
-	struct ptlrpc_request *request = NULL;
-	int lmmsize;
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	char *xattr_name;
+	ssize_t rc, rc2;
+	size_t len, rem;
 
 	LASSERT(inode);
 
@@ -450,65 +469,48 @@ ssize_t ll_listxattr(struct dentry *dentry, char *buffer, size_t size)
 	rc = ll_xattr_list(inode, NULL, XATTR_OTHER_T, buffer, size,
 			   OBD_MD_FLXATTRLS);
 	if (rc < 0)
-		goto out;
-
-	if (buffer) {
-		struct ll_sb_info *sbi = ll_i2sbi(inode);
-		char *xattr_name = buffer;
-		int xlen, rem = rc;
-
-		while (rem > 0) {
-			xlen = strnlen(xattr_name, rem - 1) + 1;
-			rem -= xlen;
-			if (xattr_type_filter(sbi,
-					get_xattr_type(xattr_name)) == 0) {
-				/* skip OK xattr type
-				 * leave it in buffer
-				 */
-				xattr_name += xlen;
-				continue;
-			}
-			/* move up remaining xattrs in buffer
-			 * removing the xattr that is not OK
-			 */
-			memmove(xattr_name, xattr_name + xlen, rem);
-			rc -= xlen;
+		return rc;
+	/*
+	 * If we're being called to get the size of the xattr list
+	 * (buf_size == 0) then just assume that a lustre.lov xattr
+	 * exists.
+	 */
+	if (!size)
+		return rc + sizeof(XATTR_LUSTRE_LOV);
+
+	xattr_name = buffer;
+	rem = rc;
+
+	while (rem > 0) {
+		len = strnlen(xattr_name, rem - 1) + 1;
+		rem -= len;
+		if (!xattr_type_filter(sbi, get_xattr_type(xattr_name))) {
+			/* Skip OK xattr type leave it in buffer */
+			xattr_name += len;
+			continue;
 		}
-	}
-	if (S_ISREG(inode->i_mode)) {
-		if (!ll_i2info(inode)->lli_has_smd)
-			rc2 = -1;
-	} else if (S_ISDIR(inode->i_mode)) {
-		rc2 = ll_dir_getstripe(inode, (void **)&lmm, &lmmsize,
-				       &request, 0);
+
+		/*
+		 * Move up remaining xattrs in buffer
+		 * removing the xattr that is not OK
+		 */
+		memmove(xattr_name, xattr_name + len, rem);
+		rc -= len;
 	}
 
-	if (rc2 < 0) {
-		rc2 = 0;
-		goto out;
-	} else if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)) {
-		const int prefix_len = sizeof(XATTR_LUSTRE_PREFIX) - 1;
-		const size_t name_len   = sizeof("lov") - 1;
-		const size_t total_len  = prefix_len + name_len + 1;
-
-		if (((rc + total_len) > size) && buffer) {
-			ptlrpc_req_finished(request);
-			return -ERANGE;
-		}
+	rc2 = ll_getxattr_lov(inode, NULL, 0);
+	if (rc2 == -ENODATA)
+		return rc;
 
-		if (buffer) {
-			buffer += rc;
-			memcpy(buffer, XATTR_LUSTRE_PREFIX, prefix_len);
-			memcpy(buffer + prefix_len, "lov", name_len);
-			buffer[prefix_len + name_len] = '\0';
-		}
-		rc2 = total_len;
-	}
-out:
-	ptlrpc_req_finished(request);
-	rc = rc + rc2;
+	if (rc2 < 0)
+		return rc2;
 
-	return rc;
+	if (size < rc + sizeof(XATTR_LUSTRE_LOV))
+		return -ERANGE;
+
+	memcpy(buffer + rc, XATTR_LUSTRE_LOV, sizeof(XATTR_LUSTRE_LOV));
+
+	return rc + sizeof(XATTR_LUSTRE_LOV);
 }
 
 static const struct xattr_handler ll_user_xattr_handler = {
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index fffc18c..60397a2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -176,6 +176,8 @@ int lov_del_target(struct obd_device *obd, __u32 index,
 		   struct obd_uuid *uuidp, int gen);
 
 /* lov_pack.c */
+ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
+		     size_t buf_size);
 int lov_packmd(struct obd_export *exp, struct lov_mds_md **lmm,
 	       struct lov_stripe_md *lsm);
 int lov_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 07bef44..d39724a 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -75,12 +75,11 @@ struct lov_layout_operations {
 
 static int lov_layout_wait(const struct lu_env *env, struct lov_object *lov);
 
-void lov_lsm_put(struct cl_object *unused, struct lov_stripe_md *lsm)
+static void lov_lsm_put(struct lov_stripe_md *lsm)
 {
 	if (lsm)
 		lov_free_memmd(&lsm);
 }
-EXPORT_SYMBOL(lov_lsm_put);
 
 /*****************************************************************************
  *
@@ -1408,7 +1407,7 @@ obj_put:
 		cl_object_put(env, subobj);
 out:
 	kvfree(fm_local);
-	lov_lsm_put(obj, lsm);
+	lov_lsm_put(lsm);
 	return rc;
 }
 
@@ -1424,10 +1423,37 @@ static int lov_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 		return -ENODATA;
 
 	rc = lov_getstripe(cl2lov(obj), lsm, lum);
-	lov_lsm_put(obj, lsm);
+	lov_lsm_put(lsm);
 	return rc;
 }
 
+static int lov_object_layout_get(const struct lu_env *env,
+				 struct cl_object *obj,
+				 struct cl_layout *cl)
+{
+	struct lov_object *lov = cl2lov(obj);
+	struct lov_stripe_md *lsm = lov_lsm_addref(lov);
+	struct lu_buf *buf = &cl->cl_buf;
+	ssize_t rc;
+
+	if (!lsm) {
+		cl->cl_size = 0;
+		cl->cl_layout_gen = CL_LAYOUT_GEN_EMPTY;
+		cl->cl_is_released = false;
+
+		return 0;
+	}
+
+	cl->cl_size = lov_mds_md_size(lsm->lsm_stripe_count, lsm->lsm_magic);
+	cl->cl_layout_gen = lsm->lsm_layout_gen;
+	cl->cl_is_released = lsm_is_released(lsm);
+
+	rc = lov_lsm_pack(lsm, buf->lb_buf, buf->lb_len);
+	lov_lsm_put(lsm);
+
+	return rc < 0 ? rc : 0;
+}
+
 static const struct cl_object_operations lov_ops = {
 	.coo_page_init = lov_page_init,
 	.coo_lock_init = lov_lock_init,
@@ -1436,6 +1462,7 @@ static const struct cl_object_operations lov_ops = {
 	.coo_attr_update = lov_attr_update,
 	.coo_conf_set  = lov_conf_set,
 	.coo_getstripe = lov_object_getstripe,
+	.coo_layout_get	 = lov_object_layout_get,
 	.coo_fiemap	 = lov_object_fiemap,
 };
 
@@ -1488,22 +1515,6 @@ struct lov_stripe_md *lov_lsm_addref(struct lov_object *lov)
 	return lsm;
 }
 
-struct lov_stripe_md *lov_lsm_get(struct cl_object *clobj)
-{
-	struct lu_object *luobj;
-	struct lov_stripe_md *lsm = NULL;
-
-	if (!clobj)
-		return NULL;
-
-	luobj = lu_object_locate(&cl_object_header(clobj)->coh_lu,
-				 &lov_device_type);
-	if (luobj)
-		lsm = lov_lsm_addref(lu2lov(luobj));
-	return lsm;
-}
-EXPORT_SYMBOL(lov_lsm_get);
-
 int lov_read_and_clear_async_rc(struct cl_object *clob)
 {
 	struct lu_object *luobj;
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index be6e985..1156ef9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -97,6 +97,62 @@ void lov_dump_lmm_v3(int level, struct lov_mds_md_v3 *lmm)
 			     le16_to_cpu(lmm->lmm_stripe_count));
 }
 
+/**
+ * Pack LOV striping metadata for disk storage format (in little
+ * endian byte order).
+ *
+ * This follows the getxattr() conventions. If \a buf_size is zero
+ * then return the size needed. If \a buf_size is too small then
+ * return -ERANGE. Otherwise return the size of the result.
+ */
+ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
+		     size_t buf_size)
+{
+	struct lov_ost_data_v1 *lmm_objects;
+	struct lov_mds_md_v1 *lmmv1 = buf;
+	struct lov_mds_md_v3 *lmmv3 = buf;
+	size_t lmm_size;
+	unsigned int i;
+
+	lmm_size = lov_mds_md_size(lsm->lsm_stripe_count, lsm->lsm_magic);
+	if (!buf_size)
+		return lmm_size;
+
+	if (buf_size < lmm_size)
+		return -ERANGE;
+
+	/*
+	 * lmmv1 and lmmv3 point to the same struct and have the
+	 * same first fields
+	 */
+	lmmv1->lmm_magic = cpu_to_le32(lsm->lsm_magic);
+	lmm_oi_cpu_to_le(&lmmv1->lmm_oi, &lsm->lsm_oi);
+	lmmv1->lmm_stripe_size = cpu_to_le32(lsm->lsm_stripe_size);
+	lmmv1->lmm_stripe_count = cpu_to_le16(lsm->lsm_stripe_count);
+	lmmv1->lmm_pattern = cpu_to_le32(lsm->lsm_pattern);
+	lmmv1->lmm_layout_gen = cpu_to_le16(lsm->lsm_layout_gen);
+
+	if (lsm->lsm_magic == LOV_MAGIC_V3) {
+		CLASSERT(sizeof(lsm->lsm_pool_name) ==
+			 sizeof(lmmv3->lmm_pool_name));
+		strlcpy(lmmv3->lmm_pool_name, lsm->lsm_pool_name,
+			sizeof(lmmv3->lmm_pool_name));
+		lmm_objects = lmmv3->lmm_objects;
+	} else {
+		lmm_objects = lmmv1->lmm_objects;
+	}
+
+	for (i = 0; i < lsm->lsm_stripe_count; i++) {
+		struct lov_oinfo *loi = lsm->lsm_oinfo[i];
+
+		ostid_cpu_to_le(&loi->loi_oi, &lmm_objects[i].l_ost_oi);
+		lmm_objects[i].l_ost_gen = cpu_to_le32(loi->loi_ost_gen);
+		lmm_objects[i].l_ost_idx = cpu_to_le32(loi->loi_ost_idx);
+	}
+
+	return lmm_size;
+}
+
 /* Pack LOV object metadata for disk storage.  It is packed in LE byte
  * order and is opaque to the networking layer.
  *
@@ -108,13 +164,8 @@ void lov_dump_lmm_v3(int level, struct lov_mds_md_v3 *lmm)
 int lov_obd_packmd(struct lov_obd *lov, struct lov_mds_md **lmmp,
 		   struct lov_stripe_md *lsm)
 {
-	struct lov_mds_md_v1 *lmmv1;
-	struct lov_mds_md_v3 *lmmv3;
 	__u16 stripe_count;
-	struct lov_ost_data_v1 *lmm_objects;
 	int lmm_size, lmm_magic;
-	int i;
-	int cplen = 0;
 
 	if (lsm) {
 		lmm_magic = lsm->lsm_magic;
@@ -177,46 +228,10 @@ int lov_obd_packmd(struct lov_obd *lov, struct lov_mds_md **lmmp,
 	CDEBUG(D_INFO, "lov_packmd: LOV_MAGIC 0x%08X, lmm_size = %d\n",
 	       lmm_magic, lmm_size);
 
-	lmmv1 = *lmmp;
-	lmmv3 = (struct lov_mds_md_v3 *)*lmmp;
-	if (lmm_magic == LOV_MAGIC_V3)
-		lmmv3->lmm_magic = cpu_to_le32(LOV_MAGIC_V3);
-	else
-		lmmv1->lmm_magic = cpu_to_le32(LOV_MAGIC_V1);
-
 	if (!lsm)
 		return lmm_size;
 
-	/* lmmv1 and lmmv3 point to the same struct and have the
-	 * same first fields
-	 */
-	lmm_oi_cpu_to_le(&lmmv1->lmm_oi, &lsm->lsm_oi);
-	lmmv1->lmm_stripe_size = cpu_to_le32(lsm->lsm_stripe_size);
-	lmmv1->lmm_stripe_count = cpu_to_le16(stripe_count);
-	lmmv1->lmm_pattern = cpu_to_le32(lsm->lsm_pattern);
-	lmmv1->lmm_layout_gen = cpu_to_le16(lsm->lsm_layout_gen);
-	if (lsm->lsm_magic == LOV_MAGIC_V3) {
-		cplen = strlcpy(lmmv3->lmm_pool_name, lsm->lsm_pool_name,
-				sizeof(lmmv3->lmm_pool_name));
-		if (cplen >= sizeof(lmmv3->lmm_pool_name))
-			return -E2BIG;
-		lmm_objects = lmmv3->lmm_objects;
-	} else {
-		lmm_objects = lmmv1->lmm_objects;
-	}
-
-	for (i = 0; i < stripe_count; i++) {
-		struct lov_oinfo *loi = lsm->lsm_oinfo[i];
-		/* XXX LOV STACKING call down to osc_packmd() to do packing */
-		LASSERTF(ostid_id(&loi->loi_oi) != 0, "lmm_oi "DOSTID
-			 " stripe %u/%u idx %u\n", POSTID(&lmmv1->lmm_oi),
-			 i, stripe_count, loi->loi_ost_idx);
-		ostid_cpu_to_le(&loi->loi_oi, &lmm_objects[i].l_ost_oi);
-		lmm_objects[i].l_ost_gen = cpu_to_le32(loi->loi_ost_gen);
-		lmm_objects[i].l_ost_idx = cpu_to_le32(loi->loi_ost_idx);
-	}
-
-	return lmm_size;
+	return lov_lsm_pack(lsm, *lmmp, lmm_size);
 }
 
 int lov_packmd(struct obd_export *exp, struct lov_mds_md **lmmp,
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index 4ad2ee5..dd80b83 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -374,6 +374,20 @@ int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 }
 EXPORT_SYMBOL(cl_object_fiemap);
 
+int cl_object_layout_get(const struct lu_env *env, struct cl_object *obj,
+			 struct cl_layout *cl)
+{
+	struct lu_object_header *top = obj->co_lu.lo_header;
+
+	list_for_each_entry(obj, &top->loh_layers, co_lu.lo_linkage) {
+		if (obj->co_ops->coo_layout_get)
+			return obj->co_ops->coo_layout_get(env, obj, cl);
+	}
+
+	return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL(cl_object_layout_get);
+
 /**
  * Helper function removing all object locks, and marking object for
  * deletion. All object pages must have been deleted at this point.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 25/41] staging: lustre: lov: add cl_object_layout_get()
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Add cl_object_layout_get() to return the layout and generation of an
object. Replace some direct accesses to object LSM with calls to this
function.

In ll_getxattr() factor out the LOV xattr specific handling into a new
function ll_getxattr_lov() which calls cl_object_layout_get(). In
ll_listxattr() call ll_getxattr_lov() to determine if a lustre.lov
xattr should be emitted.  Add lov_lsm_pack() to generate LOV xattrs
from a LSM.

Remove the unused functions ccc_inode_lsm_{get,put}() and
lov_lsm_get().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13680
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   27 +++
 .../lustre/lustre/include/lustre/lustre_user.h     |    3 +
 drivers/staging/lustre/lustre/llite/file.c         |  149 +++++++------
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |   19 --
 .../staging/lustre/lustre/llite/llite_internal.h   |    5 -
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   33 +++-
 drivers/staging/lustre/lustre/llite/vvp_internal.h |   13 -
 drivers/staging/lustre/lustre/llite/vvp_object.c   |    4 +-
 drivers/staging/lustre/lustre/llite/xattr.c        |  234 ++++++++++----------
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    2 +
 drivers/staging/lustre/lustre/lov/lov_object.c     |   51 +++--
 drivers/staging/lustre/lustre/lov/lov_pack.c       |   99 +++++----
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   14 ++
 13 files changed, 364 insertions(+), 289 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index e46a510..b80539d 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -301,6 +301,26 @@ enum {
 	OBJECT_CONF_WAIT = 2
 };
 
+enum {
+	CL_LAYOUT_GEN_NONE	= (u32)-2,	/* layout lock was cancelled */
+	CL_LAYOUT_GEN_EMPTY	= (u32)-1,	/* for empty layout */
+};
+
+struct cl_layout {
+	/** the buffer to return the layout in lov_mds_md format. */
+	struct lu_buf	cl_buf;
+	/** size of layout in lov_mds_md format. */
+	size_t		cl_size;
+	/** Layout generation. */
+	u32		cl_layout_gen;
+	/**
+	 * True if this is a released file.
+	 * Temporarily added for released file truncate in ll_setattr_raw().
+	 * It will be removed later. -Jinshan
+	 */
+	bool		cl_is_released;
+};
+
 /**
  * Operations implemented for each cl object layer.
  *
@@ -406,6 +426,11 @@ struct cl_object_operations {
 	int (*coo_fiemap)(const struct lu_env *env, struct cl_object *obj,
 			  struct ll_fiemap_info_key *fmkey,
 			  struct fiemap *fiemap, size_t *buflen);
+	/**
+	 * Get layout and generation of the object.
+	 */
+	int (*coo_layout_get)(const struct lu_env *env, struct cl_object *obj,
+			      struct cl_layout *layout);
 };
 
 /**
@@ -2200,6 +2225,8 @@ int  cl_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 		     struct ll_fiemap_info_key *fmkey, struct fiemap *fiemap,
 		     size_t *buflen);
+int cl_object_layout_get(const struct lu_env *env, struct cl_object *obj,
+			 struct cl_layout *cl);
 
 /**
  * Returns true, iff \a o0 and \a o1 are slices of the same object.
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 043fc1c..80fecba 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -346,6 +346,9 @@ enum ll_lease_type {
 #define LOV_ALL_STRIPES       0xffff /* only valid for directories */
 #define LOV_V1_INSANE_STRIPE_COUNT 65532 /* maximum stripe count bz13933 */
 
+#define XATTR_LUSTRE_PREFIX	"lustre."
+#define XATTR_LUSTRE_LOV	"lustre.lov"
+
 #define lov_user_ost_data lov_user_ost_data_v1
 struct lov_user_ost_data_v1 {     /* per-stripe data structure */
 	struct ost_id l_ost_oi;	  /* OST object ID */
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 8fa65a5..73ea446 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -3187,35 +3187,51 @@ ll_iocontrol_call(struct inode *inode, struct file *file,
 int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
+	struct cl_object *obj = lli->lli_clob;
 	struct cl_env_nest nest;
 	struct lu_env *env;
-	int result;
+	int rc;
 
-	if (!lli->lli_clob)
+	if (!obj)
 		return 0;
 
 	env = cl_env_nested_get(&nest);
 	if (IS_ERR(env))
 		return PTR_ERR(env);
 
-	result = cl_conf_set(env, lli->lli_clob, conf);
-	cl_env_nested_put(&nest, env);
+	rc = cl_conf_set(env, obj, conf);
+	if (rc < 0)
+		goto out;
 
 	if (conf->coc_opc == OBJECT_CONF_SET) {
 		struct ldlm_lock *lock = conf->coc_lock;
+		struct cl_layout cl = {
+			.cl_layout_gen = 0,
+		};
 
 		LASSERT(lock);
 		LASSERT(ldlm_has_layout(lock));
-		if (result == 0) {
-			/* it can only be allowed to match after layout is
-			 * applied to inode otherwise false layout would be
-			 * seen. Applying layout should happen before dropping
-			 * the intent lock.
-			 */
-			ldlm_lock_allow_match(lock);
-		}
+
+		/* it can only be allowed to match after layout is
+		 * applied to inode otherwise false layout would be
+		 * seen. Applying layout should happen before dropping
+		 * the intent lock.
+		 */
+		ldlm_lock_allow_match(lock);
+
+		rc = cl_object_layout_get(env, obj, &cl);
+		if (rc < 0)
+			goto out;
+
+		CDEBUG(D_VFSTRACE, DFID ": layout version change: %u -> %u\n",
+		       PFID(&lli->lli_fid), ll_layout_version_get(lli),
+		       cl.cl_layout_gen);
+		ll_layout_version_set(lli, cl.cl_layout_gen);
+		lli->lli_has_smd = lsm_has_objects(conf->u.coc_md->lsm);
 	}
-	return result;
+out:
+	cl_env_nested_put(&nest, env);
+	return rc;
 }
 
 /* Fetch layout from MDT with getxattr request, if it's not ready yet */
@@ -3294,7 +3310,7 @@ out:
  * in this function.
  */
 static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
-			      struct inode *inode, __u32 *gen, bool reconf)
+			      struct inode *inode)
 {
 	struct ll_inode_info *lli = ll_i2info(inode);
 	struct ll_sb_info    *sbi = ll_i2sbi(inode);
@@ -3311,8 +3327,8 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	LASSERT(lock);
 	LASSERT(ldlm_has_layout(lock));
 
-	LDLM_DEBUG(lock, "File "DFID"(%p) being reconfigured: %d",
-		   PFID(&lli->lli_fid), inode, reconf);
+	LDLM_DEBUG(lock, "File " DFID "(%p) being reconfigured",
+		   PFID(&lli->lli_fid), inode);
 
 	/* in case this is a caching lock and reinstate with new inode */
 	md_set_lock_data(sbi->ll_md_exp, lockh, inode, NULL);
@@ -3323,15 +3339,8 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	/* checking lvb_ready is racy but this is okay. The worst case is
 	 * that multi processes may configure the file on the same time.
 	 */
-	if (lvb_ready || !reconf) {
-		rc = -ENODATA;
-		if (lvb_ready) {
-			/* layout_gen must be valid if layout lock is not
-			 * cancelled and stripe has already set
-			 */
-			*gen = ll_layout_version_get(lli);
-			rc = 0;
-		}
+	if (lvb_ready) {
+		rc = 0;
 		goto out;
 	}
 
@@ -3347,19 +3356,17 @@ static int ll_layout_lock_set(struct lustre_handle *lockh, enum ldlm_mode mode,
 	if (lock->l_lvb_data) {
 		rc = obd_unpackmd(sbi->ll_dt_exp, &md.lsm,
 				  lock->l_lvb_data, lock->l_lvb_len);
-		if (rc >= 0) {
-			*gen = LL_LAYOUT_GEN_EMPTY;
-			if (md.lsm)
-				*gen = md.lsm->lsm_layout_gen;
-			rc = 0;
-		} else {
+		if (rc < 0) {
 			CERROR("%s: file " DFID " unpackmd error: %d\n",
 			       ll_get_fsname(inode->i_sb, NULL, 0),
 			       PFID(&lli->lli_fid), rc);
+			goto out;
 		}
+
+		LASSERTF(md.lsm, "lvb_data = %p, lvb_len = %u\n",
+			 lock->l_lvb_data, lock->l_lvb_len);
+		rc = 0;
 	}
-	if (rc < 0)
-		goto out;
 
 	/* set layout to file. Unlikely this will fail as old layout was
 	 * surely eliminated
@@ -3401,20 +3408,7 @@ out:
 	return rc;
 }
 
-/**
- * This function checks if there exists a LAYOUT lock on the client side,
- * or enqueues it if it doesn't have one in cache.
- *
- * This function will not hold layout lock so it may be revoked any time after
- * this function returns. Any operations depend on layout should be redone
- * in that case.
- *
- * This function should be called before lov_io_init() to get an uptodate
- * layout version, the caller should save the version number and after IO
- * is finished, this function should be called again to verify that layout
- * is not changed during IO time.
- */
-int ll_layout_refresh(struct inode *inode, __u32 *gen)
+static int ll_layout_refresh_locked(struct inode *inode)
 {
 	struct ll_inode_info  *lli = ll_i2info(inode);
 	struct ll_sb_info     *sbi = ll_i2sbi(inode);
@@ -3430,17 +3424,6 @@ int ll_layout_refresh(struct inode *inode, __u32 *gen)
 	};
 	int rc;
 
-	*gen = ll_layout_version_get(lli);
-	if (!(sbi->ll_flags & LL_SBI_LAYOUT_LOCK) || *gen != LL_LAYOUT_GEN_NONE)
-		return 0;
-
-	/* sanity checks */
-	LASSERT(fid_is_sane(ll_inode2fid(inode)));
-	LASSERT(S_ISREG(inode->i_mode));
-
-	/* take layout lock mutex to enqueue layout lock exclusively. */
-	mutex_lock(&lli->lli_layout_mutex);
-
 again:
 	/* mostly layout lock is caching on the local side, so try to match
 	 * it before grabbing layout lock mutex.
@@ -3448,20 +3431,16 @@ again:
 	mode = ll_take_md_lock(inode, MDS_INODELOCK_LAYOUT, &lockh, 0,
 			       LCK_CR | LCK_CW | LCK_PR | LCK_PW);
 	if (mode != 0) { /* hit cached lock */
-		rc = ll_layout_lock_set(&lockh, mode, inode, gen, true);
+		rc = ll_layout_lock_set(&lockh, mode, inode);
 		if (rc == -EAGAIN)
 			goto again;
-
-		mutex_unlock(&lli->lli_layout_mutex);
 		return rc;
 	}
 
 	op_data = ll_prep_md_op_data(NULL, inode, inode, NULL,
 				     0, 0, LUSTRE_OPC_ANY, NULL);
-	if (IS_ERR(op_data)) {
-		mutex_unlock(&lli->lli_layout_mutex);
+	if (IS_ERR(op_data))
 		return PTR_ERR(op_data);
-	}
 
 	/* have to enqueue one */
 	memset(&it, 0, sizeof(it));
@@ -3485,10 +3464,50 @@ again:
 	if (rc == 0) {
 		/* set lock data in case this is a new lock */
 		ll_set_lock_data(sbi->ll_md_exp, inode, &it, NULL);
-		rc = ll_layout_lock_set(&lockh, mode, inode, gen, true);
+		rc = ll_layout_lock_set(&lockh, mode, inode);
 		if (rc == -EAGAIN)
 			goto again;
 	}
+
+	return rc;
+}
+
+/**
+ * This function checks if there exists a LAYOUT lock on the client side,
+ * or enqueues it if it doesn't have one in cache.
+ *
+ * This function will not hold layout lock so it may be revoked any time after
+ * this function returns. Any operations depend on layout should be redone
+ * in that case.
+ *
+ * This function should be called before lov_io_init() to get an uptodate
+ * layout version, the caller should save the version number and after IO
+ * is finished, this function should be called again to verify that layout
+ * is not changed during IO time.
+ */
+int ll_layout_refresh(struct inode *inode, __u32 *gen)
+{
+	struct ll_inode_info *lli = ll_i2info(inode);
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	int rc;
+
+	*gen = ll_layout_version_get(lli);
+	if (!(sbi->ll_flags & LL_SBI_LAYOUT_LOCK) || *gen != CL_LAYOUT_GEN_NONE)
+		return 0;
+
+	/* sanity checks */
+	LASSERT(fid_is_sane(ll_inode2fid(inode)));
+	LASSERT(S_ISREG(inode->i_mode));
+
+	/* take layout lock mutex to enqueue layout lock exclusively. */
+	mutex_lock(&lli->lli_layout_mutex);
+
+	rc = ll_layout_refresh_locked(inode);
+	if (rc < 0)
+		goto out;
+
+	*gen = ll_layout_version_get(lli);
+out:
 	mutex_unlock(&lli->lli_layout_mutex);
 
 	return rc;
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index 64f4aed..bd98ec2 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -304,22 +304,3 @@ __u32 cl_fid_build_gen(const struct lu_fid *fid)
 	gen = fid_flatten(fid) >> 32;
 	return gen;
 }
-
-/* lsm is unreliable after hsm implementation as layout can be changed at
- * any time. This is only to support old, non-clio-ized interfaces. It will
- * cause deadlock if clio operations are called with this extra layout refcount
- * because in case the layout changed during the IO, ll_layout_refresh() will
- * have to wait for the refcount to become zero to destroy the older layout.
- *
- * Notice that the lsm returned by this function may not be valid unless called
- * inside layout lock - MDS_INODELOCK_LAYOUT.
- */
-struct lov_stripe_md *ccc_inode_lsm_get(struct inode *inode)
-{
-	return lov_lsm_get(ll_i2info(inode)->lli_clob);
-}
-
-inline void ccc_inode_lsm_put(struct inode *inode, struct lov_stripe_md *lsm)
-{
-	lov_lsm_put(ll_i2info(inode)->lli_clob, lsm);
-}
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index e249895..c89e1b8 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -1327,11 +1327,6 @@ static inline void d_lustre_revalidate(struct dentry *dentry)
 	spin_unlock(&dentry->d_lock);
 }
 
-enum {
-	LL_LAYOUT_GEN_NONE  = ((__u32)-2),	/* layout lock was cancelled */
-	LL_LAYOUT_GEN_EMPTY = ((__u32)-1)	/* for empty layout */
-};
-
 int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf);
 int ll_layout_refresh(struct inode *inode, __u32 *gen);
 int ll_layout_restore(struct inode *inode, loff_t start, __u64 length);
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 6270301..4b53119 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -800,7 +800,7 @@ void ll_lli_init(struct ll_inode_info *lli)
 	spin_lock_init(&lli->lli_agl_lock);
 	lli->lli_has_smd = false;
 	spin_lock_init(&lli->lli_layout_lock);
-	ll_layout_version_set(lli, LL_LAYOUT_GEN_NONE);
+	ll_layout_version_set(lli, CL_LAYOUT_GEN_NONE);
 	lli->lli_clob = NULL;
 
 	init_rwsem(&lli->lli_xattrs_list_rwsem);
@@ -1441,14 +1441,33 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 	 * but other attributes must be set
 	 */
 	if (S_ISREG(inode->i_mode)) {
-		struct lov_stripe_md *lsm;
+		struct cl_layout cl = {
+			.cl_is_released = false,
+		};
+		struct lu_env *env;
+		int refcheck;
 		__u32 gen;
 
-		ll_layout_refresh(inode, &gen);
-		lsm = ccc_inode_lsm_get(inode);
-		if (lsm && lsm->lsm_pattern & LOV_PATTERN_F_RELEASED)
-			file_is_released = true;
-		ccc_inode_lsm_put(inode, lsm);
+		rc = ll_layout_refresh(inode, &gen);
+		if (rc < 0)
+			goto out;
+
+		/*
+		 * XXX: the only place we need to know the layout type,
+		 * this will be removed by a later patch. -Jinshan
+		 */
+		env = cl_env_get(&refcheck);
+		if (IS_ERR(env)) {
+			rc = PTR_ERR(env);
+			goto out;
+		}
+
+		rc = cl_object_layout_get(env, lli->lli_clob, &cl);
+		cl_env_put(env, &refcheck);
+		if (rc < 0)
+			goto out;
+
+		file_is_released = cl.cl_is_released;
 
 		if (!hsm_import && attr->ia_valid & ATTR_SIZE) {
 			if (file_is_released) {
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index 47d035e..a025b35 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -323,21 +323,8 @@ static inline struct vvp_lock *cl2vvp_lock(const struct cl_lock_slice *slice)
 # define CLOBINVRNT(env, clob, expr)					\
 	((void)sizeof(env), (void)sizeof(clob), (void)sizeof(!!(expr)))
 
-/**
- * New interfaces to get and put lov_stripe_md from lov layer. This violates
- * layering because lov_stripe_md is supposed to be a private data in lov.
- *
- * NB: If you find you have to use these interfaces for your new code, please
- * think about it again. These interfaces may be removed in the future for
- * better layering.
- */
-struct lov_stripe_md *lov_lsm_get(struct cl_object *clobj);
-void lov_lsm_put(struct cl_object *clobj, struct lov_stripe_md *lsm);
 int lov_read_and_clear_async_rc(struct cl_object *clob);
 
-struct lov_stripe_md *ccc_inode_lsm_get(struct inode *inode);
-void ccc_inode_lsm_put(struct inode *inode, struct lov_stripe_md *lsm);
-
 int vvp_io_init(const struct lu_env *env, struct cl_object *obj,
 		struct cl_io *io);
 int vvp_io_write_commit(const struct lu_env *env, struct cl_io *io);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_object.c b/drivers/staging/lustre/lustre/llite/vvp_object.c
index 3214885..420a649 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_object.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_object.c
@@ -132,7 +132,7 @@ static int vvp_conf_set(const struct lu_env *env, struct cl_object *obj,
 		CDEBUG(D_VFSTRACE, DFID ": losing layout lock\n",
 		       PFID(&lli->lli_fid));
 
-		ll_layout_version_set(lli, LL_LAYOUT_GEN_NONE);
+		ll_layout_version_set(lli, CL_LAYOUT_GEN_NONE);
 
 		/* Clean up page mmap for this inode.
 		 * The reason for us to do this is that if the page has
@@ -164,7 +164,7 @@ static int vvp_conf_set(const struct lu_env *env, struct cl_object *obj,
 		       PFID(&lli->lli_fid), lli->lli_layout_gen);
 
 		lli->lli_has_smd = false;
-		ll_layout_version_set(lli, LL_LAYOUT_GEN_EMPTY);
+		ll_layout_version_set(lli, CL_LAYOUT_GEN_EMPTY);
 	}
 	return 0;
 }
diff --git a/drivers/staging/lustre/lustre/llite/xattr.c b/drivers/staging/lustre/lustre/llite/xattr.c
index e070adb..b8b5c32 100644
--- a/drivers/staging/lustre/lustre/llite/xattr.c
+++ b/drivers/staging/lustre/lustre/llite/xattr.c
@@ -353,80 +353,99 @@ static int ll_xattr_get_common(const struct xattr_handler *handler,
 			     OBD_MD_FLXATTR);
 }
 
-static int ll_xattr_get(const struct xattr_handler *handler,
-			struct dentry *dentry, struct inode *inode,
-			const char *name, void *buffer, size_t size)
+static ssize_t ll_getxattr_lov(struct inode *inode, void *buf, size_t buf_size)
 {
-	LASSERT(inode);
-	LASSERT(name);
+	ssize_t rc;
 
-	CDEBUG(D_VFSTRACE, "VFS Op:inode="DFID"(%p), xattr %s\n",
-	       PFID(ll_inode2fid(inode)), inode, name);
-
-	if (!strcmp(name, "lov")) {
-		struct lov_stripe_md *lsm;
-		struct lov_user_md *lump;
-		struct lov_mds_md *lmm = NULL;
-		struct ptlrpc_request *request = NULL;
-		int rc = 0, lmmsize = 0;
+	if (S_ISREG(inode->i_mode)) {
+		struct cl_object *obj = ll_i2info(inode)->lli_clob;
+		struct cl_layout cl = {
+			.cl_buf.lb_buf = buf,
+			.cl_buf.lb_len = buf_size,
+		};
+		struct lu_env *env;
+		int refcheck;
+
+		if (!obj)
+			return -ENODATA;
 
-		ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_GETXATTR, 1);
+		env = cl_env_get(&refcheck);
+		if (IS_ERR(env))
+			return PTR_ERR(env);
 
-		if (!S_ISREG(inode->i_mode) && !S_ISDIR(inode->i_mode))
-			return -ENODATA;
+		rc = cl_object_layout_get(env, obj, &cl);
+		if (rc < 0)
+			goto out_env;
 
-		lsm = ccc_inode_lsm_get(inode);
-		if (!lsm) {
-			if (S_ISDIR(inode->i_mode)) {
-				rc = ll_dir_getstripe(inode, (void **)&lmm,
-						      &lmmsize, &request, 0);
-			} else {
-				rc = -ENODATA;
-			}
-		} else {
-			/* LSM is present already after lookup/getattr call.
-			 * we need to grab layout lock once it is implemented
-			 */
-			rc = obd_packmd(ll_i2dtexp(inode), &lmm, lsm);
-			lmmsize = rc;
+		if (!cl.cl_size) {
+			rc = -ENODATA;
+			goto out_env;
 		}
-		ccc_inode_lsm_put(inode, lsm);
 
+		rc = cl.cl_size;
+
+		if (!buf_size)
+			goto out_env;
+
+		LASSERT(buf && rc <= buf_size);
+
+		/*
+		 * Do not return layout gen for getxattr() since
+		 * otherwise it would confuse tar --xattr by
+		 * recognizing layout gen as stripe offset when the
+		 * file is restored. See LU-2809.
+		 */
+		((struct lov_mds_md *)buf)->lmm_layout_gen = 0;
+out_env:
+		cl_env_put(env, &refcheck);
+
+		return rc;
+	} else if (S_ISDIR(inode->i_mode)) {
+		struct ptlrpc_request *req = NULL;
+		struct lov_mds_md *lmm = NULL;
+		int lmm_size = 0;
+
+		rc = ll_dir_getstripe(inode, (void **)&lmm, &lmm_size,
+				      &req, 0);
 		if (rc < 0)
-			goto out;
+			goto out_req;
 
-		if (size == 0) {
-			/* used to call ll_get_max_mdsize() forward to get
-			 * the maximum buffer size, while some apps (such as
-			 * rsync 3.0.x) care much about the exact xattr value
-			 * size
-			 */
-			rc = lmmsize;
-			goto out;
+		if (!buf_size) {
+			rc = lmm_size;
+			goto out_req;
 		}
 
-		if (size < lmmsize) {
-			CERROR("server bug: replied size %d > %d for %pd (%s)\n",
-			       lmmsize, (int)size, dentry, name);
+		if (buf_size < lmm_size) {
 			rc = -ERANGE;
-			goto out;
+			goto out_req;
 		}
 
-		lump = buffer;
-		memcpy(lump, lmm, lmmsize);
-		/* do not return layout gen for getxattr otherwise it would
-		 * confuse tar --xattr by recognizing layout gen as stripe
-		 * offset when the file is restored. See LU-2809.
-		 */
-		lump->lmm_layout_gen = 0;
+		memcpy(buf, lmm, lmm_size);
+		rc = lmm_size;
+out_req:
+		if (req)
+			ptlrpc_req_finished(req);
 
-		rc = lmmsize;
-out:
-		if (request)
-			ptlrpc_req_finished(request);
-		else if (lmm)
-			obd_free_diskmd(ll_i2dtexp(inode), &lmm);
 		return rc;
+	} else {
+		return -ENODATA;
+	}
+}
+
+static int ll_xattr_get(const struct xattr_handler *handler,
+			struct dentry *dentry, struct inode *inode,
+			const char *name, void *buffer, size_t size)
+{
+	LASSERT(inode);
+	LASSERT(name);
+
+	CDEBUG(D_VFSTRACE, "VFS Op:inode=" DFID "(%p), xattr %s\n",
+	       PFID(ll_inode2fid(inode)), inode, name);
+
+	if (!strcmp(name, "lov")) {
+		ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_GETXATTR, 1);
+
+		return ll_getxattr_lov(inode, buffer, size);
 	}
 
 	return ll_xattr_get_common(handler, dentry, inode, name, buffer, size);
@@ -435,10 +454,10 @@ out:
 ssize_t ll_listxattr(struct dentry *dentry, char *buffer, size_t size)
 {
 	struct inode *inode = d_inode(dentry);
-	int rc = 0, rc2 = 0;
-	struct lov_mds_md *lmm = NULL;
-	struct ptlrpc_request *request = NULL;
-	int lmmsize;
+	struct ll_sb_info *sbi = ll_i2sbi(inode);
+	char *xattr_name;
+	ssize_t rc, rc2;
+	size_t len, rem;
 
 	LASSERT(inode);
 
@@ -450,65 +469,48 @@ ssize_t ll_listxattr(struct dentry *dentry, char *buffer, size_t size)
 	rc = ll_xattr_list(inode, NULL, XATTR_OTHER_T, buffer, size,
 			   OBD_MD_FLXATTRLS);
 	if (rc < 0)
-		goto out;
-
-	if (buffer) {
-		struct ll_sb_info *sbi = ll_i2sbi(inode);
-		char *xattr_name = buffer;
-		int xlen, rem = rc;
-
-		while (rem > 0) {
-			xlen = strnlen(xattr_name, rem - 1) + 1;
-			rem -= xlen;
-			if (xattr_type_filter(sbi,
-					get_xattr_type(xattr_name)) == 0) {
-				/* skip OK xattr type
-				 * leave it in buffer
-				 */
-				xattr_name += xlen;
-				continue;
-			}
-			/* move up remaining xattrs in buffer
-			 * removing the xattr that is not OK
-			 */
-			memmove(xattr_name, xattr_name + xlen, rem);
-			rc -= xlen;
+		return rc;
+	/*
+	 * If we're being called to get the size of the xattr list
+	 * (buf_size == 0) then just assume that a lustre.lov xattr
+	 * exists.
+	 */
+	if (!size)
+		return rc + sizeof(XATTR_LUSTRE_LOV);
+
+	xattr_name = buffer;
+	rem = rc;
+
+	while (rem > 0) {
+		len = strnlen(xattr_name, rem - 1) + 1;
+		rem -= len;
+		if (!xattr_type_filter(sbi, get_xattr_type(xattr_name))) {
+			/* Skip OK xattr type leave it in buffer */
+			xattr_name += len;
+			continue;
 		}
-	}
-	if (S_ISREG(inode->i_mode)) {
-		if (!ll_i2info(inode)->lli_has_smd)
-			rc2 = -1;
-	} else if (S_ISDIR(inode->i_mode)) {
-		rc2 = ll_dir_getstripe(inode, (void **)&lmm, &lmmsize,
-				       &request, 0);
+
+		/*
+		 * Move up remaining xattrs in buffer
+		 * removing the xattr that is not OK
+		 */
+		memmove(xattr_name, xattr_name + len, rem);
+		rc -= len;
 	}
 
-	if (rc2 < 0) {
-		rc2 = 0;
-		goto out;
-	} else if (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode)) {
-		const int prefix_len = sizeof(XATTR_LUSTRE_PREFIX) - 1;
-		const size_t name_len   = sizeof("lov") - 1;
-		const size_t total_len  = prefix_len + name_len + 1;
-
-		if (((rc + total_len) > size) && buffer) {
-			ptlrpc_req_finished(request);
-			return -ERANGE;
-		}
+	rc2 = ll_getxattr_lov(inode, NULL, 0);
+	if (rc2 == -ENODATA)
+		return rc;
 
-		if (buffer) {
-			buffer += rc;
-			memcpy(buffer, XATTR_LUSTRE_PREFIX, prefix_len);
-			memcpy(buffer + prefix_len, "lov", name_len);
-			buffer[prefix_len + name_len] = '\0';
-		}
-		rc2 = total_len;
-	}
-out:
-	ptlrpc_req_finished(request);
-	rc = rc + rc2;
+	if (rc2 < 0)
+		return rc2;
 
-	return rc;
+	if (size < rc + sizeof(XATTR_LUSTRE_LOV))
+		return -ERANGE;
+
+	memcpy(buffer + rc, XATTR_LUSTRE_LOV, sizeof(XATTR_LUSTRE_LOV));
+
+	return rc + sizeof(XATTR_LUSTRE_LOV);
 }
 
 static const struct xattr_handler ll_user_xattr_handler = {
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index fffc18c..60397a2 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -176,6 +176,8 @@ int lov_del_target(struct obd_device *obd, __u32 index,
 		   struct obd_uuid *uuidp, int gen);
 
 /* lov_pack.c */
+ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
+		     size_t buf_size);
 int lov_packmd(struct obd_export *exp, struct lov_mds_md **lmm,
 	       struct lov_stripe_md *lsm);
 int lov_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index 07bef44..d39724a 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -75,12 +75,11 @@ struct lov_layout_operations {
 
 static int lov_layout_wait(const struct lu_env *env, struct lov_object *lov);
 
-void lov_lsm_put(struct cl_object *unused, struct lov_stripe_md *lsm)
+static void lov_lsm_put(struct lov_stripe_md *lsm)
 {
 	if (lsm)
 		lov_free_memmd(&lsm);
 }
-EXPORT_SYMBOL(lov_lsm_put);
 
 /*****************************************************************************
  *
@@ -1408,7 +1407,7 @@ obj_put:
 		cl_object_put(env, subobj);
 out:
 	kvfree(fm_local);
-	lov_lsm_put(obj, lsm);
+	lov_lsm_put(lsm);
 	return rc;
 }
 
@@ -1424,10 +1423,37 @@ static int lov_object_getstripe(const struct lu_env *env, struct cl_object *obj,
 		return -ENODATA;
 
 	rc = lov_getstripe(cl2lov(obj), lsm, lum);
-	lov_lsm_put(obj, lsm);
+	lov_lsm_put(lsm);
 	return rc;
 }
 
+static int lov_object_layout_get(const struct lu_env *env,
+				 struct cl_object *obj,
+				 struct cl_layout *cl)
+{
+	struct lov_object *lov = cl2lov(obj);
+	struct lov_stripe_md *lsm = lov_lsm_addref(lov);
+	struct lu_buf *buf = &cl->cl_buf;
+	ssize_t rc;
+
+	if (!lsm) {
+		cl->cl_size = 0;
+		cl->cl_layout_gen = CL_LAYOUT_GEN_EMPTY;
+		cl->cl_is_released = false;
+
+		return 0;
+	}
+
+	cl->cl_size = lov_mds_md_size(lsm->lsm_stripe_count, lsm->lsm_magic);
+	cl->cl_layout_gen = lsm->lsm_layout_gen;
+	cl->cl_is_released = lsm_is_released(lsm);
+
+	rc = lov_lsm_pack(lsm, buf->lb_buf, buf->lb_len);
+	lov_lsm_put(lsm);
+
+	return rc < 0 ? rc : 0;
+}
+
 static const struct cl_object_operations lov_ops = {
 	.coo_page_init = lov_page_init,
 	.coo_lock_init = lov_lock_init,
@@ -1436,6 +1462,7 @@ static const struct cl_object_operations lov_ops = {
 	.coo_attr_update = lov_attr_update,
 	.coo_conf_set  = lov_conf_set,
 	.coo_getstripe = lov_object_getstripe,
+	.coo_layout_get	 = lov_object_layout_get,
 	.coo_fiemap	 = lov_object_fiemap,
 };
 
@@ -1488,22 +1515,6 @@ struct lov_stripe_md *lov_lsm_addref(struct lov_object *lov)
 	return lsm;
 }
 
-struct lov_stripe_md *lov_lsm_get(struct cl_object *clobj)
-{
-	struct lu_object *luobj;
-	struct lov_stripe_md *lsm = NULL;
-
-	if (!clobj)
-		return NULL;
-
-	luobj = lu_object_locate(&cl_object_header(clobj)->coh_lu,
-				 &lov_device_type);
-	if (luobj)
-		lsm = lov_lsm_addref(lu2lov(luobj));
-	return lsm;
-}
-EXPORT_SYMBOL(lov_lsm_get);
-
 int lov_read_and_clear_async_rc(struct cl_object *clob)
 {
 	struct lu_object *luobj;
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index be6e985..1156ef9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -97,6 +97,62 @@ void lov_dump_lmm_v3(int level, struct lov_mds_md_v3 *lmm)
 			     le16_to_cpu(lmm->lmm_stripe_count));
 }
 
+/**
+ * Pack LOV striping metadata for disk storage format (in little
+ * endian byte order).
+ *
+ * This follows the getxattr() conventions. If \a buf_size is zero
+ * then return the size needed. If \a buf_size is too small then
+ * return -ERANGE. Otherwise return the size of the result.
+ */
+ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
+		     size_t buf_size)
+{
+	struct lov_ost_data_v1 *lmm_objects;
+	struct lov_mds_md_v1 *lmmv1 = buf;
+	struct lov_mds_md_v3 *lmmv3 = buf;
+	size_t lmm_size;
+	unsigned int i;
+
+	lmm_size = lov_mds_md_size(lsm->lsm_stripe_count, lsm->lsm_magic);
+	if (!buf_size)
+		return lmm_size;
+
+	if (buf_size < lmm_size)
+		return -ERANGE;
+
+	/*
+	 * lmmv1 and lmmv3 point to the same struct and have the
+	 * same first fields
+	 */
+	lmmv1->lmm_magic = cpu_to_le32(lsm->lsm_magic);
+	lmm_oi_cpu_to_le(&lmmv1->lmm_oi, &lsm->lsm_oi);
+	lmmv1->lmm_stripe_size = cpu_to_le32(lsm->lsm_stripe_size);
+	lmmv1->lmm_stripe_count = cpu_to_le16(lsm->lsm_stripe_count);
+	lmmv1->lmm_pattern = cpu_to_le32(lsm->lsm_pattern);
+	lmmv1->lmm_layout_gen = cpu_to_le16(lsm->lsm_layout_gen);
+
+	if (lsm->lsm_magic == LOV_MAGIC_V3) {
+		CLASSERT(sizeof(lsm->lsm_pool_name) ==
+			 sizeof(lmmv3->lmm_pool_name));
+		strlcpy(lmmv3->lmm_pool_name, lsm->lsm_pool_name,
+			sizeof(lmmv3->lmm_pool_name));
+		lmm_objects = lmmv3->lmm_objects;
+	} else {
+		lmm_objects = lmmv1->lmm_objects;
+	}
+
+	for (i = 0; i < lsm->lsm_stripe_count; i++) {
+		struct lov_oinfo *loi = lsm->lsm_oinfo[i];
+
+		ostid_cpu_to_le(&loi->loi_oi, &lmm_objects[i].l_ost_oi);
+		lmm_objects[i].l_ost_gen = cpu_to_le32(loi->loi_ost_gen);
+		lmm_objects[i].l_ost_idx = cpu_to_le32(loi->loi_ost_idx);
+	}
+
+	return lmm_size;
+}
+
 /* Pack LOV object metadata for disk storage.  It is packed in LE byte
  * order and is opaque to the networking layer.
  *
@@ -108,13 +164,8 @@ void lov_dump_lmm_v3(int level, struct lov_mds_md_v3 *lmm)
 int lov_obd_packmd(struct lov_obd *lov, struct lov_mds_md **lmmp,
 		   struct lov_stripe_md *lsm)
 {
-	struct lov_mds_md_v1 *lmmv1;
-	struct lov_mds_md_v3 *lmmv3;
 	__u16 stripe_count;
-	struct lov_ost_data_v1 *lmm_objects;
 	int lmm_size, lmm_magic;
-	int i;
-	int cplen = 0;
 
 	if (lsm) {
 		lmm_magic = lsm->lsm_magic;
@@ -177,46 +228,10 @@ int lov_obd_packmd(struct lov_obd *lov, struct lov_mds_md **lmmp,
 	CDEBUG(D_INFO, "lov_packmd: LOV_MAGIC 0x%08X, lmm_size = %d\n",
 	       lmm_magic, lmm_size);
 
-	lmmv1 = *lmmp;
-	lmmv3 = (struct lov_mds_md_v3 *)*lmmp;
-	if (lmm_magic == LOV_MAGIC_V3)
-		lmmv3->lmm_magic = cpu_to_le32(LOV_MAGIC_V3);
-	else
-		lmmv1->lmm_magic = cpu_to_le32(LOV_MAGIC_V1);
-
 	if (!lsm)
 		return lmm_size;
 
-	/* lmmv1 and lmmv3 point to the same struct and have the
-	 * same first fields
-	 */
-	lmm_oi_cpu_to_le(&lmmv1->lmm_oi, &lsm->lsm_oi);
-	lmmv1->lmm_stripe_size = cpu_to_le32(lsm->lsm_stripe_size);
-	lmmv1->lmm_stripe_count = cpu_to_le16(stripe_count);
-	lmmv1->lmm_pattern = cpu_to_le32(lsm->lsm_pattern);
-	lmmv1->lmm_layout_gen = cpu_to_le16(lsm->lsm_layout_gen);
-	if (lsm->lsm_magic == LOV_MAGIC_V3) {
-		cplen = strlcpy(lmmv3->lmm_pool_name, lsm->lsm_pool_name,
-				sizeof(lmmv3->lmm_pool_name));
-		if (cplen >= sizeof(lmmv3->lmm_pool_name))
-			return -E2BIG;
-		lmm_objects = lmmv3->lmm_objects;
-	} else {
-		lmm_objects = lmmv1->lmm_objects;
-	}
-
-	for (i = 0; i < stripe_count; i++) {
-		struct lov_oinfo *loi = lsm->lsm_oinfo[i];
-		/* XXX LOV STACKING call down to osc_packmd() to do packing */
-		LASSERTF(ostid_id(&loi->loi_oi) != 0, "lmm_oi "DOSTID
-			 " stripe %u/%u idx %u\n", POSTID(&lmmv1->lmm_oi),
-			 i, stripe_count, loi->loi_ost_idx);
-		ostid_cpu_to_le(&loi->loi_oi, &lmm_objects[i].l_ost_oi);
-		lmm_objects[i].l_ost_gen = cpu_to_le32(loi->loi_ost_gen);
-		lmm_objects[i].l_ost_idx = cpu_to_le32(loi->loi_ost_idx);
-	}
-
-	return lmm_size;
+	return lov_lsm_pack(lsm, *lmmp, lmm_size);
 }
 
 int lov_packmd(struct obd_export *exp, struct lov_mds_md **lmmp,
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index 4ad2ee5..dd80b83 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -374,6 +374,20 @@ int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 }
 EXPORT_SYMBOL(cl_object_fiemap);
 
+int cl_object_layout_get(const struct lu_env *env, struct cl_object *obj,
+			 struct cl_layout *cl)
+{
+	struct lu_object_header *top = obj->co_lu.lo_header;
+
+	list_for_each_entry(obj, &top->loh_layers, co_lu.lo_linkage) {
+		if (obj->co_ops->coo_layout_get)
+			return obj->co_ops->coo_layout_get(env, obj, cl);
+	}
+
+	return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL(cl_object_layout_get);
+
 /**
  * Helper function removing all object locks, and marking object for
  * deletion. All object pages must have been deleted at this point.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 26/41] staging: lustre: llite: remove lli_has_smd
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jinshan Xiong, John L. Hammond, James Simmons

From: Jinshan Xiong <jinshan.xiong@intel.com>

Remove the lli_has_smd flag from struct ll_inode_info. The empty
layout case will be handled by the LOV layer. Remove the unused
function cl_local_size().

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13690
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/file.c         |    7 -
 drivers/staging/lustre/lustre/llite/glimpse.c      |  134 +++++++-------------
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |    1 -
 .../staging/lustre/lustre/llite/llite_internal.h   |    2 -
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    8 +-
 drivers/staging/lustre/lustre/llite/rw26.c         |    4 -
 drivers/staging/lustre/lustre/llite/vvp_object.c   |   19 ---
 drivers/staging/lustre/lustre/lov/lov_io.c         |    9 ++-
 8 files changed, 55 insertions(+), 129 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 73ea446..94caf4f 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -632,12 +632,6 @@ restart:
 	if (!S_ISREG(inode->i_mode))
 		goto out_och_free;
 
-	if (!lli->lli_has_smd &&
-	    (cl_is_lov_delay_create(file->f_flags) ||
-	     (file->f_mode & FMODE_WRITE) == 0)) {
-		CDEBUG(D_INODE, "object creation was delayed\n");
-		goto out_och_free;
-	}
 	cl_lov_delay_create_clear(&file->f_flags);
 	goto out_och_free;
 
@@ -3227,7 +3221,6 @@ int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf)
 		       PFID(&lli->lli_fid), ll_layout_version_get(lli),
 		       cl.cl_layout_gen);
 		ll_layout_version_set(lli, cl.cl_layout_gen);
-		lli->lli_has_smd = lsm_has_objects(conf->u.coc_md->lsm);
 	}
 out:
 	cl_env_nested_put(&nest, env);
diff --git a/drivers/staging/lustre/lustre/llite/glimpse.c b/drivers/staging/lustre/lustre/llite/glimpse.c
index 0d1ffad..504498d 100644
--- a/drivers/staging/lustre/lustre/llite/glimpse.c
+++ b/drivers/staging/lustre/lustre/llite/glimpse.c
@@ -80,66 +80,60 @@ blkcnt_t dirty_cnt(struct inode *inode)
 int cl_glimpse_lock(const struct lu_env *env, struct cl_io *io,
 		    struct inode *inode, struct cl_object *clob, int agl)
 {
-	struct ll_inode_info *lli   = ll_i2info(inode);
 	const struct lu_fid  *fid   = lu_object_fid(&clob->co_lu);
+	struct cl_lock *lock = vvp_env_lock(env);
+	struct cl_lock_descr *descr = &lock->cll_descr;
 	int result = 0;
 
 	CDEBUG(D_DLMTRACE, "Glimpsing inode " DFID "\n", PFID(fid));
-	if (lli->lli_has_smd) {
-		struct cl_lock *lock = vvp_env_lock(env);
-		struct cl_lock_descr *descr = &lock->cll_descr;
-
-		/* NOTE: this looks like DLM lock request, but it may
-		 *       not be one. Due to CEF_ASYNC flag (translated
-		 *       to LDLM_FL_HAS_INTENT by osc), this is
-		 *       glimpse request, that won't revoke any
-		 *       conflicting DLM locks held. Instead,
-		 *       ll_glimpse_callback() will be called on each
-		 *       client holding a DLM lock against this file,
-		 *       and resulting size will be returned for each
-		 *       stripe. DLM lock on [0, EOF] is acquired only
-		 *       if there were no conflicting locks. If there
-		 *       were conflicting locks, enqueuing or waiting
-		 *       fails with -ENAVAIL, but valid inode
-		 *       attributes are returned anyway.
-		 */
-		*descr = whole_file;
-		descr->cld_obj = clob;
-		descr->cld_mode = CLM_READ;
-		descr->cld_enq_flags = CEF_ASYNC | CEF_MUST;
-		if (agl)
-			descr->cld_enq_flags |= CEF_AGL;
-		/*
-		 * CEF_ASYNC is used because glimpse sub-locks cannot
-		 * deadlock (because they never conflict with other
-		 * locks) and, hence, can be enqueued out-of-order.
-		 *
-		 * CEF_MUST protects glimpse lock from conversion into
-		 * a lockless mode.
-		 */
-		result = cl_lock_request(env, io, lock);
-		if (result < 0)
-			return result;
-
-		if (!agl) {
-			ll_merge_attr(env, inode);
-			if (i_size_read(inode) > 0 && !inode->i_blocks) {
-				/*
-				 * LU-417: Add dirty pages block count
-				 * lest i_blocks reports 0, some "cp" or
-				 * "tar" may think it's a completely
-				 * sparse file and skip it.
-				 */
-				inode->i_blocks = dirty_cnt(inode);
-			}
-		}
 
-		cl_lock_release(env, lock);
-	} else {
-		CDEBUG(D_DLMTRACE, "No objects for inode\n");
+	/* NOTE: this looks like DLM lock request, but it may
+	 *       not be one. Due to CEF_ASYNC flag (translated
+	 *       to LDLM_FL_HAS_INTENT by osc), this is
+	 *       glimpse request, that won't revoke any
+	 *       conflicting DLM locks held. Instead,
+	 *       ll_glimpse_callback() will be called on each
+	 *       client holding a DLM lock against this file,
+	 *       and resulting size will be returned for each
+	 *       stripe. DLM lock on [0, EOF] is acquired only
+	 *       if there were no conflicting locks. If there
+	 *       were conflicting locks, enqueuing or waiting
+	 *       fails with -ENAVAIL, but valid inode
+	 *       attributes are returned anyway.
+	 */
+	*descr = whole_file;
+	descr->cld_obj = clob;
+	descr->cld_mode = CLM_READ;
+	descr->cld_enq_flags = CEF_ASYNC | CEF_MUST;
+	if (agl)
+		descr->cld_enq_flags |= CEF_AGL;
+	/*
+	 * CEF_ASYNC is used because glimpse sub-locks cannot
+	 * deadlock (because they never conflict with other
+	 * locks) and, hence, can be enqueued out-of-order.
+	 *
+	 * CEF_MUST protects glimpse lock from conversion into
+	 * a lockless mode.
+	 */
+	result = cl_lock_request(env, io, lock);
+	if (result < 0)
+		return result;
+
+	if (!agl) {
 		ll_merge_attr(env, inode);
+		if (i_size_read(inode) > 0 && !inode->i_blocks) {
+			/*
+			 * LU-417: Add dirty pages block count
+			 * lest i_blocks reports 0, some "cp" or
+			 * "tar" may think it's a completely
+			 * sparse file and skip it.
+			 */
+			inode->i_blocks = dirty_cnt(inode);
+		}
 	}
 
+	cl_lock_release(env, lock);
+
 	return result;
 }
 
@@ -209,39 +203,3 @@ again:
 	}
 	return result;
 }
-
-int cl_local_size(struct inode *inode)
-{
-	struct lu_env	   *env = NULL;
-	struct cl_io	    *io  = NULL;
-	struct cl_object	*clob;
-	int		      result;
-	int		      refcheck;
-
-	if (!ll_i2info(inode)->lli_has_smd)
-		return 0;
-
-	result = cl_io_get(inode, &env, &io, &refcheck);
-	if (result <= 0)
-		return result;
-
-	clob = io->ci_obj;
-	result = cl_io_init(env, io, CIT_MISC, clob);
-	if (result > 0) {
-		result = io->ci_result;
-	} else if (result == 0) {
-		struct cl_lock *lock = vvp_env_lock(env);
-
-		lock->cll_descr = whole_file;
-		lock->cll_descr.cld_enq_flags = CEF_PEEK;
-		lock->cll_descr.cld_obj = clob;
-		result = cl_lock_request(env, io, lock);
-		if (result == 0) {
-			ll_merge_attr(env, inode);
-			cl_lock_release(env, lock);
-		}
-	}
-	cl_io_fini(env, io);
-	cl_env_put(env, &refcheck);
-	return result;
-}
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index bd98ec2..4087db0 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -184,7 +184,6 @@ int cl_file_inode_init(struct inode *inode, struct lustre_md *md)
 			 * locked by I_NEW bit.
 			 */
 			lli->lli_clob = clob;
-			lli->lli_has_smd = lsm_has_objects(md->lsm);
 			lu_object_ref_add(&clob->co_lu, "inode", inode);
 		} else {
 			result = PTR_ERR(clob);
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index c89e1b8..913e532 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -226,7 +226,6 @@ struct ll_inode_info {
 	 *      In the future, if more members are added only for directory,
 	 *      some of the following members can be moved into u.f.
 	 */
-	bool			    lli_has_smd;
 	struct cl_object	       *lli_clob;
 
 	/* mutex to request for layout lock exclusively. */
@@ -1348,7 +1347,6 @@ extern int cl_inode_fini_refcheck;
 
 int cl_file_inode_init(struct inode *inode, struct lustre_md *md);
 void cl_inode_fini(struct inode *inode);
-int cl_local_size(struct inode *inode);
 
 __u64 cl_fid_build_ino(const struct lu_fid *fid, int api32);
 __u32 cl_fid_build_gen(const struct lu_fid *fid);
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 4b53119..5400cbe 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -798,7 +798,6 @@ void ll_lli_init(struct ll_inode_info *lli)
 	lli->lli_open_fd_exec_count = 0;
 	mutex_init(&lli->lli_och_mutex);
 	spin_lock_init(&lli->lli_agl_lock);
-	lli->lli_has_smd = false;
 	spin_lock_init(&lli->lli_layout_lock);
 	ll_layout_version_set(lli, CL_LAYOUT_GEN_NONE);
 	lli->lli_clob = NULL;
@@ -1290,7 +1289,6 @@ void ll_clear_inode(struct inode *inode)
 	 * cl_object still uses inode lsm.
 	 */
 	cl_inode_fini(inode);
-	lli->lli_has_smd = false;
 }
 
 #define TIMES_SET_FLAGS (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)
@@ -1688,9 +1686,7 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md)
 
 	LASSERT((lsm != NULL) == ((body->mbo_valid & OBD_MD_FLEASIZE) != 0));
 	if (lsm) {
-		if (!lli->lli_has_smd &&
-		    !(sbi->ll_flags & LL_SBI_LAYOUT_LOCK))
-			cl_file_inode_init(inode, md);
+		cl_file_inode_init(inode, md);
 
 		lli->lli_maxbytes = lsm->lsm_maxbytes;
 		if (lli->lli_maxbytes > MAX_LFS_FILESIZE)
@@ -1802,8 +1798,6 @@ int ll_read_inode2(struct inode *inode, void *opaque)
 	CDEBUG(D_VFSTRACE, "VFS Op:inode="DFID"(%p)\n",
 	       PFID(&lli->lli_fid), inode);
 
-	LASSERT(!lli->lli_has_smd);
-
 	/* Core attributes from the MDS first.  This is a new inode, and
 	 * the VFS doesn't zero times in the core inode so we have to do
 	 * it ourselves.  They will be overwritten by either MDS or OST
diff --git a/drivers/staging/lustre/lustre/llite/rw26.c b/drivers/staging/lustre/lustre/llite/rw26.c
index 26f3a37..67010be 100644
--- a/drivers/staging/lustre/lustre/llite/rw26.c
+++ b/drivers/staging/lustre/lustre/llite/rw26.c
@@ -347,13 +347,9 @@ static ssize_t ll_direct_IO_26(struct kiocb *iocb, struct iov_iter *iter)
 	loff_t file_offset = iocb->ki_pos;
 	ssize_t count = iov_iter_count(iter);
 	ssize_t tot_bytes = 0, result = 0;
-	struct ll_inode_info *lli = ll_i2info(inode);
 	long size = MAX_DIO_SIZE;
 	int refcheck;
 
-	if (!lli->lli_has_smd)
-		return -EBADF;
-
 	/* FIXME: io smaller than PAGE_SIZE is broken on ia64 ??? */
 	if ((file_offset & ~PAGE_MASK) || (count & ~PAGE_MASK))
 		return -EINVAL;
diff --git a/drivers/staging/lustre/lustre/llite/vvp_object.c b/drivers/staging/lustre/lustre/llite/vvp_object.c
index 420a649..cc0f3da 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_object.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_object.c
@@ -145,27 +145,8 @@ static int vvp_conf_set(const struct lu_env *env, struct cl_object *obj,
 		 */
 		unmap_mapping_range(conf->coc_inode->i_mapping,
 				    0, OBD_OBJECT_EOF, 0);
-
-		return 0;
 	}
 
-	if (conf->coc_opc != OBJECT_CONF_SET)
-		return 0;
-
-	if (conf->u.coc_md && conf->u.coc_md->lsm) {
-		CDEBUG(D_VFSTRACE, DFID ": layout version change: %u -> %u\n",
-		       PFID(&lli->lli_fid), lli->lli_layout_gen,
-		       conf->u.coc_md->lsm->lsm_layout_gen);
-
-		lli->lli_has_smd = lsm_has_objects(conf->u.coc_md->lsm);
-		ll_layout_version_set(lli, conf->u.coc_md->lsm->lsm_layout_gen);
-	} else {
-		CDEBUG(D_VFSTRACE, DFID ": layout nuked: %u.\n",
-		       PFID(&lli->lli_fid), lli->lli_layout_gen);
-
-		lli->lli_has_smd = false;
-		ll_layout_version_set(lli, CL_LAYOUT_GEN_EMPTY);
-	}
 	return 0;
 }
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index d6be613..a1d1ec9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -918,6 +918,13 @@ static void lov_empty_io_fini(const struct lu_env *env,
 		wake_up_all(&lov->lo_waitq);
 }
 
+static int lov_empty_io_submit(const struct lu_env *env,
+			       const struct cl_io_slice *ios,
+			       enum cl_req_type crt, struct cl_2queue *queue)
+{
+	return -EBADF;
+}
+
 static void lov_empty_impossible(const struct lu_env *env,
 				 struct cl_io_slice *ios)
 {
@@ -968,7 +975,7 @@ static const struct cl_io_operations lov_empty_io_ops = {
 			.cio_fini   = lov_empty_io_fini
 		}
 	},
-	.cio_submit                    = LOV_EMPTY_IMPOSSIBLE,
+	.cio_submit			= lov_empty_io_submit,
 	.cio_commit_async              = LOV_EMPTY_IMPOSSIBLE
 };
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 26/41] staging: lustre: llite: remove lli_has_smd
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Jinshan Xiong, John L. Hammond, James Simmons

From: Jinshan Xiong <jinshan.xiong@intel.com>

Remove the lli_has_smd flag from struct ll_inode_info. The empty
layout case will be handled by the LOV layer. Remove the unused
function cl_local_size().

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13690
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/llite/file.c         |    7 -
 drivers/staging/lustre/lustre/llite/glimpse.c      |  134 +++++++-------------
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |    1 -
 .../staging/lustre/lustre/llite/llite_internal.h   |    2 -
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    8 +-
 drivers/staging/lustre/lustre/llite/rw26.c         |    4 -
 drivers/staging/lustre/lustre/llite/vvp_object.c   |   19 ---
 drivers/staging/lustre/lustre/lov/lov_io.c         |    9 ++-
 8 files changed, 55 insertions(+), 129 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 73ea446..94caf4f 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -632,12 +632,6 @@ restart:
 	if (!S_ISREG(inode->i_mode))
 		goto out_och_free;
 
-	if (!lli->lli_has_smd &&
-	    (cl_is_lov_delay_create(file->f_flags) ||
-	     (file->f_mode & FMODE_WRITE) == 0)) {
-		CDEBUG(D_INODE, "object creation was delayed\n");
-		goto out_och_free;
-	}
 	cl_lov_delay_create_clear(&file->f_flags);
 	goto out_och_free;
 
@@ -3227,7 +3221,6 @@ int ll_layout_conf(struct inode *inode, const struct cl_object_conf *conf)
 		       PFID(&lli->lli_fid), ll_layout_version_get(lli),
 		       cl.cl_layout_gen);
 		ll_layout_version_set(lli, cl.cl_layout_gen);
-		lli->lli_has_smd = lsm_has_objects(conf->u.coc_md->lsm);
 	}
 out:
 	cl_env_nested_put(&nest, env);
diff --git a/drivers/staging/lustre/lustre/llite/glimpse.c b/drivers/staging/lustre/lustre/llite/glimpse.c
index 0d1ffad..504498d 100644
--- a/drivers/staging/lustre/lustre/llite/glimpse.c
+++ b/drivers/staging/lustre/lustre/llite/glimpse.c
@@ -80,66 +80,60 @@ blkcnt_t dirty_cnt(struct inode *inode)
 int cl_glimpse_lock(const struct lu_env *env, struct cl_io *io,
 		    struct inode *inode, struct cl_object *clob, int agl)
 {
-	struct ll_inode_info *lli   = ll_i2info(inode);
 	const struct lu_fid  *fid   = lu_object_fid(&clob->co_lu);
+	struct cl_lock *lock = vvp_env_lock(env);
+	struct cl_lock_descr *descr = &lock->cll_descr;
 	int result = 0;
 
 	CDEBUG(D_DLMTRACE, "Glimpsing inode " DFID "\n", PFID(fid));
-	if (lli->lli_has_smd) {
-		struct cl_lock *lock = vvp_env_lock(env);
-		struct cl_lock_descr *descr = &lock->cll_descr;
-
-		/* NOTE: this looks like DLM lock request, but it may
-		 *       not be one. Due to CEF_ASYNC flag (translated
-		 *       to LDLM_FL_HAS_INTENT by osc), this is
-		 *       glimpse request, that won't revoke any
-		 *       conflicting DLM locks held. Instead,
-		 *       ll_glimpse_callback() will be called on each
-		 *       client holding a DLM lock against this file,
-		 *       and resulting size will be returned for each
-		 *       stripe. DLM lock on [0, EOF] is acquired only
-		 *       if there were no conflicting locks. If there
-		 *       were conflicting locks, enqueuing or waiting
-		 *       fails with -ENAVAIL, but valid inode
-		 *       attributes are returned anyway.
-		 */
-		*descr = whole_file;
-		descr->cld_obj = clob;
-		descr->cld_mode = CLM_READ;
-		descr->cld_enq_flags = CEF_ASYNC | CEF_MUST;
-		if (agl)
-			descr->cld_enq_flags |= CEF_AGL;
-		/*
-		 * CEF_ASYNC is used because glimpse sub-locks cannot
-		 * deadlock (because they never conflict with other
-		 * locks) and, hence, can be enqueued out-of-order.
-		 *
-		 * CEF_MUST protects glimpse lock from conversion into
-		 * a lockless mode.
-		 */
-		result = cl_lock_request(env, io, lock);
-		if (result < 0)
-			return result;
-
-		if (!agl) {
-			ll_merge_attr(env, inode);
-			if (i_size_read(inode) > 0 && !inode->i_blocks) {
-				/*
-				 * LU-417: Add dirty pages block count
-				 * lest i_blocks reports 0, some "cp" or
-				 * "tar" may think it's a completely
-				 * sparse file and skip it.
-				 */
-				inode->i_blocks = dirty_cnt(inode);
-			}
-		}
 
-		cl_lock_release(env, lock);
-	} else {
-		CDEBUG(D_DLMTRACE, "No objects for inode\n");
+	/* NOTE: this looks like DLM lock request, but it may
+	 *       not be one. Due to CEF_ASYNC flag (translated
+	 *       to LDLM_FL_HAS_INTENT by osc), this is
+	 *       glimpse request, that won't revoke any
+	 *       conflicting DLM locks held. Instead,
+	 *       ll_glimpse_callback() will be called on each
+	 *       client holding a DLM lock against this file,
+	 *       and resulting size will be returned for each
+	 *       stripe. DLM lock on [0, EOF] is acquired only
+	 *       if there were no conflicting locks. If there
+	 *       were conflicting locks, enqueuing or waiting
+	 *       fails with -ENAVAIL, but valid inode
+	 *       attributes are returned anyway.
+	 */
+	*descr = whole_file;
+	descr->cld_obj = clob;
+	descr->cld_mode = CLM_READ;
+	descr->cld_enq_flags = CEF_ASYNC | CEF_MUST;
+	if (agl)
+		descr->cld_enq_flags |= CEF_AGL;
+	/*
+	 * CEF_ASYNC is used because glimpse sub-locks cannot
+	 * deadlock (because they never conflict with other
+	 * locks) and, hence, can be enqueued out-of-order.
+	 *
+	 * CEF_MUST protects glimpse lock from conversion into
+	 * a lockless mode.
+	 */
+	result = cl_lock_request(env, io, lock);
+	if (result < 0)
+		return result;
+
+	if (!agl) {
 		ll_merge_attr(env, inode);
+		if (i_size_read(inode) > 0 && !inode->i_blocks) {
+			/*
+			 * LU-417: Add dirty pages block count
+			 * lest i_blocks reports 0, some "cp" or
+			 * "tar" may think it's a completely
+			 * sparse file and skip it.
+			 */
+			inode->i_blocks = dirty_cnt(inode);
+		}
 	}
 
+	cl_lock_release(env, lock);
+
 	return result;
 }
 
@@ -209,39 +203,3 @@ again:
 	}
 	return result;
 }
-
-int cl_local_size(struct inode *inode)
-{
-	struct lu_env	   *env = NULL;
-	struct cl_io	    *io  = NULL;
-	struct cl_object	*clob;
-	int		      result;
-	int		      refcheck;
-
-	if (!ll_i2info(inode)->lli_has_smd)
-		return 0;
-
-	result = cl_io_get(inode, &env, &io, &refcheck);
-	if (result <= 0)
-		return result;
-
-	clob = io->ci_obj;
-	result = cl_io_init(env, io, CIT_MISC, clob);
-	if (result > 0) {
-		result = io->ci_result;
-	} else if (result == 0) {
-		struct cl_lock *lock = vvp_env_lock(env);
-
-		lock->cll_descr = whole_file;
-		lock->cll_descr.cld_enq_flags = CEF_PEEK;
-		lock->cll_descr.cld_obj = clob;
-		result = cl_lock_request(env, io, lock);
-		if (result == 0) {
-			ll_merge_attr(env, inode);
-			cl_lock_release(env, lock);
-		}
-	}
-	cl_io_fini(env, io);
-	cl_env_put(env, &refcheck);
-	return result;
-}
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index bd98ec2..4087db0 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -184,7 +184,6 @@ int cl_file_inode_init(struct inode *inode, struct lustre_md *md)
 			 * locked by I_NEW bit.
 			 */
 			lli->lli_clob = clob;
-			lli->lli_has_smd = lsm_has_objects(md->lsm);
 			lu_object_ref_add(&clob->co_lu, "inode", inode);
 		} else {
 			result = PTR_ERR(clob);
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index c89e1b8..913e532 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -226,7 +226,6 @@ struct ll_inode_info {
 	 *      In the future, if more members are added only for directory,
 	 *      some of the following members can be moved into u.f.
 	 */
-	bool			    lli_has_smd;
 	struct cl_object	       *lli_clob;
 
 	/* mutex to request for layout lock exclusively. */
@@ -1348,7 +1347,6 @@ extern int cl_inode_fini_refcheck;
 
 int cl_file_inode_init(struct inode *inode, struct lustre_md *md);
 void cl_inode_fini(struct inode *inode);
-int cl_local_size(struct inode *inode);
 
 __u64 cl_fid_build_ino(const struct lu_fid *fid, int api32);
 __u32 cl_fid_build_gen(const struct lu_fid *fid);
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 4b53119..5400cbe 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -798,7 +798,6 @@ void ll_lli_init(struct ll_inode_info *lli)
 	lli->lli_open_fd_exec_count = 0;
 	mutex_init(&lli->lli_och_mutex);
 	spin_lock_init(&lli->lli_agl_lock);
-	lli->lli_has_smd = false;
 	spin_lock_init(&lli->lli_layout_lock);
 	ll_layout_version_set(lli, CL_LAYOUT_GEN_NONE);
 	lli->lli_clob = NULL;
@@ -1290,7 +1289,6 @@ void ll_clear_inode(struct inode *inode)
 	 * cl_object still uses inode lsm.
 	 */
 	cl_inode_fini(inode);
-	lli->lli_has_smd = false;
 }
 
 #define TIMES_SET_FLAGS (ATTR_MTIME_SET | ATTR_ATIME_SET | ATTR_TIMES_SET)
@@ -1688,9 +1686,7 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md)
 
 	LASSERT((lsm != NULL) == ((body->mbo_valid & OBD_MD_FLEASIZE) != 0));
 	if (lsm) {
-		if (!lli->lli_has_smd &&
-		    !(sbi->ll_flags & LL_SBI_LAYOUT_LOCK))
-			cl_file_inode_init(inode, md);
+		cl_file_inode_init(inode, md);
 
 		lli->lli_maxbytes = lsm->lsm_maxbytes;
 		if (lli->lli_maxbytes > MAX_LFS_FILESIZE)
@@ -1802,8 +1798,6 @@ int ll_read_inode2(struct inode *inode, void *opaque)
 	CDEBUG(D_VFSTRACE, "VFS Op:inode="DFID"(%p)\n",
 	       PFID(&lli->lli_fid), inode);
 
-	LASSERT(!lli->lli_has_smd);
-
 	/* Core attributes from the MDS first.  This is a new inode, and
 	 * the VFS doesn't zero times in the core inode so we have to do
 	 * it ourselves.  They will be overwritten by either MDS or OST
diff --git a/drivers/staging/lustre/lustre/llite/rw26.c b/drivers/staging/lustre/lustre/llite/rw26.c
index 26f3a37..67010be 100644
--- a/drivers/staging/lustre/lustre/llite/rw26.c
+++ b/drivers/staging/lustre/lustre/llite/rw26.c
@@ -347,13 +347,9 @@ static ssize_t ll_direct_IO_26(struct kiocb *iocb, struct iov_iter *iter)
 	loff_t file_offset = iocb->ki_pos;
 	ssize_t count = iov_iter_count(iter);
 	ssize_t tot_bytes = 0, result = 0;
-	struct ll_inode_info *lli = ll_i2info(inode);
 	long size = MAX_DIO_SIZE;
 	int refcheck;
 
-	if (!lli->lli_has_smd)
-		return -EBADF;
-
 	/* FIXME: io smaller than PAGE_SIZE is broken on ia64 ??? */
 	if ((file_offset & ~PAGE_MASK) || (count & ~PAGE_MASK))
 		return -EINVAL;
diff --git a/drivers/staging/lustre/lustre/llite/vvp_object.c b/drivers/staging/lustre/lustre/llite/vvp_object.c
index 420a649..cc0f3da 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_object.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_object.c
@@ -145,27 +145,8 @@ static int vvp_conf_set(const struct lu_env *env, struct cl_object *obj,
 		 */
 		unmap_mapping_range(conf->coc_inode->i_mapping,
 				    0, OBD_OBJECT_EOF, 0);
-
-		return 0;
 	}
 
-	if (conf->coc_opc != OBJECT_CONF_SET)
-		return 0;
-
-	if (conf->u.coc_md && conf->u.coc_md->lsm) {
-		CDEBUG(D_VFSTRACE, DFID ": layout version change: %u -> %u\n",
-		       PFID(&lli->lli_fid), lli->lli_layout_gen,
-		       conf->u.coc_md->lsm->lsm_layout_gen);
-
-		lli->lli_has_smd = lsm_has_objects(conf->u.coc_md->lsm);
-		ll_layout_version_set(lli, conf->u.coc_md->lsm->lsm_layout_gen);
-	} else {
-		CDEBUG(D_VFSTRACE, DFID ": layout nuked: %u.\n",
-		       PFID(&lli->lli_fid), lli->lli_layout_gen);
-
-		lli->lli_has_smd = false;
-		ll_layout_version_set(lli, CL_LAYOUT_GEN_EMPTY);
-	}
 	return 0;
 }
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_io.c b/drivers/staging/lustre/lustre/lov/lov_io.c
index d6be613..a1d1ec9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_io.c
+++ b/drivers/staging/lustre/lustre/lov/lov_io.c
@@ -918,6 +918,13 @@ static void lov_empty_io_fini(const struct lu_env *env,
 		wake_up_all(&lov->lo_waitq);
 }
 
+static int lov_empty_io_submit(const struct lu_env *env,
+			       const struct cl_io_slice *ios,
+			       enum cl_req_type crt, struct cl_2queue *queue)
+{
+	return -EBADF;
+}
+
 static void lov_empty_impossible(const struct lu_env *env,
 				 struct cl_io_slice *ios)
 {
@@ -968,7 +975,7 @@ static const struct cl_io_operations lov_empty_io_ops = {
 			.cio_fini   = lov_empty_io_fini
 		}
 	},
-	.cio_submit                    = LOV_EMPTY_IMPOSSIBLE,
+	.cio_submit			= lov_empty_io_submit,
 	.cio_commit_async              = LOV_EMPTY_IMPOSSIBLE
 };
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 27/41] staging: lustre: llite: add cl_object_maxbytes()
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Add cl_object_maxbytes() to return the maximum supported size of a
cl_object. Remove the lli_maxbytes member from struct
ll_inode_info. Change the lsm_maxbytes member of struct lov_stripe_md
from __u64 to loff_t. Correct the computation of lsm_maxbytes in the
released layout case.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13694
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    5 +
 drivers/staging/lustre/lustre/include/obd.h        |    2 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   10 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    8 +-
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  191 +++++++++-----------
 drivers/staging/lustre/lustre/lov/lov_object.c     |   17 ++
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   15 ++
 7 files changed, 130 insertions(+), 118 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index b80539d..dfef4bc 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -431,6 +431,10 @@ struct cl_object_operations {
 	 */
 	int (*coo_layout_get)(const struct lu_env *env, struct cl_object *obj,
 			      struct cl_layout *layout);
+	/**
+	 * Get maximum size of the object.
+	 */
+	loff_t (*coo_maxbytes)(struct cl_object *obj);
 };
 
 /**
@@ -2227,6 +2231,7 @@ int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 		     size_t *buflen);
 int cl_object_layout_get(const struct lu_env *env, struct cl_object *obj,
 			 struct cl_layout *cl);
+loff_t cl_object_maxbytes(struct cl_object *obj);
 
 /**
  * Returns true, iff \a o0 and \a o1 are slices of the same object.
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index f254d88..8372deb 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -88,7 +88,7 @@ struct lov_stripe_md {
 	/* maximum possible file size, might change as OSTs status changes,
 	 * e.g. disconnected, deactivated
 	 */
-	__u64		lsm_maxbytes;
+	loff_t		lsm_maxbytes;
 	struct ost_id	lsm_oi;
 	__u32		lsm_magic;
 	__u32		lsm_stripe_size;
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 913e532..cf95a72 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -185,7 +185,6 @@ struct ll_inode_info {
 		struct {
 			struct mutex			lli_size_mutex;
 			char			       *lli_symlink_name;
-			__u64				lli_maxbytes;
 			/*
 			 * struct rw_semaphore {
 			 *    signed long	count;     // align d.d_def_acl
@@ -988,9 +987,14 @@ static inline struct lu_fid *ll_inode2fid(struct inode *inode)
 	return fid;
 }
 
-static inline __u64 ll_file_maxbytes(struct inode *inode)
+static inline loff_t ll_file_maxbytes(struct inode *inode)
 {
-	return ll_i2info(inode)->lli_maxbytes;
+	struct cl_object *obj = ll_i2info(inode)->lli_clob;
+
+	if (!obj)
+		return MAX_LFS_FILESIZE;
+
+	return min_t(loff_t, cl_object_maxbytes(obj), MAX_LFS_FILESIZE);
 }
 
 /* llite/xattr.c */
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 5400cbe..25a06f8 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -785,7 +785,6 @@ void ll_lli_init(struct ll_inode_info *lli)
 {
 	lli->lli_inode_magic = LLI_INODE_MAGIC;
 	lli->lli_flags = 0;
-	lli->lli_maxbytes = MAX_LFS_FILESIZE;
 	spin_lock_init(&lli->lli_lock);
 	lli->lli_posix_acl = NULL;
 	/* Do not set lli_fid, it has been initialized already. */
@@ -1685,14 +1684,9 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md)
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
 
 	LASSERT((lsm != NULL) == ((body->mbo_valid & OBD_MD_FLEASIZE) != 0));
-	if (lsm) {
+	if (lsm)
 		cl_file_inode_init(inode, md);
 
-		lli->lli_maxbytes = lsm->lsm_maxbytes;
-		if (lli->lli_maxbytes > MAX_LFS_FILESIZE)
-			lli->lli_maxbytes = MAX_LFS_FILESIZE;
-	}
-
 	if (S_ISDIR(inode->i_mode)) {
 		int rc;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 214c561..63dcd29 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -117,9 +117,43 @@ void lsm_free_plain(struct lov_stripe_md *lsm)
 	kvfree(lsm);
 }
 
-static void lsm_unpackmd_common(struct lov_stripe_md *lsm,
-				struct lov_mds_md *lmm)
+/*
+ * Find minimum stripe maxbytes value.  For inactive or
+ * reconnecting targets use LUSTRE_EXT3_STRIPE_MAXBYTES.
+ */
+static loff_t lov_tgt_maxbytes(struct lov_tgt_desc *tgt)
+{
+	loff_t maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
+	struct obd_import *imp;
+
+	if (!tgt->ltd_active)
+		return maxbytes;
+
+	imp = tgt->ltd_obd->u.cli.cl_import;
+	if (!imp)
+		return maxbytes;
+
+	spin_lock(&imp->imp_lock);
+	if (imp->imp_state == LUSTRE_IMP_FULL &&
+	    (imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_MAXBYTES) &&
+	     imp->imp_connect_data.ocd_maxbytes > 0)
+		maxbytes = imp->imp_connect_data.ocd_maxbytes;
+
+	spin_unlock(&imp->imp_lock);
+
+	return maxbytes;
+}
+
+static int lsm_unpackmd_common(struct lov_obd *lov,
+			       struct lov_stripe_md *lsm,
+			       struct lov_mds_md *lmm,
+			       struct lov_ost_data_v1 *objects)
 {
+	loff_t stripe_maxbytes = LLONG_MAX;
+	unsigned int stripe_count;
+	struct lov_oinfo *loi;
+	unsigned int i;
+
 	/*
 	 * This supposes lov_mds_md_v1/v3 first fields are
 	 * are the same
@@ -129,6 +163,45 @@ static void lsm_unpackmd_common(struct lov_stripe_md *lsm,
 	lsm->lsm_pattern = le32_to_cpu(lmm->lmm_pattern);
 	lsm->lsm_layout_gen = le16_to_cpu(lmm->lmm_layout_gen);
 	lsm->lsm_pool_name[0] = '\0';
+
+	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
+
+	for (i = 0; i < stripe_count; i++) {
+		loff_t tgt_bytes;
+
+		loi = lsm->lsm_oinfo[i];
+		ostid_le_to_cpu(&objects[i].l_ost_oi, &loi->loi_oi);
+		loi->loi_ost_idx = le32_to_cpu(objects[i].l_ost_idx);
+		loi->loi_ost_gen = le32_to_cpu(objects[i].l_ost_gen);
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
+		if (loi->loi_ost_idx >= lov->desc.ld_tgt_count) {
+			CERROR("OST index %d more than OST count %d\n",
+			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
+			lov_dump_lmm_v1(D_WARNING, lmm);
+			return -EINVAL;
+		}
+
+		if (!lov->lov_tgts[loi->loi_ost_idx]) {
+			CERROR("OST index %d missing\n", loi->loi_ost_idx);
+			lov_dump_lmm_v1(D_WARNING, lmm);
+			return -EINVAL;
+		}
+
+		tgt_bytes = lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx]);
+		stripe_maxbytes = min_t(loff_t, stripe_maxbytes, tgt_bytes);
+	}
+
+	if (stripe_maxbytes == LLONG_MAX)
+		stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
+
+	if (!lsm->lsm_stripe_count)
+		lsm->lsm_maxbytes = stripe_maxbytes * lov->desc.ld_tgt_count;
+	else
+		lsm->lsm_maxbytes = stripe_maxbytes * lsm->lsm_stripe_count;
+
+	return 0;
 }
 
 static void
@@ -147,30 +220,6 @@ lsm_stripe_by_offset_plain(struct lov_stripe_md *lsm, int *stripeno,
 		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
 }
 
-/* Find minimum stripe maxbytes value.  For inactive or
- * reconnecting targets use LUSTRE_EXT3_STRIPE_MAXBYTES.
- */
-static void lov_tgt_maxbytes(struct lov_tgt_desc *tgt, __u64 *stripe_maxbytes)
-{
-	struct obd_import *imp = tgt->ltd_obd->u.cli.cl_import;
-
-	if (!imp || !tgt->ltd_active) {
-		*stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
-		return;
-	}
-
-	spin_lock(&imp->imp_lock);
-	if (imp->imp_state == LUSTRE_IMP_FULL &&
-	    (imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_MAXBYTES) &&
-	    imp->imp_connect_data.ocd_maxbytes > 0) {
-		if (*stripe_maxbytes > imp->imp_connect_data.ocd_maxbytes)
-			*stripe_maxbytes = imp->imp_connect_data.ocd_maxbytes;
-	} else {
-		*stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
-	}
-	spin_unlock(&imp->imp_lock);
-}
-
 static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, int lmm_bytes,
 			     __u16 *stripe_count)
 {
@@ -197,45 +246,7 @@ static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, int lmm_bytes,
 static int lsm_unpackmd_v1(struct lov_obd *lov, struct lov_stripe_md *lsm,
 			   struct lov_mds_md_v1 *lmm)
 {
-	struct lov_oinfo *loi;
-	int i;
-	int stripe_count;
-	__u64 stripe_maxbytes = OBD_OBJECT_EOF;
-
-	lsm_unpackmd_common(lsm, lmm);
-
-	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
-
-	for (i = 0; i < stripe_count; i++) {
-		/* XXX LOV STACKING call down to osc_unpackmd() */
-		loi = lsm->lsm_oinfo[i];
-		ostid_le_to_cpu(&lmm->lmm_objects[i].l_ost_oi, &loi->loi_oi);
-		loi->loi_ost_idx = le32_to_cpu(lmm->lmm_objects[i].l_ost_idx);
-		loi->loi_ost_gen = le32_to_cpu(lmm->lmm_objects[i].l_ost_gen);
-		if (lov_oinfo_is_dummy(loi))
-			continue;
-
-		if (loi->loi_ost_idx >= lov->desc.ld_tgt_count) {
-			CERROR("OST index %d more than OST count %d\n",
-			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
-			lov_dump_lmm_v1(D_WARNING, lmm);
-			return -EINVAL;
-		}
-		if (!lov->lov_tgts[loi->loi_ost_idx]) {
-			CERROR("OST index %d missing\n", loi->loi_ost_idx);
-			lov_dump_lmm_v1(D_WARNING, lmm);
-			return -EINVAL;
-		}
-		/* calculate the minimum stripe max bytes */
-		lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx],
-				 &stripe_maxbytes);
-	}
-
-	lsm->lsm_maxbytes = stripe_maxbytes * lsm->lsm_stripe_count;
-	if (lsm->lsm_stripe_count == 0)
-		lsm->lsm_maxbytes = stripe_maxbytes * lov->desc.ld_tgt_count;
-
-	return 0;
+	return lsm_unpackmd_common(lov, lsm, lmm, lmm->lmm_objects);
 }
 
 const struct lsm_operations lsm_v1_ops = {
@@ -275,55 +286,21 @@ static int lsm_lmm_verify_v3(struct lov_mds_md *lmmv1, int lmm_bytes,
 }
 
 static int lsm_unpackmd_v3(struct lov_obd *lov, struct lov_stripe_md *lsm,
-			   struct lov_mds_md *lmmv1)
+			   struct lov_mds_md *lmm)
 {
-	struct lov_mds_md_v3 *lmm;
-	struct lov_oinfo *loi;
-	int i;
-	int stripe_count;
-	__u64 stripe_maxbytes = OBD_OBJECT_EOF;
-	int cplen = 0;
-
-	lmm = (struct lov_mds_md_v3 *)lmmv1;
+	struct lov_mds_md_v3 *lmm_v3 = (struct lov_mds_md_v3 *)lmm;
+	size_t cplen = 0;
+	int rc;
 
-	lsm_unpackmd_common(lsm, (struct lov_mds_md_v1 *)lmm);
+	rc = lsm_unpackmd_common(lov, lsm, lmm, lmm_v3->lmm_objects);
+	if (rc)
+		return rc;
 
-	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
-
-	cplen = strlcpy(lsm->lsm_pool_name, lmm->lmm_pool_name,
+	cplen = strlcpy(lsm->lsm_pool_name, lmm_v3->lmm_pool_name,
 			sizeof(lsm->lsm_pool_name));
 	if (cplen >= sizeof(lsm->lsm_pool_name))
 		return -E2BIG;
 
-	for (i = 0; i < stripe_count; i++) {
-		/* XXX LOV STACKING call down to osc_unpackmd() */
-		loi = lsm->lsm_oinfo[i];
-		ostid_le_to_cpu(&lmm->lmm_objects[i].l_ost_oi, &loi->loi_oi);
-		loi->loi_ost_idx = le32_to_cpu(lmm->lmm_objects[i].l_ost_idx);
-		loi->loi_ost_gen = le32_to_cpu(lmm->lmm_objects[i].l_ost_gen);
-		if (lov_oinfo_is_dummy(loi))
-			continue;
-
-		if (loi->loi_ost_idx >= lov->desc.ld_tgt_count) {
-			CERROR("OST index %d more than OST count %d\n",
-			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
-			lov_dump_lmm_v3(D_WARNING, lmm);
-			return -EINVAL;
-		}
-		if (!lov->lov_tgts[loi->loi_ost_idx]) {
-			CERROR("OST index %d missing\n", loi->loi_ost_idx);
-			lov_dump_lmm_v3(D_WARNING, lmm);
-			return -EINVAL;
-		}
-		/* calculate the minimum stripe max bytes */
-		lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx],
-				 &stripe_maxbytes);
-	}
-
-	lsm->lsm_maxbytes = stripe_maxbytes * lsm->lsm_stripe_count;
-	if (lsm->lsm_stripe_count == 0)
-		lsm->lsm_maxbytes = stripe_maxbytes * lov->desc.ld_tgt_count;
-
 	return 0;
 }
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index d39724a..82b99e0 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -1454,6 +1454,22 @@ static int lov_object_layout_get(const struct lu_env *env,
 	return rc < 0 ? rc : 0;
 }
 
+static loff_t lov_object_maxbytes(struct cl_object *obj)
+{
+	struct lov_object *lov = cl2lov(obj);
+	struct lov_stripe_md *lsm = lov_lsm_addref(lov);
+	loff_t maxbytes;
+
+	if (!lsm)
+		return LLONG_MAX;
+
+	maxbytes = lsm->lsm_maxbytes;
+
+	lov_lsm_put(lsm);
+
+	return maxbytes;
+}
+
 static const struct cl_object_operations lov_ops = {
 	.coo_page_init = lov_page_init,
 	.coo_lock_init = lov_lock_init,
@@ -1463,6 +1479,7 @@ static const struct cl_object_operations lov_ops = {
 	.coo_conf_set  = lov_conf_set,
 	.coo_getstripe = lov_object_getstripe,
 	.coo_layout_get	 = lov_object_layout_get,
+	.coo_maxbytes	 = lov_object_maxbytes,
 	.coo_fiemap	 = lov_object_fiemap,
 };
 
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index dd80b83..cda01b8 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -388,6 +388,21 @@ int cl_object_layout_get(const struct lu_env *env, struct cl_object *obj,
 }
 EXPORT_SYMBOL(cl_object_layout_get);
 
+loff_t cl_object_maxbytes(struct cl_object *obj)
+{
+	struct lu_object_header *top = obj->co_lu.lo_header;
+	loff_t maxbytes = LLONG_MAX;
+
+	list_for_each_entry(obj, &top->loh_layers, co_lu.lo_linkage) {
+		if (obj->co_ops->coo_maxbytes)
+			maxbytes = min_t(loff_t, obj->co_ops->coo_maxbytes(obj),
+					 maxbytes);
+	}
+
+	return maxbytes;
+}
+EXPORT_SYMBOL(cl_object_maxbytes);
+
 /**
  * Helper function removing all object locks, and marking object for
  * deletion. All object pages must have been deleted at this point.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 27/41] staging: lustre: llite: add cl_object_maxbytes()
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Add cl_object_maxbytes() to return the maximum supported size of a
cl_object. Remove the lli_maxbytes member from struct
ll_inode_info. Change the lsm_maxbytes member of struct lov_stripe_md
from __u64 to loff_t. Correct the computation of lsm_maxbytes in the
released layout case.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13694
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/cl_object.h  |    5 +
 drivers/staging/lustre/lustre/include/obd.h        |    2 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   10 +-
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    8 +-
 drivers/staging/lustre/lustre/lov/lov_ea.c         |  191 +++++++++-----------
 drivers/staging/lustre/lustre/lov/lov_object.c     |   17 ++
 drivers/staging/lustre/lustre/obdclass/cl_object.c |   15 ++
 7 files changed, 130 insertions(+), 118 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h b/drivers/staging/lustre/lustre/include/cl_object.h
index b80539d..dfef4bc 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -431,6 +431,10 @@ struct cl_object_operations {
 	 */
 	int (*coo_layout_get)(const struct lu_env *env, struct cl_object *obj,
 			      struct cl_layout *layout);
+	/**
+	 * Get maximum size of the object.
+	 */
+	loff_t (*coo_maxbytes)(struct cl_object *obj);
 };
 
 /**
@@ -2227,6 +2231,7 @@ int cl_object_fiemap(const struct lu_env *env, struct cl_object *obj,
 		     size_t *buflen);
 int cl_object_layout_get(const struct lu_env *env, struct cl_object *obj,
 			 struct cl_layout *cl);
+loff_t cl_object_maxbytes(struct cl_object *obj);
 
 /**
  * Returns true, iff \a o0 and \a o1 are slices of the same object.
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index f254d88..8372deb 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -88,7 +88,7 @@ struct lov_stripe_md {
 	/* maximum possible file size, might change as OSTs status changes,
 	 * e.g. disconnected, deactivated
 	 */
-	__u64		lsm_maxbytes;
+	loff_t		lsm_maxbytes;
 	struct ost_id	lsm_oi;
 	__u32		lsm_magic;
 	__u32		lsm_stripe_size;
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 913e532..cf95a72 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -185,7 +185,6 @@ struct ll_inode_info {
 		struct {
 			struct mutex			lli_size_mutex;
 			char			       *lli_symlink_name;
-			__u64				lli_maxbytes;
 			/*
 			 * struct rw_semaphore {
 			 *    signed long	count;     // align d.d_def_acl
@@ -988,9 +987,14 @@ static inline struct lu_fid *ll_inode2fid(struct inode *inode)
 	return fid;
 }
 
-static inline __u64 ll_file_maxbytes(struct inode *inode)
+static inline loff_t ll_file_maxbytes(struct inode *inode)
 {
-	return ll_i2info(inode)->lli_maxbytes;
+	struct cl_object *obj = ll_i2info(inode)->lli_clob;
+
+	if (!obj)
+		return MAX_LFS_FILESIZE;
+
+	return min_t(loff_t, cl_object_maxbytes(obj), MAX_LFS_FILESIZE);
 }
 
 /* llite/xattr.c */
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 5400cbe..25a06f8 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -785,7 +785,6 @@ void ll_lli_init(struct ll_inode_info *lli)
 {
 	lli->lli_inode_magic = LLI_INODE_MAGIC;
 	lli->lli_flags = 0;
-	lli->lli_maxbytes = MAX_LFS_FILESIZE;
 	spin_lock_init(&lli->lli_lock);
 	lli->lli_posix_acl = NULL;
 	/* Do not set lli_fid, it has been initialized already. */
@@ -1685,14 +1684,9 @@ int ll_update_inode(struct inode *inode, struct lustre_md *md)
 	struct ll_sb_info *sbi = ll_i2sbi(inode);
 
 	LASSERT((lsm != NULL) == ((body->mbo_valid & OBD_MD_FLEASIZE) != 0));
-	if (lsm) {
+	if (lsm)
 		cl_file_inode_init(inode, md);
 
-		lli->lli_maxbytes = lsm->lsm_maxbytes;
-		if (lli->lli_maxbytes > MAX_LFS_FILESIZE)
-			lli->lli_maxbytes = MAX_LFS_FILESIZE;
-	}
-
 	if (S_ISDIR(inode->i_mode)) {
 		int rc;
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 214c561..63dcd29 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -117,9 +117,43 @@ void lsm_free_plain(struct lov_stripe_md *lsm)
 	kvfree(lsm);
 }
 
-static void lsm_unpackmd_common(struct lov_stripe_md *lsm,
-				struct lov_mds_md *lmm)
+/*
+ * Find minimum stripe maxbytes value.  For inactive or
+ * reconnecting targets use LUSTRE_EXT3_STRIPE_MAXBYTES.
+ */
+static loff_t lov_tgt_maxbytes(struct lov_tgt_desc *tgt)
+{
+	loff_t maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
+	struct obd_import *imp;
+
+	if (!tgt->ltd_active)
+		return maxbytes;
+
+	imp = tgt->ltd_obd->u.cli.cl_import;
+	if (!imp)
+		return maxbytes;
+
+	spin_lock(&imp->imp_lock);
+	if (imp->imp_state == LUSTRE_IMP_FULL &&
+	    (imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_MAXBYTES) &&
+	     imp->imp_connect_data.ocd_maxbytes > 0)
+		maxbytes = imp->imp_connect_data.ocd_maxbytes;
+
+	spin_unlock(&imp->imp_lock);
+
+	return maxbytes;
+}
+
+static int lsm_unpackmd_common(struct lov_obd *lov,
+			       struct lov_stripe_md *lsm,
+			       struct lov_mds_md *lmm,
+			       struct lov_ost_data_v1 *objects)
 {
+	loff_t stripe_maxbytes = LLONG_MAX;
+	unsigned int stripe_count;
+	struct lov_oinfo *loi;
+	unsigned int i;
+
 	/*
 	 * This supposes lov_mds_md_v1/v3 first fields are
 	 * are the same
@@ -129,6 +163,45 @@ static void lsm_unpackmd_common(struct lov_stripe_md *lsm,
 	lsm->lsm_pattern = le32_to_cpu(lmm->lmm_pattern);
 	lsm->lsm_layout_gen = le16_to_cpu(lmm->lmm_layout_gen);
 	lsm->lsm_pool_name[0] = '\0';
+
+	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
+
+	for (i = 0; i < stripe_count; i++) {
+		loff_t tgt_bytes;
+
+		loi = lsm->lsm_oinfo[i];
+		ostid_le_to_cpu(&objects[i].l_ost_oi, &loi->loi_oi);
+		loi->loi_ost_idx = le32_to_cpu(objects[i].l_ost_idx);
+		loi->loi_ost_gen = le32_to_cpu(objects[i].l_ost_gen);
+		if (lov_oinfo_is_dummy(loi))
+			continue;
+
+		if (loi->loi_ost_idx >= lov->desc.ld_tgt_count) {
+			CERROR("OST index %d more than OST count %d\n",
+			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
+			lov_dump_lmm_v1(D_WARNING, lmm);
+			return -EINVAL;
+		}
+
+		if (!lov->lov_tgts[loi->loi_ost_idx]) {
+			CERROR("OST index %d missing\n", loi->loi_ost_idx);
+			lov_dump_lmm_v1(D_WARNING, lmm);
+			return -EINVAL;
+		}
+
+		tgt_bytes = lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx]);
+		stripe_maxbytes = min_t(loff_t, stripe_maxbytes, tgt_bytes);
+	}
+
+	if (stripe_maxbytes == LLONG_MAX)
+		stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
+
+	if (!lsm->lsm_stripe_count)
+		lsm->lsm_maxbytes = stripe_maxbytes * lov->desc.ld_tgt_count;
+	else
+		lsm->lsm_maxbytes = stripe_maxbytes * lsm->lsm_stripe_count;
+
+	return 0;
 }
 
 static void
@@ -147,30 +220,6 @@ lsm_stripe_by_offset_plain(struct lov_stripe_md *lsm, int *stripeno,
 		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
 }
 
-/* Find minimum stripe maxbytes value.  For inactive or
- * reconnecting targets use LUSTRE_EXT3_STRIPE_MAXBYTES.
- */
-static void lov_tgt_maxbytes(struct lov_tgt_desc *tgt, __u64 *stripe_maxbytes)
-{
-	struct obd_import *imp = tgt->ltd_obd->u.cli.cl_import;
-
-	if (!imp || !tgt->ltd_active) {
-		*stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
-		return;
-	}
-
-	spin_lock(&imp->imp_lock);
-	if (imp->imp_state == LUSTRE_IMP_FULL &&
-	    (imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_MAXBYTES) &&
-	    imp->imp_connect_data.ocd_maxbytes > 0) {
-		if (*stripe_maxbytes > imp->imp_connect_data.ocd_maxbytes)
-			*stripe_maxbytes = imp->imp_connect_data.ocd_maxbytes;
-	} else {
-		*stripe_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
-	}
-	spin_unlock(&imp->imp_lock);
-}
-
 static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, int lmm_bytes,
 			     __u16 *stripe_count)
 {
@@ -197,45 +246,7 @@ static int lsm_lmm_verify_v1(struct lov_mds_md_v1 *lmm, int lmm_bytes,
 static int lsm_unpackmd_v1(struct lov_obd *lov, struct lov_stripe_md *lsm,
 			   struct lov_mds_md_v1 *lmm)
 {
-	struct lov_oinfo *loi;
-	int i;
-	int stripe_count;
-	__u64 stripe_maxbytes = OBD_OBJECT_EOF;
-
-	lsm_unpackmd_common(lsm, lmm);
-
-	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
-
-	for (i = 0; i < stripe_count; i++) {
-		/* XXX LOV STACKING call down to osc_unpackmd() */
-		loi = lsm->lsm_oinfo[i];
-		ostid_le_to_cpu(&lmm->lmm_objects[i].l_ost_oi, &loi->loi_oi);
-		loi->loi_ost_idx = le32_to_cpu(lmm->lmm_objects[i].l_ost_idx);
-		loi->loi_ost_gen = le32_to_cpu(lmm->lmm_objects[i].l_ost_gen);
-		if (lov_oinfo_is_dummy(loi))
-			continue;
-
-		if (loi->loi_ost_idx >= lov->desc.ld_tgt_count) {
-			CERROR("OST index %d more than OST count %d\n",
-			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
-			lov_dump_lmm_v1(D_WARNING, lmm);
-			return -EINVAL;
-		}
-		if (!lov->lov_tgts[loi->loi_ost_idx]) {
-			CERROR("OST index %d missing\n", loi->loi_ost_idx);
-			lov_dump_lmm_v1(D_WARNING, lmm);
-			return -EINVAL;
-		}
-		/* calculate the minimum stripe max bytes */
-		lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx],
-				 &stripe_maxbytes);
-	}
-
-	lsm->lsm_maxbytes = stripe_maxbytes * lsm->lsm_stripe_count;
-	if (lsm->lsm_stripe_count == 0)
-		lsm->lsm_maxbytes = stripe_maxbytes * lov->desc.ld_tgt_count;
-
-	return 0;
+	return lsm_unpackmd_common(lov, lsm, lmm, lmm->lmm_objects);
 }
 
 const struct lsm_operations lsm_v1_ops = {
@@ -275,55 +286,21 @@ static int lsm_lmm_verify_v3(struct lov_mds_md *lmmv1, int lmm_bytes,
 }
 
 static int lsm_unpackmd_v3(struct lov_obd *lov, struct lov_stripe_md *lsm,
-			   struct lov_mds_md *lmmv1)
+			   struct lov_mds_md *lmm)
 {
-	struct lov_mds_md_v3 *lmm;
-	struct lov_oinfo *loi;
-	int i;
-	int stripe_count;
-	__u64 stripe_maxbytes = OBD_OBJECT_EOF;
-	int cplen = 0;
-
-	lmm = (struct lov_mds_md_v3 *)lmmv1;
+	struct lov_mds_md_v3 *lmm_v3 = (struct lov_mds_md_v3 *)lmm;
+	size_t cplen = 0;
+	int rc;
 
-	lsm_unpackmd_common(lsm, (struct lov_mds_md_v1 *)lmm);
+	rc = lsm_unpackmd_common(lov, lsm, lmm, lmm_v3->lmm_objects);
+	if (rc)
+		return rc;
 
-	stripe_count = lsm_is_released(lsm) ? 0 : lsm->lsm_stripe_count;
-
-	cplen = strlcpy(lsm->lsm_pool_name, lmm->lmm_pool_name,
+	cplen = strlcpy(lsm->lsm_pool_name, lmm_v3->lmm_pool_name,
 			sizeof(lsm->lsm_pool_name));
 	if (cplen >= sizeof(lsm->lsm_pool_name))
 		return -E2BIG;
 
-	for (i = 0; i < stripe_count; i++) {
-		/* XXX LOV STACKING call down to osc_unpackmd() */
-		loi = lsm->lsm_oinfo[i];
-		ostid_le_to_cpu(&lmm->lmm_objects[i].l_ost_oi, &loi->loi_oi);
-		loi->loi_ost_idx = le32_to_cpu(lmm->lmm_objects[i].l_ost_idx);
-		loi->loi_ost_gen = le32_to_cpu(lmm->lmm_objects[i].l_ost_gen);
-		if (lov_oinfo_is_dummy(loi))
-			continue;
-
-		if (loi->loi_ost_idx >= lov->desc.ld_tgt_count) {
-			CERROR("OST index %d more than OST count %d\n",
-			       loi->loi_ost_idx, lov->desc.ld_tgt_count);
-			lov_dump_lmm_v3(D_WARNING, lmm);
-			return -EINVAL;
-		}
-		if (!lov->lov_tgts[loi->loi_ost_idx]) {
-			CERROR("OST index %d missing\n", loi->loi_ost_idx);
-			lov_dump_lmm_v3(D_WARNING, lmm);
-			return -EINVAL;
-		}
-		/* calculate the minimum stripe max bytes */
-		lov_tgt_maxbytes(lov->lov_tgts[loi->loi_ost_idx],
-				 &stripe_maxbytes);
-	}
-
-	lsm->lsm_maxbytes = stripe_maxbytes * lsm->lsm_stripe_count;
-	if (lsm->lsm_stripe_count == 0)
-		lsm->lsm_maxbytes = stripe_maxbytes * lov->desc.ld_tgt_count;
-
 	return 0;
 }
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c b/drivers/staging/lustre/lustre/lov/lov_object.c
index d39724a..82b99e0 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -1454,6 +1454,22 @@ static int lov_object_layout_get(const struct lu_env *env,
 	return rc < 0 ? rc : 0;
 }
 
+static loff_t lov_object_maxbytes(struct cl_object *obj)
+{
+	struct lov_object *lov = cl2lov(obj);
+	struct lov_stripe_md *lsm = lov_lsm_addref(lov);
+	loff_t maxbytes;
+
+	if (!lsm)
+		return LLONG_MAX;
+
+	maxbytes = lsm->lsm_maxbytes;
+
+	lov_lsm_put(lsm);
+
+	return maxbytes;
+}
+
 static const struct cl_object_operations lov_ops = {
 	.coo_page_init = lov_page_init,
 	.coo_lock_init = lov_lock_init,
@@ -1463,6 +1479,7 @@ static const struct cl_object_operations lov_ops = {
 	.coo_conf_set  = lov_conf_set,
 	.coo_getstripe = lov_object_getstripe,
 	.coo_layout_get	 = lov_object_layout_get,
+	.coo_maxbytes	 = lov_object_maxbytes,
 	.coo_fiemap	 = lov_object_fiemap,
 };
 
diff --git a/drivers/staging/lustre/lustre/obdclass/cl_object.c b/drivers/staging/lustre/lustre/obdclass/cl_object.c
index dd80b83..cda01b8 100644
--- a/drivers/staging/lustre/lustre/obdclass/cl_object.c
+++ b/drivers/staging/lustre/lustre/obdclass/cl_object.c
@@ -388,6 +388,21 @@ int cl_object_layout_get(const struct lu_env *env, struct cl_object *obj,
 }
 EXPORT_SYMBOL(cl_object_layout_get);
 
+loff_t cl_object_maxbytes(struct cl_object *obj)
+{
+	struct lu_object_header *top = obj->co_lu.lo_header;
+	loff_t maxbytes = LLONG_MAX;
+
+	list_for_each_entry(obj, &top->loh_layers, co_lu.lo_linkage) {
+		if (obj->co_ops->coo_maxbytes)
+			maxbytes = min_t(loff_t, obj->co_ops->coo_maxbytes(obj),
+					 maxbytes);
+	}
+
+	return maxbytes;
+}
+EXPORT_SYMBOL(cl_object_maxbytes);
+
 /**
  * Helper function removing all object locks, and marking object for
  * deletion. All object pages must have been deleted at this point.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 28/41] staging: lustre: hsm: make HSM modification requests replayable
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Mikhail Pershin, James Simmons

From: Mikhail Pershin <mike.pershin@intel.com>

There are several HSM requests which modify data on server and
reply on Lustre recovery, e.g. they should replay changes in
case of recovery.

Patch allows such requests to be replayed in recovery time and
they are issued from client using mdc_rpc_lock to serialize them
and avoid concurrent last_rcvd update on server.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5939
Reviewed-on: http://review.whamcloud.com/13684
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 7b9fb90..1d1eaa5 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1505,7 +1505,9 @@ static int mdc_ioc_hsm_progress(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_queue_wait(req);
+	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	rc = ptlrpc_queue_wait(req);
+	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
@@ -1683,7 +1685,9 @@ static int mdc_ioc_hsm_state_set(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_queue_wait(req);
+	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	rc = ptlrpc_queue_wait(req);
+	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
@@ -1746,7 +1750,9 @@ static int mdc_ioc_hsm_request(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_queue_wait(req);
+	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	rc = ptlrpc_queue_wait(req);
+	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 28/41] staging: lustre: hsm: make HSM modification requests replayable
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Mikhail Pershin, James Simmons

From: Mikhail Pershin <mike.pershin@intel.com>

There are several HSM requests which modify data on server and
reply on Lustre recovery, e.g. they should replay changes in
case of recovery.

Patch allows such requests to be replayed in recovery time and
they are issued from client using mdc_rpc_lock to serialize them
and avoid concurrent last_rcvd update on server.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5939
Reviewed-on: http://review.whamcloud.com/13684
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c |   12 +++++++++---
 1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 7b9fb90..1d1eaa5 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1505,7 +1505,9 @@ static int mdc_ioc_hsm_progress(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_queue_wait(req);
+	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	rc = ptlrpc_queue_wait(req);
+	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
@@ -1683,7 +1685,9 @@ static int mdc_ioc_hsm_state_set(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_queue_wait(req);
+	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	rc = ptlrpc_queue_wait(req);
+	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
@@ -1746,7 +1750,9 @@ static int mdc_ioc_hsm_request(struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	rc = mdc_queue_wait(req);
+	mdc_get_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
+	rc = ptlrpc_queue_wait(req);
+	mdc_put_rpc_lock(exp->exp_obd->u.cli.cl_rpc_lock, NULL);
 out:
 	ptlrpc_req_finished(req);
 	return rc;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 29/41] staging: lustre: ptlrpc: Move NRS structures out of lustre_net.h
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Chris Horn,
	James Simmons

From: Chris Horn <hornc@cray.com>

NRS specific structures are not needed in the rest of the PtlRPC code.
It is more appropriate for these structures to be defined in a
separate header. This commit creates a lustre_nrs.h header for the
generic NRS structures, and policy-specific headers for the various
NRS policies.

Signed-off-by: Chris Horn <hornc@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2667
Reviewed-on: http://review.whamcloud.com/13966
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_net.h |  712 +-------------------
 drivers/staging/lustre/lustre/include/lustre_nrs.h |  717 ++++++++++++++++++++
 .../lustre/lustre/include/lustre_nrs_fifo.h        |   70 ++
 3 files changed, 788 insertions(+), 711 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/include/lustre_nrs.h
 create mode 100644 drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h

diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index ab80330..7302238 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -515,717 +515,7 @@ struct lu_env;
 
 struct ldlm_lock;
 
-/**
- * \defgroup nrs Network Request Scheduler
- * @{
- */
-struct ptlrpc_nrs_policy;
-struct ptlrpc_nrs_resource;
-struct ptlrpc_nrs_request;
-
-/**
- * NRS control operations.
- *
- * These are common for all policies.
- */
-enum ptlrpc_nrs_ctl {
-	/**
-	 * Not a valid opcode.
-	 */
-	PTLRPC_NRS_CTL_INVALID,
-	/**
-	 * Activate the policy.
-	 */
-	PTLRPC_NRS_CTL_START,
-	/**
-	 * Reserved for multiple primary policies, which may be a possibility
-	 * in the future.
-	 */
-	PTLRPC_NRS_CTL_STOP,
-	/**
-	 * Policies can start using opcodes from this value and onwards for
-	 * their own purposes; the assigned value itself is arbitrary.
-	 */
-	PTLRPC_NRS_CTL_1ST_POL_SPEC = 0x20,
-};
-
-/**
- * ORR policy operations
- */
-enum nrs_ctl_orr {
-	NRS_CTL_ORR_RD_QUANTUM = PTLRPC_NRS_CTL_1ST_POL_SPEC,
-	NRS_CTL_ORR_WR_QUANTUM,
-	NRS_CTL_ORR_RD_OFF_TYPE,
-	NRS_CTL_ORR_WR_OFF_TYPE,
-	NRS_CTL_ORR_RD_SUPP_REQ,
-	NRS_CTL_ORR_WR_SUPP_REQ,
-};
-
-/**
- * NRS policy operations.
- *
- * These determine the behaviour of a policy, and are called in response to
- * NRS core events.
- */
-struct ptlrpc_nrs_pol_ops {
-	/**
-	 * Called during policy registration; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy being initialized
-	 */
-	int	(*op_policy_init)(struct ptlrpc_nrs_policy *policy);
-	/**
-	 * Called during policy unregistration; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy being unregistered/finalized
-	 */
-	void	(*op_policy_fini)(struct ptlrpc_nrs_policy *policy);
-	/**
-	 * Called when activating a policy via lprocfs; policies allocate and
-	 * initialize their resources here; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy being started
-	 *
-	 * \see nrs_policy_start_locked()
-	 */
-	int	(*op_policy_start)(struct ptlrpc_nrs_policy *policy);
-	/**
-	 * Called when deactivating a policy via lprocfs; policies deallocate
-	 * their resources here; this operation is optional
-	 *
-	 * \param[in,out] policy The policy being stopped
-	 *
-	 * \see nrs_policy_stop0()
-	 */
-	void	(*op_policy_stop)(struct ptlrpc_nrs_policy *policy);
-	/**
-	 * Used for policy-specific operations; i.e. not generic ones like
-	 * \e PTLRPC_NRS_CTL_START and \e PTLRPC_NRS_CTL_GET_INFO; analogous
-	 * to an ioctl; this operation is optional.
-	 *
-	 * \param[in,out]	 policy The policy carrying out operation \a opc
-	 * \param[in]	  opc	 The command operation being carried out
-	 * \param[in,out] arg	 An generic buffer for communication between the
-	 *			 user and the control operation
-	 *
-	 * \retval -ve error
-	 * \retval   0 success
-	 *
-	 * \see ptlrpc_nrs_policy_control()
-	 */
-	int	(*op_policy_ctl)(struct ptlrpc_nrs_policy *policy,
-				 enum ptlrpc_nrs_ctl opc, void *arg);
-
-	/**
-	 * Called when obtaining references to the resources of the resource
-	 * hierarchy for a request that has arrived for handling at the PTLRPC
-	 * service. Policies should return -ve for requests they do not wish
-	 * to handle. This operation is mandatory.
-	 *
-	 * \param[in,out] policy  The policy we're getting resources for.
-	 * \param[in,out] nrq	  The request we are getting resources for.
-	 * \param[in]	  parent  The parent resource of the resource being
-	 *			  requested; set to NULL if none.
-	 * \param[out]	  resp	  The resource is to be returned here; the
-	 *			  fallback policy in an NRS head should
-	 *			  \e always return a non-NULL pointer value.
-	 * \param[in]  moving_req When set, signifies that this is an attempt
-	 *			  to obtain resources for a request being moved
-	 *			  to the high-priority NRS head by
-	 *			  ldlm_lock_reorder_req().
-	 *			  This implies two things:
-	 *			  1. We are under obd_export::exp_rpc_lock and
-	 *			  so should not sleep.
-	 *			  2. We should not perform non-idempotent or can
-	 *			  skip performing idempotent operations that
-	 *			  were carried out when resources were first
-	 *			  taken for the request when it was initialized
-	 *			  in ptlrpc_nrs_req_initialize().
-	 *
-	 * \retval 0, +ve The level of the returned resource in the resource
-	 *		  hierarchy; currently only 0 (for a non-leaf resource)
-	 *		  and 1 (for a leaf resource) are supported by the
-	 *		  framework.
-	 * \retval -ve	  error
-	 *
-	 * \see ptlrpc_nrs_req_initialize()
-	 * \see ptlrpc_nrs_hpreq_add_nolock()
-	 */
-	int	(*op_res_get)(struct ptlrpc_nrs_policy *policy,
-			      struct ptlrpc_nrs_request *nrq,
-			      const struct ptlrpc_nrs_resource *parent,
-			      struct ptlrpc_nrs_resource **resp,
-			      bool moving_req);
-	/**
-	 * Called when releasing references taken for resources in the resource
-	 * hierarchy for the request; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy the resource belongs to
-	 * \param[in] res	 The resource to be freed
-	 *
-	 * \see ptlrpc_nrs_req_finalize()
-	 * \see ptlrpc_nrs_hpreq_add_nolock()
-	 */
-	void	(*op_res_put)(struct ptlrpc_nrs_policy *policy,
-			      const struct ptlrpc_nrs_resource *res);
-
-	/**
-	 * Obtains a request for handling from the policy, and optionally
-	 * removes the request from the policy; this operation is mandatory.
-	 *
-	 * \param[in,out] policy The policy to poll
-	 * \param[in]	  peek	 When set, signifies that we just want to
-	 *			 examine the request, and not handle it, so the
-	 *			 request is not removed from the policy.
-	 * \param[in]	  force	 When set, it will force a policy to return a
-	 *			 request if it has one queued.
-	 *
-	 * \retval NULL No request available for handling
-	 * \retval valid-pointer The request polled for handling
-	 *
-	 * \see ptlrpc_nrs_req_get_nolock()
-	 */
-	struct ptlrpc_nrs_request *
-		(*op_req_get)(struct ptlrpc_nrs_policy *policy, bool peek,
-			      bool force);
-	/**
-	 * Called when attempting to add a request to a policy for later
-	 * handling; this operation is mandatory.
-	 *
-	 * \param[in,out] policy  The policy on which to enqueue \a nrq
-	 * \param[in,out] nrq The request to enqueue
-	 *
-	 * \retval 0	success
-	 * \retval != 0	error
-	 *
-	 * \see ptlrpc_nrs_req_add_nolock()
-	 */
-	int	(*op_req_enqueue)(struct ptlrpc_nrs_policy *policy,
-				  struct ptlrpc_nrs_request *nrq);
-	/**
-	 * Removes a request from the policy's set of pending requests. Normally
-	 * called after a request has been polled successfully from the policy
-	 * for handling; this operation is mandatory.
-	 *
-	 * \param[in,out] policy The policy the request \a nrq belongs to
-	 * \param[in,out] nrq    The request to dequeue
-	 */
-	void	(*op_req_dequeue)(struct ptlrpc_nrs_policy *policy,
-				  struct ptlrpc_nrs_request *nrq);
-	/**
-	 * Called after the request being carried out. Could be used for
-	 * job/resource control; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy which is stopping to handle request
-	 *			 \a nrq
-	 * \param[in,out] nrq	 The request
-	 *
-	 * \pre assert_spin_locked(&svcpt->scp_req_lock)
-	 *
-	 * \see ptlrpc_nrs_req_stop_nolock()
-	 */
-	void	(*op_req_stop)(struct ptlrpc_nrs_policy *policy,
-			       struct ptlrpc_nrs_request *nrq);
-	/**
-	 * Registers the policy's lprocfs interface with a PTLRPC service.
-	 *
-	 * \param[in] svc The service
-	 *
-	 * \retval 0	success
-	 * \retval != 0	error
-	 */
-	int	(*op_lprocfs_init)(struct ptlrpc_service *svc);
-	/**
-	 * Unegisters the policy's lprocfs interface with a PTLRPC service.
-	 *
-	 * In cases of failed policy registration in
-	 * \e ptlrpc_nrs_policy_register(), this function may be called for a
-	 * service which has not registered the policy successfully, so
-	 * implementations of this method should make sure their operations are
-	 * safe in such cases.
-	 *
-	 * \param[in] svc The service
-	 */
-	void	(*op_lprocfs_fini)(struct ptlrpc_service *svc);
-};
-
-/**
- * Policy flags
- */
-enum nrs_policy_flags {
-	/**
-	 * Fallback policy, use this flag only on a single supported policy per
-	 * service. The flag cannot be used on policies that use
-	 * \e PTLRPC_NRS_FL_REG_EXTERN
-	 */
-	PTLRPC_NRS_FL_FALLBACK		= (1 << 0),
-	/**
-	 * Start policy immediately after registering.
-	 */
-	PTLRPC_NRS_FL_REG_START		= (1 << 1),
-	/**
-	 * This is a policy registering from a module different to the one NRS
-	 * core ships in (currently ptlrpc).
-	 */
-	PTLRPC_NRS_FL_REG_EXTERN	= (1 << 2),
-};
-
-/**
- * NRS queue type.
- *
- * Denotes whether an NRS instance is for handling normal or high-priority
- * RPCs, or whether an operation pertains to one or both of the NRS instances
- * in a service.
- */
-enum ptlrpc_nrs_queue_type {
-	PTLRPC_NRS_QUEUE_REG	= (1 << 0),
-	PTLRPC_NRS_QUEUE_HP	= (1 << 1),
-	PTLRPC_NRS_QUEUE_BOTH	= (PTLRPC_NRS_QUEUE_REG | PTLRPC_NRS_QUEUE_HP)
-};
-
-/**
- * NRS head
- *
- * A PTLRPC service has at least one NRS head instance for handling normal
- * priority RPCs, and may optionally have a second NRS head instance for
- * handling high-priority RPCs. Each NRS head maintains a list of available
- * policies, of which one and only one policy is acting as the fallback policy,
- * and optionally a different policy may be acting as the primary policy. For
- * all RPCs handled by this NRS head instance, NRS core will first attempt to
- * enqueue the RPC using the primary policy (if any). The fallback policy is
- * used in the following cases:
- * - when there was no primary policy in the
- *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED state at the time the request
- *   was initialized.
- * - when the primary policy that was at the
- *   ptlrpc_nrs_pol_state::PTLRPC_NRS_POL_STATE_STARTED state at the time the
- *   RPC was initialized, denoted it did not wish, or for some other reason was
- *   not able to handle the request, by returning a non-valid NRS resource
- *   reference.
- * - when the primary policy that was at the
- *   ptlrpc_nrs_pol_state::PTLRPC_NRS_POL_STATE_STARTED state at the time the
- *   RPC was initialized, fails later during the request enqueueing stage.
- *
- * \see nrs_resource_get_safe()
- * \see nrs_request_enqueue()
- */
-struct ptlrpc_nrs {
-	spinlock_t			nrs_lock;
-	/** XXX Possibly replace svcpt->scp_req_lock with another lock here. */
-	/**
-	 * List of registered policies
-	 */
-	struct list_head			nrs_policy_list;
-	/**
-	 * List of policies with queued requests. Policies that have any
-	 * outstanding requests are queued here, and this list is queried
-	 * in a round-robin manner from NRS core when obtaining a request
-	 * for handling. This ensures that requests from policies that at some
-	 * point transition away from the
-	 * ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED state are drained.
-	 */
-	struct list_head			nrs_policy_queued;
-	/**
-	 * Service partition for this NRS head
-	 */
-	struct ptlrpc_service_part     *nrs_svcpt;
-	/**
-	 * Primary policy, which is the preferred policy for handling RPCs
-	 */
-	struct ptlrpc_nrs_policy       *nrs_policy_primary;
-	/**
-	 * Fallback policy, which is the backup policy for handling RPCs
-	 */
-	struct ptlrpc_nrs_policy       *nrs_policy_fallback;
-	/**
-	 * This NRS head handles either HP or regular requests
-	 */
-	enum ptlrpc_nrs_queue_type	nrs_queue_type;
-	/**
-	 * # queued requests from all policies in this NRS head
-	 */
-	unsigned long			nrs_req_queued;
-	/**
-	 * # scheduled requests from all policies in this NRS head
-	 */
-	unsigned long			nrs_req_started;
-	/**
-	 * # policies on this NRS
-	 */
-	unsigned			nrs_num_pols;
-	/**
-	 * This NRS head is in progress of starting a policy
-	 */
-	unsigned			nrs_policy_starting:1;
-	/**
-	 * In progress of shutting down the whole NRS head; used during
-	 * unregistration
-	 */
-	unsigned			nrs_stopping:1;
-};
-
-#define NRS_POL_NAME_MAX		16
-
-struct ptlrpc_nrs_pol_desc;
-
-/**
- * Service compatibility predicate; this determines whether a policy is adequate
- * for handling RPCs of a particular PTLRPC service.
- *
- * XXX:This should give the same result during policy registration and
- * unregistration, and for all partitions of a service; so the result should not
- * depend on temporal service or other properties, that may influence the
- * result.
- */
-typedef bool (*nrs_pol_desc_compat_t) (const struct ptlrpc_service *svc,
-				       const struct ptlrpc_nrs_pol_desc *desc);
-
-struct ptlrpc_nrs_pol_conf {
-	/**
-	 * Human-readable policy name
-	 */
-	char				   nc_name[NRS_POL_NAME_MAX];
-	/**
-	 * NRS operations for this policy
-	 */
-	const struct ptlrpc_nrs_pol_ops	  *nc_ops;
-	/**
-	 * Service compatibility predicate
-	 */
-	nrs_pol_desc_compat_t		   nc_compat;
-	/**
-	 * Set for policies that support a single ptlrpc service, i.e. ones that
-	 * have \a pd_compat set to nrs_policy_compat_one(). The variable value
-	 * depicts the name of the single service that such policies are
-	 * compatible with.
-	 */
-	const char			  *nc_compat_svc_name;
-	/**
-	 * Owner module for this policy descriptor; policies registering from a
-	 * different module to the one the NRS framework is held within
-	 * (currently ptlrpc), should set this field to THIS_MODULE.
-	 */
-	struct module			  *nc_owner;
-	/**
-	 * Policy registration flags; a bitmask of \e nrs_policy_flags
-	 */
-	unsigned			   nc_flags;
-};
-
-/**
- * NRS policy registering descriptor
- *
- * Is used to hold a description of a policy that can be passed to NRS core in
- * order to register the policy with NRS heads in different PTLRPC services.
- */
-struct ptlrpc_nrs_pol_desc {
-	/**
-	 * Human-readable policy name
-	 */
-	char					pd_name[NRS_POL_NAME_MAX];
-	/**
-	 * Link into nrs_core::nrs_policies
-	 */
-	struct list_head				pd_list;
-	/**
-	 * NRS operations for this policy
-	 */
-	const struct ptlrpc_nrs_pol_ops	       *pd_ops;
-	/**
-	 * Service compatibility predicate
-	 */
-	nrs_pol_desc_compat_t			pd_compat;
-	/**
-	 * Set for policies that are compatible with only one PTLRPC service.
-	 *
-	 * \see ptlrpc_nrs_pol_conf::nc_compat_svc_name
-	 */
-	const char			       *pd_compat_svc_name;
-	/**
-	 * Owner module for this policy descriptor.
-	 *
-	 * We need to hold a reference to the module whenever we might make use
-	 * of any of the module's contents, i.e.
-	 * - If one or more instances of the policy are at a state where they
-	 *   might be handling a request, i.e.
-	 *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED or
-	 *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STOPPING as we will have to
-	 *   call into the policy's ptlrpc_nrs_pol_ops() handlers. A reference
-	 *   is taken on the module when
-	 *   \e ptlrpc_nrs_pol_desc::pd_refs becomes 1, and released when it
-	 *   becomes 0, so that we hold only one reference to the module maximum
-	 *   at any time.
-	 *
-	 *   We do not need to hold a reference to the module, even though we
-	 *   might use code and data from the module, in the following cases:
-	 * - During external policy registration, because this should happen in
-	 *   the module's init() function, in which case the module is safe from
-	 *   removal because a reference is being held on the module by the
-	 *   kernel, and iirc kmod (and I guess module-init-tools also) will
-	 *   serialize any racing processes properly anyway.
-	 * - During external policy unregistration, because this should happen
-	 *   in a module's exit() function, and any attempts to start a policy
-	 *   instance would need to take a reference on the module, and this is
-	 *   not possible once we have reached the point where the exit()
-	 *   handler is called.
-	 * - During service registration and unregistration, as service setup
-	 *   and cleanup, and policy registration, unregistration and policy
-	 *   instance starting, are serialized by \e nrs_core::nrs_mutex, so
-	 *   as long as users adhere to the convention of registering policies
-	 *   in init() and unregistering them in module exit() functions, there
-	 *   should not be a race between these operations.
-	 * - During any policy-specific lprocfs operations, because a reference
-	 *   is held by the kernel on a proc entry that has been entered by a
-	 *   syscall, so as long as proc entries are removed during unregistration time,
-	 *   then unregistration and lprocfs operations will be properly
-	 *   serialized.
-	 */
-	struct module			       *pd_owner;
-	/**
-	 * Bitmask of \e nrs_policy_flags
-	 */
-	unsigned				pd_flags;
-	/**
-	 * # of references on this descriptor
-	 */
-	atomic_t				pd_refs;
-};
-
-/**
- * NRS policy state
- *
- * Policies transition from one state to the other during their lifetime
- */
-enum ptlrpc_nrs_pol_state {
-	/**
-	 * Not a valid policy state.
-	 */
-	NRS_POL_STATE_INVALID,
-	/**
-	 * Policies are at this state either at the start of their life, or
-	 * transition here when the user selects a different policy to act
-	 * as the primary one.
-	 */
-	NRS_POL_STATE_STOPPED,
-	/**
-	 * Policy is progress of stopping
-	 */
-	NRS_POL_STATE_STOPPING,
-	/**
-	 * Policy is in progress of starting
-	 */
-	NRS_POL_STATE_STARTING,
-	/**
-	 * A policy is in this state in two cases:
-	 * - it is the fallback policy, which is always in this state.
-	 * - it has been activated by the user; i.e. it is the primary policy,
-	 */
-	NRS_POL_STATE_STARTED,
-};
-
-/**
- * NRS policy information
- *
- * Used for obtaining information for the status of a policy via lprocfs
- */
-struct ptlrpc_nrs_pol_info {
-	/**
-	 * Policy name
-	 */
-	char				pi_name[NRS_POL_NAME_MAX];
-	/**
-	 * Current policy state
-	 */
-	enum ptlrpc_nrs_pol_state	pi_state;
-	/**
-	 * # RPCs enqueued for later dispatching by the policy
-	 */
-	long				pi_req_queued;
-	/**
-	 * # RPCs started for dispatch by the policy
-	 */
-	long				pi_req_started;
-	/**
-	 * Is this a fallback policy?
-	 */
-	unsigned			pi_fallback:1;
-};
-
-/**
- * NRS policy
- *
- * There is one instance of this for each policy in each NRS head of each
- * PTLRPC service partition.
- */
-struct ptlrpc_nrs_policy {
-	/**
-	 * Linkage into the NRS head's list of policies,
-	 * ptlrpc_nrs:nrs_policy_list
-	 */
-	struct list_head			pol_list;
-	/**
-	 * Linkage into the NRS head's list of policies with enqueued
-	 * requests ptlrpc_nrs:nrs_policy_queued
-	 */
-	struct list_head			pol_list_queued;
-	/**
-	 * Current state of this policy
-	 */
-	enum ptlrpc_nrs_pol_state	pol_state;
-	/**
-	 * Bitmask of nrs_policy_flags
-	 */
-	unsigned			pol_flags;
-	/**
-	 * # RPCs enqueued for later dispatching by the policy
-	 */
-	long				pol_req_queued;
-	/**
-	 * # RPCs started for dispatch by the policy
-	 */
-	long				pol_req_started;
-	/**
-	 * Usage Reference count taken on the policy instance
-	 */
-	long				pol_ref;
-	/**
-	 * The NRS head this policy has been created at
-	 */
-	struct ptlrpc_nrs	       *pol_nrs;
-	/**
-	 * Private policy data; varies by policy type
-	 */
-	void			       *pol_private;
-	/**
-	 * Policy descriptor for this policy instance.
-	 */
-	struct ptlrpc_nrs_pol_desc     *pol_desc;
-};
-
-/**
- * NRS resource
- *
- * Resources are embedded into two types of NRS entities:
- * - Inside NRS policies, in the policy's private data in
- *   ptlrpc_nrs_policy::pol_private
- * - In objects that act as prime-level scheduling entities in different NRS
- *   policies; e.g. on a policy that performs round robin or similar order
- *   scheduling across client NIDs, there would be one NRS resource per unique
- *   client NID. On a policy which performs round robin scheduling across
- *   backend filesystem objects, there would be one resource associated with
- *   each of the backend filesystem objects partaking in the scheduling
- *   performed by the policy.
- *
- * NRS resources share a parent-child relationship, in which resources embedded
- * in policy instances are the parent entities, with all scheduling entities
- * a policy schedules across being the children, thus forming a simple resource
- * hierarchy. This hierarchy may be extended with one or more levels in the
- * future if the ability to have more than one primary policy is added.
- *
- * Upon request initialization, references to the then active NRS policies are
- * taken and used to later handle the dispatching of the request with one of
- * these policies.
- *
- * \see nrs_resource_get_safe()
- * \see ptlrpc_nrs_req_add()
- */
-struct ptlrpc_nrs_resource {
-	/**
-	 * This NRS resource's parent; is NULL for resources embedded in NRS
-	 * policy instances; i.e. those are top-level ones.
-	 */
-	struct ptlrpc_nrs_resource     *res_parent;
-	/**
-	 * The policy associated with this resource.
-	 */
-	struct ptlrpc_nrs_policy       *res_policy;
-};
-
-enum {
-	NRS_RES_FALLBACK,
-	NRS_RES_PRIMARY,
-	NRS_RES_MAX
-};
-
-/* \name fifo
- *
- * FIFO policy
- *
- * This policy is a logical wrapper around previous, non-NRS functionality.
- * It dispatches RPCs in the same order as they arrive from the network. This
- * policy is currently used as the fallback policy, and the only enabled policy
- * on all NRS heads of all PTLRPC service partitions.
- * @{
- */
-
-/**
- * Private data structure for the FIFO policy
- */
-struct nrs_fifo_head {
-	/**
-	 * Resource object for policy instance.
-	 */
-	struct ptlrpc_nrs_resource	fh_res;
-	/**
-	 * List of queued requests.
-	 */
-	struct list_head			fh_list;
-	/**
-	 * For debugging purposes.
-	 */
-	__u64				fh_sequence;
-};
-
-struct nrs_fifo_req {
-	struct list_head		fr_list;
-	__u64			fr_sequence;
-};
-
-/** @} fifo */
-
-/**
- * NRS request
- *
- * Instances of this object exist embedded within ptlrpc_request; the main
- * purpose of this object is to hold references to the request's resources
- * for the lifetime of the request, and to hold properties that policies use
- * use for determining the request's scheduling priority.
- */
-struct ptlrpc_nrs_request {
-	/**
-	 * The request's resource hierarchy.
-	 */
-	struct ptlrpc_nrs_resource     *nr_res_ptrs[NRS_RES_MAX];
-	/**
-	 * Index into ptlrpc_nrs_request::nr_res_ptrs of the resource of the
-	 * policy that was used to enqueue the request.
-	 *
-	 * \see nrs_request_enqueue()
-	 */
-	unsigned			nr_res_idx;
-	unsigned			nr_initialized:1;
-	unsigned			nr_enqueued:1;
-	unsigned			nr_started:1;
-	unsigned			nr_finalized:1;
-
-	/**
-	 * Policy-specific fields, used for determining a request's scheduling
-	 * priority, and other supporting functionality.
-	 */
-	union {
-		/**
-		 * Fields for the FIFO policy
-		 */
-		struct nrs_fifo_req	fifo;
-	} nr_u;
-	/**
-	 * Externally-registering policies may want to use this to allocate
-	 * their own request properties.
-	 */
-	void			       *ext;
-};
-
-/** @} nrs */
+#include "lustre_nrs.h"
 
 /**
  * Basic request prioritization operations structure.
diff --git a/drivers/staging/lustre/lustre/include/lustre_nrs.h b/drivers/staging/lustre/lustre/include/lustre_nrs.h
new file mode 100644
index 0000000..a5028aa
--- /dev/null
+++ b/drivers/staging/lustre/lustre/include/lustre_nrs.h
@@ -0,0 +1,717 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License version 2 for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.gnu.org/licenses/gpl-2.0.html
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * Copyright 2012 Xyratex Technology Limited
+ */
+/*
+ *
+ * Network Request Scheduler (NRS)
+ *
+ */
+
+#ifndef _LUSTRE_NRS_H
+#define _LUSTRE_NRS_H
+
+/**
+ * \defgroup nrs Network Request Scheduler
+ * @{
+ */
+struct ptlrpc_nrs_policy;
+struct ptlrpc_nrs_resource;
+struct ptlrpc_nrs_request;
+
+/**
+ * NRS control operations.
+ *
+ * These are common for all policies.
+ */
+enum ptlrpc_nrs_ctl {
+	/**
+	 * Not a valid opcode.
+	 */
+	PTLRPC_NRS_CTL_INVALID,
+	/**
+	 * Activate the policy.
+	 */
+	PTLRPC_NRS_CTL_START,
+	/**
+	 * Reserved for multiple primary policies, which may be a possibility
+	 * in the future.
+	 */
+	PTLRPC_NRS_CTL_STOP,
+	/**
+	 * Policies can start using opcodes from this value and onwards for
+	 * their own purposes; the assigned value itself is arbitrary.
+	 */
+	PTLRPC_NRS_CTL_1ST_POL_SPEC = 0x20,
+};
+
+/**
+ * NRS policy operations.
+ *
+ * These determine the behaviour of a policy, and are called in response to
+ * NRS core events.
+ */
+struct ptlrpc_nrs_pol_ops {
+	/**
+	 * Called during policy registration; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy being initialized
+	 */
+	int	(*op_policy_init)(struct ptlrpc_nrs_policy *policy);
+	/**
+	 * Called during policy unregistration; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy being unregistered/finalized
+	 */
+	void	(*op_policy_fini)(struct ptlrpc_nrs_policy *policy);
+	/**
+	 * Called when activating a policy via lprocfs; policies allocate and
+	 * initialize their resources here; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy being started
+	 *
+	 * \see nrs_policy_start_locked()
+	 */
+	int	(*op_policy_start)(struct ptlrpc_nrs_policy *policy);
+	/**
+	 * Called when deactivating a policy via lprocfs; policies deallocate
+	 * their resources here; this operation is optional
+	 *
+	 * \param[in,out] policy The policy being stopped
+	 *
+	 * \see nrs_policy_stop0()
+	 */
+	void	(*op_policy_stop)(struct ptlrpc_nrs_policy *policy);
+	/**
+	 * Used for policy-specific operations; i.e. not generic ones like
+	 * \e PTLRPC_NRS_CTL_START and \e PTLRPC_NRS_CTL_GET_INFO; analogous
+	 * to an ioctl; this operation is optional.
+	 *
+	 * \param[in,out]	 policy The policy carrying out operation \a opc
+	 * \param[in]	  opc	 The command operation being carried out
+	 * \param[in,out] arg	 An generic buffer for communication between the
+	 *			 user and the control operation
+	 *
+	 * \retval -ve error
+	 * \retval   0 success
+	 *
+	 * \see ptlrpc_nrs_policy_control()
+	 */
+	int	(*op_policy_ctl)(struct ptlrpc_nrs_policy *policy,
+				 enum ptlrpc_nrs_ctl opc, void *arg);
+
+	/**
+	 * Called when obtaining references to the resources of the resource
+	 * hierarchy for a request that has arrived for handling at the PTLRPC
+	 * service. Policies should return -ve for requests they do not wish
+	 * to handle. This operation is mandatory.
+	 *
+	 * \param[in,out] policy  The policy we're getting resources for.
+	 * \param[in,out] nrq	  The request we are getting resources for.
+	 * \param[in]	  parent  The parent resource of the resource being
+	 *			  requested; set to NULL if none.
+	 * \param[out]	  resp	  The resource is to be returned here; the
+	 *			  fallback policy in an NRS head should
+	 *			  \e always return a non-NULL pointer value.
+	 * \param[in]  moving_req When set, signifies that this is an attempt
+	 *			  to obtain resources for a request being moved
+	 *			  to the high-priority NRS head by
+	 *			  ldlm_lock_reorder_req().
+	 *			  This implies two things:
+	 *			  1. We are under obd_export::exp_rpc_lock and
+	 *			  so should not sleep.
+	 *			  2. We should not perform non-idempotent or can
+	 *			  skip performing idempotent operations that
+	 *			  were carried out when resources were first
+	 *			  taken for the request when it was initialized
+	 *			  in ptlrpc_nrs_req_initialize().
+	 *
+	 * \retval 0, +ve The level of the returned resource in the resource
+	 *		  hierarchy; currently only 0 (for a non-leaf resource)
+	 *		  and 1 (for a leaf resource) are supported by the
+	 *		  framework.
+	 * \retval -ve	  error
+	 *
+	 * \see ptlrpc_nrs_req_initialize()
+	 * \see ptlrpc_nrs_hpreq_add_nolock()
+	 * \see ptlrpc_nrs_req_hp_move()
+	 */
+	int	(*op_res_get)(struct ptlrpc_nrs_policy *policy,
+			      struct ptlrpc_nrs_request *nrq,
+			      const struct ptlrpc_nrs_resource *parent,
+			      struct ptlrpc_nrs_resource **resp,
+			      bool moving_req);
+	/**
+	 * Called when releasing references taken for resources in the resource
+	 * hierarchy for the request; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy the resource belongs to
+	 * \param[in] res	 The resource to be freed
+	 *
+	 * \see ptlrpc_nrs_req_finalize()
+	 * \see ptlrpc_nrs_hpreq_add_nolock()
+	 * \see ptlrpc_nrs_req_hp_move()
+	 */
+	void	(*op_res_put)(struct ptlrpc_nrs_policy *policy,
+			      const struct ptlrpc_nrs_resource *res);
+
+	/**
+	 * Obtains a request for handling from the policy, and optionally
+	 * removes the request from the policy; this operation is mandatory.
+	 *
+	 * \param[in,out] policy The policy to poll
+	 * \param[in]	  peek	 When set, signifies that we just want to
+	 *			 examine the request, and not handle it, so the
+	 *			 request is not removed from the policy.
+	 * \param[in]	  force  When set, it will force a policy to return a
+	 *			 request if it has one queued.
+	 *
+	 * \retval NULL No request available for handling
+	 * \retval valid-pointer The request polled for handling
+	 *
+	 * \see ptlrpc_nrs_req_get_nolock()
+	 */
+	struct ptlrpc_nrs_request *
+		(*op_req_get)(struct ptlrpc_nrs_policy *policy, bool peek,
+			      bool force);
+	/**
+	 * Called when attempting to add a request to a policy for later
+	 * handling; this operation is mandatory.
+	 *
+	 * \param[in,out] policy  The policy on which to enqueue \a nrq
+	 * \param[in,out] nrq The request to enqueue
+	 *
+	 * \retval 0	success
+	 * \retval != 0 error
+	 *
+	 * \see ptlrpc_nrs_req_add_nolock()
+	 */
+	int	(*op_req_enqueue)(struct ptlrpc_nrs_policy *policy,
+				  struct ptlrpc_nrs_request *nrq);
+	/**
+	 * Removes a request from the policy's set of pending requests. Normally
+	 * called after a request has been polled successfully from the policy
+	 * for handling; this operation is mandatory.
+	 *
+	 * \param[in,out] policy The policy the request \a nrq belongs to
+	 * \param[in,out] nrq	 The request to dequeue
+	 *
+	 * \see ptlrpc_nrs_req_del_nolock()
+	 */
+	void	(*op_req_dequeue)(struct ptlrpc_nrs_policy *policy,
+				  struct ptlrpc_nrs_request *nrq);
+	/**
+	 * Called after the request being carried out. Could be used for
+	 * job/resource control; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy which is stopping to handle request
+	 *			 \a nrq
+	 * \param[in,out] nrq	 The request
+	 *
+	 * \pre assert_spin_locked(&svcpt->scp_req_lock)
+	 *
+	 * \see ptlrpc_nrs_req_stop_nolock()
+	 */
+	void	(*op_req_stop)(struct ptlrpc_nrs_policy *policy,
+			       struct ptlrpc_nrs_request *nrq);
+	/**
+	 * Registers the policy's lprocfs interface with a PTLRPC service.
+	 *
+	 * \param[in] svc The service
+	 *
+	 * \retval 0	success
+	 * \retval != 0 error
+	 */
+	int	(*op_lprocfs_init)(struct ptlrpc_service *svc);
+	/**
+	 * Unegisters the policy's lprocfs interface with a PTLRPC service.
+	 *
+	 * In cases of failed policy registration in
+	 * \e ptlrpc_nrs_policy_register(), this function may be called for a
+	 * service which has not registered the policy successfully, so
+	 * implementations of this method should make sure their operations are
+	 * safe in such cases.
+	 *
+	 * \param[in] svc The service
+	 */
+	void	(*op_lprocfs_fini)(struct ptlrpc_service *svc);
+};
+
+/**
+ * Policy flags
+ */
+enum nrs_policy_flags {
+	/**
+	 * Fallback policy, use this flag only on a single supported policy per
+	 * service. The flag cannot be used on policies that use
+	 * \e PTLRPC_NRS_FL_REG_EXTERN
+	 */
+	PTLRPC_NRS_FL_FALLBACK		= BIT(0),
+	/**
+	 * Start policy immediately after registering.
+	 */
+	PTLRPC_NRS_FL_REG_START		= BIT(1),
+	/**
+	 * This is a policy registering from a module different to the one NRS
+	 * core ships in (currently ptlrpc).
+	 */
+	PTLRPC_NRS_FL_REG_EXTERN	= BIT(2),
+};
+
+/**
+ * NRS queue type.
+ *
+ * Denotes whether an NRS instance is for handling normal or high-priority
+ * RPCs, or whether an operation pertains to one or both of the NRS instances
+ * in a service.
+ */
+enum ptlrpc_nrs_queue_type {
+	PTLRPC_NRS_QUEUE_REG	= BIT(0),
+	PTLRPC_NRS_QUEUE_HP	= BIT(1),
+	PTLRPC_NRS_QUEUE_BOTH	= (PTLRPC_NRS_QUEUE_REG | PTLRPC_NRS_QUEUE_HP)
+};
+
+/**
+ * NRS head
+ *
+ * A PTLRPC service has at least one NRS head instance for handling normal
+ * priority RPCs, and may optionally have a second NRS head instance for
+ * handling high-priority RPCs. Each NRS head maintains a list of available
+ * policies, of which one and only one policy is acting as the fallback policy,
+ * and optionally a different policy may be acting as the primary policy. For
+ * all RPCs handled by this NRS head instance, NRS core will first attempt to
+ * enqueue the RPC using the primary policy (if any). The fallback policy is
+ * used in the following cases:
+ * - when there was no primary policy in the
+ *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED state at the time the request
+ *   was initialized.
+ * - when the primary policy that was at the
+ *   ptlrpc_nrs_pol_state::PTLRPC_NRS_POL_STATE_STARTED state at the time the
+ *   RPC was initialized, denoted it did not wish, or for some other reason was
+ *   not able to handle the request, by returning a non-valid NRS resource
+ *   reference.
+ * - when the primary policy that was at the
+ *   ptlrpc_nrs_pol_state::PTLRPC_NRS_POL_STATE_STARTED state at the time the
+ *   RPC was initialized, fails later during the request enqueueing stage.
+ *
+ * \see nrs_resource_get_safe()
+ * \see nrs_request_enqueue()
+ */
+struct ptlrpc_nrs {
+	spinlock_t			nrs_lock;
+	/** XXX Possibly replace svcpt->scp_req_lock with another lock here. */
+	/**
+	 * List of registered policies
+	 */
+	struct list_head		nrs_policy_list;
+	/**
+	 * List of policies with queued requests. Policies that have any
+	 * outstanding requests are queued here, and this list is queried
+	 * in a round-robin manner from NRS core when obtaining a request
+	 * for handling. This ensures that requests from policies that at some
+	 * point transition away from the
+	 * ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED state are drained.
+	 */
+	struct list_head		nrs_policy_queued;
+	/**
+	 * Service partition for this NRS head
+	 */
+	struct ptlrpc_service_part     *nrs_svcpt;
+	/**
+	 * Primary policy, which is the preferred policy for handling RPCs
+	 */
+	struct ptlrpc_nrs_policy       *nrs_policy_primary;
+	/**
+	 * Fallback policy, which is the backup policy for handling RPCs
+	 */
+	struct ptlrpc_nrs_policy       *nrs_policy_fallback;
+	/**
+	 * This NRS head handles either HP or regular requests
+	 */
+	enum ptlrpc_nrs_queue_type	nrs_queue_type;
+	/**
+	 * # queued requests from all policies in this NRS head
+	 */
+	unsigned long			nrs_req_queued;
+	/**
+	 * # scheduled requests from all policies in this NRS head
+	 */
+	unsigned long			nrs_req_started;
+	/**
+	 * # policies on this NRS
+	 */
+	unsigned int			nrs_num_pols;
+	/**
+	 * This NRS head is in progress of starting a policy
+	 */
+	unsigned int			nrs_policy_starting:1;
+	/**
+	 * In progress of shutting down the whole NRS head; used during
+	 * unregistration
+	 */
+	unsigned int			nrs_stopping:1;
+	/**
+	 * NRS policy is throttling request
+	 */
+	unsigned int			nrs_throttling:1;
+};
+
+#define NRS_POL_NAME_MAX		16
+#define NRS_POL_ARG_MAX			16
+
+struct ptlrpc_nrs_pol_desc;
+
+/**
+ * Service compatibility predicate; this determines whether a policy is adequate
+ * for handling RPCs of a particular PTLRPC service.
+ *
+ * XXX:This should give the same result during policy registration and
+ * unregistration, and for all partitions of a service; so the result should not
+ * depend on temporal service or other properties, that may influence the
+ * result.
+ */
+typedef bool (*nrs_pol_desc_compat_t)(const struct ptlrpc_service *svc,
+				      const struct ptlrpc_nrs_pol_desc *desc);
+
+struct ptlrpc_nrs_pol_conf {
+	/**
+	 * Human-readable policy name
+	 */
+	char				   nc_name[NRS_POL_NAME_MAX];
+	/**
+	 * NRS operations for this policy
+	 */
+	const struct ptlrpc_nrs_pol_ops   *nc_ops;
+	/**
+	 * Service compatibility predicate
+	 */
+	nrs_pol_desc_compat_t		   nc_compat;
+	/**
+	 * Set for policies that support a single ptlrpc service, i.e. ones that
+	 * have \a pd_compat set to nrs_policy_compat_one(). The variable value
+	 * depicts the name of the single service that such policies are
+	 * compatible with.
+	 */
+	const char			  *nc_compat_svc_name;
+	/**
+	 * Owner module for this policy descriptor; policies registering from a
+	 * different module to the one the NRS framework is held within
+	 * (currently ptlrpc), should set this field to THIS_MODULE.
+	 */
+	struct module			  *nc_owner;
+	/**
+	 * Policy registration flags; a bitmask of \e nrs_policy_flags
+	 */
+	unsigned int			   nc_flags;
+};
+
+/**
+ * NRS policy registering descriptor
+ *
+ * Is used to hold a description of a policy that can be passed to NRS core in
+ * order to register the policy with NRS heads in different PTLRPC services.
+ */
+struct ptlrpc_nrs_pol_desc {
+	/**
+	 * Human-readable policy name
+	 */
+	char					pd_name[NRS_POL_NAME_MAX];
+	/**
+	 * Link into nrs_core::nrs_policies
+	 */
+	struct list_head			pd_list;
+	/**
+	 * NRS operations for this policy
+	 */
+	const struct ptlrpc_nrs_pol_ops        *pd_ops;
+	/**
+	 * Service compatibility predicate
+	 */
+	nrs_pol_desc_compat_t			pd_compat;
+	/**
+	 * Set for policies that are compatible with only one PTLRPC service.
+	 *
+	 * \see ptlrpc_nrs_pol_conf::nc_compat_svc_name
+	 */
+	const char			       *pd_compat_svc_name;
+	/**
+	 * Owner module for this policy descriptor.
+	 *
+	 * We need to hold a reference to the module whenever we might make use
+	 * of any of the module's contents, i.e.
+	 * - If one or more instances of the policy are at a state where they
+	 *   might be handling a request, i.e.
+	 *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED or
+	 *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STOPPING as we will have to
+	 *   call into the policy's ptlrpc_nrs_pol_ops() handlers. A reference
+	 *   is taken on the module when
+	 *   \e ptlrpc_nrs_pol_desc::pd_refs becomes 1, and released when it
+	 *   becomes 0, so that we hold only one reference to the module maximum
+	 *   at any time.
+	 *
+	 *   We do not need to hold a reference to the module, even though we
+	 *   might use code and data from the module, in the following cases:
+	 * - During external policy registration, because this should happen in
+	 *   the module's init() function, in which case the module is safe from
+	 *   removal because a reference is being held on the module by the
+	 *   kernel, and iirc kmod (and I guess module-init-tools also) will
+	 *   serialize any racing processes properly anyway.
+	 * - During external policy unregistration, because this should happen
+	 *   in a module's exit() function, and any attempts to start a policy
+	 *   instance would need to take a reference on the module, and this is
+	 *   not possible once we have reached the point where the exit()
+	 *   handler is called.
+	 * - During service registration and unregistration, as service setup
+	 *   and cleanup, and policy registration, unregistration and policy
+	 *   instance starting, are serialized by \e nrs_core::nrs_mutex, so
+	 *   as long as users adhere to the convention of registering policies
+	 *   in init() and unregistering them in module exit() functions, there
+	 *   should not be a race between these operations.
+	 * - During any policy-specific lprocfs operations, because a reference
+	 *   is held by the kernel on a proc entry that has been entered by a
+	 *   syscall, so as long as proc entries are removed during
+	 *   unregistration time, then unregistration and lprocfs operations
+	 *   will be properly serialized.
+	 */
+	struct module			       *pd_owner;
+	/**
+	 * Bitmask of \e nrs_policy_flags
+	 */
+	unsigned int				pd_flags;
+	/**
+	 * # of references on this descriptor
+	 */
+	atomic_t				pd_refs;
+};
+
+/**
+ * NRS policy state
+ *
+ * Policies transition from one state to the other during their lifetime
+ */
+enum ptlrpc_nrs_pol_state {
+	/**
+	 * Not a valid policy state.
+	 */
+	NRS_POL_STATE_INVALID,
+	/**
+	 * Policies are at this state either at the start of their life, or
+	 * transition here when the user selects a different policy to act
+	 * as the primary one.
+	 */
+	NRS_POL_STATE_STOPPED,
+	/**
+	 * Policy is progress of stopping
+	 */
+	NRS_POL_STATE_STOPPING,
+	/**
+	 * Policy is in progress of starting
+	 */
+	NRS_POL_STATE_STARTING,
+	/**
+	 * A policy is in this state in two cases:
+	 * - it is the fallback policy, which is always in this state.
+	 * - it has been activated by the user; i.e. it is the primary policy,
+	 */
+	NRS_POL_STATE_STARTED,
+};
+
+/**
+ * NRS policy information
+ *
+ * Used for obtaining information for the status of a policy via lprocfs
+ */
+struct ptlrpc_nrs_pol_info {
+	/**
+	 * Policy name
+	 */
+	char				pi_name[NRS_POL_NAME_MAX];
+	/**
+	 * Policy argument
+	 */
+	char				pi_arg[NRS_POL_ARG_MAX];
+	/**
+	 * Current policy state
+	 */
+	enum ptlrpc_nrs_pol_state	pi_state;
+	/**
+	 * # RPCs enqueued for later dispatching by the policy
+	 */
+	long				pi_req_queued;
+	/**
+	 * # RPCs started for dispatch by the policy
+	 */
+	long				pi_req_started;
+	/**
+	 * Is this a fallback policy?
+	 */
+	unsigned			pi_fallback:1;
+};
+
+/**
+ * NRS policy
+ *
+ * There is one instance of this for each policy in each NRS head of each
+ * PTLRPC service partition.
+ */
+struct ptlrpc_nrs_policy {
+	/**
+	 * Linkage into the NRS head's list of policies,
+	 * ptlrpc_nrs:nrs_policy_list
+	 */
+	struct list_head		pol_list;
+	/**
+	 * Linkage into the NRS head's list of policies with enqueued
+	 * requests ptlrpc_nrs:nrs_policy_queued
+	 */
+	struct list_head		pol_list_queued;
+	/**
+	 * Current state of this policy
+	 */
+	enum ptlrpc_nrs_pol_state	pol_state;
+	/**
+	 * Bitmask of nrs_policy_flags
+	 */
+	unsigned int			pol_flags;
+	/**
+	 * # RPCs enqueued for later dispatching by the policy
+	 */
+	long				pol_req_queued;
+	/**
+	 * # RPCs started for dispatch by the policy
+	 */
+	long				pol_req_started;
+	/**
+	 * Usage Reference count taken on the policy instance
+	 */
+	long				pol_ref;
+	/**
+	 * Human-readable policy argument
+	 */
+	char				pol_arg[NRS_POL_ARG_MAX];
+	/**
+	 * The NRS head this policy has been created at
+	 */
+	struct ptlrpc_nrs	       *pol_nrs;
+	/**
+	 * Private policy data; varies by policy type
+	 */
+	void			       *pol_private;
+	/**
+	 * Policy descriptor for this policy instance.
+	 */
+	struct ptlrpc_nrs_pol_desc     *pol_desc;
+};
+
+/**
+ * NRS resource
+ *
+ * Resources are embedded into two types of NRS entities:
+ * - Inside NRS policies, in the policy's private data in
+ *   ptlrpc_nrs_policy::pol_private
+ * - In objects that act as prime-level scheduling entities in different NRS
+ *   policies; e.g. on a policy that performs round robin or similar order
+ *   scheduling across client NIDs, there would be one NRS resource per unique
+ *   client NID. On a policy which performs round robin scheduling across
+ *   backend filesystem objects, there would be one resource associated with
+ *   each of the backend filesystem objects partaking in the scheduling
+ *   performed by the policy.
+ *
+ * NRS resources share a parent-child relationship, in which resources embedded
+ * in policy instances are the parent entities, with all scheduling entities
+ * a policy schedules across being the children, thus forming a simple resource
+ * hierarchy. This hierarchy may be extended with one or more levels in the
+ * future if the ability to have more than one primary policy is added.
+ *
+ * Upon request initialization, references to the then active NRS policies are
+ * taken and used to later handle the dispatching of the request with one of
+ * these policies.
+ *
+ * \see nrs_resource_get_safe()
+ * \see ptlrpc_nrs_req_add()
+ */
+struct ptlrpc_nrs_resource {
+	/**
+	 * This NRS resource's parent; is NULL for resources embedded in NRS
+	 * policy instances; i.e. those are top-level ones.
+	 */
+	struct ptlrpc_nrs_resource     *res_parent;
+	/**
+	 * The policy associated with this resource.
+	 */
+	struct ptlrpc_nrs_policy       *res_policy;
+};
+
+enum {
+	NRS_RES_FALLBACK,
+	NRS_RES_PRIMARY,
+	NRS_RES_MAX
+};
+
+#include "lustre_nrs_fifo.h"
+
+/**
+ * NRS request
+ *
+ * Instances of this object exist embedded within ptlrpc_request; the main
+ * purpose of this object is to hold references to the request's resources
+ * for the lifetime of the request, and to hold properties that policies use
+ * use for determining the request's scheduling priority.
+ **/
+struct ptlrpc_nrs_request {
+	/**
+	 * The request's resource hierarchy.
+	 */
+	struct ptlrpc_nrs_resource     *nr_res_ptrs[NRS_RES_MAX];
+	/**
+	 * Index into ptlrpc_nrs_request::nr_res_ptrs of the resource of the
+	 * policy that was used to enqueue the request.
+	 *
+	 * \see nrs_request_enqueue()
+	 */
+	unsigned int			nr_res_idx;
+	unsigned int			nr_initialized:1;
+	unsigned int			nr_enqueued:1;
+	unsigned int			nr_started:1;
+	unsigned int			nr_finalized:1;
+
+	/**
+	 * Policy-specific fields, used for determining a request's scheduling
+	 * priority, and other supporting functionality.
+	 */
+	union {
+		/**
+		 * Fields for the FIFO policy
+		 */
+		struct nrs_fifo_req	fifo;
+	} nr_u;
+	/**
+	 * Externally-registering policies may want to use this to allocate
+	 * their own request properties.
+	 */
+	void			       *ext;
+};
+
+/** @} nrs */
+#endif
diff --git a/drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h b/drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h
new file mode 100644
index 0000000..3b5418e
--- /dev/null
+++ b/drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h
@@ -0,0 +1,70 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License version 2 for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.gnu.org/licenses/gpl-2.0.html
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * Copyright 2012 Xyratex Technology Limited
+ */
+/*
+ *
+ * Network Request Scheduler (NRS) First-in First-out (FIFO) policy
+ *
+ */
+
+#ifndef _LUSTRE_NRS_FIFO_H
+#define _LUSTRE_NRS_FIFO_H
+
+/* \name fifo
+ *
+ * FIFO policy
+ *
+ * This policy is a logical wrapper around previous, non-NRS functionality.
+ * It dispatches RPCs in the same order as they arrive from the network. This
+ * policy is currently used as the fallback policy, and the only enabled policy
+ * on all NRS heads of all PTLRPC service partitions.
+ * @{
+ */
+
+/**
+ * Private data structure for the FIFO policy
+ */
+struct nrs_fifo_head {
+	/**
+	 * Resource object for policy instance.
+	 */
+	struct ptlrpc_nrs_resource	fh_res;
+	/**
+	 * List of queued requests.
+	 */
+	struct list_head		fh_list;
+	/**
+	 * For debugging purposes.
+	 */
+	__u64				fh_sequence;
+};
+
+struct nrs_fifo_req {
+	struct list_head	fr_list;
+	__u64			fr_sequence;
+};
+
+/** @} fifo */
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 29/41] staging: lustre: ptlrpc: Move NRS structures out of lustre_net.h
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Chris Horn,
	James Simmons

From: Chris Horn <hornc@cray.com>

NRS specific structures are not needed in the rest of the PtlRPC code.
It is more appropriate for these structures to be defined in a
separate header. This commit creates a lustre_nrs.h header for the
generic NRS structures, and policy-specific headers for the various
NRS policies.

Signed-off-by: Chris Horn <hornc@cray.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2667
Reviewed-on: http://review.whamcloud.com/13966
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_net.h |  712 +-------------------
 drivers/staging/lustre/lustre/include/lustre_nrs.h |  717 ++++++++++++++++++++
 .../lustre/lustre/include/lustre_nrs_fifo.h        |   70 ++
 3 files changed, 788 insertions(+), 711 deletions(-)
 create mode 100644 drivers/staging/lustre/lustre/include/lustre_nrs.h
 create mode 100644 drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h

diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h b/drivers/staging/lustre/lustre/include/lustre_net.h
index ab80330..7302238 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -515,717 +515,7 @@ struct lu_env;
 
 struct ldlm_lock;
 
-/**
- * \defgroup nrs Network Request Scheduler
- * @{
- */
-struct ptlrpc_nrs_policy;
-struct ptlrpc_nrs_resource;
-struct ptlrpc_nrs_request;
-
-/**
- * NRS control operations.
- *
- * These are common for all policies.
- */
-enum ptlrpc_nrs_ctl {
-	/**
-	 * Not a valid opcode.
-	 */
-	PTLRPC_NRS_CTL_INVALID,
-	/**
-	 * Activate the policy.
-	 */
-	PTLRPC_NRS_CTL_START,
-	/**
-	 * Reserved for multiple primary policies, which may be a possibility
-	 * in the future.
-	 */
-	PTLRPC_NRS_CTL_STOP,
-	/**
-	 * Policies can start using opcodes from this value and onwards for
-	 * their own purposes; the assigned value itself is arbitrary.
-	 */
-	PTLRPC_NRS_CTL_1ST_POL_SPEC = 0x20,
-};
-
-/**
- * ORR policy operations
- */
-enum nrs_ctl_orr {
-	NRS_CTL_ORR_RD_QUANTUM = PTLRPC_NRS_CTL_1ST_POL_SPEC,
-	NRS_CTL_ORR_WR_QUANTUM,
-	NRS_CTL_ORR_RD_OFF_TYPE,
-	NRS_CTL_ORR_WR_OFF_TYPE,
-	NRS_CTL_ORR_RD_SUPP_REQ,
-	NRS_CTL_ORR_WR_SUPP_REQ,
-};
-
-/**
- * NRS policy operations.
- *
- * These determine the behaviour of a policy, and are called in response to
- * NRS core events.
- */
-struct ptlrpc_nrs_pol_ops {
-	/**
-	 * Called during policy registration; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy being initialized
-	 */
-	int	(*op_policy_init)(struct ptlrpc_nrs_policy *policy);
-	/**
-	 * Called during policy unregistration; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy being unregistered/finalized
-	 */
-	void	(*op_policy_fini)(struct ptlrpc_nrs_policy *policy);
-	/**
-	 * Called when activating a policy via lprocfs; policies allocate and
-	 * initialize their resources here; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy being started
-	 *
-	 * \see nrs_policy_start_locked()
-	 */
-	int	(*op_policy_start)(struct ptlrpc_nrs_policy *policy);
-	/**
-	 * Called when deactivating a policy via lprocfs; policies deallocate
-	 * their resources here; this operation is optional
-	 *
-	 * \param[in,out] policy The policy being stopped
-	 *
-	 * \see nrs_policy_stop0()
-	 */
-	void	(*op_policy_stop)(struct ptlrpc_nrs_policy *policy);
-	/**
-	 * Used for policy-specific operations; i.e. not generic ones like
-	 * \e PTLRPC_NRS_CTL_START and \e PTLRPC_NRS_CTL_GET_INFO; analogous
-	 * to an ioctl; this operation is optional.
-	 *
-	 * \param[in,out]	 policy The policy carrying out operation \a opc
-	 * \param[in]	  opc	 The command operation being carried out
-	 * \param[in,out] arg	 An generic buffer for communication between the
-	 *			 user and the control operation
-	 *
-	 * \retval -ve error
-	 * \retval   0 success
-	 *
-	 * \see ptlrpc_nrs_policy_control()
-	 */
-	int	(*op_policy_ctl)(struct ptlrpc_nrs_policy *policy,
-				 enum ptlrpc_nrs_ctl opc, void *arg);
-
-	/**
-	 * Called when obtaining references to the resources of the resource
-	 * hierarchy for a request that has arrived for handling at the PTLRPC
-	 * service. Policies should return -ve for requests they do not wish
-	 * to handle. This operation is mandatory.
-	 *
-	 * \param[in,out] policy  The policy we're getting resources for.
-	 * \param[in,out] nrq	  The request we are getting resources for.
-	 * \param[in]	  parent  The parent resource of the resource being
-	 *			  requested; set to NULL if none.
-	 * \param[out]	  resp	  The resource is to be returned here; the
-	 *			  fallback policy in an NRS head should
-	 *			  \e always return a non-NULL pointer value.
-	 * \param[in]  moving_req When set, signifies that this is an attempt
-	 *			  to obtain resources for a request being moved
-	 *			  to the high-priority NRS head by
-	 *			  ldlm_lock_reorder_req().
-	 *			  This implies two things:
-	 *			  1. We are under obd_export::exp_rpc_lock and
-	 *			  so should not sleep.
-	 *			  2. We should not perform non-idempotent or can
-	 *			  skip performing idempotent operations that
-	 *			  were carried out when resources were first
-	 *			  taken for the request when it was initialized
-	 *			  in ptlrpc_nrs_req_initialize().
-	 *
-	 * \retval 0, +ve The level of the returned resource in the resource
-	 *		  hierarchy; currently only 0 (for a non-leaf resource)
-	 *		  and 1 (for a leaf resource) are supported by the
-	 *		  framework.
-	 * \retval -ve	  error
-	 *
-	 * \see ptlrpc_nrs_req_initialize()
-	 * \see ptlrpc_nrs_hpreq_add_nolock()
-	 */
-	int	(*op_res_get)(struct ptlrpc_nrs_policy *policy,
-			      struct ptlrpc_nrs_request *nrq,
-			      const struct ptlrpc_nrs_resource *parent,
-			      struct ptlrpc_nrs_resource **resp,
-			      bool moving_req);
-	/**
-	 * Called when releasing references taken for resources in the resource
-	 * hierarchy for the request; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy the resource belongs to
-	 * \param[in] res	 The resource to be freed
-	 *
-	 * \see ptlrpc_nrs_req_finalize()
-	 * \see ptlrpc_nrs_hpreq_add_nolock()
-	 */
-	void	(*op_res_put)(struct ptlrpc_nrs_policy *policy,
-			      const struct ptlrpc_nrs_resource *res);
-
-	/**
-	 * Obtains a request for handling from the policy, and optionally
-	 * removes the request from the policy; this operation is mandatory.
-	 *
-	 * \param[in,out] policy The policy to poll
-	 * \param[in]	  peek	 When set, signifies that we just want to
-	 *			 examine the request, and not handle it, so the
-	 *			 request is not removed from the policy.
-	 * \param[in]	  force	 When set, it will force a policy to return a
-	 *			 request if it has one queued.
-	 *
-	 * \retval NULL No request available for handling
-	 * \retval valid-pointer The request polled for handling
-	 *
-	 * \see ptlrpc_nrs_req_get_nolock()
-	 */
-	struct ptlrpc_nrs_request *
-		(*op_req_get)(struct ptlrpc_nrs_policy *policy, bool peek,
-			      bool force);
-	/**
-	 * Called when attempting to add a request to a policy for later
-	 * handling; this operation is mandatory.
-	 *
-	 * \param[in,out] policy  The policy on which to enqueue \a nrq
-	 * \param[in,out] nrq The request to enqueue
-	 *
-	 * \retval 0	success
-	 * \retval != 0	error
-	 *
-	 * \see ptlrpc_nrs_req_add_nolock()
-	 */
-	int	(*op_req_enqueue)(struct ptlrpc_nrs_policy *policy,
-				  struct ptlrpc_nrs_request *nrq);
-	/**
-	 * Removes a request from the policy's set of pending requests. Normally
-	 * called after a request has been polled successfully from the policy
-	 * for handling; this operation is mandatory.
-	 *
-	 * \param[in,out] policy The policy the request \a nrq belongs to
-	 * \param[in,out] nrq    The request to dequeue
-	 */
-	void	(*op_req_dequeue)(struct ptlrpc_nrs_policy *policy,
-				  struct ptlrpc_nrs_request *nrq);
-	/**
-	 * Called after the request being carried out. Could be used for
-	 * job/resource control; this operation is optional.
-	 *
-	 * \param[in,out] policy The policy which is stopping to handle request
-	 *			 \a nrq
-	 * \param[in,out] nrq	 The request
-	 *
-	 * \pre assert_spin_locked(&svcpt->scp_req_lock)
-	 *
-	 * \see ptlrpc_nrs_req_stop_nolock()
-	 */
-	void	(*op_req_stop)(struct ptlrpc_nrs_policy *policy,
-			       struct ptlrpc_nrs_request *nrq);
-	/**
-	 * Registers the policy's lprocfs interface with a PTLRPC service.
-	 *
-	 * \param[in] svc The service
-	 *
-	 * \retval 0	success
-	 * \retval != 0	error
-	 */
-	int	(*op_lprocfs_init)(struct ptlrpc_service *svc);
-	/**
-	 * Unegisters the policy's lprocfs interface with a PTLRPC service.
-	 *
-	 * In cases of failed policy registration in
-	 * \e ptlrpc_nrs_policy_register(), this function may be called for a
-	 * service which has not registered the policy successfully, so
-	 * implementations of this method should make sure their operations are
-	 * safe in such cases.
-	 *
-	 * \param[in] svc The service
-	 */
-	void	(*op_lprocfs_fini)(struct ptlrpc_service *svc);
-};
-
-/**
- * Policy flags
- */
-enum nrs_policy_flags {
-	/**
-	 * Fallback policy, use this flag only on a single supported policy per
-	 * service. The flag cannot be used on policies that use
-	 * \e PTLRPC_NRS_FL_REG_EXTERN
-	 */
-	PTLRPC_NRS_FL_FALLBACK		= (1 << 0),
-	/**
-	 * Start policy immediately after registering.
-	 */
-	PTLRPC_NRS_FL_REG_START		= (1 << 1),
-	/**
-	 * This is a policy registering from a module different to the one NRS
-	 * core ships in (currently ptlrpc).
-	 */
-	PTLRPC_NRS_FL_REG_EXTERN	= (1 << 2),
-};
-
-/**
- * NRS queue type.
- *
- * Denotes whether an NRS instance is for handling normal or high-priority
- * RPCs, or whether an operation pertains to one or both of the NRS instances
- * in a service.
- */
-enum ptlrpc_nrs_queue_type {
-	PTLRPC_NRS_QUEUE_REG	= (1 << 0),
-	PTLRPC_NRS_QUEUE_HP	= (1 << 1),
-	PTLRPC_NRS_QUEUE_BOTH	= (PTLRPC_NRS_QUEUE_REG | PTLRPC_NRS_QUEUE_HP)
-};
-
-/**
- * NRS head
- *
- * A PTLRPC service has at least one NRS head instance for handling normal
- * priority RPCs, and may optionally have a second NRS head instance for
- * handling high-priority RPCs. Each NRS head maintains a list of available
- * policies, of which one and only one policy is acting as the fallback policy,
- * and optionally a different policy may be acting as the primary policy. For
- * all RPCs handled by this NRS head instance, NRS core will first attempt to
- * enqueue the RPC using the primary policy (if any). The fallback policy is
- * used in the following cases:
- * - when there was no primary policy in the
- *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED state at the time the request
- *   was initialized.
- * - when the primary policy that was at the
- *   ptlrpc_nrs_pol_state::PTLRPC_NRS_POL_STATE_STARTED state at the time the
- *   RPC was initialized, denoted it did not wish, or for some other reason was
- *   not able to handle the request, by returning a non-valid NRS resource
- *   reference.
- * - when the primary policy that was at the
- *   ptlrpc_nrs_pol_state::PTLRPC_NRS_POL_STATE_STARTED state@the time the
- *   RPC was initialized, fails later during the request enqueueing stage.
- *
- * \see nrs_resource_get_safe()
- * \see nrs_request_enqueue()
- */
-struct ptlrpc_nrs {
-	spinlock_t			nrs_lock;
-	/** XXX Possibly replace svcpt->scp_req_lock with another lock here. */
-	/**
-	 * List of registered policies
-	 */
-	struct list_head			nrs_policy_list;
-	/**
-	 * List of policies with queued requests. Policies that have any
-	 * outstanding requests are queued here, and this list is queried
-	 * in a round-robin manner from NRS core when obtaining a request
-	 * for handling. This ensures that requests from policies that at some
-	 * point transition away from the
-	 * ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED state are drained.
-	 */
-	struct list_head			nrs_policy_queued;
-	/**
-	 * Service partition for this NRS head
-	 */
-	struct ptlrpc_service_part     *nrs_svcpt;
-	/**
-	 * Primary policy, which is the preferred policy for handling RPCs
-	 */
-	struct ptlrpc_nrs_policy       *nrs_policy_primary;
-	/**
-	 * Fallback policy, which is the backup policy for handling RPCs
-	 */
-	struct ptlrpc_nrs_policy       *nrs_policy_fallback;
-	/**
-	 * This NRS head handles either HP or regular requests
-	 */
-	enum ptlrpc_nrs_queue_type	nrs_queue_type;
-	/**
-	 * # queued requests from all policies in this NRS head
-	 */
-	unsigned long			nrs_req_queued;
-	/**
-	 * # scheduled requests from all policies in this NRS head
-	 */
-	unsigned long			nrs_req_started;
-	/**
-	 * # policies on this NRS
-	 */
-	unsigned			nrs_num_pols;
-	/**
-	 * This NRS head is in progress of starting a policy
-	 */
-	unsigned			nrs_policy_starting:1;
-	/**
-	 * In progress of shutting down the whole NRS head; used during
-	 * unregistration
-	 */
-	unsigned			nrs_stopping:1;
-};
-
-#define NRS_POL_NAME_MAX		16
-
-struct ptlrpc_nrs_pol_desc;
-
-/**
- * Service compatibility predicate; this determines whether a policy is adequate
- * for handling RPCs of a particular PTLRPC service.
- *
- * XXX:This should give the same result during policy registration and
- * unregistration, and for all partitions of a service; so the result should not
- * depend on temporal service or other properties, that may influence the
- * result.
- */
-typedef bool (*nrs_pol_desc_compat_t) (const struct ptlrpc_service *svc,
-				       const struct ptlrpc_nrs_pol_desc *desc);
-
-struct ptlrpc_nrs_pol_conf {
-	/**
-	 * Human-readable policy name
-	 */
-	char				   nc_name[NRS_POL_NAME_MAX];
-	/**
-	 * NRS operations for this policy
-	 */
-	const struct ptlrpc_nrs_pol_ops	  *nc_ops;
-	/**
-	 * Service compatibility predicate
-	 */
-	nrs_pol_desc_compat_t		   nc_compat;
-	/**
-	 * Set for policies that support a single ptlrpc service, i.e. ones that
-	 * have \a pd_compat set to nrs_policy_compat_one(). The variable value
-	 * depicts the name of the single service that such policies are
-	 * compatible with.
-	 */
-	const char			  *nc_compat_svc_name;
-	/**
-	 * Owner module for this policy descriptor; policies registering from a
-	 * different module to the one the NRS framework is held within
-	 * (currently ptlrpc), should set this field to THIS_MODULE.
-	 */
-	struct module			  *nc_owner;
-	/**
-	 * Policy registration flags; a bitmask of \e nrs_policy_flags
-	 */
-	unsigned			   nc_flags;
-};
-
-/**
- * NRS policy registering descriptor
- *
- * Is used to hold a description of a policy that can be passed to NRS core in
- * order to register the policy with NRS heads in different PTLRPC services.
- */
-struct ptlrpc_nrs_pol_desc {
-	/**
-	 * Human-readable policy name
-	 */
-	char					pd_name[NRS_POL_NAME_MAX];
-	/**
-	 * Link into nrs_core::nrs_policies
-	 */
-	struct list_head				pd_list;
-	/**
-	 * NRS operations for this policy
-	 */
-	const struct ptlrpc_nrs_pol_ops	       *pd_ops;
-	/**
-	 * Service compatibility predicate
-	 */
-	nrs_pol_desc_compat_t			pd_compat;
-	/**
-	 * Set for policies that are compatible with only one PTLRPC service.
-	 *
-	 * \see ptlrpc_nrs_pol_conf::nc_compat_svc_name
-	 */
-	const char			       *pd_compat_svc_name;
-	/**
-	 * Owner module for this policy descriptor.
-	 *
-	 * We need to hold a reference to the module whenever we might make use
-	 * of any of the module's contents, i.e.
-	 * - If one or more instances of the policy are at a state where they
-	 *   might be handling a request, i.e.
-	 *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED or
-	 *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STOPPING as we will have to
-	 *   call into the policy's ptlrpc_nrs_pol_ops() handlers. A reference
-	 *   is taken on the module when
-	 *   \e ptlrpc_nrs_pol_desc::pd_refs becomes 1, and released when it
-	 *   becomes 0, so that we hold only one reference to the module maximum
-	 *   at any time.
-	 *
-	 *   We do not need to hold a reference to the module, even though we
-	 *   might use code and data from the module, in the following cases:
-	 * - During external policy registration, because this should happen in
-	 *   the module's init() function, in which case the module is safe from
-	 *   removal because a reference is being held on the module by the
-	 *   kernel, and iirc kmod (and I guess module-init-tools also) will
-	 *   serialize any racing processes properly anyway.
-	 * - During external policy unregistration, because this should happen
-	 *   in a module's exit() function, and any attempts to start a policy
-	 *   instance would need to take a reference on the module, and this is
-	 *   not possible once we have reached the point where the exit()
-	 *   handler is called.
-	 * - During service registration and unregistration, as service setup
-	 *   and cleanup, and policy registration, unregistration and policy
-	 *   instance starting, are serialized by \e nrs_core::nrs_mutex, so
-	 *   as long as users adhere to the convention of registering policies
-	 *   in init() and unregistering them in module exit() functions, there
-	 *   should not be a race between these operations.
-	 * - During any policy-specific lprocfs operations, because a reference
-	 *   is held by the kernel on a proc entry that has been entered by a
-	 *   syscall, so as long as proc entries are removed during unregistration time,
-	 *   then unregistration and lprocfs operations will be properly
-	 *   serialized.
-	 */
-	struct module			       *pd_owner;
-	/**
-	 * Bitmask of \e nrs_policy_flags
-	 */
-	unsigned				pd_flags;
-	/**
-	 * # of references on this descriptor
-	 */
-	atomic_t				pd_refs;
-};
-
-/**
- * NRS policy state
- *
- * Policies transition from one state to the other during their lifetime
- */
-enum ptlrpc_nrs_pol_state {
-	/**
-	 * Not a valid policy state.
-	 */
-	NRS_POL_STATE_INVALID,
-	/**
-	 * Policies are at this state either at the start of their life, or
-	 * transition here when the user selects a different policy to act
-	 * as the primary one.
-	 */
-	NRS_POL_STATE_STOPPED,
-	/**
-	 * Policy is progress of stopping
-	 */
-	NRS_POL_STATE_STOPPING,
-	/**
-	 * Policy is in progress of starting
-	 */
-	NRS_POL_STATE_STARTING,
-	/**
-	 * A policy is in this state in two cases:
-	 * - it is the fallback policy, which is always in this state.
-	 * - it has been activated by the user; i.e. it is the primary policy,
-	 */
-	NRS_POL_STATE_STARTED,
-};
-
-/**
- * NRS policy information
- *
- * Used for obtaining information for the status of a policy via lprocfs
- */
-struct ptlrpc_nrs_pol_info {
-	/**
-	 * Policy name
-	 */
-	char				pi_name[NRS_POL_NAME_MAX];
-	/**
-	 * Current policy state
-	 */
-	enum ptlrpc_nrs_pol_state	pi_state;
-	/**
-	 * # RPCs enqueued for later dispatching by the policy
-	 */
-	long				pi_req_queued;
-	/**
-	 * # RPCs started for dispatch by the policy
-	 */
-	long				pi_req_started;
-	/**
-	 * Is this a fallback policy?
-	 */
-	unsigned			pi_fallback:1;
-};
-
-/**
- * NRS policy
- *
- * There is one instance of this for each policy in each NRS head of each
- * PTLRPC service partition.
- */
-struct ptlrpc_nrs_policy {
-	/**
-	 * Linkage into the NRS head's list of policies,
-	 * ptlrpc_nrs:nrs_policy_list
-	 */
-	struct list_head			pol_list;
-	/**
-	 * Linkage into the NRS head's list of policies with enqueued
-	 * requests ptlrpc_nrs:nrs_policy_queued
-	 */
-	struct list_head			pol_list_queued;
-	/**
-	 * Current state of this policy
-	 */
-	enum ptlrpc_nrs_pol_state	pol_state;
-	/**
-	 * Bitmask of nrs_policy_flags
-	 */
-	unsigned			pol_flags;
-	/**
-	 * # RPCs enqueued for later dispatching by the policy
-	 */
-	long				pol_req_queued;
-	/**
-	 * # RPCs started for dispatch by the policy
-	 */
-	long				pol_req_started;
-	/**
-	 * Usage Reference count taken on the policy instance
-	 */
-	long				pol_ref;
-	/**
-	 * The NRS head this policy has been created at
-	 */
-	struct ptlrpc_nrs	       *pol_nrs;
-	/**
-	 * Private policy data; varies by policy type
-	 */
-	void			       *pol_private;
-	/**
-	 * Policy descriptor for this policy instance.
-	 */
-	struct ptlrpc_nrs_pol_desc     *pol_desc;
-};
-
-/**
- * NRS resource
- *
- * Resources are embedded into two types of NRS entities:
- * - Inside NRS policies, in the policy's private data in
- *   ptlrpc_nrs_policy::pol_private
- * - In objects that act as prime-level scheduling entities in different NRS
- *   policies; e.g. on a policy that performs round robin or similar order
- *   scheduling across client NIDs, there would be one NRS resource per unique
- *   client NID. On a policy which performs round robin scheduling across
- *   backend filesystem objects, there would be one resource associated with
- *   each of the backend filesystem objects partaking in the scheduling
- *   performed by the policy.
- *
- * NRS resources share a parent-child relationship, in which resources embedded
- * in policy instances are the parent entities, with all scheduling entities
- * a policy schedules across being the children, thus forming a simple resource
- * hierarchy. This hierarchy may be extended with one or more levels in the
- * future if the ability to have more than one primary policy is added.
- *
- * Upon request initialization, references to the then active NRS policies are
- * taken and used to later handle the dispatching of the request with one of
- * these policies.
- *
- * \see nrs_resource_get_safe()
- * \see ptlrpc_nrs_req_add()
- */
-struct ptlrpc_nrs_resource {
-	/**
-	 * This NRS resource's parent; is NULL for resources embedded in NRS
-	 * policy instances; i.e. those are top-level ones.
-	 */
-	struct ptlrpc_nrs_resource     *res_parent;
-	/**
-	 * The policy associated with this resource.
-	 */
-	struct ptlrpc_nrs_policy       *res_policy;
-};
-
-enum {
-	NRS_RES_FALLBACK,
-	NRS_RES_PRIMARY,
-	NRS_RES_MAX
-};
-
-/* \name fifo
- *
- * FIFO policy
- *
- * This policy is a logical wrapper around previous, non-NRS functionality.
- * It dispatches RPCs in the same order as they arrive from the network. This
- * policy is currently used as the fallback policy, and the only enabled policy
- * on all NRS heads of all PTLRPC service partitions.
- * @{
- */
-
-/**
- * Private data structure for the FIFO policy
- */
-struct nrs_fifo_head {
-	/**
-	 * Resource object for policy instance.
-	 */
-	struct ptlrpc_nrs_resource	fh_res;
-	/**
-	 * List of queued requests.
-	 */
-	struct list_head			fh_list;
-	/**
-	 * For debugging purposes.
-	 */
-	__u64				fh_sequence;
-};
-
-struct nrs_fifo_req {
-	struct list_head		fr_list;
-	__u64			fr_sequence;
-};
-
-/** @} fifo */
-
-/**
- * NRS request
- *
- * Instances of this object exist embedded within ptlrpc_request; the main
- * purpose of this object is to hold references to the request's resources
- * for the lifetime of the request, and to hold properties that policies use
- * use for determining the request's scheduling priority.
- */
-struct ptlrpc_nrs_request {
-	/**
-	 * The request's resource hierarchy.
-	 */
-	struct ptlrpc_nrs_resource     *nr_res_ptrs[NRS_RES_MAX];
-	/**
-	 * Index into ptlrpc_nrs_request::nr_res_ptrs of the resource of the
-	 * policy that was used to enqueue the request.
-	 *
-	 * \see nrs_request_enqueue()
-	 */
-	unsigned			nr_res_idx;
-	unsigned			nr_initialized:1;
-	unsigned			nr_enqueued:1;
-	unsigned			nr_started:1;
-	unsigned			nr_finalized:1;
-
-	/**
-	 * Policy-specific fields, used for determining a request's scheduling
-	 * priority, and other supporting functionality.
-	 */
-	union {
-		/**
-		 * Fields for the FIFO policy
-		 */
-		struct nrs_fifo_req	fifo;
-	} nr_u;
-	/**
-	 * Externally-registering policies may want to use this to allocate
-	 * their own request properties.
-	 */
-	void			       *ext;
-};
-
-/** @} nrs */
+#include "lustre_nrs.h"
 
 /**
  * Basic request prioritization operations structure.
diff --git a/drivers/staging/lustre/lustre/include/lustre_nrs.h b/drivers/staging/lustre/lustre/include/lustre_nrs.h
new file mode 100644
index 0000000..a5028aa
--- /dev/null
+++ b/drivers/staging/lustre/lustre/include/lustre_nrs.h
@@ -0,0 +1,717 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License version 2 for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.gnu.org/licenses/gpl-2.0.html
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * Copyright 2012 Xyratex Technology Limited
+ */
+/*
+ *
+ * Network Request Scheduler (NRS)
+ *
+ */
+
+#ifndef _LUSTRE_NRS_H
+#define _LUSTRE_NRS_H
+
+/**
+ * \defgroup nrs Network Request Scheduler
+ * @{
+ */
+struct ptlrpc_nrs_policy;
+struct ptlrpc_nrs_resource;
+struct ptlrpc_nrs_request;
+
+/**
+ * NRS control operations.
+ *
+ * These are common for all policies.
+ */
+enum ptlrpc_nrs_ctl {
+	/**
+	 * Not a valid opcode.
+	 */
+	PTLRPC_NRS_CTL_INVALID,
+	/**
+	 * Activate the policy.
+	 */
+	PTLRPC_NRS_CTL_START,
+	/**
+	 * Reserved for multiple primary policies, which may be a possibility
+	 * in the future.
+	 */
+	PTLRPC_NRS_CTL_STOP,
+	/**
+	 * Policies can start using opcodes from this value and onwards for
+	 * their own purposes; the assigned value itself is arbitrary.
+	 */
+	PTLRPC_NRS_CTL_1ST_POL_SPEC = 0x20,
+};
+
+/**
+ * NRS policy operations.
+ *
+ * These determine the behaviour of a policy, and are called in response to
+ * NRS core events.
+ */
+struct ptlrpc_nrs_pol_ops {
+	/**
+	 * Called during policy registration; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy being initialized
+	 */
+	int	(*op_policy_init)(struct ptlrpc_nrs_policy *policy);
+	/**
+	 * Called during policy unregistration; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy being unregistered/finalized
+	 */
+	void	(*op_policy_fini)(struct ptlrpc_nrs_policy *policy);
+	/**
+	 * Called when activating a policy via lprocfs; policies allocate and
+	 * initialize their resources here; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy being started
+	 *
+	 * \see nrs_policy_start_locked()
+	 */
+	int	(*op_policy_start)(struct ptlrpc_nrs_policy *policy);
+	/**
+	 * Called when deactivating a policy via lprocfs; policies deallocate
+	 * their resources here; this operation is optional
+	 *
+	 * \param[in,out] policy The policy being stopped
+	 *
+	 * \see nrs_policy_stop0()
+	 */
+	void	(*op_policy_stop)(struct ptlrpc_nrs_policy *policy);
+	/**
+	 * Used for policy-specific operations; i.e. not generic ones like
+	 * \e PTLRPC_NRS_CTL_START and \e PTLRPC_NRS_CTL_GET_INFO; analogous
+	 * to an ioctl; this operation is optional.
+	 *
+	 * \param[in,out]	 policy The policy carrying out operation \a opc
+	 * \param[in]	  opc	 The command operation being carried out
+	 * \param[in,out] arg	 An generic buffer for communication between the
+	 *			 user and the control operation
+	 *
+	 * \retval -ve error
+	 * \retval   0 success
+	 *
+	 * \see ptlrpc_nrs_policy_control()
+	 */
+	int	(*op_policy_ctl)(struct ptlrpc_nrs_policy *policy,
+				 enum ptlrpc_nrs_ctl opc, void *arg);
+
+	/**
+	 * Called when obtaining references to the resources of the resource
+	 * hierarchy for a request that has arrived for handling at the PTLRPC
+	 * service. Policies should return -ve for requests they do not wish
+	 * to handle. This operation is mandatory.
+	 *
+	 * \param[in,out] policy  The policy we're getting resources for.
+	 * \param[in,out] nrq	  The request we are getting resources for.
+	 * \param[in]	  parent  The parent resource of the resource being
+	 *			  requested; set to NULL if none.
+	 * \param[out]	  resp	  The resource is to be returned here; the
+	 *			  fallback policy in an NRS head should
+	 *			  \e always return a non-NULL pointer value.
+	 * \param[in]  moving_req When set, signifies that this is an attempt
+	 *			  to obtain resources for a request being moved
+	 *			  to the high-priority NRS head by
+	 *			  ldlm_lock_reorder_req().
+	 *			  This implies two things:
+	 *			  1. We are under obd_export::exp_rpc_lock and
+	 *			  so should not sleep.
+	 *			  2. We should not perform non-idempotent or can
+	 *			  skip performing idempotent operations that
+	 *			  were carried out when resources were first
+	 *			  taken for the request when it was initialized
+	 *			  in ptlrpc_nrs_req_initialize().
+	 *
+	 * \retval 0, +ve The level of the returned resource in the resource
+	 *		  hierarchy; currently only 0 (for a non-leaf resource)
+	 *		  and 1 (for a leaf resource) are supported by the
+	 *		  framework.
+	 * \retval -ve	  error
+	 *
+	 * \see ptlrpc_nrs_req_initialize()
+	 * \see ptlrpc_nrs_hpreq_add_nolock()
+	 * \see ptlrpc_nrs_req_hp_move()
+	 */
+	int	(*op_res_get)(struct ptlrpc_nrs_policy *policy,
+			      struct ptlrpc_nrs_request *nrq,
+			      const struct ptlrpc_nrs_resource *parent,
+			      struct ptlrpc_nrs_resource **resp,
+			      bool moving_req);
+	/**
+	 * Called when releasing references taken for resources in the resource
+	 * hierarchy for the request; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy the resource belongs to
+	 * \param[in] res	 The resource to be freed
+	 *
+	 * \see ptlrpc_nrs_req_finalize()
+	 * \see ptlrpc_nrs_hpreq_add_nolock()
+	 * \see ptlrpc_nrs_req_hp_move()
+	 */
+	void	(*op_res_put)(struct ptlrpc_nrs_policy *policy,
+			      const struct ptlrpc_nrs_resource *res);
+
+	/**
+	 * Obtains a request for handling from the policy, and optionally
+	 * removes the request from the policy; this operation is mandatory.
+	 *
+	 * \param[in,out] policy The policy to poll
+	 * \param[in]	  peek	 When set, signifies that we just want to
+	 *			 examine the request, and not handle it, so the
+	 *			 request is not removed from the policy.
+	 * \param[in]	  force  When set, it will force a policy to return a
+	 *			 request if it has one queued.
+	 *
+	 * \retval NULL No request available for handling
+	 * \retval valid-pointer The request polled for handling
+	 *
+	 * \see ptlrpc_nrs_req_get_nolock()
+	 */
+	struct ptlrpc_nrs_request *
+		(*op_req_get)(struct ptlrpc_nrs_policy *policy, bool peek,
+			      bool force);
+	/**
+	 * Called when attempting to add a request to a policy for later
+	 * handling; this operation is mandatory.
+	 *
+	 * \param[in,out] policy  The policy on which to enqueue \a nrq
+	 * \param[in,out] nrq The request to enqueue
+	 *
+	 * \retval 0	success
+	 * \retval != 0 error
+	 *
+	 * \see ptlrpc_nrs_req_add_nolock()
+	 */
+	int	(*op_req_enqueue)(struct ptlrpc_nrs_policy *policy,
+				  struct ptlrpc_nrs_request *nrq);
+	/**
+	 * Removes a request from the policy's set of pending requests. Normally
+	 * called after a request has been polled successfully from the policy
+	 * for handling; this operation is mandatory.
+	 *
+	 * \param[in,out] policy The policy the request \a nrq belongs to
+	 * \param[in,out] nrq	 The request to dequeue
+	 *
+	 * \see ptlrpc_nrs_req_del_nolock()
+	 */
+	void	(*op_req_dequeue)(struct ptlrpc_nrs_policy *policy,
+				  struct ptlrpc_nrs_request *nrq);
+	/**
+	 * Called after the request being carried out. Could be used for
+	 * job/resource control; this operation is optional.
+	 *
+	 * \param[in,out] policy The policy which is stopping to handle request
+	 *			 \a nrq
+	 * \param[in,out] nrq	 The request
+	 *
+	 * \pre assert_spin_locked(&svcpt->scp_req_lock)
+	 *
+	 * \see ptlrpc_nrs_req_stop_nolock()
+	 */
+	void	(*op_req_stop)(struct ptlrpc_nrs_policy *policy,
+			       struct ptlrpc_nrs_request *nrq);
+	/**
+	 * Registers the policy's lprocfs interface with a PTLRPC service.
+	 *
+	 * \param[in] svc The service
+	 *
+	 * \retval 0	success
+	 * \retval != 0 error
+	 */
+	int	(*op_lprocfs_init)(struct ptlrpc_service *svc);
+	/**
+	 * Unegisters the policy's lprocfs interface with a PTLRPC service.
+	 *
+	 * In cases of failed policy registration in
+	 * \e ptlrpc_nrs_policy_register(), this function may be called for a
+	 * service which has not registered the policy successfully, so
+	 * implementations of this method should make sure their operations are
+	 * safe in such cases.
+	 *
+	 * \param[in] svc The service
+	 */
+	void	(*op_lprocfs_fini)(struct ptlrpc_service *svc);
+};
+
+/**
+ * Policy flags
+ */
+enum nrs_policy_flags {
+	/**
+	 * Fallback policy, use this flag only on a single supported policy per
+	 * service. The flag cannot be used on policies that use
+	 * \e PTLRPC_NRS_FL_REG_EXTERN
+	 */
+	PTLRPC_NRS_FL_FALLBACK		= BIT(0),
+	/**
+	 * Start policy immediately after registering.
+	 */
+	PTLRPC_NRS_FL_REG_START		= BIT(1),
+	/**
+	 * This is a policy registering from a module different to the one NRS
+	 * core ships in (currently ptlrpc).
+	 */
+	PTLRPC_NRS_FL_REG_EXTERN	= BIT(2),
+};
+
+/**
+ * NRS queue type.
+ *
+ * Denotes whether an NRS instance is for handling normal or high-priority
+ * RPCs, or whether an operation pertains to one or both of the NRS instances
+ * in a service.
+ */
+enum ptlrpc_nrs_queue_type {
+	PTLRPC_NRS_QUEUE_REG	= BIT(0),
+	PTLRPC_NRS_QUEUE_HP	= BIT(1),
+	PTLRPC_NRS_QUEUE_BOTH	= (PTLRPC_NRS_QUEUE_REG | PTLRPC_NRS_QUEUE_HP)
+};
+
+/**
+ * NRS head
+ *
+ * A PTLRPC service has at least one NRS head instance for handling normal
+ * priority RPCs, and may optionally have a second NRS head instance for
+ * handling high-priority RPCs. Each NRS head maintains a list of available
+ * policies, of which one and only one policy is acting as the fallback policy,
+ * and optionally a different policy may be acting as the primary policy. For
+ * all RPCs handled by this NRS head instance, NRS core will first attempt to
+ * enqueue the RPC using the primary policy (if any). The fallback policy is
+ * used in the following cases:
+ * - when there was no primary policy in the
+ *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED state at the time the request
+ *   was initialized.
+ * - when the primary policy that was at the
+ *   ptlrpc_nrs_pol_state::PTLRPC_NRS_POL_STATE_STARTED state at the time the
+ *   RPC was initialized, denoted it did not wish, or for some other reason was
+ *   not able to handle the request, by returning a non-valid NRS resource
+ *   reference.
+ * - when the primary policy that was at the
+ *   ptlrpc_nrs_pol_state::PTLRPC_NRS_POL_STATE_STARTED state at the time the
+ *   RPC was initialized, fails later during the request enqueueing stage.
+ *
+ * \see nrs_resource_get_safe()
+ * \see nrs_request_enqueue()
+ */
+struct ptlrpc_nrs {
+	spinlock_t			nrs_lock;
+	/** XXX Possibly replace svcpt->scp_req_lock with another lock here. */
+	/**
+	 * List of registered policies
+	 */
+	struct list_head		nrs_policy_list;
+	/**
+	 * List of policies with queued requests. Policies that have any
+	 * outstanding requests are queued here, and this list is queried
+	 * in a round-robin manner from NRS core when obtaining a request
+	 * for handling. This ensures that requests from policies that at some
+	 * point transition away from the
+	 * ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED state are drained.
+	 */
+	struct list_head		nrs_policy_queued;
+	/**
+	 * Service partition for this NRS head
+	 */
+	struct ptlrpc_service_part     *nrs_svcpt;
+	/**
+	 * Primary policy, which is the preferred policy for handling RPCs
+	 */
+	struct ptlrpc_nrs_policy       *nrs_policy_primary;
+	/**
+	 * Fallback policy, which is the backup policy for handling RPCs
+	 */
+	struct ptlrpc_nrs_policy       *nrs_policy_fallback;
+	/**
+	 * This NRS head handles either HP or regular requests
+	 */
+	enum ptlrpc_nrs_queue_type	nrs_queue_type;
+	/**
+	 * # queued requests from all policies in this NRS head
+	 */
+	unsigned long			nrs_req_queued;
+	/**
+	 * # scheduled requests from all policies in this NRS head
+	 */
+	unsigned long			nrs_req_started;
+	/**
+	 * # policies on this NRS
+	 */
+	unsigned int			nrs_num_pols;
+	/**
+	 * This NRS head is in progress of starting a policy
+	 */
+	unsigned int			nrs_policy_starting:1;
+	/**
+	 * In progress of shutting down the whole NRS head; used during
+	 * unregistration
+	 */
+	unsigned int			nrs_stopping:1;
+	/**
+	 * NRS policy is throttling request
+	 */
+	unsigned int			nrs_throttling:1;
+};
+
+#define NRS_POL_NAME_MAX		16
+#define NRS_POL_ARG_MAX			16
+
+struct ptlrpc_nrs_pol_desc;
+
+/**
+ * Service compatibility predicate; this determines whether a policy is adequate
+ * for handling RPCs of a particular PTLRPC service.
+ *
+ * XXX:This should give the same result during policy registration and
+ * unregistration, and for all partitions of a service; so the result should not
+ * depend on temporal service or other properties, that may influence the
+ * result.
+ */
+typedef bool (*nrs_pol_desc_compat_t)(const struct ptlrpc_service *svc,
+				      const struct ptlrpc_nrs_pol_desc *desc);
+
+struct ptlrpc_nrs_pol_conf {
+	/**
+	 * Human-readable policy name
+	 */
+	char				   nc_name[NRS_POL_NAME_MAX];
+	/**
+	 * NRS operations for this policy
+	 */
+	const struct ptlrpc_nrs_pol_ops   *nc_ops;
+	/**
+	 * Service compatibility predicate
+	 */
+	nrs_pol_desc_compat_t		   nc_compat;
+	/**
+	 * Set for policies that support a single ptlrpc service, i.e. ones that
+	 * have \a pd_compat set to nrs_policy_compat_one(). The variable value
+	 * depicts the name of the single service that such policies are
+	 * compatible with.
+	 */
+	const char			  *nc_compat_svc_name;
+	/**
+	 * Owner module for this policy descriptor; policies registering from a
+	 * different module to the one the NRS framework is held within
+	 * (currently ptlrpc), should set this field to THIS_MODULE.
+	 */
+	struct module			  *nc_owner;
+	/**
+	 * Policy registration flags; a bitmask of \e nrs_policy_flags
+	 */
+	unsigned int			   nc_flags;
+};
+
+/**
+ * NRS policy registering descriptor
+ *
+ * Is used to hold a description of a policy that can be passed to NRS core in
+ * order to register the policy with NRS heads in different PTLRPC services.
+ */
+struct ptlrpc_nrs_pol_desc {
+	/**
+	 * Human-readable policy name
+	 */
+	char					pd_name[NRS_POL_NAME_MAX];
+	/**
+	 * Link into nrs_core::nrs_policies
+	 */
+	struct list_head			pd_list;
+	/**
+	 * NRS operations for this policy
+	 */
+	const struct ptlrpc_nrs_pol_ops        *pd_ops;
+	/**
+	 * Service compatibility predicate
+	 */
+	nrs_pol_desc_compat_t			pd_compat;
+	/**
+	 * Set for policies that are compatible with only one PTLRPC service.
+	 *
+	 * \see ptlrpc_nrs_pol_conf::nc_compat_svc_name
+	 */
+	const char			       *pd_compat_svc_name;
+	/**
+	 * Owner module for this policy descriptor.
+	 *
+	 * We need to hold a reference to the module whenever we might make use
+	 * of any of the module's contents, i.e.
+	 * - If one or more instances of the policy are at a state where they
+	 *   might be handling a request, i.e.
+	 *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STARTED or
+	 *   ptlrpc_nrs_pol_state::NRS_POL_STATE_STOPPING as we will have to
+	 *   call into the policy's ptlrpc_nrs_pol_ops() handlers. A reference
+	 *   is taken on the module when
+	 *   \e ptlrpc_nrs_pol_desc::pd_refs becomes 1, and released when it
+	 *   becomes 0, so that we hold only one reference to the module maximum
+	 *   at any time.
+	 *
+	 *   We do not need to hold a reference to the module, even though we
+	 *   might use code and data from the module, in the following cases:
+	 * - During external policy registration, because this should happen in
+	 *   the module's init() function, in which case the module is safe from
+	 *   removal because a reference is being held on the module by the
+	 *   kernel, and iirc kmod (and I guess module-init-tools also) will
+	 *   serialize any racing processes properly anyway.
+	 * - During external policy unregistration, because this should happen
+	 *   in a module's exit() function, and any attempts to start a policy
+	 *   instance would need to take a reference on the module, and this is
+	 *   not possible once we have reached the point where the exit()
+	 *   handler is called.
+	 * - During service registration and unregistration, as service setup
+	 *   and cleanup, and policy registration, unregistration and policy
+	 *   instance starting, are serialized by \e nrs_core::nrs_mutex, so
+	 *   as long as users adhere to the convention of registering policies
+	 *   in init() and unregistering them in module exit() functions, there
+	 *   should not be a race between these operations.
+	 * - During any policy-specific lprocfs operations, because a reference
+	 *   is held by the kernel on a proc entry that has been entered by a
+	 *   syscall, so as long as proc entries are removed during
+	 *   unregistration time, then unregistration and lprocfs operations
+	 *   will be properly serialized.
+	 */
+	struct module			       *pd_owner;
+	/**
+	 * Bitmask of \e nrs_policy_flags
+	 */
+	unsigned int				pd_flags;
+	/**
+	 * # of references on this descriptor
+	 */
+	atomic_t				pd_refs;
+};
+
+/**
+ * NRS policy state
+ *
+ * Policies transition from one state to the other during their lifetime
+ */
+enum ptlrpc_nrs_pol_state {
+	/**
+	 * Not a valid policy state.
+	 */
+	NRS_POL_STATE_INVALID,
+	/**
+	 * Policies are at this state either at the start of their life, or
+	 * transition here when the user selects a different policy to act
+	 * as the primary one.
+	 */
+	NRS_POL_STATE_STOPPED,
+	/**
+	 * Policy is progress of stopping
+	 */
+	NRS_POL_STATE_STOPPING,
+	/**
+	 * Policy is in progress of starting
+	 */
+	NRS_POL_STATE_STARTING,
+	/**
+	 * A policy is in this state in two cases:
+	 * - it is the fallback policy, which is always in this state.
+	 * - it has been activated by the user; i.e. it is the primary policy,
+	 */
+	NRS_POL_STATE_STARTED,
+};
+
+/**
+ * NRS policy information
+ *
+ * Used for obtaining information for the status of a policy via lprocfs
+ */
+struct ptlrpc_nrs_pol_info {
+	/**
+	 * Policy name
+	 */
+	char				pi_name[NRS_POL_NAME_MAX];
+	/**
+	 * Policy argument
+	 */
+	char				pi_arg[NRS_POL_ARG_MAX];
+	/**
+	 * Current policy state
+	 */
+	enum ptlrpc_nrs_pol_state	pi_state;
+	/**
+	 * # RPCs enqueued for later dispatching by the policy
+	 */
+	long				pi_req_queued;
+	/**
+	 * # RPCs started for dispatch by the policy
+	 */
+	long				pi_req_started;
+	/**
+	 * Is this a fallback policy?
+	 */
+	unsigned			pi_fallback:1;
+};
+
+/**
+ * NRS policy
+ *
+ * There is one instance of this for each policy in each NRS head of each
+ * PTLRPC service partition.
+ */
+struct ptlrpc_nrs_policy {
+	/**
+	 * Linkage into the NRS head's list of policies,
+	 * ptlrpc_nrs:nrs_policy_list
+	 */
+	struct list_head		pol_list;
+	/**
+	 * Linkage into the NRS head's list of policies with enqueued
+	 * requests ptlrpc_nrs:nrs_policy_queued
+	 */
+	struct list_head		pol_list_queued;
+	/**
+	 * Current state of this policy
+	 */
+	enum ptlrpc_nrs_pol_state	pol_state;
+	/**
+	 * Bitmask of nrs_policy_flags
+	 */
+	unsigned int			pol_flags;
+	/**
+	 * # RPCs enqueued for later dispatching by the policy
+	 */
+	long				pol_req_queued;
+	/**
+	 * # RPCs started for dispatch by the policy
+	 */
+	long				pol_req_started;
+	/**
+	 * Usage Reference count taken on the policy instance
+	 */
+	long				pol_ref;
+	/**
+	 * Human-readable policy argument
+	 */
+	char				pol_arg[NRS_POL_ARG_MAX];
+	/**
+	 * The NRS head this policy has been created at
+	 */
+	struct ptlrpc_nrs	       *pol_nrs;
+	/**
+	 * Private policy data; varies by policy type
+	 */
+	void			       *pol_private;
+	/**
+	 * Policy descriptor for this policy instance.
+	 */
+	struct ptlrpc_nrs_pol_desc     *pol_desc;
+};
+
+/**
+ * NRS resource
+ *
+ * Resources are embedded into two types of NRS entities:
+ * - Inside NRS policies, in the policy's private data in
+ *   ptlrpc_nrs_policy::pol_private
+ * - In objects that act as prime-level scheduling entities in different NRS
+ *   policies; e.g. on a policy that performs round robin or similar order
+ *   scheduling across client NIDs, there would be one NRS resource per unique
+ *   client NID. On a policy which performs round robin scheduling across
+ *   backend filesystem objects, there would be one resource associated with
+ *   each of the backend filesystem objects partaking in the scheduling
+ *   performed by the policy.
+ *
+ * NRS resources share a parent-child relationship, in which resources embedded
+ * in policy instances are the parent entities, with all scheduling entities
+ * a policy schedules across being the children, thus forming a simple resource
+ * hierarchy. This hierarchy may be extended with one or more levels in the
+ * future if the ability to have more than one primary policy is added.
+ *
+ * Upon request initialization, references to the then active NRS policies are
+ * taken and used to later handle the dispatching of the request with one of
+ * these policies.
+ *
+ * \see nrs_resource_get_safe()
+ * \see ptlrpc_nrs_req_add()
+ */
+struct ptlrpc_nrs_resource {
+	/**
+	 * This NRS resource's parent; is NULL for resources embedded in NRS
+	 * policy instances; i.e. those are top-level ones.
+	 */
+	struct ptlrpc_nrs_resource     *res_parent;
+	/**
+	 * The policy associated with this resource.
+	 */
+	struct ptlrpc_nrs_policy       *res_policy;
+};
+
+enum {
+	NRS_RES_FALLBACK,
+	NRS_RES_PRIMARY,
+	NRS_RES_MAX
+};
+
+#include "lustre_nrs_fifo.h"
+
+/**
+ * NRS request
+ *
+ * Instances of this object exist embedded within ptlrpc_request; the main
+ * purpose of this object is to hold references to the request's resources
+ * for the lifetime of the request, and to hold properties that policies use
+ * use for determining the request's scheduling priority.
+ **/
+struct ptlrpc_nrs_request {
+	/**
+	 * The request's resource hierarchy.
+	 */
+	struct ptlrpc_nrs_resource     *nr_res_ptrs[NRS_RES_MAX];
+	/**
+	 * Index into ptlrpc_nrs_request::nr_res_ptrs of the resource of the
+	 * policy that was used to enqueue the request.
+	 *
+	 * \see nrs_request_enqueue()
+	 */
+	unsigned int			nr_res_idx;
+	unsigned int			nr_initialized:1;
+	unsigned int			nr_enqueued:1;
+	unsigned int			nr_started:1;
+	unsigned int			nr_finalized:1;
+
+	/**
+	 * Policy-specific fields, used for determining a request's scheduling
+	 * priority, and other supporting functionality.
+	 */
+	union {
+		/**
+		 * Fields for the FIFO policy
+		 */
+		struct nrs_fifo_req	fifo;
+	} nr_u;
+	/**
+	 * Externally-registering policies may want to use this to allocate
+	 * their own request properties.
+	 */
+	void			       *ext;
+};
+
+/** @} nrs */
+#endif
diff --git a/drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h b/drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h
new file mode 100644
index 0000000..3b5418e
--- /dev/null
+++ b/drivers/staging/lustre/lustre/include/lustre_nrs_fifo.h
@@ -0,0 +1,70 @@
+/*
+ * GPL HEADER START
+ *
+ * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 only,
+ * as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License version 2 for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * version 2 along with this program; If not, see
+ * http://www.gnu.org/licenses/gpl-2.0.html
+ *
+ * GPL HEADER END
+ */
+/*
+ * Copyright (c) 2014, Intel Corporation.
+ *
+ * Copyright 2012 Xyratex Technology Limited
+ */
+/*
+ *
+ * Network Request Scheduler (NRS) First-in First-out (FIFO) policy
+ *
+ */
+
+#ifndef _LUSTRE_NRS_FIFO_H
+#define _LUSTRE_NRS_FIFO_H
+
+/* \name fifo
+ *
+ * FIFO policy
+ *
+ * This policy is a logical wrapper around previous, non-NRS functionality.
+ * It dispatches RPCs in the same order as they arrive from the network. This
+ * policy is currently used as the fallback policy, and the only enabled policy
+ * on all NRS heads of all PTLRPC service partitions.
+ * @{
+ */
+
+/**
+ * Private data structure for the FIFO policy
+ */
+struct nrs_fifo_head {
+	/**
+	 * Resource object for policy instance.
+	 */
+	struct ptlrpc_nrs_resource	fh_res;
+	/**
+	 * List of queued requests.
+	 */
+	struct list_head		fh_list;
+	/**
+	 * For debugging purposes.
+	 */
+	__u64				fh_sequence;
+};
+
+struct nrs_fifo_req {
+	struct list_head	fr_list;
+	__u64			fr_sequence;
+};
+
+/** @} fifo */
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 30/41] staging: lustre: quota: remove obsolete quota code
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

Remove the obsolete quotacheck, quotaon and quotaoff, which
were retained for the interoperability with old (< 2.4) client
and server.

Some other obsolete quota code related to LL_IOC_QUOTACTL_18,
Q_INVLIDATE and Q_FINVALIDATE are removed as well.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5975
Reviewed-on: http://review.whamcloud.com/14705
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    6 +-
 .../lustre/lustre/include/lustre/lustre_ioctl.h    |    4 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |   13 +---
 .../lustre/lustre/include/lustre_req_layout.h      |    3 -
 drivers/staging/lustre/lustre/include/obd.h        |    9 ---
 drivers/staging/lustre/lustre/include/obd_class.h  |   12 ----
 .../staging/lustre/lustre/include/obd_support.h    |    6 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |    2 -
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |   24 -------
 drivers/staging/lustre/lustre/llite/dir.c          |   65 --------------------
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |   31 +---------
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   54 +----------------
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   50 ---------------
 drivers/staging/lustre/lustre/osc/osc_internal.h   |    3 -
 drivers/staging/lustre/lustre/osc/osc_quota.c      |   44 -------------
 drivers/staging/lustre/lustre/osc/osc_request.c    |    4 -
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   15 -----
 17 files changed, 15 insertions(+), 330 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 17feb71..7645ed9 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1406,7 +1406,7 @@ enum ost_cmd {
 	OST_STATFS     = 13,
 	OST_SYNC       = 16,
 	OST_SET_INFO   = 17,
-	OST_QUOTACHECK = 18,
+	OST_QUOTACHECK = 18, /* not used since 2.4 */
 	OST_QUOTACTL   = 19,
 	OST_QUOTA_ADJUST_QUNIT = 20, /* not used since 2.4 */
 	OST_LAST_OPC
@@ -1925,7 +1925,7 @@ enum mds_cmd {
 	MDS_SYNC		= 44,
 	MDS_DONE_WRITING	= 45, /* obsolete since 2.8.0 */
 	MDS_SET_INFO		= 46,
-	MDS_QUOTACHECK		= 47,
+	MDS_QUOTACHECK		= 47, /* not used since 2.4 */
 	MDS_QUOTACTL		= 48,
 	MDS_GETXATTR		= 49,
 	MDS_SETXATTR		= 50, /* obsolete, now it's MDS_REINT op */
@@ -2889,7 +2889,7 @@ void lustre_swab_cfg_marker(struct cfg_marker *marker, int swab, int size);
 enum obd_cmd {
 	OBD_PING = 400,
 	OBD_LOG_CANCEL,
-	OBD_QC_CALLBACK,
+	OBD_QC_CALLBACK, /* not used since 2.4 */
 	OBD_IDX_READ,
 	OBD_LAST_OPC
 };
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_ioctl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_ioctl.h
index f3d7c94..eb08df3 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_ioctl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_ioctl.h
@@ -363,8 +363,8 @@ obd_ioctl_unpack(struct obd_ioctl_data *data, char *pbuf, int max_len)
 /*	OBD_IOC_LOV_GETSTRIPE	155 LL_IOC_LOV_GETSTRIPE */
 /*	OBD_IOC_LOV_SETEA	156 LL_IOC_LOV_SETEA */
 /*	lustre/lustre_user.h	157-159 */
-#define	OBD_IOC_QUOTACHECK	_IOW('f', 160, int)
-#define	OBD_IOC_POLL_QUOTACHECK	_IOR('f', 161, struct if_quotacheck *)
+/*	OBD_IOC_QUOTACHECK	_IOW('f', 160, int) */
+/*	OBD_IOC_POLL_QUOTACHECK	_IOR('f', 161, struct if_quotacheck *) */
 #define OBD_IOC_QUOTACTL	_IOWR('f', 162, struct if_quotactl)
 /*	lustre/lustre_user.h	163-176 */
 #define OBD_IOC_CHANGELOG_REG	_IOW('f', 177, struct obd_ioctl_data)
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 80fecba..856e2f9 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -557,23 +557,18 @@ static inline void obd_uuid2fsname(char *buf, char *uuid, int buflen)
 #define Q_FINVALIDATE  0x800104 /* deprecated as of 2.4 */
 
 /* these must be explicitly translated into linux Q_* in ll_dir_ioctl */
-#define LUSTRE_Q_QUOTAON    0x800002     /* turn quotas on */
-#define LUSTRE_Q_QUOTAOFF   0x800003     /* turn quotas off */
+#define LUSTRE_Q_QUOTAON    0x800002	/* deprecated as of 2.4 */
+#define LUSTRE_Q_QUOTAOFF   0x800003	/* deprecated as of 2.4 */
 #define LUSTRE_Q_GETINFO    0x800005     /* get information about quota files */
 #define LUSTRE_Q_SETINFO    0x800006     /* set information about quota files */
 #define LUSTRE_Q_GETQUOTA   0x800007     /* get user quota structure */
 #define LUSTRE_Q_SETQUOTA   0x800008     /* set user quota structure */
 /* lustre-specific control commands */
-#define LUSTRE_Q_INVALIDATE  0x80000b     /* invalidate quota data */
-#define LUSTRE_Q_FINVALIDATE 0x80000c     /* invalidate filter quota data */
+#define LUSTRE_Q_INVALIDATE  0x80000b	/* deprecated as of 2.4 */
+#define LUSTRE_Q_FINVALIDATE 0x80000c	/* deprecated as of 2.4 */
 
 #define UGQUOTA 2       /* set both USRQUOTA and GRPQUOTA */
 
-struct if_quotacheck {
-	char		    obd_type[16];
-	struct obd_uuid	 obd_uuid;
-};
-
 #define IDENTITY_DOWNCALL_MAGIC 0x6d6dd629
 
 /* permission */
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index dd8717e..78857b3 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -165,9 +165,7 @@ extern struct req_format RQF_MDS_REINT_LINK;
 extern struct req_format RQF_MDS_REINT_RENAME;
 extern struct req_format RQF_MDS_REINT_SETATTR;
 extern struct req_format RQF_MDS_REINT_SETXATTR;
-extern struct req_format RQF_MDS_QUOTACHECK;
 extern struct req_format RQF_MDS_QUOTACTL;
-extern struct req_format RQF_QC_CALLBACK;
 extern struct req_format RQF_MDS_SWAP_LAYOUTS;
 /* MDS hsm formats */
 extern struct req_format RQF_MDS_HSM_STATE_GET;
@@ -180,7 +178,6 @@ extern struct req_format RQF_MDS_HSM_REQUEST;
 /* OST req_format */
 extern struct req_format RQF_OST_CONNECT;
 extern struct req_format RQF_OST_DISCONNECT;
-extern struct req_format RQF_OST_QUOTACHECK;
 extern struct req_format RQF_OST_QUOTACTL;
 extern struct req_format RQF_OST_GETATTR;
 extern struct req_format RQF_OST_SETATTR;
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 8372deb..2811901 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -345,13 +345,6 @@ struct client_obd {
 	/* also protected by the poorly named _loi_list_lock lock above */
 	struct osc_async_rc      cl_ar;
 
-	/* used by quotacheck when the servers are older than 2.4 */
-	int		      cl_qchk_stat; /* quotacheck stat of the peer */
-#define CL_NOT_QUOTACHECKED 1   /* client->cl_qchk_stat init value */
-#if OBD_OCD_VERSION(2, 7, 53, 0) < LUSTRE_VERSION_CODE
-#warning "please consider removing quotacheck compatibility code"
-#endif
-
 	/* sequence manager */
 	struct lu_client_seq    *cl_seq;
 
@@ -930,8 +923,6 @@ struct obd_ops {
 	struct obd_uuid *(*get_uuid)(struct obd_export *exp);
 
 	/* quota methods */
-	int (*quotacheck)(struct obd_device *, struct obd_export *,
-			  struct obd_quotactl *);
 	int (*quotactl)(struct obd_device *, struct obd_export *,
 			struct obd_quotactl *);
 
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index cb7160e..8f1d681 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -1163,18 +1163,6 @@ static inline int obd_notify_observer(struct obd_device *observer,
 	return rc1 ? rc1 : rc2;
 }
 
-static inline int obd_quotacheck(struct obd_export *exp,
-				 struct obd_quotactl *oqctl)
-{
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, quotacheck);
-	EXP_COUNTER_INCREMENT(exp, quotacheck);
-
-	rc = OBP(exp->exp_obd, quotacheck)(exp->exp_obd, exp, oqctl);
-	return rc;
-}
-
 static inline int obd_quotactl(struct obd_export *exp,
 			       struct obd_quotactl *oqctl)
 {
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 9d2d6f8..1233c34 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -179,7 +179,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_MDS_STATFS_LCW_SLEEP    0x12a
 #define OBD_FAIL_MDS_OPEN_CREATE	 0x12b
 #define OBD_FAIL_MDS_OST_SETATTR	 0x12c
-#define OBD_FAIL_MDS_QUOTACHECK_NET      0x12d
+/*	OBD_FAIL_MDS_QUOTACHECK_NET      0x12d obsolete since 2.4 */
 #define OBD_FAIL_MDS_QUOTACTL_NET	0x12e
 #define OBD_FAIL_MDS_CLIENT_ADD	  0x12f
 #define OBD_FAIL_MDS_GETXATTR_NET	0x130
@@ -264,7 +264,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_OST_ENOSPC	      0x215
 #define OBD_FAIL_OST_EROFS	       0x216
 #define OBD_FAIL_OST_ENOENT	      0x217
-#define OBD_FAIL_OST_QUOTACHECK_NET      0x218
+/*	OBD_FAIL_OST_QUOTACHECK_NET      0x218 obsolete since 2.4 */
 #define OBD_FAIL_OST_QUOTACTL_NET	0x219
 #define OBD_FAIL_OST_CHECKSUM_RECEIVE    0x21a
 #define OBD_FAIL_OST_CHECKSUM_SEND       0x21b
@@ -373,7 +373,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_OBD_PING_NET	    0x600
 #define OBD_FAIL_OBD_LOG_CANCEL_NET      0x601
 #define OBD_FAIL_OBD_LOGD_NET	    0x602
-#define OBD_FAIL_OBD_QC_CALLBACK_NET     0x603
+/*	OBD_FAIL_OBD_QC_CALLBACK_NET     0x603 obsolete since 2.4 */
 #define OBD_FAIL_OBD_DQACQ	       0x604
 #define OBD_FAIL_OBD_LLOG_SETUP	  0x605
 #define OBD_FAIL_OBD_LOG_CANCEL_REP      0x606
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 153e990..f3128b6 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -425,8 +425,6 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 		goto err_import;
 	}
 
-	cli->cl_qchk_stat = CL_NOT_QUOTACHECKED;
-
 	return rc;
 
 err_import:
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
index c32b414..12647af 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
@@ -511,23 +511,6 @@ static inline void ldlm_callback_errmsg(struct ptlrpc_request *req,
 		CWARN("Send reply failed, maybe cause bug 21636.\n");
 }
 
-static int ldlm_handle_qc_callback(struct ptlrpc_request *req)
-{
-	struct obd_quotactl *oqctl;
-	struct client_obd *cli = &req->rq_export->exp_obd->u.cli;
-
-	oqctl = req_capsule_client_get(&req->rq_pill, &RMF_OBD_QUOTACTL);
-	if (!oqctl) {
-		CERROR("Can't unpack obd_quotactl\n");
-		return -EPROTO;
-	}
-
-	oqctl->qc_stat = ptlrpc_status_ntoh(oqctl->qc_stat);
-
-	cli->cl_qchk_stat = oqctl->qc_stat;
-	return 0;
-}
-
 /* TODO: handle requests in a similar way as MDT: see mdt_handle_common() */
 static int ldlm_callback_handler(struct ptlrpc_request *req)
 {
@@ -577,13 +560,6 @@ static int ldlm_callback_handler(struct ptlrpc_request *req)
 		rc = ldlm_handle_setinfo(req);
 		ldlm_callback_reply(req, rc);
 		return 0;
-	case OBD_QC_CALLBACK:
-		req_capsule_set(&req->rq_pill, &RQF_QC_CALLBACK);
-		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_QC_CALLBACK_NET))
-			return 0;
-		rc = ldlm_handle_qc_callback(req);
-		ldlm_callback_reply(req, rc);
-		return 0;
 	default:
 		CERROR("unknown opcode %u\n",
 		       lustre_msg_get_opc(req->rq_reqmsg));
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 360d97f..929c32c 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -861,10 +861,6 @@ static int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl)
 	int rc = 0;
 
 	switch (cmd) {
-	case LUSTRE_Q_INVALIDATE:
-	case LUSTRE_Q_FINVALIDATE:
-	case Q_QUOTAON:
-	case Q_QUOTAOFF:
 	case Q_SETQUOTA:
 	case Q_SETINFO:
 		if (!capable(CFS_CAP_SYS_ADMIN))
@@ -929,10 +925,6 @@ static int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl)
 		QCTL_COPY(oqctl, qctl);
 		rc = obd_quotactl(sbi->ll_md_exp, oqctl);
 		if (rc) {
-			if (rc != -EALREADY && cmd == Q_QUOTAON) {
-				oqctl->qc_cmd = Q_QUOTAOFF;
-				obd_quotactl(sbi->ll_md_exp, oqctl);
-			}
 			kfree(oqctl);
 			return rc;
 		}
@@ -1369,63 +1361,6 @@ out_req:
 			ll_putname(filename);
 		return rc;
 	}
-	case OBD_IOC_QUOTACHECK: {
-		struct obd_quotactl *oqctl;
-		int error = 0;
-
-		if (!capable(CFS_CAP_SYS_ADMIN))
-			return -EPERM;
-
-		oqctl = kzalloc(sizeof(*oqctl), GFP_NOFS);
-		if (!oqctl)
-			return -ENOMEM;
-		oqctl->qc_type = arg;
-		rc = obd_quotacheck(sbi->ll_md_exp, oqctl);
-		if (rc < 0) {
-			CDEBUG(D_INFO, "md_quotacheck failed: rc %d\n", rc);
-			error = rc;
-		}
-
-		rc = obd_quotacheck(sbi->ll_dt_exp, oqctl);
-		if (rc < 0)
-			CDEBUG(D_INFO, "obd_quotacheck failed: rc %d\n", rc);
-
-		kfree(oqctl);
-		return error ?: rc;
-	}
-	case OBD_IOC_POLL_QUOTACHECK: {
-		struct if_quotacheck *check;
-
-		if (!capable(CFS_CAP_SYS_ADMIN))
-			return -EPERM;
-
-		check = kzalloc(sizeof(*check), GFP_NOFS);
-		if (!check)
-			return -ENOMEM;
-
-		rc = obd_iocontrol(cmd, sbi->ll_md_exp, 0, (void *)check,
-				   NULL);
-		if (rc) {
-			CDEBUG(D_QUOTA, "mdc ioctl %d failed: %d\n", cmd, rc);
-			if (copy_to_user((void __user *)arg, check,
-					 sizeof(*check)))
-				CDEBUG(D_QUOTA, "copy_to_user failed\n");
-			goto out_poll;
-		}
-
-		rc = obd_iocontrol(cmd, sbi->ll_dt_exp, 0, (void *)check,
-				   NULL);
-		if (rc) {
-			CDEBUG(D_QUOTA, "osc ioctl %d failed: %d\n", cmd, rc);
-			if (copy_to_user((void __user *)arg, check,
-					 sizeof(*check)))
-				CDEBUG(D_QUOTA, "copy_to_user failed\n");
-			goto out_poll;
-		}
-out_poll:
-		kfree(check);
-		return rc;
-	}
 	case OBD_IOC_QUOTACTL: {
 		struct if_quotactl *qctl;
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 7bd7e15..67969a8 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1128,9 +1128,7 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp,
 			mdc_obd = class_exp2obd(tgt->ltd_exp);
 			mdc_obd->obd_force = obddev->obd_force;
 			err = obd_iocontrol(cmd, tgt->ltd_exp, len, karg, uarg);
-			if (err == -ENODATA && cmd == OBD_IOC_POLL_QUOTACHECK) {
-				return err;
-			} else if (err) {
+			if (err) {
 				if (tgt->ltd_active) {
 					CERROR("error: iocontrol MDC %s on MDTidx %d cmd %x: err = %d\n",
 					       tgt->ltd_uuid.uuid, i, cmd, err);
@@ -3243,32 +3241,6 @@ static int lmv_quotactl(struct obd_device *unused, struct obd_export *exp,
 	return rc;
 }
 
-static int lmv_quotacheck(struct obd_device *unused, struct obd_export *exp,
-			  struct obd_quotactl *oqctl)
-{
-	struct obd_device   *obd = class_exp2obd(exp);
-	struct lmv_obd      *lmv = &obd->u.lmv;
-	struct lmv_tgt_desc *tgt;
-	int rc = 0;
-	u32 i;
-
-	for (i = 0; i < lmv->desc.ld_tgt_count; i++) {
-		int err;
-
-		tgt = lmv->tgts[i];
-		if (!tgt || !tgt->ltd_exp || !tgt->ltd_active) {
-			CERROR("lmv idx %d inactive\n", i);
-			return -EIO;
-		}
-
-		err = obd_quotacheck(tgt->ltd_exp, oqctl);
-		if (err && !rc)
-			rc = err;
-	}
-
-	return rc;
-}
-
 static int lmv_merge_attr(struct obd_export *exp,
 			  const struct lmv_stripe_md *lsm,
 			  struct cl_attr *attr,
@@ -3326,7 +3298,6 @@ static struct obd_ops lmv_obd_ops = {
 	.notify		= lmv_notify,
 	.get_uuid	= lmv_get_uuid,
 	.iocontrol	= lmv_iocontrol,
-	.quotacheck	= lmv_quotacheck,
 	.quotactl	= lmv_quotactl
 };
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 019fe95..c2a853c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1217,8 +1217,6 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 			osc_obd->obd_force = obddev->obd_force;
 			err = obd_iocontrol(cmd, lov->lov_tgts[i]->ltd_exp,
 					    len, karg, uarg);
-			if (err == -ENODATA && cmd == OBD_IOC_POLL_QUOTACHECK)
-				return err;
 			if (err) {
 				if (lov->lov_tgts[i]->ltd_active) {
 					CDEBUG(err == -ENOTTY ?
@@ -1355,12 +1353,8 @@ static int lov_quotactl(struct obd_device *obd, struct obd_export *exp,
 	__u64		bhardlimit = 0;
 	int		  i, rc = 0;
 
-	if (oqctl->qc_cmd != LUSTRE_Q_QUOTAON &&
-	    oqctl->qc_cmd != LUSTRE_Q_QUOTAOFF &&
-	    oqctl->qc_cmd != Q_GETOQUOTA &&
-	    oqctl->qc_cmd != Q_INITQUOTA &&
-	    oqctl->qc_cmd != LUSTRE_Q_SETQUOTA &&
-	    oqctl->qc_cmd != Q_FINVALIDATE) {
+	if (oqctl->qc_cmd != Q_GETOQUOTA &&
+	    oqctl->qc_cmd != LUSTRE_Q_SETQUOTA) {
 		CERROR("bad quota opc %x for lov obd\n", oqctl->qc_cmd);
 		return -EFAULT;
 	}
@@ -1407,49 +1401,6 @@ static int lov_quotactl(struct obd_device *obd, struct obd_export *exp,
 	return rc;
 }
 
-static int lov_quotacheck(struct obd_device *obd, struct obd_export *exp,
-			  struct obd_quotactl *oqctl)
-{
-	struct lov_obd *lov = &obd->u.lov;
-	int	     i, rc = 0;
-
-	obd_getref(obd);
-
-	for (i = 0; i < lov->desc.ld_tgt_count; i++) {
-		if (!lov->lov_tgts[i])
-			continue;
-
-		/* Skip quota check on the administratively disabled OSTs. */
-		if (!lov->lov_tgts[i]->ltd_activate) {
-			CWARN("lov idx %d was administratively disabled, skip quotacheck on it.\n",
-			      i);
-			continue;
-		}
-
-		if (!lov->lov_tgts[i]->ltd_active) {
-			CERROR("lov idx %d inactive\n", i);
-			rc = -EIO;
-			goto out;
-		}
-	}
-
-	for (i = 0; i < lov->desc.ld_tgt_count; i++) {
-		int err;
-
-		if (!lov->lov_tgts[i] || !lov->lov_tgts[i]->ltd_activate)
-			continue;
-
-		err = obd_quotacheck(lov->lov_tgts[i]->ltd_exp, oqctl);
-		if (err && !rc)
-			rc = err;
-	}
-
-out:
-	obd_putref(obd);
-
-	return rc;
-}
-
 static struct obd_ops lov_obd_ops = {
 	.owner          = THIS_MODULE,
 	.setup          = lov_setup,
@@ -1473,7 +1424,6 @@ static struct obd_ops lov_obd_ops = {
 	.getref         = lov_getref,
 	.putref         = lov_putref,
 	.quotactl       = lov_quotactl,
-	.quotacheck     = lov_quotacheck,
 };
 
 struct kmem_cache *lov_oinfo_slab;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 1d1eaa5..b62b29f 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1929,52 +1929,6 @@ static int mdc_ioc_changelog_send(struct obd_device *obd,
 static int mdc_ioc_hsm_ct_start(struct obd_export *exp,
 				struct lustre_kernelcomm *lk);
 
-static int mdc_quotacheck(struct obd_device *unused, struct obd_export *exp,
-			  struct obd_quotactl *oqctl)
-{
-	struct client_obd       *cli = &exp->exp_obd->u.cli;
-	struct ptlrpc_request   *req;
-	struct obd_quotactl     *body;
-	int		      rc;
-
-	req = ptlrpc_request_alloc_pack(class_exp2cliimp(exp),
-					&RQF_MDS_QUOTACHECK, LUSTRE_MDS_VERSION,
-					MDS_QUOTACHECK);
-	if (!req)
-		return -ENOMEM;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_OBD_QUOTACTL);
-	*body = *oqctl;
-
-	ptlrpc_request_set_replen(req);
-
-	/* the next poll will find -ENODATA, that means quotacheck is
-	 * going on
-	 */
-	cli->cl_qchk_stat = -ENODATA;
-	rc = ptlrpc_queue_wait(req);
-	if (rc)
-		cli->cl_qchk_stat = rc;
-	ptlrpc_req_finished(req);
-	return rc;
-}
-
-static int mdc_quota_poll_check(struct obd_export *exp,
-				struct if_quotacheck *qchk)
-{
-	struct client_obd *cli = &exp->exp_obd->u.cli;
-	int rc;
-
-	qchk->obd_uuid = cli->cl_target_uuid;
-	memcpy(qchk->obd_type, LUSTRE_MDS_NAME, strlen(LUSTRE_MDS_NAME));
-
-	rc = cli->cl_qchk_stat;
-	/* the client is not the previous one */
-	if (rc == CL_NOT_QUOTACHECKED)
-		rc = -EINTR;
-	return rc;
-}
-
 static int mdc_quotactl(struct obd_device *unused, struct obd_export *exp,
 			struct obd_quotactl *oqctl)
 {
@@ -2129,9 +2083,6 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	case IOC_OSC_SET_ACTIVE:
 		rc = ptlrpc_set_import_active(imp, data->ioc_offset);
 		goto out;
-	case OBD_IOC_POLL_QUOTACHECK:
-		rc = mdc_quota_poll_check(exp, (struct if_quotacheck *)karg);
-		goto out;
 	case OBD_IOC_PING_TARGET:
 		rc = ptlrpc_obd_ping(obd);
 		goto out;
@@ -2794,7 +2745,6 @@ static struct obd_ops mdc_obd_ops = {
 	.process_config = mdc_process_config,
 	.get_uuid       = mdc_get_uuid,
 	.quotactl       = mdc_quotactl,
-	.quotacheck     = mdc_quotacheck
 };
 
 static struct md_ops mdc_md_ops = {
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 64684c4..61bfacb 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -190,9 +190,6 @@ int osc_quota_setdq(struct client_obd *cli, const unsigned int qid[],
 int osc_quota_chkdq(struct client_obd *cli, const unsigned int qid[]);
 int osc_quotactl(struct obd_device *unused, struct obd_export *exp,
 		 struct obd_quotactl *oqctl);
-int osc_quotacheck(struct obd_device *unused, struct obd_export *exp,
-		   struct obd_quotactl *oqctl);
-int osc_quota_poll_check(struct obd_export *exp, struct if_quotacheck *qchk);
 void osc_inc_unstable_pages(struct ptlrpc_request *req);
 void osc_dec_unstable_pages(struct ptlrpc_request *req);
 bool osc_over_unstable_soft_limit(struct client_obd *cli);
diff --git a/drivers/staging/lustre/lustre/osc/osc_quota.c b/drivers/staging/lustre/lustre/osc/osc_quota.c
index 194d8ed..acdd91a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_quota.c
+++ b/drivers/staging/lustre/lustre/osc/osc_quota.c
@@ -281,47 +281,3 @@ int osc_quotactl(struct obd_device *unused, struct obd_export *exp,
 
 	return rc;
 }
-
-int osc_quotacheck(struct obd_device *unused, struct obd_export *exp,
-		   struct obd_quotactl *oqctl)
-{
-	struct client_obd *cli = &exp->exp_obd->u.cli;
-	struct ptlrpc_request *req;
-	struct obd_quotactl *body;
-	int rc;
-
-	req = ptlrpc_request_alloc_pack(class_exp2cliimp(exp),
-					&RQF_OST_QUOTACHECK, LUSTRE_OST_VERSION,
-					OST_QUOTACHECK);
-	if (!req)
-		return -ENOMEM;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_OBD_QUOTACTL);
-	*body = *oqctl;
-
-	ptlrpc_request_set_replen(req);
-
-	/* the next poll will find -ENODATA, that means quotacheck is going on
-	 */
-	cli->cl_qchk_stat = -ENODATA;
-	rc = ptlrpc_queue_wait(req);
-	if (rc)
-		cli->cl_qchk_stat = rc;
-	ptlrpc_req_finished(req);
-	return rc;
-}
-
-int osc_quota_poll_check(struct obd_export *exp, struct if_quotacheck *qchk)
-{
-	struct client_obd *cli = &exp->exp_obd->u.cli;
-	int rc;
-
-	qchk->obd_uuid = cli->cl_target_uuid;
-	memcpy(qchk->obd_type, LUSTRE_OST_NAME, strlen(LUSTRE_OST_NAME));
-
-	rc = cli->cl_qchk_stat;
-	/* the client is not the previous one */
-	if (rc == CL_NOT_QUOTACHECKED)
-		rc = -EINTR;
-	return rc;
-}
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 2e4d2d5..ab7f82d 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2443,9 +2443,6 @@ static int osc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		err = ptlrpc_set_import_active(obd->u.cli.cl_import,
 					       data->ioc_offset);
 		goto out;
-	case OBD_IOC_POLL_QUOTACHECK:
-		err = osc_quota_poll_check(exp, karg);
-		goto out;
 	case OBD_IOC_PING_TARGET:
 		err = ptlrpc_obd_ping(obd);
 		goto out;
@@ -2934,7 +2931,6 @@ static struct obd_ops osc_obd_ops = {
 	.import_event   = osc_import_event,
 	.process_config = osc_process_config,
 	.quotactl       = osc_quotactl,
-	.quotacheck     = osc_quotacheck,
 };
 
 extern struct lu_kmem_descr osc_caches[];
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 358c124..f0e0448 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -680,7 +680,6 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_REINT_RENAME,
 	&RQF_MDS_REINT_SETATTR,
 	&RQF_MDS_REINT_SETXATTR,
-	&RQF_MDS_QUOTACHECK,
 	&RQF_MDS_QUOTACTL,
 	&RQF_MDS_HSM_PROGRESS,
 	&RQF_MDS_HSM_CT_REGISTER,
@@ -690,10 +689,8 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_HSM_ACTION,
 	&RQF_MDS_HSM_REQUEST,
 	&RQF_MDS_SWAP_LAYOUTS,
-	&RQF_QC_CALLBACK,
 	&RQF_OST_CONNECT,
 	&RQF_OST_DISCONNECT,
-	&RQF_OST_QUOTACHECK,
 	&RQF_OST_QUOTACTL,
 	&RQF_OST_GETATTR,
 	&RQF_OST_SETATTR,
@@ -1179,14 +1176,6 @@ struct req_format RQF_LOG_CANCEL =
 	DEFINE_REQ_FMT0("OBD_LOG_CANCEL", log_cancel_client, empty);
 EXPORT_SYMBOL(RQF_LOG_CANCEL);
 
-struct req_format RQF_MDS_QUOTACHECK =
-	DEFINE_REQ_FMT0("MDS_QUOTACHECK", quotactl_only, empty);
-EXPORT_SYMBOL(RQF_MDS_QUOTACHECK);
-
-struct req_format RQF_OST_QUOTACHECK =
-	DEFINE_REQ_FMT0("OST_QUOTACHECK", quotactl_only, empty);
-EXPORT_SYMBOL(RQF_OST_QUOTACHECK);
-
 struct req_format RQF_MDS_QUOTACTL =
 	DEFINE_REQ_FMT0("MDS_QUOTACTL", quotactl_only, quotactl_only);
 EXPORT_SYMBOL(RQF_MDS_QUOTACTL);
@@ -1195,10 +1184,6 @@ struct req_format RQF_OST_QUOTACTL =
 	DEFINE_REQ_FMT0("OST_QUOTACTL", quotactl_only, quotactl_only);
 EXPORT_SYMBOL(RQF_OST_QUOTACTL);
 
-struct req_format RQF_QC_CALLBACK =
-	DEFINE_REQ_FMT0("QC_CALLBACK", quotactl_only, empty);
-EXPORT_SYMBOL(RQF_QC_CALLBACK);
-
 struct req_format RQF_MDS_GETSTATUS =
 	DEFINE_REQ_FMT0("MDS_GETSTATUS", mdt_body_only, mdt_body_capa);
 EXPORT_SYMBOL(RQF_MDS_GETSTATUS);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 30/41] staging: lustre: quota: remove obsolete quota code
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

Remove the obsolete quotacheck, quotaon and quotaoff, which
were retained for the interoperability with old (< 2.4) client
and server.

Some other obsolete quota code related to LL_IOC_QUOTACTL_18,
Q_INVLIDATE and Q_FINVALIDATE are removed as well.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5975
Reviewed-on: http://review.whamcloud.com/14705
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    6 +-
 .../lustre/lustre/include/lustre/lustre_ioctl.h    |    4 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |   13 +---
 .../lustre/lustre/include/lustre_req_layout.h      |    3 -
 drivers/staging/lustre/lustre/include/obd.h        |    9 ---
 drivers/staging/lustre/lustre/include/obd_class.h  |   12 ----
 .../staging/lustre/lustre/include/obd_support.h    |    6 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |    2 -
 drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c    |   24 -------
 drivers/staging/lustre/lustre/llite/dir.c          |   65 --------------------
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |   31 +---------
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   54 +----------------
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   50 ---------------
 drivers/staging/lustre/lustre/osc/osc_internal.h   |    3 -
 drivers/staging/lustre/lustre/osc/osc_quota.c      |   44 -------------
 drivers/staging/lustre/lustre/osc/osc_request.c    |    4 -
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   15 -----
 17 files changed, 15 insertions(+), 330 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 17feb71..7645ed9 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1406,7 +1406,7 @@ enum ost_cmd {
 	OST_STATFS     = 13,
 	OST_SYNC       = 16,
 	OST_SET_INFO   = 17,
-	OST_QUOTACHECK = 18,
+	OST_QUOTACHECK = 18, /* not used since 2.4 */
 	OST_QUOTACTL   = 19,
 	OST_QUOTA_ADJUST_QUNIT = 20, /* not used since 2.4 */
 	OST_LAST_OPC
@@ -1925,7 +1925,7 @@ enum mds_cmd {
 	MDS_SYNC		= 44,
 	MDS_DONE_WRITING	= 45, /* obsolete since 2.8.0 */
 	MDS_SET_INFO		= 46,
-	MDS_QUOTACHECK		= 47,
+	MDS_QUOTACHECK		= 47, /* not used since 2.4 */
 	MDS_QUOTACTL		= 48,
 	MDS_GETXATTR		= 49,
 	MDS_SETXATTR		= 50, /* obsolete, now it's MDS_REINT op */
@@ -2889,7 +2889,7 @@ void lustre_swab_cfg_marker(struct cfg_marker *marker, int swab, int size);
 enum obd_cmd {
 	OBD_PING = 400,
 	OBD_LOG_CANCEL,
-	OBD_QC_CALLBACK,
+	OBD_QC_CALLBACK, /* not used since 2.4 */
 	OBD_IDX_READ,
 	OBD_LAST_OPC
 };
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_ioctl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_ioctl.h
index f3d7c94..eb08df3 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_ioctl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_ioctl.h
@@ -363,8 +363,8 @@ obd_ioctl_unpack(struct obd_ioctl_data *data, char *pbuf, int max_len)
 /*	OBD_IOC_LOV_GETSTRIPE	155 LL_IOC_LOV_GETSTRIPE */
 /*	OBD_IOC_LOV_SETEA	156 LL_IOC_LOV_SETEA */
 /*	lustre/lustre_user.h	157-159 */
-#define	OBD_IOC_QUOTACHECK	_IOW('f', 160, int)
-#define	OBD_IOC_POLL_QUOTACHECK	_IOR('f', 161, struct if_quotacheck *)
+/*	OBD_IOC_QUOTACHECK	_IOW('f', 160, int) */
+/*	OBD_IOC_POLL_QUOTACHECK	_IOR('f', 161, struct if_quotacheck *) */
 #define OBD_IOC_QUOTACTL	_IOWR('f', 162, struct if_quotactl)
 /*	lustre/lustre_user.h	163-176 */
 #define OBD_IOC_CHANGELOG_REG	_IOW('f', 177, struct obd_ioctl_data)
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 80fecba..856e2f9 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -557,23 +557,18 @@ static inline void obd_uuid2fsname(char *buf, char *uuid, int buflen)
 #define Q_FINVALIDATE  0x800104 /* deprecated as of 2.4 */
 
 /* these must be explicitly translated into linux Q_* in ll_dir_ioctl */
-#define LUSTRE_Q_QUOTAON    0x800002     /* turn quotas on */
-#define LUSTRE_Q_QUOTAOFF   0x800003     /* turn quotas off */
+#define LUSTRE_Q_QUOTAON    0x800002	/* deprecated as of 2.4 */
+#define LUSTRE_Q_QUOTAOFF   0x800003	/* deprecated as of 2.4 */
 #define LUSTRE_Q_GETINFO    0x800005     /* get information about quota files */
 #define LUSTRE_Q_SETINFO    0x800006     /* set information about quota files */
 #define LUSTRE_Q_GETQUOTA   0x800007     /* get user quota structure */
 #define LUSTRE_Q_SETQUOTA   0x800008     /* set user quota structure */
 /* lustre-specific control commands */
-#define LUSTRE_Q_INVALIDATE  0x80000b     /* invalidate quota data */
-#define LUSTRE_Q_FINVALIDATE 0x80000c     /* invalidate filter quota data */
+#define LUSTRE_Q_INVALIDATE  0x80000b	/* deprecated as of 2.4 */
+#define LUSTRE_Q_FINVALIDATE 0x80000c	/* deprecated as of 2.4 */
 
 #define UGQUOTA 2       /* set both USRQUOTA and GRPQUOTA */
 
-struct if_quotacheck {
-	char		    obd_type[16];
-	struct obd_uuid	 obd_uuid;
-};
-
 #define IDENTITY_DOWNCALL_MAGIC 0x6d6dd629
 
 /* permission */
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index dd8717e..78857b3 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -165,9 +165,7 @@ extern struct req_format RQF_MDS_REINT_LINK;
 extern struct req_format RQF_MDS_REINT_RENAME;
 extern struct req_format RQF_MDS_REINT_SETATTR;
 extern struct req_format RQF_MDS_REINT_SETXATTR;
-extern struct req_format RQF_MDS_QUOTACHECK;
 extern struct req_format RQF_MDS_QUOTACTL;
-extern struct req_format RQF_QC_CALLBACK;
 extern struct req_format RQF_MDS_SWAP_LAYOUTS;
 /* MDS hsm formats */
 extern struct req_format RQF_MDS_HSM_STATE_GET;
@@ -180,7 +178,6 @@ extern struct req_format RQF_MDS_HSM_REQUEST;
 /* OST req_format */
 extern struct req_format RQF_OST_CONNECT;
 extern struct req_format RQF_OST_DISCONNECT;
-extern struct req_format RQF_OST_QUOTACHECK;
 extern struct req_format RQF_OST_QUOTACTL;
 extern struct req_format RQF_OST_GETATTR;
 extern struct req_format RQF_OST_SETATTR;
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 8372deb..2811901 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -345,13 +345,6 @@ struct client_obd {
 	/* also protected by the poorly named _loi_list_lock lock above */
 	struct osc_async_rc      cl_ar;
 
-	/* used by quotacheck when the servers are older than 2.4 */
-	int		      cl_qchk_stat; /* quotacheck stat of the peer */
-#define CL_NOT_QUOTACHECKED 1   /* client->cl_qchk_stat init value */
-#if OBD_OCD_VERSION(2, 7, 53, 0) < LUSTRE_VERSION_CODE
-#warning "please consider removing quotacheck compatibility code"
-#endif
-
 	/* sequence manager */
 	struct lu_client_seq    *cl_seq;
 
@@ -930,8 +923,6 @@ struct obd_ops {
 	struct obd_uuid *(*get_uuid)(struct obd_export *exp);
 
 	/* quota methods */
-	int (*quotacheck)(struct obd_device *, struct obd_export *,
-			  struct obd_quotactl *);
 	int (*quotactl)(struct obd_device *, struct obd_export *,
 			struct obd_quotactl *);
 
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index cb7160e..8f1d681 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -1163,18 +1163,6 @@ static inline int obd_notify_observer(struct obd_device *observer,
 	return rc1 ? rc1 : rc2;
 }
 
-static inline int obd_quotacheck(struct obd_export *exp,
-				 struct obd_quotactl *oqctl)
-{
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, quotacheck);
-	EXP_COUNTER_INCREMENT(exp, quotacheck);
-
-	rc = OBP(exp->exp_obd, quotacheck)(exp->exp_obd, exp, oqctl);
-	return rc;
-}
-
 static inline int obd_quotactl(struct obd_export *exp,
 			       struct obd_quotactl *oqctl)
 {
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 9d2d6f8..1233c34 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -179,7 +179,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_MDS_STATFS_LCW_SLEEP    0x12a
 #define OBD_FAIL_MDS_OPEN_CREATE	 0x12b
 #define OBD_FAIL_MDS_OST_SETATTR	 0x12c
-#define OBD_FAIL_MDS_QUOTACHECK_NET      0x12d
+/*	OBD_FAIL_MDS_QUOTACHECK_NET      0x12d obsolete since 2.4 */
 #define OBD_FAIL_MDS_QUOTACTL_NET	0x12e
 #define OBD_FAIL_MDS_CLIENT_ADD	  0x12f
 #define OBD_FAIL_MDS_GETXATTR_NET	0x130
@@ -264,7 +264,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_OST_ENOSPC	      0x215
 #define OBD_FAIL_OST_EROFS	       0x216
 #define OBD_FAIL_OST_ENOENT	      0x217
-#define OBD_FAIL_OST_QUOTACHECK_NET      0x218
+/*	OBD_FAIL_OST_QUOTACHECK_NET      0x218 obsolete since 2.4 */
 #define OBD_FAIL_OST_QUOTACTL_NET	0x219
 #define OBD_FAIL_OST_CHECKSUM_RECEIVE    0x21a
 #define OBD_FAIL_OST_CHECKSUM_SEND       0x21b
@@ -373,7 +373,7 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_OBD_PING_NET	    0x600
 #define OBD_FAIL_OBD_LOG_CANCEL_NET      0x601
 #define OBD_FAIL_OBD_LOGD_NET	    0x602
-#define OBD_FAIL_OBD_QC_CALLBACK_NET     0x603
+/*	OBD_FAIL_OBD_QC_CALLBACK_NET     0x603 obsolete since 2.4 */
 #define OBD_FAIL_OBD_DQACQ	       0x604
 #define OBD_FAIL_OBD_LLOG_SETUP	  0x605
 #define OBD_FAIL_OBD_LOG_CANCEL_REP      0x606
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 153e990..f3128b6 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -425,8 +425,6 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 		goto err_import;
 	}
 
-	cli->cl_qchk_stat = CL_NOT_QUOTACHECKED;
-
 	return rc;
 
 err_import:
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
index c32b414..12647af 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lockd.c
@@ -511,23 +511,6 @@ static inline void ldlm_callback_errmsg(struct ptlrpc_request *req,
 		CWARN("Send reply failed, maybe cause bug 21636.\n");
 }
 
-static int ldlm_handle_qc_callback(struct ptlrpc_request *req)
-{
-	struct obd_quotactl *oqctl;
-	struct client_obd *cli = &req->rq_export->exp_obd->u.cli;
-
-	oqctl = req_capsule_client_get(&req->rq_pill, &RMF_OBD_QUOTACTL);
-	if (!oqctl) {
-		CERROR("Can't unpack obd_quotactl\n");
-		return -EPROTO;
-	}
-
-	oqctl->qc_stat = ptlrpc_status_ntoh(oqctl->qc_stat);
-
-	cli->cl_qchk_stat = oqctl->qc_stat;
-	return 0;
-}
-
 /* TODO: handle requests in a similar way as MDT: see mdt_handle_common() */
 static int ldlm_callback_handler(struct ptlrpc_request *req)
 {
@@ -577,13 +560,6 @@ static int ldlm_callback_handler(struct ptlrpc_request *req)
 		rc = ldlm_handle_setinfo(req);
 		ldlm_callback_reply(req, rc);
 		return 0;
-	case OBD_QC_CALLBACK:
-		req_capsule_set(&req->rq_pill, &RQF_QC_CALLBACK);
-		if (OBD_FAIL_CHECK(OBD_FAIL_OBD_QC_CALLBACK_NET))
-			return 0;
-		rc = ldlm_handle_qc_callback(req);
-		ldlm_callback_reply(req, rc);
-		return 0;
 	default:
 		CERROR("unknown opcode %u\n",
 		       lustre_msg_get_opc(req->rq_reqmsg));
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 360d97f..929c32c 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -861,10 +861,6 @@ static int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl)
 	int rc = 0;
 
 	switch (cmd) {
-	case LUSTRE_Q_INVALIDATE:
-	case LUSTRE_Q_FINVALIDATE:
-	case Q_QUOTAON:
-	case Q_QUOTAOFF:
 	case Q_SETQUOTA:
 	case Q_SETINFO:
 		if (!capable(CFS_CAP_SYS_ADMIN))
@@ -929,10 +925,6 @@ static int quotactl_ioctl(struct ll_sb_info *sbi, struct if_quotactl *qctl)
 		QCTL_COPY(oqctl, qctl);
 		rc = obd_quotactl(sbi->ll_md_exp, oqctl);
 		if (rc) {
-			if (rc != -EALREADY && cmd == Q_QUOTAON) {
-				oqctl->qc_cmd = Q_QUOTAOFF;
-				obd_quotactl(sbi->ll_md_exp, oqctl);
-			}
 			kfree(oqctl);
 			return rc;
 		}
@@ -1369,63 +1361,6 @@ out_req:
 			ll_putname(filename);
 		return rc;
 	}
-	case OBD_IOC_QUOTACHECK: {
-		struct obd_quotactl *oqctl;
-		int error = 0;
-
-		if (!capable(CFS_CAP_SYS_ADMIN))
-			return -EPERM;
-
-		oqctl = kzalloc(sizeof(*oqctl), GFP_NOFS);
-		if (!oqctl)
-			return -ENOMEM;
-		oqctl->qc_type = arg;
-		rc = obd_quotacheck(sbi->ll_md_exp, oqctl);
-		if (rc < 0) {
-			CDEBUG(D_INFO, "md_quotacheck failed: rc %d\n", rc);
-			error = rc;
-		}
-
-		rc = obd_quotacheck(sbi->ll_dt_exp, oqctl);
-		if (rc < 0)
-			CDEBUG(D_INFO, "obd_quotacheck failed: rc %d\n", rc);
-
-		kfree(oqctl);
-		return error ?: rc;
-	}
-	case OBD_IOC_POLL_QUOTACHECK: {
-		struct if_quotacheck *check;
-
-		if (!capable(CFS_CAP_SYS_ADMIN))
-			return -EPERM;
-
-		check = kzalloc(sizeof(*check), GFP_NOFS);
-		if (!check)
-			return -ENOMEM;
-
-		rc = obd_iocontrol(cmd, sbi->ll_md_exp, 0, (void *)check,
-				   NULL);
-		if (rc) {
-			CDEBUG(D_QUOTA, "mdc ioctl %d failed: %d\n", cmd, rc);
-			if (copy_to_user((void __user *)arg, check,
-					 sizeof(*check)))
-				CDEBUG(D_QUOTA, "copy_to_user failed\n");
-			goto out_poll;
-		}
-
-		rc = obd_iocontrol(cmd, sbi->ll_dt_exp, 0, (void *)check,
-				   NULL);
-		if (rc) {
-			CDEBUG(D_QUOTA, "osc ioctl %d failed: %d\n", cmd, rc);
-			if (copy_to_user((void __user *)arg, check,
-					 sizeof(*check)))
-				CDEBUG(D_QUOTA, "copy_to_user failed\n");
-			goto out_poll;
-		}
-out_poll:
-		kfree(check);
-		return rc;
-	}
 	case OBD_IOC_QUOTACTL: {
 		struct if_quotactl *qctl;
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 7bd7e15..67969a8 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1128,9 +1128,7 @@ static int lmv_iocontrol(unsigned int cmd, struct obd_export *exp,
 			mdc_obd = class_exp2obd(tgt->ltd_exp);
 			mdc_obd->obd_force = obddev->obd_force;
 			err = obd_iocontrol(cmd, tgt->ltd_exp, len, karg, uarg);
-			if (err == -ENODATA && cmd == OBD_IOC_POLL_QUOTACHECK) {
-				return err;
-			} else if (err) {
+			if (err) {
 				if (tgt->ltd_active) {
 					CERROR("error: iocontrol MDC %s on MDTidx %d cmd %x: err = %d\n",
 					       tgt->ltd_uuid.uuid, i, cmd, err);
@@ -3243,32 +3241,6 @@ static int lmv_quotactl(struct obd_device *unused, struct obd_export *exp,
 	return rc;
 }
 
-static int lmv_quotacheck(struct obd_device *unused, struct obd_export *exp,
-			  struct obd_quotactl *oqctl)
-{
-	struct obd_device   *obd = class_exp2obd(exp);
-	struct lmv_obd      *lmv = &obd->u.lmv;
-	struct lmv_tgt_desc *tgt;
-	int rc = 0;
-	u32 i;
-
-	for (i = 0; i < lmv->desc.ld_tgt_count; i++) {
-		int err;
-
-		tgt = lmv->tgts[i];
-		if (!tgt || !tgt->ltd_exp || !tgt->ltd_active) {
-			CERROR("lmv idx %d inactive\n", i);
-			return -EIO;
-		}
-
-		err = obd_quotacheck(tgt->ltd_exp, oqctl);
-		if (err && !rc)
-			rc = err;
-	}
-
-	return rc;
-}
-
 static int lmv_merge_attr(struct obd_export *exp,
 			  const struct lmv_stripe_md *lsm,
 			  struct cl_attr *attr,
@@ -3326,7 +3298,6 @@ static struct obd_ops lmv_obd_ops = {
 	.notify		= lmv_notify,
 	.get_uuid	= lmv_get_uuid,
 	.iocontrol	= lmv_iocontrol,
-	.quotacheck	= lmv_quotacheck,
 	.quotactl	= lmv_quotactl
 };
 
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 019fe95..c2a853c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1217,8 +1217,6 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 			osc_obd->obd_force = obddev->obd_force;
 			err = obd_iocontrol(cmd, lov->lov_tgts[i]->ltd_exp,
 					    len, karg, uarg);
-			if (err == -ENODATA && cmd == OBD_IOC_POLL_QUOTACHECK)
-				return err;
 			if (err) {
 				if (lov->lov_tgts[i]->ltd_active) {
 					CDEBUG(err == -ENOTTY ?
@@ -1355,12 +1353,8 @@ static int lov_quotactl(struct obd_device *obd, struct obd_export *exp,
 	__u64		bhardlimit = 0;
 	int		  i, rc = 0;
 
-	if (oqctl->qc_cmd != LUSTRE_Q_QUOTAON &&
-	    oqctl->qc_cmd != LUSTRE_Q_QUOTAOFF &&
-	    oqctl->qc_cmd != Q_GETOQUOTA &&
-	    oqctl->qc_cmd != Q_INITQUOTA &&
-	    oqctl->qc_cmd != LUSTRE_Q_SETQUOTA &&
-	    oqctl->qc_cmd != Q_FINVALIDATE) {
+	if (oqctl->qc_cmd != Q_GETOQUOTA &&
+	    oqctl->qc_cmd != LUSTRE_Q_SETQUOTA) {
 		CERROR("bad quota opc %x for lov obd\n", oqctl->qc_cmd);
 		return -EFAULT;
 	}
@@ -1407,49 +1401,6 @@ static int lov_quotactl(struct obd_device *obd, struct obd_export *exp,
 	return rc;
 }
 
-static int lov_quotacheck(struct obd_device *obd, struct obd_export *exp,
-			  struct obd_quotactl *oqctl)
-{
-	struct lov_obd *lov = &obd->u.lov;
-	int	     i, rc = 0;
-
-	obd_getref(obd);
-
-	for (i = 0; i < lov->desc.ld_tgt_count; i++) {
-		if (!lov->lov_tgts[i])
-			continue;
-
-		/* Skip quota check on the administratively disabled OSTs. */
-		if (!lov->lov_tgts[i]->ltd_activate) {
-			CWARN("lov idx %d was administratively disabled, skip quotacheck on it.\n",
-			      i);
-			continue;
-		}
-
-		if (!lov->lov_tgts[i]->ltd_active) {
-			CERROR("lov idx %d inactive\n", i);
-			rc = -EIO;
-			goto out;
-		}
-	}
-
-	for (i = 0; i < lov->desc.ld_tgt_count; i++) {
-		int err;
-
-		if (!lov->lov_tgts[i] || !lov->lov_tgts[i]->ltd_activate)
-			continue;
-
-		err = obd_quotacheck(lov->lov_tgts[i]->ltd_exp, oqctl);
-		if (err && !rc)
-			rc = err;
-	}
-
-out:
-	obd_putref(obd);
-
-	return rc;
-}
-
 static struct obd_ops lov_obd_ops = {
 	.owner          = THIS_MODULE,
 	.setup          = lov_setup,
@@ -1473,7 +1424,6 @@ static struct obd_ops lov_obd_ops = {
 	.getref         = lov_getref,
 	.putref         = lov_putref,
 	.quotactl       = lov_quotactl,
-	.quotacheck     = lov_quotacheck,
 };
 
 struct kmem_cache *lov_oinfo_slab;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 1d1eaa5..b62b29f 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -1929,52 +1929,6 @@ static int mdc_ioc_changelog_send(struct obd_device *obd,
 static int mdc_ioc_hsm_ct_start(struct obd_export *exp,
 				struct lustre_kernelcomm *lk);
 
-static int mdc_quotacheck(struct obd_device *unused, struct obd_export *exp,
-			  struct obd_quotactl *oqctl)
-{
-	struct client_obd       *cli = &exp->exp_obd->u.cli;
-	struct ptlrpc_request   *req;
-	struct obd_quotactl     *body;
-	int		      rc;
-
-	req = ptlrpc_request_alloc_pack(class_exp2cliimp(exp),
-					&RQF_MDS_QUOTACHECK, LUSTRE_MDS_VERSION,
-					MDS_QUOTACHECK);
-	if (!req)
-		return -ENOMEM;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_OBD_QUOTACTL);
-	*body = *oqctl;
-
-	ptlrpc_request_set_replen(req);
-
-	/* the next poll will find -ENODATA, that means quotacheck is
-	 * going on
-	 */
-	cli->cl_qchk_stat = -ENODATA;
-	rc = ptlrpc_queue_wait(req);
-	if (rc)
-		cli->cl_qchk_stat = rc;
-	ptlrpc_req_finished(req);
-	return rc;
-}
-
-static int mdc_quota_poll_check(struct obd_export *exp,
-				struct if_quotacheck *qchk)
-{
-	struct client_obd *cli = &exp->exp_obd->u.cli;
-	int rc;
-
-	qchk->obd_uuid = cli->cl_target_uuid;
-	memcpy(qchk->obd_type, LUSTRE_MDS_NAME, strlen(LUSTRE_MDS_NAME));
-
-	rc = cli->cl_qchk_stat;
-	/* the client is not the previous one */
-	if (rc == CL_NOT_QUOTACHECKED)
-		rc = -EINTR;
-	return rc;
-}
-
 static int mdc_quotactl(struct obd_device *unused, struct obd_export *exp,
 			struct obd_quotactl *oqctl)
 {
@@ -2129,9 +2083,6 @@ static int mdc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	case IOC_OSC_SET_ACTIVE:
 		rc = ptlrpc_set_import_active(imp, data->ioc_offset);
 		goto out;
-	case OBD_IOC_POLL_QUOTACHECK:
-		rc = mdc_quota_poll_check(exp, (struct if_quotacheck *)karg);
-		goto out;
 	case OBD_IOC_PING_TARGET:
 		rc = ptlrpc_obd_ping(obd);
 		goto out;
@@ -2794,7 +2745,6 @@ static struct obd_ops mdc_obd_ops = {
 	.process_config = mdc_process_config,
 	.get_uuid       = mdc_get_uuid,
 	.quotactl       = mdc_quotactl,
-	.quotacheck     = mdc_quotacheck
 };
 
 static struct md_ops mdc_md_ops = {
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 64684c4..61bfacb 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -190,9 +190,6 @@ int osc_quota_setdq(struct client_obd *cli, const unsigned int qid[],
 int osc_quota_chkdq(struct client_obd *cli, const unsigned int qid[]);
 int osc_quotactl(struct obd_device *unused, struct obd_export *exp,
 		 struct obd_quotactl *oqctl);
-int osc_quotacheck(struct obd_device *unused, struct obd_export *exp,
-		   struct obd_quotactl *oqctl);
-int osc_quota_poll_check(struct obd_export *exp, struct if_quotacheck *qchk);
 void osc_inc_unstable_pages(struct ptlrpc_request *req);
 void osc_dec_unstable_pages(struct ptlrpc_request *req);
 bool osc_over_unstable_soft_limit(struct client_obd *cli);
diff --git a/drivers/staging/lustre/lustre/osc/osc_quota.c b/drivers/staging/lustre/lustre/osc/osc_quota.c
index 194d8ed..acdd91a 100644
--- a/drivers/staging/lustre/lustre/osc/osc_quota.c
+++ b/drivers/staging/lustre/lustre/osc/osc_quota.c
@@ -281,47 +281,3 @@ int osc_quotactl(struct obd_device *unused, struct obd_export *exp,
 
 	return rc;
 }
-
-int osc_quotacheck(struct obd_device *unused, struct obd_export *exp,
-		   struct obd_quotactl *oqctl)
-{
-	struct client_obd *cli = &exp->exp_obd->u.cli;
-	struct ptlrpc_request *req;
-	struct obd_quotactl *body;
-	int rc;
-
-	req = ptlrpc_request_alloc_pack(class_exp2cliimp(exp),
-					&RQF_OST_QUOTACHECK, LUSTRE_OST_VERSION,
-					OST_QUOTACHECK);
-	if (!req)
-		return -ENOMEM;
-
-	body = req_capsule_client_get(&req->rq_pill, &RMF_OBD_QUOTACTL);
-	*body = *oqctl;
-
-	ptlrpc_request_set_replen(req);
-
-	/* the next poll will find -ENODATA, that means quotacheck is going on
-	 */
-	cli->cl_qchk_stat = -ENODATA;
-	rc = ptlrpc_queue_wait(req);
-	if (rc)
-		cli->cl_qchk_stat = rc;
-	ptlrpc_req_finished(req);
-	return rc;
-}
-
-int osc_quota_poll_check(struct obd_export *exp, struct if_quotacheck *qchk)
-{
-	struct client_obd *cli = &exp->exp_obd->u.cli;
-	int rc;
-
-	qchk->obd_uuid = cli->cl_target_uuid;
-	memcpy(qchk->obd_type, LUSTRE_OST_NAME, strlen(LUSTRE_OST_NAME));
-
-	rc = cli->cl_qchk_stat;
-	/* the client is not the previous one */
-	if (rc == CL_NOT_QUOTACHECKED)
-		rc = -EINTR;
-	return rc;
-}
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 2e4d2d5..ab7f82d 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -2443,9 +2443,6 @@ static int osc_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		err = ptlrpc_set_import_active(obd->u.cli.cl_import,
 					       data->ioc_offset);
 		goto out;
-	case OBD_IOC_POLL_QUOTACHECK:
-		err = osc_quota_poll_check(exp, karg);
-		goto out;
 	case OBD_IOC_PING_TARGET:
 		err = ptlrpc_obd_ping(obd);
 		goto out;
@@ -2934,7 +2931,6 @@ static struct obd_ops osc_obd_ops = {
 	.import_event   = osc_import_event,
 	.process_config = osc_process_config,
 	.quotactl       = osc_quotactl,
-	.quotacheck     = osc_quotacheck,
 };
 
 extern struct lu_kmem_descr osc_caches[];
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 358c124..f0e0448 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -680,7 +680,6 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_REINT_RENAME,
 	&RQF_MDS_REINT_SETATTR,
 	&RQF_MDS_REINT_SETXATTR,
-	&RQF_MDS_QUOTACHECK,
 	&RQF_MDS_QUOTACTL,
 	&RQF_MDS_HSM_PROGRESS,
 	&RQF_MDS_HSM_CT_REGISTER,
@@ -690,10 +689,8 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_HSM_ACTION,
 	&RQF_MDS_HSM_REQUEST,
 	&RQF_MDS_SWAP_LAYOUTS,
-	&RQF_QC_CALLBACK,
 	&RQF_OST_CONNECT,
 	&RQF_OST_DISCONNECT,
-	&RQF_OST_QUOTACHECK,
 	&RQF_OST_QUOTACTL,
 	&RQF_OST_GETATTR,
 	&RQF_OST_SETATTR,
@@ -1179,14 +1176,6 @@ struct req_format RQF_LOG_CANCEL =
 	DEFINE_REQ_FMT0("OBD_LOG_CANCEL", log_cancel_client, empty);
 EXPORT_SYMBOL(RQF_LOG_CANCEL);
 
-struct req_format RQF_MDS_QUOTACHECK =
-	DEFINE_REQ_FMT0("MDS_QUOTACHECK", quotactl_only, empty);
-EXPORT_SYMBOL(RQF_MDS_QUOTACHECK);
-
-struct req_format RQF_OST_QUOTACHECK =
-	DEFINE_REQ_FMT0("OST_QUOTACHECK", quotactl_only, empty);
-EXPORT_SYMBOL(RQF_OST_QUOTACHECK);
-
 struct req_format RQF_MDS_QUOTACTL =
 	DEFINE_REQ_FMT0("MDS_QUOTACTL", quotactl_only, quotactl_only);
 EXPORT_SYMBOL(RQF_MDS_QUOTACTL);
@@ -1195,10 +1184,6 @@ struct req_format RQF_OST_QUOTACTL =
 	DEFINE_REQ_FMT0("OST_QUOTACTL", quotactl_only, quotactl_only);
 EXPORT_SYMBOL(RQF_OST_QUOTACTL);
 
-struct req_format RQF_QC_CALLBACK =
-	DEFINE_REQ_FMT0("QC_CALLBACK", quotactl_only, empty);
-EXPORT_SYMBOL(RQF_QC_CALLBACK);
-
 struct req_format RQF_MDS_GETSTATUS =
 	DEFINE_REQ_FMT0("MDS_GETSTATUS", mdt_body_only, mdt_body_capa);
 EXPORT_SYMBOL(RQF_MDS_GETSTATUS);
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 31/41] staging: lustre: obd: remove destroy cookie handling
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Clients no longer need to track the max and default MDS cookiesizes
so remove
 the obsolete obod o_valid flag OBD_MD_FLCOOKIE,
 the struct client_obd members cl_{default,max}_mds_cookiesize,
 the struct obd_trans_info and parameters of this type,
 the cookiesize parameters from md_init_ea_size(),
 the files llite/*/{default,max}_cookiesize, and
 any code that needlessly handled these values.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6017
Reviewed-on: http://review.whamcloud.com/12922
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    7 +-
 drivers/staging/lustre/lustre/include/lustre_mdc.h |   20 ++-----
 drivers/staging/lustre/lustre/include/obd.h        |   46 ++------------
 drivers/staging/lustre/lustre/include/obd_class.h  |   31 ++++------
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |    3 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   13 +---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |   23 ++-----
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |    1 -
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    2 -
 drivers/staging/lustre/lustre/lov/lov_request.c    |    1 -
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |    2 -
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |    4 -
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   15 +----
 .../staging/lustre/lustre/obdecho/echo_client.c    |   58 ++++++-------------
 drivers/staging/lustre/lustre/osc/osc_request.c    |   62 ++++---------------
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   10 +--
 17 files changed, 79 insertions(+), 223 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 7645ed9..5d2f845 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1664,7 +1664,7 @@ lov_mds_md_max_stripe_count(size_t buf_size, __u32 lmm_magic)
 #define OBD_MD_FLCKSUM     (0x00100000ULL) /* bulk data checksum */
 #define OBD_MD_FLQOS       (0x00200000ULL) /* quality of service stats */
 /*#define OBD_MD_FLOSCOPQ    (0x00400000ULL) osc opaque data, never used */
-#define OBD_MD_FLCOOKIE    (0x00800000ULL) /* log cancellation cookie */
+/*	OBD_MD_FLCOOKIE    (0x00800000ULL) obsolete in 2.8 */
 #define OBD_MD_FLGROUP     (0x01000000ULL) /* group */
 #define OBD_MD_FLFID       (0x02000000ULL) /* ->ost write inline fid */
 #define OBD_MD_FLEPOCH     (0x04000000ULL) /* ->ost write with ioepoch */
@@ -2091,7 +2091,7 @@ struct mdt_body {
 	__u32	mbo_eadatasize;
 	__u32	mbo_aclsize;
 	__u32	mbo_max_mdsize;
-	__u32	mbo_max_cookiesize;
+	__u32	mbo_unused3;	/* was max_cookiesize until 2.8 */
 	__u32	mbo_uid_h;	/* high 32-bits of uid, for FUID */
 	__u32	mbo_gid_h;	/* high 32-bits of gid, for FUID */
 	__u32	mbo_padding_5;	/* also fix lustre_swab_mdt_body */
@@ -3226,7 +3226,8 @@ struct obdo {
 	__u32		   o_parent_ver;
 	struct lustre_handle    o_handle;  /* brw: lock handle to prolong locks
 					    */
-	struct llog_cookie      o_lcookie; /* destroy: unlink cookie from MDS
+	struct llog_cookie      o_lcookie; /* destroy: unlink cookie from MDS,
+					    * obsolete in 2.8, reused in OSP
 					    */
 	__u32			o_uid_h;
 	__u32			o_gid_h;
diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h b/drivers/staging/lustre/lustre/include/lustre_mdc.h
index 8fc2d3f..92a5c0f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
+++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
@@ -157,15 +157,14 @@ static inline void mdc_put_rpc_lock(struct mdc_rpc_lock *lck,
 }
 
 /**
- * Update the maximum possible easize and cookiesize.
+ * Update the maximum possible easize.
  *
- * The values are learned from ptlrpc replies sent by the MDT.  The
- * default easize and cookiesize is initialized to the minimum value but
- * allowed to grow up to a single page in size if required to handle the
+ * This value is learned from ptlrpc replies sent by the MDT. The
+ * default easize is initialized to the minimum value but allowed
+ * to grow up to a single page in size if required to handle the
  * common case.
  *
- * \see client_obd::cl_default_mds_easize and
- * client_obd::cl_default_mds_cookiesize
+ * \see client_obd::cl_default_mds_easize
  *
  * \param[in] exp	export for MDC device
  * \param[in] body	body of ptlrpc reply from MDT
@@ -176,7 +175,7 @@ static inline void mdc_update_max_ea_from_body(struct obd_export *exp,
 {
 	if (body->mbo_valid & OBD_MD_FLMODEASIZE) {
 		struct client_obd *cli = &exp->exp_obd->u.cli;
-		u32 def_cookiesize, def_easize;
+		u32 def_easize;
 
 		if (cli->cl_max_mds_easize < body->mbo_max_mdsize)
 			cli->cl_max_mds_easize = body->mbo_max_mdsize;
@@ -184,13 +183,6 @@ static inline void mdc_update_max_ea_from_body(struct obd_export *exp,
 		def_easize = min_t(__u32, body->mbo_max_mdsize,
 				   OBD_MAX_DEFAULT_EA_SIZE);
 		cli->cl_default_mds_easize = def_easize;
-
-		if (cli->cl_max_mds_cookiesize < body->mbo_max_cookiesize)
-			cli->cl_max_mds_cookiesize = body->mbo_max_cookiesize;
-
-		def_cookiesize = min_t(__u32, body->mbo_max_cookiesize,
-				       OBD_MAX_DEFAULT_COOKIE_SIZE);
-		cli->cl_default_mds_cookiesize = def_cookiesize;
 	}
 }
 
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 2811901..4691121 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -204,7 +204,6 @@ enum obd_cl_sem_lock_class {
  * on the MDS.
  */
 #define OBD_MAX_DEFAULT_EA_SIZE		4096
-#define OBD_MAX_DEFAULT_COOKIE_SIZE	4096
 
 struct mdc_rpc_lock;
 struct obd_import;
@@ -214,7 +213,7 @@ struct client_obd {
 	struct obd_import       *cl_import; /* ptlrpc connection state */
 	size_t			 cl_conn_count;
 	/*
-	 * Cache maximum and default values for easize and cookiesize. This is
+	 * Cache maximum and default values for easize. This is
 	 * strictly a performance optimization to minimize calls to
 	 * obd_size_diskmd(). The default values are used to calculate the
 	 * initial size of a request buffer. The ptlrpc layer will resize the
@@ -235,18 +234,6 @@ struct client_obd {
 	 * run-time if a larger observed size is advertised by the MDT.
 	 */
 	u32			 cl_max_mds_easize;
-	/* Default cookie size for llog cookies (see struct llog_cookie). It is
-	 * initialized to zero at mount-time, then it tracks the largest
-	 * observed cookie size advertised by the MDT, up to a maximum value of
-	 * OBD_MAX_DEFAULT_COOKIE_SIZE. Note that llog_cookies are not
-	 * used by clients communicating with MDS versions 2.4.0 and later.
-	 */
-	u32			 cl_default_mds_cookiesize;
-	/* Maximum possible cookie size computed at mount-time based on
-	 * the number of OSTs in the filesystem. May be increased at
-	 * run-time if a larger observed size is advertised by the MDT.
-	 */
-	u32			 cl_max_mds_cookiesize;
 
 	enum lustre_sec_part     cl_sp_me;
 	enum lustre_sec_part     cl_sp_to;
@@ -447,8 +434,6 @@ struct lmv_obd {
 	int			connected;
 	int			max_easize;
 	int			max_def_easize;
-	int			max_cookiesize;
-	int			max_def_cookiesize;
 
 	u32			tgts_size; /* size of tgts array */
 	struct lmv_tgt_desc	**tgts;
@@ -505,21 +490,6 @@ struct niobuf_local {
 /* Don't conflict with on-wire flags OBD_BRW_WRITE, etc */
 #define N_LOCAL_TEMP_PAGE 0x10000000
 
-struct obd_trans_info {
-	__u64		    oti_xid;
-	/* Only used on the server side for tracking acks. */
-	struct oti_req_ack_lock {
-		struct lustre_handle lock;
-		__u32		mode;
-	}			oti_ack_locks[4];
-	void		    *oti_handle;
-	struct llog_cookie       oti_onecookie;
-	struct llog_cookie      *oti_logcookies;
-
-	/** VBR: versions */
-	__u64		    oti_pre_version;
-};
-
 /*
  * Events signalled through obd_notify() upcall-chain.
  */
@@ -891,24 +861,22 @@ struct obd_ops {
 			struct lov_stripe_md **mem_tgt,
 			struct lov_mds_md *disk_src, int disk_len);
 	int (*create)(const struct lu_env *env, struct obd_export *exp,
-		      struct obdo *oa, struct obd_trans_info *oti);
+		      struct obdo *oa);
 	int (*destroy)(const struct lu_env *env, struct obd_export *exp,
-		       struct obdo *oa, struct obd_trans_info *oti);
+		       struct obdo *oa);
 	int (*setattr)(const struct lu_env *, struct obd_export *exp,
-		       struct obd_info *oinfo, struct obd_trans_info *oti);
+		       struct obd_info *oinfo);
 	int (*getattr)(const struct lu_env *env, struct obd_export *exp,
 		       struct obd_info *oinfo);
 	int (*preprw)(const struct lu_env *env, int cmd,
 		      struct obd_export *exp, struct obdo *oa, int objcount,
 		      struct obd_ioobj *obj, struct niobuf_remote *remote,
-		      int *nr_pages, struct niobuf_local *local,
-		      struct obd_trans_info *oti);
+		      int *nr_pages, struct niobuf_local *local);
 	int (*commitrw)(const struct lu_env *env, int cmd,
 			struct obd_export *exp, struct obdo *oa,
 			int objcount, struct obd_ioobj *obj,
 			struct niobuf_remote *remote, int pages,
-			struct niobuf_local *local,
-			struct obd_trans_info *oti, int rc);
+			struct niobuf_local *local, int rc);
 	int (*init_export)(struct obd_export *exp);
 	int (*destroy_export)(struct obd_export *exp);
 
@@ -1018,7 +986,7 @@ struct md_ops {
 			u64, const char *, const char *, int, int, int,
 			struct ptlrpc_request **);
 
-	int (*init_ea_size)(struct obd_export *, u32, u32, u32, u32);
+	int (*init_ea_size)(struct obd_export *, u32, u32);
 
 	int (*get_lustre_md)(struct obd_export *, struct ptlrpc_request *,
 			     struct obd_export *, struct obd_export *,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 8f1d681..a27dbc8 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -686,26 +686,26 @@ static inline int obd_free_memmd(struct obd_export *exp,
 }
 
 static inline int obd_create(const struct lu_env *env, struct obd_export *exp,
-			     struct obdo *obdo, struct obd_trans_info *oti)
+			     struct obdo *obdo)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, create);
 	EXP_COUNTER_INCREMENT(exp, create);
 
-	rc = OBP(exp->exp_obd, create)(env, exp, obdo, oti);
+	rc = OBP(exp->exp_obd, create)(env, exp, obdo);
 	return rc;
 }
 
 static inline int obd_destroy(const struct lu_env *env, struct obd_export *exp,
-			      struct obdo *obdo, struct obd_trans_info *oti)
+			      struct obdo *obdo)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, destroy);
 	EXP_COUNTER_INCREMENT(exp, destroy);
 
-	rc = OBP(exp->exp_obd, destroy)(env, exp, obdo, oti);
+	rc = OBP(exp->exp_obd, destroy)(env, exp, obdo);
 	return rc;
 }
 
@@ -722,15 +722,14 @@ static inline int obd_getattr(const struct lu_env *env, struct obd_export *exp,
 }
 
 static inline int obd_setattr(const struct lu_env *env, struct obd_export *exp,
-			      struct obd_info *oinfo,
-			      struct obd_trans_info *oti)
+			      struct obd_info *oinfo)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, setattr);
 	EXP_COUNTER_INCREMENT(exp, setattr);
 
-	rc = OBP(exp->exp_obd, setattr)(env, exp, oinfo, oti);
+	rc = OBP(exp->exp_obd, setattr)(env, exp, oinfo);
 	return rc;
 }
 
@@ -1056,8 +1055,7 @@ static inline int obd_preprw(const struct lu_env *env, int cmd,
 			     struct obd_export *exp, struct obdo *oa,
 			     int objcount, struct obd_ioobj *obj,
 			     struct niobuf_remote *remote, int *pages,
-			     struct niobuf_local *local,
-			     struct obd_trans_info *oti)
+			     struct niobuf_local *local)
 {
 	int rc;
 
@@ -1065,7 +1063,7 @@ static inline int obd_preprw(const struct lu_env *env, int cmd,
 	EXP_COUNTER_INCREMENT(exp, preprw);
 
 	rc = OBP(exp->exp_obd, preprw)(env, cmd, exp, oa, objcount, obj, remote,
-				       pages, local, oti);
+				       pages, local);
 	return rc;
 }
 
@@ -1073,14 +1071,13 @@ static inline int obd_commitrw(const struct lu_env *env, int cmd,
 			       struct obd_export *exp, struct obdo *oa,
 			       int objcount, struct obd_ioobj *obj,
 			       struct niobuf_remote *rnb, int pages,
-			       struct niobuf_local *local,
-			       struct obd_trans_info *oti, int rc)
+			       struct niobuf_local *local, int rc)
 {
 	EXP_CHECK_DT_OP(exp, commitrw);
 	EXP_COUNTER_INCREMENT(exp, commitrw);
 
 	rc = OBP(exp->exp_obd, commitrw)(env, cmd, exp, oa, objcount, obj,
-					 rnb, pages, local, oti, rc);
+					 rnb, pages, local, rc);
 	return rc;
 }
 
@@ -1507,14 +1504,12 @@ static inline enum ldlm_mode md_lock_match(struct obd_export *exp, __u64 flags,
 					     policy, mode, lockh);
 }
 
-static inline int md_init_ea_size(struct obd_export *exp, int easize,
-				  int def_asize, int cookiesize,
-				  int def_cookiesize)
+static inline int md_init_ea_size(struct obd_export *exp, u32 easize,
+				  u32 def_asize)
 {
 	EXP_CHECK_MD_OP(exp, init_ea_size);
 	EXP_MD_COUNTER_INCREMENT(exp, init_ea_size);
-	return MDP(exp->exp_obd, init_ea_size)(exp, easize, def_asize,
-					       cookiesize, def_cookiesize);
+	return MDP(exp->exp_obd, init_ea_size)(exp, easize, def_asize);
 }
 
 static inline int md_intent_getattr_async(struct obd_export *exp,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index f3128b6..4f9480e 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -399,9 +399,8 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 	}
 
 	cli->cl_import = imp;
-	/* cli->cl_max_mds_{easize,cookiesize} updated by mdc_init_ea_size() */
+	/* cli->cl_max_mds_easize updated by mdc_init_ea_size() */
 	cli->cl_max_mds_easize = sizeof(struct lov_mds_md_v3);
-	cli->cl_max_mds_cookiesize = sizeof(struct llog_cookie);
 
 	if (LUSTRE_CFG_BUFLEN(lcfg, 3) > 0) {
 		if (!strcmp(lustre_cfg_string(lcfg, 3), "inactive")) {
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_misc.c b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
index 07d38e5..4562643 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
@@ -49,7 +49,7 @@ int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 {
 	struct lov_stripe_md lsm = { .lsm_magic = LOV_MAGIC_V3 };
 	__u32 valsize = sizeof(struct lov_desc);
-	int rc, easize, def_easize, cookiesize;
+	int rc, easize, def_easize;
 	struct lov_desc desc;
 	__u16 stripes, def_stripes;
 
@@ -67,16 +67,9 @@ int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 	lsm.lsm_stripe_count = def_stripes;
 	def_easize = obd_size_diskmd(dt_exp, &lsm);
 
-	cookiesize = stripes * sizeof(struct llog_cookie);
+	CDEBUG(D_HA, "updating def/max_easize: %d/%d\n", def_easize, easize);
 
-	/* default cookiesize is 0 because from 2.4 server doesn't send
-	 * llog cookies to client.
-	 */
-	CDEBUG(D_HA,
-	       "updating def/max_easize: %d/%d def/max_cookiesize: 0/%d\n",
-	       def_easize, easize, cookiesize);
-
-	rc = md_init_ea_size(md_exp, easize, def_easize, cookiesize, 0);
+	rc = md_init_ea_size(md_exp, easize, def_easize);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 67969a8..75f5958 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -245,8 +245,7 @@ static int lmv_connect(const struct lu_env *env,
 	return rc;
 }
 
-static int lmv_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
-			    u32 cookiesize, u32 def_cookiesize)
+static int lmv_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize)
 {
 	struct obd_device   *obd = exp->exp_obd;
 	struct lmv_obd      *lmv = &obd->u.lmv;
@@ -262,14 +261,7 @@ static int lmv_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
 		lmv->max_def_easize = def_easize;
 		change = 1;
 	}
-	if (lmv->max_cookiesize < cookiesize) {
-		lmv->max_cookiesize = cookiesize;
-		change = 1;
-	}
-	if (lmv->max_def_cookiesize < def_cookiesize) {
-		lmv->max_def_cookiesize = def_cookiesize;
-		change = 1;
-	}
+
 	if (change == 0)
 		return 0;
 
@@ -284,8 +276,7 @@ static int lmv_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
 			continue;
 		}
 
-		rc = md_init_ea_size(tgt->ltd_exp, easize, def_easize,
-				     cookiesize, def_cookiesize);
+		rc = md_init_ea_size(tgt->ltd_exp, easize, def_easize);
 		if (rc) {
 			CERROR("%s: obd_init_ea_size() failed on MDT target %d: rc = %d\n",
 			       obd->obd_name, i, rc);
@@ -368,8 +359,7 @@ static int lmv_connect_mdc(struct obd_device *obd, struct lmv_tgt_desc *tgt)
 	tgt->ltd_exp = mdc_exp;
 	lmv->desc.ld_active_tgt_count++;
 
-	md_init_ea_size(tgt->ltd_exp, lmv->max_easize, lmv->max_def_easize,
-			lmv->max_cookiesize, lmv->max_def_cookiesize);
+	md_init_ea_size(tgt->ltd_exp, lmv->max_easize, lmv->max_def_easize);
 
 	CDEBUG(D_CONFIG, "Connected to %s(%s) successfully (%d)\n",
 	       mdc_obd->obd_name, mdc_obd->obd_uuid.uuid,
@@ -483,7 +473,7 @@ static int lmv_add_target(struct obd_device *obd, struct obd_uuid *uuidp,
 		} else {
 			int easize = sizeof(struct lmv_stripe_md) +
 				lmv->desc.ld_tgt_count * sizeof(struct lu_fid);
-			lmv_init_ea_size(obd->obd_self_export, easize, 0, 0, 0);
+			lmv_init_ea_size(obd->obd_self_export, easize, 0);
 		}
 	}
 
@@ -538,7 +528,7 @@ int lmv_check_connect(struct obd_device *obd)
 	class_export_put(lmv->exp);
 	lmv->connected = 1;
 	easize = lmv_mds_md_size(lmv->desc.ld_tgt_count, LMV_MAGIC);
-	lmv_init_ea_size(obd->obd_self_export, easize, 0, 0, 0);
+	lmv_init_ea_size(obd->obd_self_export, easize, 0);
 	mutex_unlock(&lmv->lmv_init_mutex);
 	return 0;
 
@@ -1282,7 +1272,6 @@ static int lmv_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 	obd_str2uuid(&lmv->desc.ld_uuid, desc->ld_uuid.uuid);
 	lmv->desc.ld_tgt_count = 0;
 	lmv->desc.ld_active_tgt_count = 0;
-	lmv->max_cookiesize = 0;
 	lmv->max_def_easize = 0;
 	lmv->max_easize = 0;
 	lmv->lmv_placement = PLACEMENT_CHAR_POLICY;
diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index 4d2b7d3..2b03938 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -412,7 +412,6 @@ struct lov_io_sub {
 	int		  sub_refcheck;
 	int		  sub_refcheck2;
 	int		  sub_reenter;
-	void		*sub_cookie;
 };
 
 /**
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 60397a2..bd105d9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -110,8 +110,6 @@ struct lov_request_set {
 	atomic_t			set_completes;
 	atomic_t			set_success;
 	atomic_t			set_finish_checked;
-	struct llog_cookie		*set_cookies;
-	int				set_cookie_sent;
 	struct list_head			set_list;
 	wait_queue_head_t			set_waitq;
 };
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 42e66d1..c8734a6 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -44,7 +44,6 @@ static void lov_init_set(struct lov_request_set *set)
 	atomic_set(&set->set_completes, 0);
 	atomic_set(&set->set_success, 0);
 	atomic_set(&set->set_finish_checked, 0);
-	set->set_cookies = NULL;
 	INIT_LIST_HEAD(&set->set_list);
 	atomic_set(&set->set_refcount, 1);
 	init_waitqueue_head(&set->set_waitq);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index f1f6c08..5b3d0ba 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -386,8 +386,6 @@ static struct ptlrpc_request *mdc_intent_unlink_pack(struct obd_export *exp,
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obddev->u.cli.cl_default_mds_easize);
-	req_capsule_set_size(&req->rq_pill, &RMF_ACL, RCL_SERVER,
-			     obddev->u.cli.cl_default_mds_cookiesize);
 	ptlrpc_request_set_replen(req);
 	return req;
 }
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_reint.c b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
index 6f62a95..1847e5a 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_reint.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
@@ -288,8 +288,6 @@ int mdc_unlink(struct obd_export *exp, struct md_op_data *op_data,
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
-	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_SERVER,
-			     obd->u.cli.cl_default_mds_cookiesize);
 	ptlrpc_request_set_replen(req);
 
 	*request = req;
@@ -398,8 +396,6 @@ int mdc_rename(struct obd_export *exp, struct md_op_data *op_data,
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
-	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_SERVER,
-			     obd->u.cli.cl_default_mds_cookiesize);
 	ptlrpc_request_set_replen(req);
 
 	rc = mdc_reint(req, obd->u.cli.cl_rpc_lock, LUSTRE_IMP_FULL);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index b62b29f..34ccff8 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -787,8 +787,6 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
-	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_SERVER,
-			     obd->u.cli.cl_default_mds_cookiesize);
 
 	ptlrpc_request_set_replen(req);
 
@@ -2646,16 +2644,15 @@ err_rpc_lock:
 	return rc;
 }
 
-/* Initialize the default and maximum LOV EA and cookie sizes.  This allows
+/* Initialize the default and maximum LOV EA sizes. This allows
  * us to make MDS RPCs with large enough reply buffers to hold a default
- * sized EA and cookie without having to calculate this (via a call into the
+ * sized EA without having to calculate this (via a call into the
  * LOV + OSCs) each time we make an RPC.  The maximum size is also tracked
  * but not used to avoid wastefully vmalloc()'ing large reply buffers when
  * a large number of stripes is possible.  If a larger reply buffer is
  * required it will be reallocated in the ptlrpc layer due to overflow.
  */
-static int mdc_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
-			    u32 cookiesize, u32 def_cookiesize)
+static int mdc_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize)
 {
 	struct obd_device *obd = exp->exp_obd;
 	struct client_obd *cli = &obd->u.cli;
@@ -2666,12 +2663,6 @@ static int mdc_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
 	if (cli->cl_default_mds_easize < def_easize)
 		cli->cl_default_mds_easize = def_easize;
 
-	if (cli->cl_max_mds_cookiesize < cookiesize)
-		cli->cl_max_mds_cookiesize = cookiesize;
-
-	if (cli->cl_default_mds_cookiesize < def_cookiesize)
-		cli->cl_default_mds_cookiesize = def_cookiesize;
-
 	return 0;
 }
 
diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index 505582f..df6fbed 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -1100,7 +1100,7 @@ out:
 static u64 last_object_id;
 
 static int echo_create_object(const struct lu_env *env, struct echo_device *ed,
-			      struct obdo *oa, struct obd_trans_info *oti)
+			      struct obdo *oa)
 {
 	struct echo_object     *eco;
 	struct echo_client_obd *ec = ed->ed_ec;
@@ -1117,7 +1117,7 @@ static int echo_create_object(const struct lu_env *env, struct echo_device *ed,
 	if (!ostid_id(&oa->o_oi))
 		ostid_set_id(&oa->o_oi, ++last_object_id);
 
-	rc = obd_create(env, ec->ec_exp, oa, oti);
+	rc = obd_create(env, ec->ec_exp, oa);
 	if (rc != 0) {
 		CERROR("Cannot create objects: rc = %d\n", rc);
 		goto failed;
@@ -1137,7 +1137,7 @@ static int echo_create_object(const struct lu_env *env, struct echo_device *ed,
 
  failed:
 	if (created && rc)
-		obd_destroy(env, ec->ec_exp, oa, oti);
+		obd_destroy(env, ec->ec_exp, oa);
 	if (rc)
 		CERROR("create object failed with: rc = %d\n", rc);
 	return rc;
@@ -1237,8 +1237,7 @@ static int echo_client_page_debug_check(struct page *page, u64 id,
 
 static int echo_client_kbrw(struct echo_device *ed, int rw, struct obdo *oa,
 			    struct echo_object *eco, u64 offset,
-			    u64 count, int async,
-			    struct obd_trans_info *oti)
+			    u64 count, int async)
 {
 	u32	       npages;
 	struct brw_page	*pga;
@@ -1332,8 +1331,7 @@ static int echo_client_prep_commit(const struct lu_env *env,
 				   struct obd_export *exp, int rw,
 				   struct obdo *oa, struct echo_object *eco,
 				   u64 offset, u64 count,
-				   u64 batch, struct obd_trans_info *oti,
-				   int async)
+				   u64 batch, int async)
 {
 	struct obd_ioobj ioo;
 	struct niobuf_local *lnb;
@@ -1378,8 +1376,7 @@ static int echo_client_prep_commit(const struct lu_env *env,
 		ioo.ioo_bufcnt = npages;
 
 		lpages = npages;
-		ret = obd_preprw(env, rw, exp, oa, 1, &ioo, rnb, &lpages,
-				 lnb, oti);
+		ret = obd_preprw(env, rw, exp, oa, 1, &ioo, rnb, &lpages, lnb);
 		if (ret != 0)
 			goto out;
 		LASSERT(lpages == npages);
@@ -1411,14 +1408,11 @@ static int echo_client_prep_commit(const struct lu_env *env,
 							     rnb[i].rnb_len);
 		}
 
-		ret = obd_commitrw(env, rw, exp, oa, 1, &ioo,
-				   rnb, npages, lnb, oti, ret);
+		ret = obd_commitrw(env, rw, exp, oa, 1, &ioo, rnb, npages, lnb,
+				   ret);
 		if (ret != 0)
 			goto out;
 
-		/* Reset oti otherwise it would confuse ldiskfs. */
-		memset(oti, 0, sizeof(*oti));
-
 		/* Reuse env context. */
 		lu_context_exit((struct lu_context *)&env->le_ctx);
 		lu_context_enter((struct lu_context *)&env->le_ctx);
@@ -1432,8 +1426,7 @@ out:
 
 static int echo_client_brw_ioctl(const struct lu_env *env, int rw,
 				 struct obd_export *exp,
-				 struct obd_ioctl_data *data,
-				 struct obd_trans_info *dummy_oti)
+				 struct obd_ioctl_data *data)
 {
 	struct obd_device *obd = class_exp2obd(exp);
 	struct echo_device *ed = obd2echo_dev(obd);
@@ -1470,15 +1463,13 @@ static int echo_client_brw_ioctl(const struct lu_env *env, int rw,
 	case 1:
 		/* fall through */
 	case 2:
-		rc = echo_client_kbrw(ed, rw, oa,
-				      eco, data->ioc_offset,
-				      data->ioc_count, async, dummy_oti);
+		rc = echo_client_kbrw(ed, rw, oa, eco, data->ioc_offset,
+				      data->ioc_count, async);
 		break;
 	case 3:
-		rc = echo_client_prep_commit(env, ec->ec_exp, rw, oa,
-					     eco, data->ioc_offset,
-					     data->ioc_count, data->ioc_plen1,
-					     dummy_oti, async);
+		rc = echo_client_prep_commit(env, ec->ec_exp, rw, oa, eco,
+					     data->ioc_offset, data->ioc_count,
+					     data->ioc_plen1, async);
 		break;
 	default:
 		rc = -EINVAL;
@@ -1496,16 +1487,11 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	struct echo_client_obd *ec = ed->ed_ec;
 	struct echo_object     *eco;
 	struct obd_ioctl_data  *data = karg;
-	struct obd_trans_info   dummy_oti;
 	struct lu_env	  *env;
-	struct oti_req_ack_lock *ack_lock;
 	struct obdo	    *oa;
 	struct lu_fid	   fid;
 	int		     rw = OBD_BRW_READ;
 	int		     rc = 0;
-	int		     i;
-
-	memset(&dummy_oti, 0, sizeof(dummy_oti));
 
 	oa = &data->ioc_obdo1;
 	if (!(oa->o_valid & OBD_MD_FLGROUP)) {
@@ -1535,7 +1521,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 			goto out;
 		}
 
-		rc = echo_create_object(env, ed, oa, &dummy_oti);
+		rc = echo_create_object(env, ed, oa);
 		goto out;
 
 	case OBD_IOC_DESTROY:
@@ -1546,7 +1532,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 
 		rc = echo_get_object(&eco, ed, oa);
 		if (rc == 0) {
-			rc = obd_destroy(env, ec->ec_exp, oa, &dummy_oti);
+			rc = obd_destroy(env, ec->ec_exp, oa);
 			if (rc == 0)
 				eco->eo_deleted = 1;
 			echo_put_object(eco);
@@ -1577,7 +1563,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 				.oi_oa = oa,
 			};
 
-			rc = obd_setattr(env, ec->ec_exp, &oinfo, NULL);
+			rc = obd_setattr(env, ec->ec_exp, &oinfo);
 			echo_put_object(eco);
 		}
 		goto out;
@@ -1591,7 +1577,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		rw = OBD_BRW_WRITE;
 		/* fall through */
 	case OBD_IOC_BRW_READ:
-		rc = echo_client_brw_ioctl(env, rw, exp, data, &dummy_oti);
+		rc = echo_client_brw_ioctl(env, rw, exp, data);
 		goto out;
 
 	default:
@@ -1604,14 +1590,6 @@ out:
 	lu_env_fini(env);
 	kfree(env);
 
-	/* XXX this should be in a helper also called by target_send_reply */
-	for (ack_lock = dummy_oti.oti_ack_locks, i = 0; i < 4;
-	     i++, ack_lock++) {
-		if (!ack_lock->mode)
-			break;
-		ldlm_lock_decref(&ack_lock->lock, ack_lock->mode);
-	}
-
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index ab7f82d..64d95c1 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -221,7 +221,7 @@ static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 }
 
 static int osc_setattr(const struct lu_env *env, struct obd_export *exp,
-		       struct obd_info *oinfo, struct obd_trans_info *oti)
+		       struct obd_info *oinfo)
 {
 	struct ptlrpc_request *req;
 	struct ost_body *body;
@@ -329,7 +329,7 @@ int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
 }
 
 static int osc_create(const struct lu_env *env, struct obd_export *exp,
-		      struct obdo *oa, struct obd_trans_info *oti)
+		      struct obdo *oa)
 {
 	struct ptlrpc_request *req;
 	struct ost_body *body;
@@ -358,15 +358,6 @@ static int osc_create(const struct lu_env *env, struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	if ((oa->o_valid & OBD_MD_FLFLAGS) &&
-	    oa->o_flags == OBD_FL_DELORPHAN) {
-		DEBUG_REQ(D_HA, req,
-			  "delorphan from OST integration");
-		/* Don't resend the delorphan req */
-		req->rq_no_resend = 1;
-		req->rq_no_delay = 1;
-	}
-
 	rc = ptlrpc_queue_wait(req);
 	if (rc)
 		goto out_req;
@@ -383,12 +374,6 @@ static int osc_create(const struct lu_env *env, struct obd_export *exp,
 	oa->o_blksize = cli_brw_size(exp->exp_obd);
 	oa->o_valid |= OBD_MD_FLBLKSZ;
 
-	if (oti && oa->o_valid & OBD_MD_FLCOOKIE) {
-		if (!oti->oti_logcookies)
-			oti->oti_logcookies = &oti->oti_onecookie;
-		*oti->oti_logcookies = oa->o_lcookie;
-	}
-
 	CDEBUG(D_HA, "transno: %lld\n",
 	       lustre_msg_get_transno(req->rq_repmsg));
 out_req:
@@ -569,19 +554,8 @@ static int osc_can_send_destroy(struct client_obd *cli)
 	return 0;
 }
 
-/* Destroy requests can be async always on the client, and we don't even really
- * care about the return code since the client cannot do anything at all about
- * a destroy failure.
- * When the MDS is unlinking a filename, it saves the file objects into a
- * recovery llog, and these object records are cancelled when the OST reports
- * they were destroyed and sync'd to disk (i.e. transaction committed).
- * If the client dies, or the OST is down when the object should be destroyed,
- * the records are not cancelled, and when the OST reconnects to the MDS next,
- * it will retrieve the llog unlink logs and then sends the log cancellation
- * cookies to the MDS after committing destroy transactions.
- */
 static int osc_destroy(const struct lu_env *env, struct obd_export *exp,
-		       struct obdo *oa, struct obd_trans_info *oti)
+		       struct obdo *oa)
 {
 	struct client_obd *cli = &exp->exp_obd->u.cli;
 	struct ptlrpc_request *req;
@@ -613,32 +587,22 @@ static int osc_destroy(const struct lu_env *env, struct obd_export *exp,
 	req->rq_request_portal = OST_IO_PORTAL; /* bug 7198 */
 	ptlrpc_at_set_req_timeout(req);
 
-	if (oti && oa->o_valid & OBD_MD_FLCOOKIE)
-		oa->o_lcookie = *oti->oti_logcookies;
 	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
 	LASSERT(body);
 	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa, oa);
 
 	ptlrpc_request_set_replen(req);
 
-	/* If osc_destroy is for destroying the unlink orphan,
-	 * sent from MDT to OST, which should not be blocked here,
-	 * because the process might be triggered by ptlrpcd, and
-	 * it is not good to block ptlrpcd thread (b=16006
-	 **/
-	if (!(oa->o_flags & OBD_FL_DELORPHAN)) {
-		req->rq_interpret_reply = osc_destroy_interpret;
-		if (!osc_can_send_destroy(cli)) {
-			struct l_wait_info lwi = LWI_INTR(LWI_ON_SIGNAL_NOOP,
-							  NULL);
-
-			/*
-			 * Wait until the number of on-going destroy RPCs drops
-			 * under max_rpc_in_flight
-			 */
-			l_wait_event_exclusive(cli->cl_destroy_waitq,
-					       osc_can_send_destroy(cli), &lwi);
-		}
+	req->rq_interpret_reply = osc_destroy_interpret;
+	if (!osc_can_send_destroy(cli)) {
+		struct l_wait_info lwi = LWI_INTR(LWI_ON_SIGNAL_NOOP, NULL);
+
+		/*
+		 * Wait until the number of on-going destroy RPCs drops
+		 * under max_rpc_in_flight
+		 */
+		l_wait_event_exclusive(cli->cl_destroy_waitq,
+				       osc_can_send_destroy(cli), &lwi);
 	}
 
 	/* Do not wait for response */
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index bca781a..1c06b4e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1706,7 +1706,7 @@ void lustre_swab_mdt_body(struct mdt_body *b)
 	__swab32s(&b->mbo_eadatasize);
 	__swab32s(&b->mbo_aclsize);
 	__swab32s(&b->mbo_max_mdsize);
-	__swab32s(&b->mbo_max_cookiesize);
+	CLASSERT(offsetof(typeof(*b), mbo_unused3));
 	__swab32s(&b->mbo_uid_h);
 	__swab32s(&b->mbo_gid_h);
 	CLASSERT(offsetof(typeof(*b), mbo_padding_5) != 0);
@@ -2103,8 +2103,6 @@ static void dump_obdo(struct obdo *oa)
 	if (valid & OBD_MD_FLHANDLE)
 		CDEBUG(D_RPCTRACE, "obdo: o_handle = %lld\n",
 		       oa->o_handle.cookie);
-	if (valid & OBD_MD_FLCOOKIE)
-		CDEBUG(D_RPCTRACE, "obdo: o_lcookie = (llog_cookie dumping not yet implemented)\n");
 }
 
 void dump_ost_body(struct ost_body *ob)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index 391e83e..a6edc8d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -1245,8 +1245,6 @@ void lustre_assert_wire_constants(void)
 		 OBD_MD_FLCKSUM);
 	LASSERTF(OBD_MD_FLQOS == (0x00200000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLQOS);
-	LASSERTF(OBD_MD_FLCOOKIE == (0x00800000ULL), "found 0x%.16llxULL\n",
-		 OBD_MD_FLCOOKIE);
 	LASSERTF(OBD_MD_FLGROUP == (0x01000000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLGROUP);
 	LASSERTF(OBD_MD_FLFID == (0x02000000ULL), "found 0x%.16llxULL\n",
@@ -1823,10 +1821,10 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct mdt_body, mbo_max_mdsize));
 	LASSERTF((int)sizeof(((struct mdt_body *)0)->mbo_max_mdsize) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct mdt_body *)0)->mbo_max_mdsize));
-	LASSERTF((int)offsetof(struct mdt_body, mbo_max_cookiesize) == 160, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_body, mbo_max_cookiesize));
-	LASSERTF((int)sizeof(((struct mdt_body *)0)->mbo_max_cookiesize) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_body *)0)->mbo_max_cookiesize));
+	LASSERTF((int)offsetof(struct mdt_body, mbo_unused3) == 160, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_body, mbo_unused3));
+	LASSERTF((int)sizeof(((struct mdt_body *)0)->mbo_unused3) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_body *)0)->mbo_unused3));
 	LASSERTF((int)offsetof(struct mdt_body, mbo_uid_h) == 164, "found %lld\n",
 		 (long long)(int)offsetof(struct mdt_body, mbo_uid_h));
 	LASSERTF((int)sizeof(((struct mdt_body *)0)->mbo_uid_h) == 4, "found %lld\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 31/41] staging: lustre: obd: remove destroy cookie handling
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Clients no longer need to track the max and default MDS cookiesizes
so remove
 the obsolete obod o_valid flag OBD_MD_FLCOOKIE,
 the struct client_obd members cl_{default,max}_mds_cookiesize,
 the struct obd_trans_info and parameters of this type,
 the cookiesize parameters from md_init_ea_size(),
 the files llite/*/{default,max}_cookiesize, and
 any code that needlessly handled these values.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6017
Reviewed-on: http://review.whamcloud.com/12922
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    7 +-
 drivers/staging/lustre/lustre/include/lustre_mdc.h |   20 ++-----
 drivers/staging/lustre/lustre/include/obd.h        |   46 ++------------
 drivers/staging/lustre/lustre/include/obd_class.h  |   31 ++++------
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c      |    3 +-
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   13 +---
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |   23 ++-----
 .../staging/lustre/lustre/lov/lov_cl_internal.h    |    1 -
 drivers/staging/lustre/lustre/lov/lov_internal.h   |    2 -
 drivers/staging/lustre/lustre/lov/lov_request.c    |    1 -
 drivers/staging/lustre/lustre/mdc/mdc_locks.c      |    2 -
 drivers/staging/lustre/lustre/mdc/mdc_reint.c      |    4 -
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |   15 +----
 .../staging/lustre/lustre/obdecho/echo_client.c    |   58 ++++++-------------
 drivers/staging/lustre/lustre/osc/osc_request.c    |   62 ++++---------------
 .../staging/lustre/lustre/ptlrpc/pack_generic.c    |    4 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c    |   10 +--
 17 files changed, 79 insertions(+), 223 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 7645ed9..5d2f845 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1664,7 +1664,7 @@ lov_mds_md_max_stripe_count(size_t buf_size, __u32 lmm_magic)
 #define OBD_MD_FLCKSUM     (0x00100000ULL) /* bulk data checksum */
 #define OBD_MD_FLQOS       (0x00200000ULL) /* quality of service stats */
 /*#define OBD_MD_FLOSCOPQ    (0x00400000ULL) osc opaque data, never used */
-#define OBD_MD_FLCOOKIE    (0x00800000ULL) /* log cancellation cookie */
+/*	OBD_MD_FLCOOKIE    (0x00800000ULL) obsolete in 2.8 */
 #define OBD_MD_FLGROUP     (0x01000000ULL) /* group */
 #define OBD_MD_FLFID       (0x02000000ULL) /* ->ost write inline fid */
 #define OBD_MD_FLEPOCH     (0x04000000ULL) /* ->ost write with ioepoch */
@@ -2091,7 +2091,7 @@ struct mdt_body {
 	__u32	mbo_eadatasize;
 	__u32	mbo_aclsize;
 	__u32	mbo_max_mdsize;
-	__u32	mbo_max_cookiesize;
+	__u32	mbo_unused3;	/* was max_cookiesize until 2.8 */
 	__u32	mbo_uid_h;	/* high 32-bits of uid, for FUID */
 	__u32	mbo_gid_h;	/* high 32-bits of gid, for FUID */
 	__u32	mbo_padding_5;	/* also fix lustre_swab_mdt_body */
@@ -3226,7 +3226,8 @@ struct obdo {
 	__u32		   o_parent_ver;
 	struct lustre_handle    o_handle;  /* brw: lock handle to prolong locks
 					    */
-	struct llog_cookie      o_lcookie; /* destroy: unlink cookie from MDS
+	struct llog_cookie      o_lcookie; /* destroy: unlink cookie from MDS,
+					    * obsolete in 2.8, reused in OSP
 					    */
 	__u32			o_uid_h;
 	__u32			o_gid_h;
diff --git a/drivers/staging/lustre/lustre/include/lustre_mdc.h b/drivers/staging/lustre/lustre/include/lustre_mdc.h
index 8fc2d3f..92a5c0f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_mdc.h
+++ b/drivers/staging/lustre/lustre/include/lustre_mdc.h
@@ -157,15 +157,14 @@ static inline void mdc_put_rpc_lock(struct mdc_rpc_lock *lck,
 }
 
 /**
- * Update the maximum possible easize and cookiesize.
+ * Update the maximum possible easize.
  *
- * The values are learned from ptlrpc replies sent by the MDT.  The
- * default easize and cookiesize is initialized to the minimum value but
- * allowed to grow up to a single page in size if required to handle the
+ * This value is learned from ptlrpc replies sent by the MDT. The
+ * default easize is initialized to the minimum value but allowed
+ * to grow up to a single page in size if required to handle the
  * common case.
  *
- * \see client_obd::cl_default_mds_easize and
- * client_obd::cl_default_mds_cookiesize
+ * \see client_obd::cl_default_mds_easize
  *
  * \param[in] exp	export for MDC device
  * \param[in] body	body of ptlrpc reply from MDT
@@ -176,7 +175,7 @@ static inline void mdc_update_max_ea_from_body(struct obd_export *exp,
 {
 	if (body->mbo_valid & OBD_MD_FLMODEASIZE) {
 		struct client_obd *cli = &exp->exp_obd->u.cli;
-		u32 def_cookiesize, def_easize;
+		u32 def_easize;
 
 		if (cli->cl_max_mds_easize < body->mbo_max_mdsize)
 			cli->cl_max_mds_easize = body->mbo_max_mdsize;
@@ -184,13 +183,6 @@ static inline void mdc_update_max_ea_from_body(struct obd_export *exp,
 		def_easize = min_t(__u32, body->mbo_max_mdsize,
 				   OBD_MAX_DEFAULT_EA_SIZE);
 		cli->cl_default_mds_easize = def_easize;
-
-		if (cli->cl_max_mds_cookiesize < body->mbo_max_cookiesize)
-			cli->cl_max_mds_cookiesize = body->mbo_max_cookiesize;
-
-		def_cookiesize = min_t(__u32, body->mbo_max_cookiesize,
-				       OBD_MAX_DEFAULT_COOKIE_SIZE);
-		cli->cl_default_mds_cookiesize = def_cookiesize;
 	}
 }
 
diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 2811901..4691121 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -204,7 +204,6 @@ enum obd_cl_sem_lock_class {
  * on the MDS.
  */
 #define OBD_MAX_DEFAULT_EA_SIZE		4096
-#define OBD_MAX_DEFAULT_COOKIE_SIZE	4096
 
 struct mdc_rpc_lock;
 struct obd_import;
@@ -214,7 +213,7 @@ struct client_obd {
 	struct obd_import       *cl_import; /* ptlrpc connection state */
 	size_t			 cl_conn_count;
 	/*
-	 * Cache maximum and default values for easize and cookiesize. This is
+	 * Cache maximum and default values for easize. This is
 	 * strictly a performance optimization to minimize calls to
 	 * obd_size_diskmd(). The default values are used to calculate the
 	 * initial size of a request buffer. The ptlrpc layer will resize the
@@ -235,18 +234,6 @@ struct client_obd {
 	 * run-time if a larger observed size is advertised by the MDT.
 	 */
 	u32			 cl_max_mds_easize;
-	/* Default cookie size for llog cookies (see struct llog_cookie). It is
-	 * initialized to zero at mount-time, then it tracks the largest
-	 * observed cookie size advertised by the MDT, up to a maximum value of
-	 * OBD_MAX_DEFAULT_COOKIE_SIZE. Note that llog_cookies are not
-	 * used by clients communicating with MDS versions 2.4.0 and later.
-	 */
-	u32			 cl_default_mds_cookiesize;
-	/* Maximum possible cookie size computed at mount-time based on
-	 * the number of OSTs in the filesystem. May be increased at
-	 * run-time if a larger observed size is advertised by the MDT.
-	 */
-	u32			 cl_max_mds_cookiesize;
 
 	enum lustre_sec_part     cl_sp_me;
 	enum lustre_sec_part     cl_sp_to;
@@ -447,8 +434,6 @@ struct lmv_obd {
 	int			connected;
 	int			max_easize;
 	int			max_def_easize;
-	int			max_cookiesize;
-	int			max_def_cookiesize;
 
 	u32			tgts_size; /* size of tgts array */
 	struct lmv_tgt_desc	**tgts;
@@ -505,21 +490,6 @@ struct niobuf_local {
 /* Don't conflict with on-wire flags OBD_BRW_WRITE, etc */
 #define N_LOCAL_TEMP_PAGE 0x10000000
 
-struct obd_trans_info {
-	__u64		    oti_xid;
-	/* Only used on the server side for tracking acks. */
-	struct oti_req_ack_lock {
-		struct lustre_handle lock;
-		__u32		mode;
-	}			oti_ack_locks[4];
-	void		    *oti_handle;
-	struct llog_cookie       oti_onecookie;
-	struct llog_cookie      *oti_logcookies;
-
-	/** VBR: versions */
-	__u64		    oti_pre_version;
-};
-
 /*
  * Events signalled through obd_notify() upcall-chain.
  */
@@ -891,24 +861,22 @@ struct obd_ops {
 			struct lov_stripe_md **mem_tgt,
 			struct lov_mds_md *disk_src, int disk_len);
 	int (*create)(const struct lu_env *env, struct obd_export *exp,
-		      struct obdo *oa, struct obd_trans_info *oti);
+		      struct obdo *oa);
 	int (*destroy)(const struct lu_env *env, struct obd_export *exp,
-		       struct obdo *oa, struct obd_trans_info *oti);
+		       struct obdo *oa);
 	int (*setattr)(const struct lu_env *, struct obd_export *exp,
-		       struct obd_info *oinfo, struct obd_trans_info *oti);
+		       struct obd_info *oinfo);
 	int (*getattr)(const struct lu_env *env, struct obd_export *exp,
 		       struct obd_info *oinfo);
 	int (*preprw)(const struct lu_env *env, int cmd,
 		      struct obd_export *exp, struct obdo *oa, int objcount,
 		      struct obd_ioobj *obj, struct niobuf_remote *remote,
-		      int *nr_pages, struct niobuf_local *local,
-		      struct obd_trans_info *oti);
+		      int *nr_pages, struct niobuf_local *local);
 	int (*commitrw)(const struct lu_env *env, int cmd,
 			struct obd_export *exp, struct obdo *oa,
 			int objcount, struct obd_ioobj *obj,
 			struct niobuf_remote *remote, int pages,
-			struct niobuf_local *local,
-			struct obd_trans_info *oti, int rc);
+			struct niobuf_local *local, int rc);
 	int (*init_export)(struct obd_export *exp);
 	int (*destroy_export)(struct obd_export *exp);
 
@@ -1018,7 +986,7 @@ struct md_ops {
 			u64, const char *, const char *, int, int, int,
 			struct ptlrpc_request **);
 
-	int (*init_ea_size)(struct obd_export *, u32, u32, u32, u32);
+	int (*init_ea_size)(struct obd_export *, u32, u32);
 
 	int (*get_lustre_md)(struct obd_export *, struct ptlrpc_request *,
 			     struct obd_export *, struct obd_export *,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 8f1d681..a27dbc8 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -686,26 +686,26 @@ static inline int obd_free_memmd(struct obd_export *exp,
 }
 
 static inline int obd_create(const struct lu_env *env, struct obd_export *exp,
-			     struct obdo *obdo, struct obd_trans_info *oti)
+			     struct obdo *obdo)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, create);
 	EXP_COUNTER_INCREMENT(exp, create);
 
-	rc = OBP(exp->exp_obd, create)(env, exp, obdo, oti);
+	rc = OBP(exp->exp_obd, create)(env, exp, obdo);
 	return rc;
 }
 
 static inline int obd_destroy(const struct lu_env *env, struct obd_export *exp,
-			      struct obdo *obdo, struct obd_trans_info *oti)
+			      struct obdo *obdo)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, destroy);
 	EXP_COUNTER_INCREMENT(exp, destroy);
 
-	rc = OBP(exp->exp_obd, destroy)(env, exp, obdo, oti);
+	rc = OBP(exp->exp_obd, destroy)(env, exp, obdo);
 	return rc;
 }
 
@@ -722,15 +722,14 @@ static inline int obd_getattr(const struct lu_env *env, struct obd_export *exp,
 }
 
 static inline int obd_setattr(const struct lu_env *env, struct obd_export *exp,
-			      struct obd_info *oinfo,
-			      struct obd_trans_info *oti)
+			      struct obd_info *oinfo)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, setattr);
 	EXP_COUNTER_INCREMENT(exp, setattr);
 
-	rc = OBP(exp->exp_obd, setattr)(env, exp, oinfo, oti);
+	rc = OBP(exp->exp_obd, setattr)(env, exp, oinfo);
 	return rc;
 }
 
@@ -1056,8 +1055,7 @@ static inline int obd_preprw(const struct lu_env *env, int cmd,
 			     struct obd_export *exp, struct obdo *oa,
 			     int objcount, struct obd_ioobj *obj,
 			     struct niobuf_remote *remote, int *pages,
-			     struct niobuf_local *local,
-			     struct obd_trans_info *oti)
+			     struct niobuf_local *local)
 {
 	int rc;
 
@@ -1065,7 +1063,7 @@ static inline int obd_preprw(const struct lu_env *env, int cmd,
 	EXP_COUNTER_INCREMENT(exp, preprw);
 
 	rc = OBP(exp->exp_obd, preprw)(env, cmd, exp, oa, objcount, obj, remote,
-				       pages, local, oti);
+				       pages, local);
 	return rc;
 }
 
@@ -1073,14 +1071,13 @@ static inline int obd_commitrw(const struct lu_env *env, int cmd,
 			       struct obd_export *exp, struct obdo *oa,
 			       int objcount, struct obd_ioobj *obj,
 			       struct niobuf_remote *rnb, int pages,
-			       struct niobuf_local *local,
-			       struct obd_trans_info *oti, int rc)
+			       struct niobuf_local *local, int rc)
 {
 	EXP_CHECK_DT_OP(exp, commitrw);
 	EXP_COUNTER_INCREMENT(exp, commitrw);
 
 	rc = OBP(exp->exp_obd, commitrw)(env, cmd, exp, oa, objcount, obj,
-					 rnb, pages, local, oti, rc);
+					 rnb, pages, local, rc);
 	return rc;
 }
 
@@ -1507,14 +1504,12 @@ static inline enum ldlm_mode md_lock_match(struct obd_export *exp, __u64 flags,
 					     policy, mode, lockh);
 }
 
-static inline int md_init_ea_size(struct obd_export *exp, int easize,
-				  int def_asize, int cookiesize,
-				  int def_cookiesize)
+static inline int md_init_ea_size(struct obd_export *exp, u32 easize,
+				  u32 def_asize)
 {
 	EXP_CHECK_MD_OP(exp, init_ea_size);
 	EXP_MD_COUNTER_INCREMENT(exp, init_ea_size);
-	return MDP(exp->exp_obd, init_ea_size)(exp, easize, def_asize,
-					       cookiesize, def_cookiesize);
+	return MDP(exp->exp_obd, init_ea_size)(exp, easize, def_asize);
 }
 
 static inline int md_intent_getattr_async(struct obd_export *exp,
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index f3128b6..4f9480e 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -399,9 +399,8 @@ int client_obd_setup(struct obd_device *obddev, struct lustre_cfg *lcfg)
 	}
 
 	cli->cl_import = imp;
-	/* cli->cl_max_mds_{easize,cookiesize} updated by mdc_init_ea_size() */
+	/* cli->cl_max_mds_easize updated by mdc_init_ea_size() */
 	cli->cl_max_mds_easize = sizeof(struct lov_mds_md_v3);
-	cli->cl_max_mds_cookiesize = sizeof(struct llog_cookie);
 
 	if (LUSTRE_CFG_BUFLEN(lcfg, 3) > 0) {
 		if (!strcmp(lustre_cfg_string(lcfg, 3), "inactive")) {
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_misc.c b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
index 07d38e5..4562643 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
@@ -49,7 +49,7 @@ int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 {
 	struct lov_stripe_md lsm = { .lsm_magic = LOV_MAGIC_V3 };
 	__u32 valsize = sizeof(struct lov_desc);
-	int rc, easize, def_easize, cookiesize;
+	int rc, easize, def_easize;
 	struct lov_desc desc;
 	__u16 stripes, def_stripes;
 
@@ -67,16 +67,9 @@ int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 	lsm.lsm_stripe_count = def_stripes;
 	def_easize = obd_size_diskmd(dt_exp, &lsm);
 
-	cookiesize = stripes * sizeof(struct llog_cookie);
+	CDEBUG(D_HA, "updating def/max_easize: %d/%d\n", def_easize, easize);
 
-	/* default cookiesize is 0 because from 2.4 server doesn't send
-	 * llog cookies to client.
-	 */
-	CDEBUG(D_HA,
-	       "updating def/max_easize: %d/%d def/max_cookiesize: 0/%d\n",
-	       def_easize, easize, cookiesize);
-
-	rc = md_init_ea_size(md_exp, easize, def_easize, cookiesize, 0);
+	rc = md_init_ea_size(md_exp, easize, def_easize);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 67969a8..75f5958 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -245,8 +245,7 @@ static int lmv_connect(const struct lu_env *env,
 	return rc;
 }
 
-static int lmv_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
-			    u32 cookiesize, u32 def_cookiesize)
+static int lmv_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize)
 {
 	struct obd_device   *obd = exp->exp_obd;
 	struct lmv_obd      *lmv = &obd->u.lmv;
@@ -262,14 +261,7 @@ static int lmv_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
 		lmv->max_def_easize = def_easize;
 		change = 1;
 	}
-	if (lmv->max_cookiesize < cookiesize) {
-		lmv->max_cookiesize = cookiesize;
-		change = 1;
-	}
-	if (lmv->max_def_cookiesize < def_cookiesize) {
-		lmv->max_def_cookiesize = def_cookiesize;
-		change = 1;
-	}
+
 	if (change == 0)
 		return 0;
 
@@ -284,8 +276,7 @@ static int lmv_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
 			continue;
 		}
 
-		rc = md_init_ea_size(tgt->ltd_exp, easize, def_easize,
-				     cookiesize, def_cookiesize);
+		rc = md_init_ea_size(tgt->ltd_exp, easize, def_easize);
 		if (rc) {
 			CERROR("%s: obd_init_ea_size() failed on MDT target %d: rc = %d\n",
 			       obd->obd_name, i, rc);
@@ -368,8 +359,7 @@ static int lmv_connect_mdc(struct obd_device *obd, struct lmv_tgt_desc *tgt)
 	tgt->ltd_exp = mdc_exp;
 	lmv->desc.ld_active_tgt_count++;
 
-	md_init_ea_size(tgt->ltd_exp, lmv->max_easize, lmv->max_def_easize,
-			lmv->max_cookiesize, lmv->max_def_cookiesize);
+	md_init_ea_size(tgt->ltd_exp, lmv->max_easize, lmv->max_def_easize);
 
 	CDEBUG(D_CONFIG, "Connected to %s(%s) successfully (%d)\n",
 	       mdc_obd->obd_name, mdc_obd->obd_uuid.uuid,
@@ -483,7 +473,7 @@ static int lmv_add_target(struct obd_device *obd, struct obd_uuid *uuidp,
 		} else {
 			int easize = sizeof(struct lmv_stripe_md) +
 				lmv->desc.ld_tgt_count * sizeof(struct lu_fid);
-			lmv_init_ea_size(obd->obd_self_export, easize, 0, 0, 0);
+			lmv_init_ea_size(obd->obd_self_export, easize, 0);
 		}
 	}
 
@@ -538,7 +528,7 @@ int lmv_check_connect(struct obd_device *obd)
 	class_export_put(lmv->exp);
 	lmv->connected = 1;
 	easize = lmv_mds_md_size(lmv->desc.ld_tgt_count, LMV_MAGIC);
-	lmv_init_ea_size(obd->obd_self_export, easize, 0, 0, 0);
+	lmv_init_ea_size(obd->obd_self_export, easize, 0);
 	mutex_unlock(&lmv->lmv_init_mutex);
 	return 0;
 
@@ -1282,7 +1272,6 @@ static int lmv_setup(struct obd_device *obd, struct lustre_cfg *lcfg)
 	obd_str2uuid(&lmv->desc.ld_uuid, desc->ld_uuid.uuid);
 	lmv->desc.ld_tgt_count = 0;
 	lmv->desc.ld_active_tgt_count = 0;
-	lmv->max_cookiesize = 0;
 	lmv->max_def_easize = 0;
 	lmv->max_easize = 0;
 	lmv->lmv_placement = PLACEMENT_CHAR_POLICY;
diff --git a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
index 4d2b7d3..2b03938 100644
--- a/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_cl_internal.h
@@ -412,7 +412,6 @@ struct lov_io_sub {
 	int		  sub_refcheck;
 	int		  sub_refcheck2;
 	int		  sub_reenter;
-	void		*sub_cookie;
 };
 
 /**
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index 60397a2..bd105d9 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -110,8 +110,6 @@ struct lov_request_set {
 	atomic_t			set_completes;
 	atomic_t			set_success;
 	atomic_t			set_finish_checked;
-	struct llog_cookie		*set_cookies;
-	int				set_cookie_sent;
 	struct list_head			set_list;
 	wait_queue_head_t			set_waitq;
 };
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index 42e66d1..c8734a6 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -44,7 +44,6 @@ static void lov_init_set(struct lov_request_set *set)
 	atomic_set(&set->set_completes, 0);
 	atomic_set(&set->set_success, 0);
 	atomic_set(&set->set_finish_checked, 0);
-	set->set_cookies = NULL;
 	INIT_LIST_HEAD(&set->set_list);
 	atomic_set(&set->set_refcount, 1);
 	init_waitqueue_head(&set->set_waitq);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index f1f6c08..5b3d0ba 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -386,8 +386,6 @@ static struct ptlrpc_request *mdc_intent_unlink_pack(struct obd_export *exp,
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obddev->u.cli.cl_default_mds_easize);
-	req_capsule_set_size(&req->rq_pill, &RMF_ACL, RCL_SERVER,
-			     obddev->u.cli.cl_default_mds_cookiesize);
 	ptlrpc_request_set_replen(req);
 	return req;
 }
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_reint.c b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
index 6f62a95..1847e5a 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_reint.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_reint.c
@@ -288,8 +288,6 @@ int mdc_unlink(struct obd_export *exp, struct md_op_data *op_data,
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
-	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_SERVER,
-			     obd->u.cli.cl_default_mds_cookiesize);
 	ptlrpc_request_set_replen(req);
 
 	*request = req;
@@ -398,8 +396,6 @@ int mdc_rename(struct obd_export *exp, struct md_op_data *op_data,
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
-	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_SERVER,
-			     obd->u.cli.cl_default_mds_cookiesize);
 	ptlrpc_request_set_replen(req);
 
 	rc = mdc_reint(req, obd->u.cli.cl_rpc_lock, LUSTRE_IMP_FULL);
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index b62b29f..34ccff8 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -787,8 +787,6 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 
 	req_capsule_set_size(&req->rq_pill, &RMF_MDT_MD, RCL_SERVER,
 			     obd->u.cli.cl_default_mds_easize);
-	req_capsule_set_size(&req->rq_pill, &RMF_LOGCOOKIES, RCL_SERVER,
-			     obd->u.cli.cl_default_mds_cookiesize);
 
 	ptlrpc_request_set_replen(req);
 
@@ -2646,16 +2644,15 @@ err_rpc_lock:
 	return rc;
 }
 
-/* Initialize the default and maximum LOV EA and cookie sizes.  This allows
+/* Initialize the default and maximum LOV EA sizes. This allows
  * us to make MDS RPCs with large enough reply buffers to hold a default
- * sized EA and cookie without having to calculate this (via a call into the
+ * sized EA without having to calculate this (via a call into the
  * LOV + OSCs) each time we make an RPC.  The maximum size is also tracked
  * but not used to avoid wastefully vmalloc()'ing large reply buffers when
  * a large number of stripes is possible.  If a larger reply buffer is
  * required it will be reallocated in the ptlrpc layer due to overflow.
  */
-static int mdc_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
-			    u32 cookiesize, u32 def_cookiesize)
+static int mdc_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize)
 {
 	struct obd_device *obd = exp->exp_obd;
 	struct client_obd *cli = &obd->u.cli;
@@ -2666,12 +2663,6 @@ static int mdc_init_ea_size(struct obd_export *exp, u32 easize, u32 def_easize,
 	if (cli->cl_default_mds_easize < def_easize)
 		cli->cl_default_mds_easize = def_easize;
 
-	if (cli->cl_max_mds_cookiesize < cookiesize)
-		cli->cl_max_mds_cookiesize = cookiesize;
-
-	if (cli->cl_default_mds_cookiesize < def_cookiesize)
-		cli->cl_default_mds_cookiesize = def_cookiesize;
-
 	return 0;
 }
 
diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index 505582f..df6fbed 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -1100,7 +1100,7 @@ out:
 static u64 last_object_id;
 
 static int echo_create_object(const struct lu_env *env, struct echo_device *ed,
-			      struct obdo *oa, struct obd_trans_info *oti)
+			      struct obdo *oa)
 {
 	struct echo_object     *eco;
 	struct echo_client_obd *ec = ed->ed_ec;
@@ -1117,7 +1117,7 @@ static int echo_create_object(const struct lu_env *env, struct echo_device *ed,
 	if (!ostid_id(&oa->o_oi))
 		ostid_set_id(&oa->o_oi, ++last_object_id);
 
-	rc = obd_create(env, ec->ec_exp, oa, oti);
+	rc = obd_create(env, ec->ec_exp, oa);
 	if (rc != 0) {
 		CERROR("Cannot create objects: rc = %d\n", rc);
 		goto failed;
@@ -1137,7 +1137,7 @@ static int echo_create_object(const struct lu_env *env, struct echo_device *ed,
 
  failed:
 	if (created && rc)
-		obd_destroy(env, ec->ec_exp, oa, oti);
+		obd_destroy(env, ec->ec_exp, oa);
 	if (rc)
 		CERROR("create object failed with: rc = %d\n", rc);
 	return rc;
@@ -1237,8 +1237,7 @@ static int echo_client_page_debug_check(struct page *page, u64 id,
 
 static int echo_client_kbrw(struct echo_device *ed, int rw, struct obdo *oa,
 			    struct echo_object *eco, u64 offset,
-			    u64 count, int async,
-			    struct obd_trans_info *oti)
+			    u64 count, int async)
 {
 	u32	       npages;
 	struct brw_page	*pga;
@@ -1332,8 +1331,7 @@ static int echo_client_prep_commit(const struct lu_env *env,
 				   struct obd_export *exp, int rw,
 				   struct obdo *oa, struct echo_object *eco,
 				   u64 offset, u64 count,
-				   u64 batch, struct obd_trans_info *oti,
-				   int async)
+				   u64 batch, int async)
 {
 	struct obd_ioobj ioo;
 	struct niobuf_local *lnb;
@@ -1378,8 +1376,7 @@ static int echo_client_prep_commit(const struct lu_env *env,
 		ioo.ioo_bufcnt = npages;
 
 		lpages = npages;
-		ret = obd_preprw(env, rw, exp, oa, 1, &ioo, rnb, &lpages,
-				 lnb, oti);
+		ret = obd_preprw(env, rw, exp, oa, 1, &ioo, rnb, &lpages, lnb);
 		if (ret != 0)
 			goto out;
 		LASSERT(lpages == npages);
@@ -1411,14 +1408,11 @@ static int echo_client_prep_commit(const struct lu_env *env,
 							     rnb[i].rnb_len);
 		}
 
-		ret = obd_commitrw(env, rw, exp, oa, 1, &ioo,
-				   rnb, npages, lnb, oti, ret);
+		ret = obd_commitrw(env, rw, exp, oa, 1, &ioo, rnb, npages, lnb,
+				   ret);
 		if (ret != 0)
 			goto out;
 
-		/* Reset oti otherwise it would confuse ldiskfs. */
-		memset(oti, 0, sizeof(*oti));
-
 		/* Reuse env context. */
 		lu_context_exit((struct lu_context *)&env->le_ctx);
 		lu_context_enter((struct lu_context *)&env->le_ctx);
@@ -1432,8 +1426,7 @@ out:
 
 static int echo_client_brw_ioctl(const struct lu_env *env, int rw,
 				 struct obd_export *exp,
-				 struct obd_ioctl_data *data,
-				 struct obd_trans_info *dummy_oti)
+				 struct obd_ioctl_data *data)
 {
 	struct obd_device *obd = class_exp2obd(exp);
 	struct echo_device *ed = obd2echo_dev(obd);
@@ -1470,15 +1463,13 @@ static int echo_client_brw_ioctl(const struct lu_env *env, int rw,
 	case 1:
 		/* fall through */
 	case 2:
-		rc = echo_client_kbrw(ed, rw, oa,
-				      eco, data->ioc_offset,
-				      data->ioc_count, async, dummy_oti);
+		rc = echo_client_kbrw(ed, rw, oa, eco, data->ioc_offset,
+				      data->ioc_count, async);
 		break;
 	case 3:
-		rc = echo_client_prep_commit(env, ec->ec_exp, rw, oa,
-					     eco, data->ioc_offset,
-					     data->ioc_count, data->ioc_plen1,
-					     dummy_oti, async);
+		rc = echo_client_prep_commit(env, ec->ec_exp, rw, oa, eco,
+					     data->ioc_offset, data->ioc_count,
+					     data->ioc_plen1, async);
 		break;
 	default:
 		rc = -EINVAL;
@@ -1496,16 +1487,11 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	struct echo_client_obd *ec = ed->ed_ec;
 	struct echo_object     *eco;
 	struct obd_ioctl_data  *data = karg;
-	struct obd_trans_info   dummy_oti;
 	struct lu_env	  *env;
-	struct oti_req_ack_lock *ack_lock;
 	struct obdo	    *oa;
 	struct lu_fid	   fid;
 	int		     rw = OBD_BRW_READ;
 	int		     rc = 0;
-	int		     i;
-
-	memset(&dummy_oti, 0, sizeof(dummy_oti));
 
 	oa = &data->ioc_obdo1;
 	if (!(oa->o_valid & OBD_MD_FLGROUP)) {
@@ -1535,7 +1521,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 			goto out;
 		}
 
-		rc = echo_create_object(env, ed, oa, &dummy_oti);
+		rc = echo_create_object(env, ed, oa);
 		goto out;
 
 	case OBD_IOC_DESTROY:
@@ -1546,7 +1532,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 
 		rc = echo_get_object(&eco, ed, oa);
 		if (rc == 0) {
-			rc = obd_destroy(env, ec->ec_exp, oa, &dummy_oti);
+			rc = obd_destroy(env, ec->ec_exp, oa);
 			if (rc == 0)
 				eco->eo_deleted = 1;
 			echo_put_object(eco);
@@ -1577,7 +1563,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 				.oi_oa = oa,
 			};
 
-			rc = obd_setattr(env, ec->ec_exp, &oinfo, NULL);
+			rc = obd_setattr(env, ec->ec_exp, &oinfo);
 			echo_put_object(eco);
 		}
 		goto out;
@@ -1591,7 +1577,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		rw = OBD_BRW_WRITE;
 		/* fall through */
 	case OBD_IOC_BRW_READ:
-		rc = echo_client_brw_ioctl(env, rw, exp, data, &dummy_oti);
+		rc = echo_client_brw_ioctl(env, rw, exp, data);
 		goto out;
 
 	default:
@@ -1604,14 +1590,6 @@ out:
 	lu_env_fini(env);
 	kfree(env);
 
-	/* XXX this should be in a helper also called by target_send_reply */
-	for (ack_lock = dummy_oti.oti_ack_locks, i = 0; i < 4;
-	     i++, ack_lock++) {
-		if (!ack_lock->mode)
-			break;
-		ldlm_lock_decref(&ack_lock->lock, ack_lock->mode);
-	}
-
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index ab7f82d..64d95c1 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -221,7 +221,7 @@ static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 }
 
 static int osc_setattr(const struct lu_env *env, struct obd_export *exp,
-		       struct obd_info *oinfo, struct obd_trans_info *oti)
+		       struct obd_info *oinfo)
 {
 	struct ptlrpc_request *req;
 	struct ost_body *body;
@@ -329,7 +329,7 @@ int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
 }
 
 static int osc_create(const struct lu_env *env, struct obd_export *exp,
-		      struct obdo *oa, struct obd_trans_info *oti)
+		      struct obdo *oa)
 {
 	struct ptlrpc_request *req;
 	struct ost_body *body;
@@ -358,15 +358,6 @@ static int osc_create(const struct lu_env *env, struct obd_export *exp,
 
 	ptlrpc_request_set_replen(req);
 
-	if ((oa->o_valid & OBD_MD_FLFLAGS) &&
-	    oa->o_flags == OBD_FL_DELORPHAN) {
-		DEBUG_REQ(D_HA, req,
-			  "delorphan from OST integration");
-		/* Don't resend the delorphan req */
-		req->rq_no_resend = 1;
-		req->rq_no_delay = 1;
-	}
-
 	rc = ptlrpc_queue_wait(req);
 	if (rc)
 		goto out_req;
@@ -383,12 +374,6 @@ static int osc_create(const struct lu_env *env, struct obd_export *exp,
 	oa->o_blksize = cli_brw_size(exp->exp_obd);
 	oa->o_valid |= OBD_MD_FLBLKSZ;
 
-	if (oti && oa->o_valid & OBD_MD_FLCOOKIE) {
-		if (!oti->oti_logcookies)
-			oti->oti_logcookies = &oti->oti_onecookie;
-		*oti->oti_logcookies = oa->o_lcookie;
-	}
-
 	CDEBUG(D_HA, "transno: %lld\n",
 	       lustre_msg_get_transno(req->rq_repmsg));
 out_req:
@@ -569,19 +554,8 @@ static int osc_can_send_destroy(struct client_obd *cli)
 	return 0;
 }
 
-/* Destroy requests can be async always on the client, and we don't even really
- * care about the return code since the client cannot do anything at all about
- * a destroy failure.
- * When the MDS is unlinking a filename, it saves the file objects into a
- * recovery llog, and these object records are cancelled when the OST reports
- * they were destroyed and sync'd to disk (i.e. transaction committed).
- * If the client dies, or the OST is down when the object should be destroyed,
- * the records are not cancelled, and when the OST reconnects to the MDS next,
- * it will retrieve the llog unlink logs and then sends the log cancellation
- * cookies to the MDS after committing destroy transactions.
- */
 static int osc_destroy(const struct lu_env *env, struct obd_export *exp,
-		       struct obdo *oa, struct obd_trans_info *oti)
+		       struct obdo *oa)
 {
 	struct client_obd *cli = &exp->exp_obd->u.cli;
 	struct ptlrpc_request *req;
@@ -613,32 +587,22 @@ static int osc_destroy(const struct lu_env *env, struct obd_export *exp,
 	req->rq_request_portal = OST_IO_PORTAL; /* bug 7198 */
 	ptlrpc_at_set_req_timeout(req);
 
-	if (oti && oa->o_valid & OBD_MD_FLCOOKIE)
-		oa->o_lcookie = *oti->oti_logcookies;
 	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
 	LASSERT(body);
 	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa, oa);
 
 	ptlrpc_request_set_replen(req);
 
-	/* If osc_destroy is for destroying the unlink orphan,
-	 * sent from MDT to OST, which should not be blocked here,
-	 * because the process might be triggered by ptlrpcd, and
-	 * it is not good to block ptlrpcd thread (b=16006
-	 **/
-	if (!(oa->o_flags & OBD_FL_DELORPHAN)) {
-		req->rq_interpret_reply = osc_destroy_interpret;
-		if (!osc_can_send_destroy(cli)) {
-			struct l_wait_info lwi = LWI_INTR(LWI_ON_SIGNAL_NOOP,
-							  NULL);
-
-			/*
-			 * Wait until the number of on-going destroy RPCs drops
-			 * under max_rpc_in_flight
-			 */
-			l_wait_event_exclusive(cli->cl_destroy_waitq,
-					       osc_can_send_destroy(cli), &lwi);
-		}
+	req->rq_interpret_reply = osc_destroy_interpret;
+	if (!osc_can_send_destroy(cli)) {
+		struct l_wait_info lwi = LWI_INTR(LWI_ON_SIGNAL_NOOP, NULL);
+
+		/*
+		 * Wait until the number of on-going destroy RPCs drops
+		 * under max_rpc_in_flight
+		 */
+		l_wait_event_exclusive(cli->cl_destroy_waitq,
+				       osc_can_send_destroy(cli), &lwi);
 	}
 
 	/* Do not wait for response */
diff --git a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
index bca781a..1c06b4e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/pack_generic.c
@@ -1706,7 +1706,7 @@ void lustre_swab_mdt_body(struct mdt_body *b)
 	__swab32s(&b->mbo_eadatasize);
 	__swab32s(&b->mbo_aclsize);
 	__swab32s(&b->mbo_max_mdsize);
-	__swab32s(&b->mbo_max_cookiesize);
+	CLASSERT(offsetof(typeof(*b), mbo_unused3));
 	__swab32s(&b->mbo_uid_h);
 	__swab32s(&b->mbo_gid_h);
 	CLASSERT(offsetof(typeof(*b), mbo_padding_5) != 0);
@@ -2103,8 +2103,6 @@ static void dump_obdo(struct obdo *oa)
 	if (valid & OBD_MD_FLHANDLE)
 		CDEBUG(D_RPCTRACE, "obdo: o_handle = %lld\n",
 		       oa->o_handle.cookie);
-	if (valid & OBD_MD_FLCOOKIE)
-		CDEBUG(D_RPCTRACE, "obdo: o_lcookie = (llog_cookie dumping not yet implemented)\n");
 }
 
 void dump_ost_body(struct ost_body *ob)
diff --git a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
index 391e83e..a6edc8d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/wiretest.c
@@ -1245,8 +1245,6 @@ void lustre_assert_wire_constants(void)
 		 OBD_MD_FLCKSUM);
 	LASSERTF(OBD_MD_FLQOS == (0x00200000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLQOS);
-	LASSERTF(OBD_MD_FLCOOKIE == (0x00800000ULL), "found 0x%.16llxULL\n",
-		 OBD_MD_FLCOOKIE);
 	LASSERTF(OBD_MD_FLGROUP == (0x01000000ULL), "found 0x%.16llxULL\n",
 		 OBD_MD_FLGROUP);
 	LASSERTF(OBD_MD_FLFID == (0x02000000ULL), "found 0x%.16llxULL\n",
@@ -1823,10 +1821,10 @@ void lustre_assert_wire_constants(void)
 		 (long long)(int)offsetof(struct mdt_body, mbo_max_mdsize));
 	LASSERTF((int)sizeof(((struct mdt_body *)0)->mbo_max_mdsize) == 4, "found %lld\n",
 		 (long long)(int)sizeof(((struct mdt_body *)0)->mbo_max_mdsize));
-	LASSERTF((int)offsetof(struct mdt_body, mbo_max_cookiesize) == 160, "found %lld\n",
-		 (long long)(int)offsetof(struct mdt_body, mbo_max_cookiesize));
-	LASSERTF((int)sizeof(((struct mdt_body *)0)->mbo_max_cookiesize) == 4, "found %lld\n",
-		 (long long)(int)sizeof(((struct mdt_body *)0)->mbo_max_cookiesize));
+	LASSERTF((int)offsetof(struct mdt_body, mbo_unused3) == 160, "found %lld\n",
+		 (long long)(int)offsetof(struct mdt_body, mbo_unused3));
+	LASSERTF((int)sizeof(((struct mdt_body *)0)->mbo_unused3) == 4, "found %lld\n",
+		 (long long)(int)sizeof(((struct mdt_body *)0)->mbo_unused3));
 	LASSERTF((int)offsetof(struct mdt_body, mbo_uid_h) == 164, "found %lld\n",
 		 (long long)(int)offsetof(struct mdt_body, mbo_uid_h));
 	LASSERTF((int)sizeof(((struct mdt_body *)0)->mbo_uid_h) == 4, "found %lld\n",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	Jinshan Xiong, James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

If normal IO got short read/write, we'd restart the IO from where
we've accomplished until we meet EOF or error happens.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
Reviewed-on: http://review.whamcloud.com/14123
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
 .../staging/lustre/lustre/include/obd_support.h    |    2 +
 drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
 4 files changed, 45 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/fail.c b/drivers/staging/lustre/lnet/libcfs/fail.c
index e4b1a0a..3a9c8dd 100644
--- a/drivers/staging/lustre/lnet/libcfs/fail.c
+++ b/drivers/staging/lustre/lnet/libcfs/fail.c
@@ -113,6 +113,7 @@ int __cfs_fail_check_set(__u32 id, __u32 value, int set)
 		break;
 	case CFS_FAIL_LOC_RESET:
 		cfs_fail_loc = value;
+		atomic_set(&cfs_fail_count, 0);
 		break;
 	default:
 		LASSERTF(0, "called with bad set %u\n", set);
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 1233c34..7f3f8cd 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -458,6 +458,8 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_LOV_INIT			    0x1403
 #define OBD_FAIL_GLIMPSE_DELAY			    0x1404
 #define OBD_FAIL_LLITE_XATTR_ENOMEM		    0x1405
+#define OBD_FAIL_MAKE_LOVEA_HOLE		    0x1406
+#define OBD_FAIL_LLITE_LOST_LAYOUT		    0x1407
 #define OBD_FAIL_GETATTR_DELAY			    0x1409
 
 #define OBD_FAIL_FID_INDIR	0x1501
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 94caf4f..9bf50bf 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -972,9 +972,11 @@ ll_file_io_generic(const struct lu_env *env, struct vvp_io_args *args,
 {
 	struct ll_inode_info *lli = ll_i2info(file_inode(file));
 	struct ll_file_data  *fd  = LUSTRE_FPRIVATE(file);
+	struct vvp_io *vio = vvp_env_io(env);
 	struct range_lock range;
 	struct cl_io	 *io;
-	ssize_t	       result;
+	ssize_t result = 0;
+	int rc = 0;
 
 	CDEBUG(D_VFSTRACE, "file: %s, type: %d ppos: %llu, count: %zu\n",
 	       file->f_path.dentry->d_name.name, iot, *ppos, count);
@@ -1010,9 +1012,8 @@ restart:
 				CDEBUG(D_VFSTRACE, "Range lock [%llu, %llu]\n",
 				       range.rl_node.in_extent.start,
 				       range.rl_node.in_extent.end);
-				result = range_lock(&lli->lli_write_tree,
-						    &range);
-				if (result < 0)
+				rc = range_lock(&lli->lli_write_tree, &range);
+				if (rc < 0)
 					goto out;
 
 				range_locked = true;
@@ -1028,7 +1029,7 @@ restart:
 			LBUG();
 		}
 		ll_cl_add(file, env, io);
-		result = cl_io_loop(env, io);
+		rc = cl_io_loop(env, io);
 		ll_cl_remove(file, env);
 		if (args->via_io_subtype == IO_NORMAL)
 			up_read(&lli->lli_trunc_sem);
@@ -1040,24 +1041,26 @@ restart:
 		}
 	} else {
 		/* cl_io_rw_init() handled IO */
-		result = io->ci_result;
+		rc = io->ci_result;
 	}
 
 	if (io->ci_nob > 0) {
 		result = io->ci_nob;
+		count -= io->ci_nob;
 		*ppos = io->u.ci_wr.wr.crw_pos;
+
+		/* prepare IO restart */
+		if (count > 0 && args->via_io_subtype == IO_NORMAL)
+			args->u.normal.via_iter = vio->vui_iter;
 	}
-	goto out;
 out:
 	cl_io_fini(env, io);
-	/* If any bit been read/written (result != 0), we just return
-	 * short read/write instead of restart io.
-	 */
-	if ((result == 0 || result == -ENODATA) && io->ci_need_restart) {
-		CDEBUG(D_VFSTRACE, "Restart %s on %pD from %lld, count:%zu\n",
+
+	if ((!rc || rc == -ENODATA) && count > 0 && io->ci_need_restart) {
+		CDEBUG(D_VFSTRACE, "%s: restart %s from %lld, count:%zu, result: %zd\n",
+		       file_dentry(file)->d_name.name,
 		       iot == CIT_READ ? "read" : "write",
-		       file, *ppos, count);
-		LASSERTF(io->ci_nob == 0, "%zd\n", io->ci_nob);
+		       *ppos, count, result);
 		goto restart;
 	}
 
@@ -1070,13 +1073,19 @@ out:
 			ll_stats_ops_tally(ll_i2sbi(file_inode(file)),
 					   LPROC_LL_WRITE_BYTES, result);
 			fd->fd_write_failed = false;
-		} else if (result != -ERESTARTSYS) {
+		} else if (!result && !rc) {
+			rc = io->ci_result;
+			if (rc < 0)
+				fd->fd_write_failed = true;
+			else
+				fd->fd_write_failed = false;
+		} else if (rc != -ERESTARTSYS) {
 			fd->fd_write_failed = true;
 		}
 	}
 	CDEBUG(D_VFSTRACE, "iot: %d, result: %zd\n", iot, result);
 
-	return result;
+	return result > 0 ? result : rc;
 }
 
 static ssize_t ll_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 8f1964f..5f93db8 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -84,9 +84,10 @@ static bool can_populate_pages(const struct lu_env *env, struct cl_io *io,
 		/* don't need lock here to check lli_layout_gen as we have held
 		 * extent lock and GROUP lock has to hold to swap layout
 		 */
-		if (ll_layout_version_get(lli) != vio->vui_layout_gen) {
+		if (ll_layout_version_get(lli) != vio->vui_layout_gen ||
+		    OBD_FAIL_CHECK_RESET(OBD_FAIL_LLITE_LOST_LAYOUT, 0)) {
 			io->ci_need_restart = 1;
-			/* this will return application a short read/write */
+			/* this will cause a short read/write */
 			io->ci_continue = 0;
 			rc = false;
 		}
@@ -960,6 +961,20 @@ static int vvp_io_write_start(const struct lu_env *env,
 
 	CDEBUG(D_VFSTRACE, "write: [%lli, %lli)\n", pos, pos + (long long)cnt);
 
+	/*
+	 * The maximum Lustre file size is variable, based on the OST maximum
+	 * object size and number of stripes.  This needs another check in
+	 * addition to the VFS checks earlier.
+	 */
+	if (pos + cnt > ll_file_maxbytes(inode)) {
+		CDEBUG(D_INODE,
+		       "%s: file " DFID " offset %llu > maxbytes %llu\n",
+		       ll_get_fsname(inode->i_sb, NULL, 0),
+		       PFID(ll_inode2fid(inode)), pos + cnt,
+		       ll_file_maxbytes(inode));
+		return -EFBIG;
+	}
+
 	if (!vio->vui_iter) {
 		/* from a temp io in ll_cl_init(). */
 		result = 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Bobi Jam,
	Jinshan Xiong, James Simmons

From: Bobi Jam <bobijam.xu@intel.com>

If normal IO got short read/write, we'd restart the IO from where
we've accomplished until we meet EOF or error happens.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
Reviewed-on: http://review.whamcloud.com/14123
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
 .../staging/lustre/lustre/include/obd_support.h    |    2 +
 drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
 4 files changed, 45 insertions(+), 18 deletions(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/fail.c b/drivers/staging/lustre/lnet/libcfs/fail.c
index e4b1a0a..3a9c8dd 100644
--- a/drivers/staging/lustre/lnet/libcfs/fail.c
+++ b/drivers/staging/lustre/lnet/libcfs/fail.c
@@ -113,6 +113,7 @@ int __cfs_fail_check_set(__u32 id, __u32 value, int set)
 		break;
 	case CFS_FAIL_LOC_RESET:
 		cfs_fail_loc = value;
+		atomic_set(&cfs_fail_count, 0);
 		break;
 	default:
 		LASSERTF(0, "called with bad set %u\n", set);
diff --git a/drivers/staging/lustre/lustre/include/obd_support.h b/drivers/staging/lustre/lustre/include/obd_support.h
index 1233c34..7f3f8cd 100644
--- a/drivers/staging/lustre/lustre/include/obd_support.h
+++ b/drivers/staging/lustre/lustre/include/obd_support.h
@@ -458,6 +458,8 @@ extern char obd_jobid_var[];
 #define OBD_FAIL_LOV_INIT			    0x1403
 #define OBD_FAIL_GLIMPSE_DELAY			    0x1404
 #define OBD_FAIL_LLITE_XATTR_ENOMEM		    0x1405
+#define OBD_FAIL_MAKE_LOVEA_HOLE		    0x1406
+#define OBD_FAIL_LLITE_LOST_LAYOUT		    0x1407
 #define OBD_FAIL_GETATTR_DELAY			    0x1409
 
 #define OBD_FAIL_FID_INDIR	0x1501
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 94caf4f..9bf50bf 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -972,9 +972,11 @@ ll_file_io_generic(const struct lu_env *env, struct vvp_io_args *args,
 {
 	struct ll_inode_info *lli = ll_i2info(file_inode(file));
 	struct ll_file_data  *fd  = LUSTRE_FPRIVATE(file);
+	struct vvp_io *vio = vvp_env_io(env);
 	struct range_lock range;
 	struct cl_io	 *io;
-	ssize_t	       result;
+	ssize_t result = 0;
+	int rc = 0;
 
 	CDEBUG(D_VFSTRACE, "file: %s, type: %d ppos: %llu, count: %zu\n",
 	       file->f_path.dentry->d_name.name, iot, *ppos, count);
@@ -1010,9 +1012,8 @@ restart:
 				CDEBUG(D_VFSTRACE, "Range lock [%llu, %llu]\n",
 				       range.rl_node.in_extent.start,
 				       range.rl_node.in_extent.end);
-				result = range_lock(&lli->lli_write_tree,
-						    &range);
-				if (result < 0)
+				rc = range_lock(&lli->lli_write_tree, &range);
+				if (rc < 0)
 					goto out;
 
 				range_locked = true;
@@ -1028,7 +1029,7 @@ restart:
 			LBUG();
 		}
 		ll_cl_add(file, env, io);
-		result = cl_io_loop(env, io);
+		rc = cl_io_loop(env, io);
 		ll_cl_remove(file, env);
 		if (args->via_io_subtype == IO_NORMAL)
 			up_read(&lli->lli_trunc_sem);
@@ -1040,24 +1041,26 @@ restart:
 		}
 	} else {
 		/* cl_io_rw_init() handled IO */
-		result = io->ci_result;
+		rc = io->ci_result;
 	}
 
 	if (io->ci_nob > 0) {
 		result = io->ci_nob;
+		count -= io->ci_nob;
 		*ppos = io->u.ci_wr.wr.crw_pos;
+
+		/* prepare IO restart */
+		if (count > 0 && args->via_io_subtype == IO_NORMAL)
+			args->u.normal.via_iter = vio->vui_iter;
 	}
-	goto out;
 out:
 	cl_io_fini(env, io);
-	/* If any bit been read/written (result != 0), we just return
-	 * short read/write instead of restart io.
-	 */
-	if ((result == 0 || result == -ENODATA) && io->ci_need_restart) {
-		CDEBUG(D_VFSTRACE, "Restart %s on %pD from %lld, count:%zu\n",
+
+	if ((!rc || rc == -ENODATA) && count > 0 && io->ci_need_restart) {
+		CDEBUG(D_VFSTRACE, "%s: restart %s from %lld, count:%zu, result: %zd\n",
+		       file_dentry(file)->d_name.name,
 		       iot == CIT_READ ? "read" : "write",
-		       file, *ppos, count);
-		LASSERTF(io->ci_nob == 0, "%zd\n", io->ci_nob);
+		       *ppos, count, result);
 		goto restart;
 	}
 
@@ -1070,13 +1073,19 @@ out:
 			ll_stats_ops_tally(ll_i2sbi(file_inode(file)),
 					   LPROC_LL_WRITE_BYTES, result);
 			fd->fd_write_failed = false;
-		} else if (result != -ERESTARTSYS) {
+		} else if (!result && !rc) {
+			rc = io->ci_result;
+			if (rc < 0)
+				fd->fd_write_failed = true;
+			else
+				fd->fd_write_failed = false;
+		} else if (rc != -ERESTARTSYS) {
 			fd->fd_write_failed = true;
 		}
 	}
 	CDEBUG(D_VFSTRACE, "iot: %d, result: %zd\n", iot, result);
 
-	return result;
+	return result > 0 ? result : rc;
 }
 
 static ssize_t ll_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 8f1964f..5f93db8 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -84,9 +84,10 @@ static bool can_populate_pages(const struct lu_env *env, struct cl_io *io,
 		/* don't need lock here to check lli_layout_gen as we have held
 		 * extent lock and GROUP lock has to hold to swap layout
 		 */
-		if (ll_layout_version_get(lli) != vio->vui_layout_gen) {
+		if (ll_layout_version_get(lli) != vio->vui_layout_gen ||
+		    OBD_FAIL_CHECK_RESET(OBD_FAIL_LLITE_LOST_LAYOUT, 0)) {
 			io->ci_need_restart = 1;
-			/* this will return application a short read/write */
+			/* this will cause a short read/write */
 			io->ci_continue = 0;
 			rc = false;
 		}
@@ -960,6 +961,20 @@ static int vvp_io_write_start(const struct lu_env *env,
 
 	CDEBUG(D_VFSTRACE, "write: [%lli, %lli)\n", pos, pos + (long long)cnt);
 
+	/*
+	 * The maximum Lustre file size is variable, based on the OST maximum
+	 * object size and number of stripes.  This needs another check in
+	 * addition to the VFS checks earlier.
+	 */
+	if (pos + cnt > ll_file_maxbytes(inode)) {
+		CDEBUG(D_INODE,
+		       "%s: file " DFID " offset %llu > maxbytes %llu\n",
+		       ll_get_fsname(inode->i_sb, NULL, 0),
+		       PFID(ll_inode2fid(inode)), pos + cnt,
+		       ll_file_maxbytes(inode));
+		return -EFBIG;
+	}
+
 	if (!vio->vui_iter) {
 		/* from a temp io in ll_cl_init(). */
 		result = 0;
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 33/41] staging: lustre: lov: use obd_get_info() to get def/max LOV EA sizes
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Use obd_get_info() to get the default and maximum LOV EA sizes (along
with maximum cookiesize) from LOV. Remove the then unused function
obd_size_diskmd() and the unused get info key KEY_LOVDESC. When
computing the maximum LOV EA size use the active OST count
(ld_active_tgt_count) rather than the OST count (ld_tgt_count).

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13695
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |    1 -
 drivers/staging/lustre/lustre/include/obd_class.h  |    6 ---
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   34 ++++++++++----------
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   10 +++++-
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   24 +++++++------
 5 files changed, 39 insertions(+), 36 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 4691121..5fa5838 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -667,7 +667,6 @@ enum obd_cleanup_stage {
 #define KEY_INTERMDS	    "inter_mds"
 #define KEY_LAST_ID	     "last_id"
 #define KEY_LAST_FID		"last_fid"
-#define KEY_LOVDESC	     "lovdesc"
 #define KEY_MAX_EASIZE		"max_easize"
 #define KEY_DEFAULT_EASIZE	"default_easize"
 #define KEY_MGSSEC	      "mgssec"
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index a27dbc8..e6ae4a0 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -628,12 +628,6 @@ static inline int obd_packmd(struct obd_export *exp,
 	return rc;
 }
 
-static inline int obd_size_diskmd(struct obd_export *exp,
-				  struct lov_stripe_md *mem_src)
-{
-	return obd_packmd(exp, NULL, mem_src);
-}
-
 static inline int obd_free_diskmd(struct obd_export *exp,
 				  struct lov_mds_md **disk_tgt)
 {
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_misc.c b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
index 4562643..1558b55 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
@@ -47,29 +47,29 @@
  */
 int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 {
-	struct lov_stripe_md lsm = { .lsm_magic = LOV_MAGIC_V3 };
-	__u32 valsize = sizeof(struct lov_desc);
-	int rc, easize, def_easize;
-	struct lov_desc desc;
-	__u16 stripes, def_stripes;
-
-	rc = obd_get_info(NULL, dt_exp, sizeof(KEY_LOVDESC), KEY_LOVDESC,
-			  &valsize, &desc);
+	u32 val_size, max_easize, def_easize;
+	int rc;
+
+	val_size = sizeof(max_easize);
+	rc = obd_get_info(NULL, dt_exp, sizeof(KEY_MAX_EASIZE), KEY_MAX_EASIZE,
+			  &val_size, &max_easize);
 	if (rc)
 		return rc;
 
-	stripes = min_t(__u32, desc.ld_tgt_count, LOV_MAX_STRIPE_COUNT);
-	lsm.lsm_stripe_count = stripes;
-	easize = obd_size_diskmd(dt_exp, &lsm);
-
-	def_stripes = min_t(__u32, desc.ld_default_stripe_count,
-			    LOV_MAX_STRIPE_COUNT);
-	lsm.lsm_stripe_count = def_stripes;
-	def_easize = obd_size_diskmd(dt_exp, &lsm);
+	val_size = sizeof(def_easize);
+	rc = obd_get_info(NULL, dt_exp, sizeof(KEY_DEFAULT_EASIZE),
+			  KEY_DEFAULT_EASIZE, &val_size, &def_easize);
+	if (rc)
+		return rc;
 
-	CDEBUG(D_HA, "updating def/max_easize: %d/%d\n", def_easize, easize);
+	/*
+	 * default cookiesize is 0 because from 2.4 server doesn't send
+	 * llog cookies to client.
+	 */
+	CDEBUG(D_HA, "updating def/max_easize: %d/%d\n",
+	       def_easize, max_easize);
 
-	rc = md_init_ea_size(md_exp, easize, def_easize);
+	rc = md_init_ea_size(md_exp, max_easize, def_easize);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 25a06f8..84d5556 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -560,7 +560,15 @@ int ll_get_max_mdsize(struct ll_sb_info *sbi, int *lmmsize)
 {
 	int size, rc;
 
-	*lmmsize = obd_size_diskmd(sbi->ll_dt_exp, NULL);
+	size = sizeof(*lmmsize);
+	rc = obd_get_info(NULL, sbi->ll_dt_exp, sizeof(KEY_MAX_EASIZE),
+			  KEY_MAX_EASIZE, &size, lmmsize);
+	if (rc) {
+		CERROR("%s: cannot get max LOV EA size: rc = %d\n",
+		       sbi->ll_dt_exp->exp_obd->obd_name, rc);
+		return rc;
+	}
+
 	size = sizeof(int);
 	rc = obd_get_info(NULL, sbi->ll_md_exp, sizeof(KEY_MAX_EASIZE),
 			  KEY_MAX_EASIZE, &size, lmmsize);
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index c2a853c..473071c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1244,28 +1244,30 @@ static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
 {
 	struct obd_device *obddev = class_exp2obd(exp);
 	struct lov_obd *lov = &obddev->u.lov;
-	int rc;
+	struct lov_desc *ld = &lov->desc;
+	int rc = 0;
 
 	if (!vallen || !val)
 		return -EFAULT;
 
 	obd_getref(obddev);
 
-	if (KEY_IS(KEY_LOVDESC)) {
-		struct lov_desc *desc_ret = val;
-		*desc_ret = lov->desc;
+	if (KEY_IS(KEY_MAX_EASIZE)) {
+		u32 max_stripe_count = min_t(u32, ld->ld_active_tgt_count,
+					     LOV_MAX_STRIPE_COUNT);
 
-		rc = 0;
-		goto out;
+		*((u32 *)val) = lov_mds_md_size(max_stripe_count, LOV_MAGIC_V3);
+	} else if (KEY_IS(KEY_DEFAULT_EASIZE)) {
+		u32 def_stripe_count = min_t(u32, ld->ld_default_stripe_count,
+					     LOV_MAX_STRIPE_COUNT);
+
+		*((u32 *)val) = lov_mds_md_size(def_stripe_count, LOV_MAGIC_V3);
 	} else if (KEY_IS(KEY_TGT_COUNT)) {
 		*((int *)val) = lov->desc.ld_tgt_count;
-		rc = 0;
-		goto out;
+	} else {
+		rc = -EINVAL;
 	}
 
-	rc = -EINVAL;
-
-out:
 	obd_putref(obddev);
 	return rc;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 33/41] staging: lustre: lov: use obd_get_info() to get def/max LOV EA sizes
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Use obd_get_info() to get the default and maximum LOV EA sizes (along
with maximum cookiesize) from LOV. Remove the then unused function
obd_size_diskmd() and the unused get info key KEY_LOVDESC. When
computing the maximum LOV EA size use the active OST count
(ld_active_tgt_count) rather than the OST count (ld_tgt_count).

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13695
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |    1 -
 drivers/staging/lustre/lustre/include/obd_class.h  |    6 ---
 drivers/staging/lustre/lustre/llite/lcommon_misc.c |   34 ++++++++++----------
 drivers/staging/lustre/lustre/llite/llite_lib.c    |   10 +++++-
 drivers/staging/lustre/lustre/lov/lov_obd.c        |   24 +++++++------
 5 files changed, 39 insertions(+), 36 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 4691121..5fa5838 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -667,7 +667,6 @@ enum obd_cleanup_stage {
 #define KEY_INTERMDS	    "inter_mds"
 #define KEY_LAST_ID	     "last_id"
 #define KEY_LAST_FID		"last_fid"
-#define KEY_LOVDESC	     "lovdesc"
 #define KEY_MAX_EASIZE		"max_easize"
 #define KEY_DEFAULT_EASIZE	"default_easize"
 #define KEY_MGSSEC	      "mgssec"
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index a27dbc8..e6ae4a0 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -628,12 +628,6 @@ static inline int obd_packmd(struct obd_export *exp,
 	return rc;
 }
 
-static inline int obd_size_diskmd(struct obd_export *exp,
-				  struct lov_stripe_md *mem_src)
-{
-	return obd_packmd(exp, NULL, mem_src);
-}
-
 static inline int obd_free_diskmd(struct obd_export *exp,
 				  struct lov_mds_md **disk_tgt)
 {
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_misc.c b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
index 4562643..1558b55 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_misc.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_misc.c
@@ -47,29 +47,29 @@
  */
 int cl_init_ea_size(struct obd_export *md_exp, struct obd_export *dt_exp)
 {
-	struct lov_stripe_md lsm = { .lsm_magic = LOV_MAGIC_V3 };
-	__u32 valsize = sizeof(struct lov_desc);
-	int rc, easize, def_easize;
-	struct lov_desc desc;
-	__u16 stripes, def_stripes;
-
-	rc = obd_get_info(NULL, dt_exp, sizeof(KEY_LOVDESC), KEY_LOVDESC,
-			  &valsize, &desc);
+	u32 val_size, max_easize, def_easize;
+	int rc;
+
+	val_size = sizeof(max_easize);
+	rc = obd_get_info(NULL, dt_exp, sizeof(KEY_MAX_EASIZE), KEY_MAX_EASIZE,
+			  &val_size, &max_easize);
 	if (rc)
 		return rc;
 
-	stripes = min_t(__u32, desc.ld_tgt_count, LOV_MAX_STRIPE_COUNT);
-	lsm.lsm_stripe_count = stripes;
-	easize = obd_size_diskmd(dt_exp, &lsm);
-
-	def_stripes = min_t(__u32, desc.ld_default_stripe_count,
-			    LOV_MAX_STRIPE_COUNT);
-	lsm.lsm_stripe_count = def_stripes;
-	def_easize = obd_size_diskmd(dt_exp, &lsm);
+	val_size = sizeof(def_easize);
+	rc = obd_get_info(NULL, dt_exp, sizeof(KEY_DEFAULT_EASIZE),
+			  KEY_DEFAULT_EASIZE, &val_size, &def_easize);
+	if (rc)
+		return rc;
 
-	CDEBUG(D_HA, "updating def/max_easize: %d/%d\n", def_easize, easize);
+	/*
+	 * default cookiesize is 0 because from 2.4 server doesn't send
+	 * llog cookies to client.
+	 */
+	CDEBUG(D_HA, "updating def/max_easize: %d/%d\n",
+	       def_easize, max_easize);
 
-	rc = md_init_ea_size(md_exp, easize, def_easize);
+	rc = md_init_ea_size(md_exp, max_easize, def_easize);
 	return rc;
 }
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 25a06f8..84d5556 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -560,7 +560,15 @@ int ll_get_max_mdsize(struct ll_sb_info *sbi, int *lmmsize)
 {
 	int size, rc;
 
-	*lmmsize = obd_size_diskmd(sbi->ll_dt_exp, NULL);
+	size = sizeof(*lmmsize);
+	rc = obd_get_info(NULL, sbi->ll_dt_exp, sizeof(KEY_MAX_EASIZE),
+			  KEY_MAX_EASIZE, &size, lmmsize);
+	if (rc) {
+		CERROR("%s: cannot get max LOV EA size: rc = %d\n",
+		       sbi->ll_dt_exp->exp_obd->obd_name, rc);
+		return rc;
+	}
+
 	size = sizeof(int);
 	rc = obd_get_info(NULL, sbi->ll_md_exp, sizeof(KEY_MAX_EASIZE),
 			  KEY_MAX_EASIZE, &size, lmmsize);
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index c2a853c..473071c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1244,28 +1244,30 @@ static int lov_get_info(const struct lu_env *env, struct obd_export *exp,
 {
 	struct obd_device *obddev = class_exp2obd(exp);
 	struct lov_obd *lov = &obddev->u.lov;
-	int rc;
+	struct lov_desc *ld = &lov->desc;
+	int rc = 0;
 
 	if (!vallen || !val)
 		return -EFAULT;
 
 	obd_getref(obddev);
 
-	if (KEY_IS(KEY_LOVDESC)) {
-		struct lov_desc *desc_ret = val;
-		*desc_ret = lov->desc;
+	if (KEY_IS(KEY_MAX_EASIZE)) {
+		u32 max_stripe_count = min_t(u32, ld->ld_active_tgt_count,
+					     LOV_MAX_STRIPE_COUNT);
 
-		rc = 0;
-		goto out;
+		*((u32 *)val) = lov_mds_md_size(max_stripe_count, LOV_MAGIC_V3);
+	} else if (KEY_IS(KEY_DEFAULT_EASIZE)) {
+		u32 def_stripe_count = min_t(u32, ld->ld_default_stripe_count,
+					     LOV_MAX_STRIPE_COUNT);
+
+		*((u32 *)val) = lov_mds_md_size(def_stripe_count, LOV_MAGIC_V3);
 	} else if (KEY_IS(KEY_TGT_COUNT)) {
 		*((int *)val) = lov->desc.ld_tgt_count;
-		rc = 0;
-		goto out;
+	} else {
+		rc = -EINVAL;
 	}
 
-	rc = -EINVAL;
-
-out:
 	obd_putref(obddev);
 	return rc;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 34/41] staging: lustre: ldlm: cancel aged locks for LRUR
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

It doesn't make sense to keep the very aged lock even with the
LRUR policy. This patch decreased the default ns_max_age from 10
hours to 65 minutes and changed LRUR policy to cancel very aged
locks.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6529
Reviewed-on: http://review.whamcloud.com/14856
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_dlm.h |    2 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    8 ++++++++
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm.h b/drivers/staging/lustre/lustre/include/lustre_dlm.h
index d035344..1c6b7b8 100644
--- a/drivers/staging/lustre/lustre/include/lustre_dlm.h
+++ b/drivers/staging/lustre/lustre/include/lustre_dlm.h
@@ -59,7 +59,7 @@ struct obd_device;
 #define OBD_LDLM_DEVICENAME  "ldlm"
 
 #define LDLM_DEFAULT_LRU_SIZE (100 * num_online_cpus())
-#define LDLM_DEFAULT_MAX_ALIVE (cfs_time_seconds(36000))
+#define LDLM_DEFAULT_MAX_ALIVE (cfs_time_seconds(3900)) /* 65 min */
 #define LDLM_DEFAULT_PARALLEL_AST_LIMIT 1024
 
 /**
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 98730a3..ac1927c 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -1182,6 +1182,14 @@ static enum ldlm_policy_res ldlm_cancel_lrur_policy(struct ldlm_namespace *ns,
 	if (count && added >= count)
 		return LDLM_POLICY_KEEP_LOCK;
 
+	/*
+	 * Despite of the LV, It doesn't make sense to keep the lock which
+	 * is unused for ns_max_age time.
+	 */
+	if (cfs_time_after(cfs_time_current(),
+			   cfs_time_add(lock->l_last_used, ns->ns_max_age)))
+		return LDLM_POLICY_CANCEL_LOCK;
+
 	slv = ldlm_pool_get_slv(pl);
 	lvf = ldlm_pool_get_lvf(pl);
 	la = cfs_duration_sec(cfs_time_sub(cur, lock->l_last_used));
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 34/41] staging: lustre: ldlm: cancel aged locks for LRUR
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Niu Yawei,
	James Simmons

From: Niu Yawei <yawei.niu@intel.com>

It doesn't make sense to keep the very aged lock even with the
LRUR policy. This patch decreased the default ns_max_age from 10
hours to 65 minutes and changed LRUR policy to cancel very aged
locks.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6529
Reviewed-on: http://review.whamcloud.com/14856
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/lustre_dlm.h |    2 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_request.c  |    8 ++++++++
 2 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_dlm.h b/drivers/staging/lustre/lustre/include/lustre_dlm.h
index d035344..1c6b7b8 100644
--- a/drivers/staging/lustre/lustre/include/lustre_dlm.h
+++ b/drivers/staging/lustre/lustre/include/lustre_dlm.h
@@ -59,7 +59,7 @@ struct obd_device;
 #define OBD_LDLM_DEVICENAME  "ldlm"
 
 #define LDLM_DEFAULT_LRU_SIZE (100 * num_online_cpus())
-#define LDLM_DEFAULT_MAX_ALIVE (cfs_time_seconds(36000))
+#define LDLM_DEFAULT_MAX_ALIVE (cfs_time_seconds(3900)) /* 65 min */
 #define LDLM_DEFAULT_PARALLEL_AST_LIMIT 1024
 
 /**
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
index 98730a3..ac1927c 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_request.c
@@ -1182,6 +1182,14 @@ static enum ldlm_policy_res ldlm_cancel_lrur_policy(struct ldlm_namespace *ns,
 	if (count && added >= count)
 		return LDLM_POLICY_KEEP_LOCK;
 
+	/*
+	 * Despite of the LV, It doesn't make sense to keep the lock which
+	 * is unused for ns_max_age time.
+	 */
+	if (cfs_time_after(cfs_time_current(),
+			   cfs_time_add(lock->l_last_used, ns->ns_max_age)))
+		return LDLM_POLICY_CANCEL_LOCK;
+
 	slv = ldlm_pool_get_slv(pl);
 	lvf = ldlm_pool_get_lvf(pl);
 	la = cfs_duration_sec(cfs_time_sub(cur, lock->l_last_used));
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 35/41] staging: lustre: hsm: Use file lease to implement migration
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Henri Doreau,
	Jinshan Xiong, James Simmons

From: Henri Doreau <henri.doreau@cea.fr>

Implement non-blocking migration based on exclusive open instead of
group lock. Implemented exclusive close operation to atomically put
a lease, swap two layouts and close a file. This allows race-free
migrations.

Make the caller responsible for retrying on failure (EBUSY, EAGAIN)
in non-blocking mode.

In blocking mode, allow applications to trigger layout swaps using a
grouplock they already own, to prevent race conditions between the
actual data copy and the layout swap. Updated lfs accordingly. File
leases are also taken in blocking mode, so that lfs migrate can issue
a warning if an application attempts to open a file that is being
migrated and gets blocked.

Timestamps (atime/mtime) are set from userland, after the layout swap
is performed, to prevent conflicts with the grouplock.

lli_trunc_sem is taken/released in the vvp_io layer, under the DLM
lock. This re-ordering fixes the original issue between truncate and
migrate.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4840
Reviewed-on: http://review.whamcloud.com/10013
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    5 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |    1 +
 .../lustre/lustre/include/lustre_req_layout.h      |    2 +-
 drivers/staging/lustre/lustre/llite/file.c         |  231 ++++++++++++--------
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    4 -
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   82 +++++---
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   34 ++--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    7 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   10 +-
 9 files changed, 235 insertions(+), 141 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 5d2f845..7de8098 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1703,7 +1703,9 @@ lov_mds_md_max_stripe_count(size_t buf_size, __u32 lmm_magic)
 /*	OBD_MD_FLRMTRGETFACL (0x0008000000000000ULL) lfs rgetfacl, obsolete */
 
 #define OBD_MD_FLDATAVERSION (0x0010000000000000ULL) /* iversion sum */
-#define OBD_MD_FLRELEASED    (0x0020000000000000ULL) /* file released */
+#define OBD_MD_CLOSE_INTENT_EXECED (0x0020000000000000ULL) /* close intent
+							    * executed
+							    */
 
 #define OBD_MD_DEFAULT_MEA   (0x0040000000000000ULL) /* default MEA */
 
@@ -2235,6 +2237,7 @@ enum mds_op_bias {
 	MDS_OWNEROVERRIDE	= 1 << 11,
 	MDS_HSM_RELEASE		= 1 << 12,
 	MDS_RENAME_MIGRATE	= BIT(13),
+	MDS_CLOSE_LAYOUT_SWAP   = BIT(14),
 };
 
 /* instance of mdt_reint_rec */
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 856e2f9..579ef14 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -650,6 +650,7 @@ struct if_quotactl {
 #define SWAP_LAYOUTS_CHECK_DV2		(1 << 1)
 #define SWAP_LAYOUTS_KEEP_MTIME		(1 << 2)
 #define SWAP_LAYOUTS_KEEP_ATIME		(1 << 3)
+#define SWAP_LAYOUTS_CLOSE		BIT(4)
 
 /* Swap XATTR_NAME_HSM as well, only on the MDT so far */
 #define SWAP_LAYOUTS_MDS_HSM		(1 << 31)
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index 78857b3..7657132 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -148,7 +148,7 @@ extern struct req_format RQF_MDS_GETATTR;
  */
 extern struct req_format RQF_MDS_GETATTR_NAME;
 extern struct req_format RQF_MDS_CLOSE;
-extern struct req_format RQF_MDS_RELEASE_CLOSE;
+extern struct req_format RQF_MDS_INTENT_CLOSE;
 extern struct req_format RQF_MDS_CONNECT;
 extern struct req_format RQF_MDS_DISCONNECT;
 extern struct req_format RQF_MDS_GET_INFO;
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 9bf50bf..b9cadd9 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -113,10 +113,19 @@ out:
 			   0, 0, LUSTRE_OPC_ANY, NULL);
 }
 
+/**
+ * Perform a close, possibly with a bias.
+ * The meaning of "data" depends on the value of "bias".
+ *
+ * If \a bias is MDS_HSM_RELEASE then \a data is a pointer to the data version.
+ * If \a bias is MDS_CLOSE_LAYOUT_SWAP then \a data is a pointer to the inode to
+ * swap layouts with.
+ */
 static int ll_close_inode_openhandle(struct obd_export *md_exp,
-				     struct inode *inode,
 				     struct obd_client_handle *och,
-				     const __u64 *data_version)
+				     struct inode *inode,
+				     enum mds_op_bias bias,
+				     void *data)
 {
 	struct obd_export *exp = ll_i2mdexp(inode);
 	struct md_op_data *op_data;
@@ -143,12 +152,26 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 	}
 
 	ll_prepare_close(inode, op_data, och);
-	if (data_version) {
-		/* Pass in data_version implies release. */
+	switch (bias) {
+	case MDS_CLOSE_LAYOUT_SWAP:
+		LASSERT(data);
+		op_data->op_bias |= MDS_CLOSE_LAYOUT_SWAP;
+		op_data->op_data_version = 0;
+		op_data->op_lease_handle = och->och_lease_handle;
+		op_data->op_fid2 = *ll_inode2fid(data);
+		break;
+
+	case MDS_HSM_RELEASE:
+		LASSERT(data);
 		op_data->op_bias |= MDS_HSM_RELEASE;
-		op_data->op_data_version = *data_version;
+		op_data->op_data_version = *(__u64 *)data;
 		op_data->op_lease_handle = och->och_lease_handle;
 		op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
+		break;
+
+	default:
+		LASSERT(!data);
+		break;
 	}
 
 	rc = md_close(md_exp, op_data, och->och_mod, &req);
@@ -169,11 +192,12 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 		spin_unlock(&lli->lli_lock);
 	}
 
-	if (rc == 0 && op_data->op_bias & MDS_HSM_RELEASE) {
+	if (op_data->op_bias & (MDS_HSM_RELEASE | MDS_CLOSE_LAYOUT_SWAP) &&
+	    !rc) {
 		struct mdt_body *body;
 
 		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-		if (!(body->mbo_valid & OBD_MD_FLRELEASED))
+		if (!(body->mbo_valid & OBD_MD_CLOSE_INTENT_EXECED))
 			rc = -EBUSY;
 	}
 
@@ -227,7 +251,7 @@ int ll_md_real_close(struct inode *inode, fmode_t fmode)
 		 * be closed.
 		 */
 		rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-					       inode, och, NULL);
+					       och, inode, 0, NULL);
 	}
 
 	return rc;
@@ -263,7 +287,8 @@ static int ll_md_close(struct obd_export *md_exp, struct inode *inode,
 	}
 
 	if (fd->fd_och) {
-		rc = ll_close_inode_openhandle(md_exp, inode, fd->fd_och, NULL);
+		rc = ll_close_inode_openhandle(md_exp, fd->fd_och, inode, 0,
+					       NULL);
 		fd->fd_och = NULL;
 		goto out;
 	}
@@ -816,7 +841,7 @@ out_close:
 		it.it_lock_mode = 0;
 		och->och_lease_handle.cookie = 0ULL;
 	}
-	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, inode, och, NULL);
+	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, och, inode, 0, NULL);
 	if (rc2 < 0)
 		CERROR("%s: error closing file "DFID": %d\n",
 		       ll_get_fsname(inode->i_sb, NULL, 0),
@@ -830,6 +855,69 @@ out:
 }
 
 /**
+ * Check whether a layout swap can be done between two inodes.
+ *
+ * \param[in] inode1  First inode to check
+ * \param[in] inode2  Second inode to check
+ *
+ * \retval 0 on success, layout swap can be performed between both inodes
+ * \retval negative error code if requirements are not met
+ */
+static int ll_check_swap_layouts_validity(struct inode *inode1,
+					  struct inode *inode2)
+{
+	if (!S_ISREG(inode1->i_mode) || !S_ISREG(inode2->i_mode))
+		return -EINVAL;
+
+	if (inode_permission(inode1, MAY_WRITE) ||
+	    inode_permission(inode2, MAY_WRITE))
+		return -EPERM;
+
+	if (inode1->i_sb != inode2->i_sb)
+		return -EXDEV;
+
+	return 0;
+}
+
+static int ll_swap_layouts_close(struct obd_client_handle *och,
+				 struct inode *inode, struct inode *inode2)
+{
+	const struct lu_fid *fid1 = ll_inode2fid(inode);
+	const struct lu_fid *fid2;
+	int rc;
+
+	CDEBUG(D_INODE, "%s: biased close of file " DFID "\n",
+	       ll_get_fsname(inode->i_sb, NULL, 0), PFID(fid1));
+
+	rc = ll_check_swap_layouts_validity(inode, inode2);
+	if (rc < 0)
+		goto out_free_och;
+
+	/* We now know that inode2 is a lustre inode */
+	fid2 = ll_inode2fid(inode2);
+
+	rc = lu_fid_cmp(fid1, fid2);
+	if (!rc) {
+		rc = -EINVAL;
+		goto out_free_och;
+	}
+
+	/*
+	 * Close the file and swap layouts between inode & inode2.
+	 * NB: lease lock handle is released in mdc_close_layout_swap_pack()
+	 * because we still need it to pack l_remote_handle to MDT.
+	 */
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, och, inode,
+				       MDS_CLOSE_LAYOUT_SWAP, inode2);
+
+	och = NULL; /* freed in ll_close_inode_openhandle() */
+
+out_free_och:
+	kfree(och);
+	return rc;
+}
+
+/**
  * Release lease and close the file.
  * It will check if the lease has ever broken.
  */
@@ -856,7 +944,7 @@ static int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 		*lease_broken = cancelled;
 
 	return ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-					 inode, och, NULL);
+					 och, inode, 0, NULL);
 }
 
 int ll_merge_attr(const struct lu_env *env, struct inode *inode)
@@ -1018,7 +1106,6 @@ restart:
 
 				range_locked = true;
 			}
-			down_read(&lli->lli_trunc_sem);
 			break;
 		case IO_SPLICE:
 			vio->u.splice.vui_pipe = args->u.splice.via_pipe;
@@ -1031,8 +1118,6 @@ restart:
 		ll_cl_add(file, env, io);
 		rc = cl_io_loop(env, io);
 		ll_cl_remove(file, env);
-		if (args->via_io_subtype == IO_NORMAL)
-			up_read(&lli->lli_trunc_sem);
 		if (range_locked) {
 			CDEBUG(D_VFSTRACE, "Range unlock [%llu, %llu]\n",
 			       range.rl_node.in_extent.start,
@@ -1454,7 +1539,7 @@ int ll_release_openhandle(struct inode *inode, struct lookup_intent *it)
 	ll_och_fill(ll_i2sbi(inode)->ll_md_exp, it, och);
 
 	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-				       inode, och, NULL);
+				       och, inode, 0, NULL);
 out:
 	/* this one is in place of ll_file_open */
 	if (it_disposition(it, DISP_ENQ_OPEN_REF)) {
@@ -1657,8 +1742,8 @@ int ll_hsm_release(struct inode *inode)
 	 * NB: lease lock handle is released in mdc_hsm_release_pack() because
 	 * we still need it to pack l_remote_handle to MDT.
 	 */
-	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och,
-				       &data_version);
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, och, inode,
+				       MDS_HSM_RELEASE, &data_version);
 	och = NULL;
 
 out:
@@ -1669,10 +1754,12 @@ out:
 }
 
 struct ll_swap_stack {
-	struct iattr		 ia1, ia2;
-	__u64			 dv1, dv2;
-	struct inode		*inode1, *inode2;
-	bool			 check_dv1, check_dv2;
+	u64		dv1;
+	u64		dv2;
+	struct inode   *inode1;
+	struct inode   *inode2;
+	bool		check_dv1;
+	bool		check_dv2;
 };
 
 static int ll_swap_layouts(struct file *file1, struct file *file2,
@@ -1692,21 +1779,9 @@ static int ll_swap_layouts(struct file *file1, struct file *file2,
 	llss->inode1 = file_inode(file1);
 	llss->inode2 = file_inode(file2);
 
-	if (!S_ISREG(llss->inode2->i_mode)) {
-		rc = -EINVAL;
-		goto free;
-	}
-
-	if (inode_permission(llss->inode1, MAY_WRITE) ||
-	    inode_permission(llss->inode2, MAY_WRITE)) {
-		rc = -EPERM;
-		goto free;
-	}
-
-	if (llss->inode2->i_sb != llss->inode1->i_sb) {
-		rc = -EXDEV;
+	rc = ll_check_swap_layouts_validity(llss->inode1, llss->inode2);
+	if (rc < 0)
 		goto free;
-	}
 
 	/* we use 2 bool because it is easier to swap than 2 bits */
 	if (lsl->sl_flags & SWAP_LAYOUTS_CHECK_DV1)
@@ -1720,10 +1795,8 @@ static int ll_swap_layouts(struct file *file1, struct file *file2,
 	llss->dv2 = lsl->sl_dv2;
 
 	rc = lu_fid_cmp(ll_inode2fid(llss->inode1), ll_inode2fid(llss->inode2));
-	if (rc == 0) /* same file, done! */ {
-		rc = 0;
+	if (!rc) /* same file, done! */
 		goto free;
-	}
 
 	if (rc < 0) { /* sequentialize it */
 		swap(llss->inode1, llss->inode2);
@@ -1745,19 +1818,6 @@ static int ll_swap_layouts(struct file *file1, struct file *file2,
 		}
 	}
 
-	/* to be able to restore mtime and atime after swap
-	 * we need to first save them
-	 */
-	if (lsl->sl_flags &
-	    (SWAP_LAYOUTS_KEEP_MTIME | SWAP_LAYOUTS_KEEP_ATIME)) {
-		llss->ia1.ia_mtime = llss->inode1->i_mtime;
-		llss->ia1.ia_atime = llss->inode1->i_atime;
-		llss->ia1.ia_valid = ATTR_MTIME | ATTR_ATIME;
-		llss->ia2.ia_mtime = llss->inode2->i_mtime;
-		llss->ia2.ia_atime = llss->inode2->i_atime;
-		llss->ia2.ia_valid = ATTR_MTIME | ATTR_ATIME;
-	}
-
 	/* ultimate check, before swapping the layouts we check if
 	 * dataversion has changed (if requested)
 	 */
@@ -1807,39 +1867,6 @@ putgl:
 		ll_put_grouplock(llss->inode1, file1, gid);
 	}
 
-	/* rc can be set from obd_iocontrol() or from a GOTO(putgl, ...) */
-	if (rc != 0)
-		goto free;
-
-	/* clear useless flags */
-	if (!(lsl->sl_flags & SWAP_LAYOUTS_KEEP_MTIME)) {
-		llss->ia1.ia_valid &= ~ATTR_MTIME;
-		llss->ia2.ia_valid &= ~ATTR_MTIME;
-	}
-
-	if (!(lsl->sl_flags & SWAP_LAYOUTS_KEEP_ATIME)) {
-		llss->ia1.ia_valid &= ~ATTR_ATIME;
-		llss->ia2.ia_valid &= ~ATTR_ATIME;
-	}
-
-	/* update time if requested */
-	rc = 0;
-	if (llss->ia2.ia_valid != 0) {
-		inode_lock(llss->inode1);
-		rc = ll_setattr(file1->f_path.dentry, &llss->ia2);
-		inode_unlock(llss->inode1);
-	}
-
-	if (llss->ia1.ia_valid != 0) {
-		int rc1;
-
-		inode_lock(llss->inode2);
-		rc1 = ll_setattr(file2->f_path.dentry, &llss->ia1);
-		inode_unlock(llss->inode2);
-		if (rc == 0)
-			rc = rc1;
-	}
-
 free:
 	kfree(llss);
 
@@ -1996,16 +2023,46 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 				   sizeof(struct lustre_swap_layouts)))
 			return -EFAULT;
 
-		if ((file->f_flags & O_ACCMODE) == 0) /* O_RDONLY */
+		if ((file->f_flags & O_ACCMODE) == O_RDONLY)
 			return -EPERM;
 
 		file2 = fget(lsl.sl_fd);
 		if (!file2)
 			return -EBADF;
 
-		rc = -EPERM;
-		if ((file2->f_flags & O_ACCMODE) != 0) /* O_WRONLY or O_RDWR */
+		/* O_WRONLY or O_RDWR */
+		if ((file2->f_flags & O_ACCMODE) == O_RDONLY) {
+			rc = -EPERM;
+			goto out;
+		}
+
+		if (lsl.sl_flags & SWAP_LAYOUTS_CLOSE) {
+			struct obd_client_handle *och = NULL;
+			struct ll_inode_info *lli;
+			struct inode *inode2;
+
+			if (lsl.sl_flags != SWAP_LAYOUTS_CLOSE) {
+				rc = -EINVAL;
+				goto out;
+			}
+
+			lli = ll_i2info(inode);
+			mutex_lock(&lli->lli_och_mutex);
+			if (fd->fd_lease_och) {
+				och = fd->fd_lease_och;
+				fd->fd_lease_och = NULL;
+			}
+			mutex_unlock(&lli->lli_och_mutex);
+			if (!och) {
+				rc = -ENOLCK;
+				goto out;
+			}
+			inode2 = file_inode(file2);
+			rc = ll_swap_layouts_close(och, inode, inode2);
+		} else {
 			rc = ll_swap_layouts(file, file2, &lsl);
+		}
+out:
 		fput(file2);
 		return rc;
 	}
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 84d5556..dbb7bd7 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1524,11 +1524,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 		 * setting times to past, but it is necessary due to possible
 		 * time de-synchronization between MDT inode and OST objects
 		 */
-		if (attr->ia_valid & ATTR_SIZE)
-			down_write(&lli->lli_trunc_sem);
 		rc = cl_setattr_ost(ll_i2info(inode)->lli_clob, attr, 0);
-		if (attr->ia_valid & ATTR_SIZE)
-			up_write(&lli->lli_trunc_sem);
 	}
 out:
 	if (op_data)
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 5f93db8..b43e3a3 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -608,14 +608,6 @@ static int vvp_do_vmtruncate(struct inode *inode, size_t size)
 	return result;
 }
 
-static int vvp_io_setattr_trunc(const struct lu_env *env,
-				const struct cl_io_slice *ios,
-				struct inode *inode, loff_t size)
-{
-	inode_dio_wait(inode);
-	return 0;
-}
-
 static int vvp_io_setattr_time(const struct lu_env *env,
 			       const struct cl_io_slice *ios)
 {
@@ -646,15 +638,20 @@ static int vvp_io_setattr_start(const struct lu_env *env,
 {
 	struct cl_io	*io    = ios->cis_io;
 	struct inode	*inode = vvp_object_inode(io->ci_obj);
-	int result = 0;
+	struct ll_inode_info *lli = ll_i2info(inode);
 
-	inode_lock(inode);
-	if (cl_io_is_trunc(io))
-		result = vvp_io_setattr_trunc(env, ios, inode,
-					io->u.ci_setattr.sa_attr.lvb_size);
-	if (!result && io->u.ci_setattr.sa_valid & TIMES_SET_FLAGS)
-		result = vvp_io_setattr_time(env, ios);
-	return result;
+	if (cl_io_is_trunc(io)) {
+		down_write(&lli->lli_trunc_sem);
+		inode_lock(inode);
+		inode_dio_wait(inode);
+	} else {
+		inode_lock(inode);
+	}
+
+	if (io->u.ci_setattr.sa_valid & TIMES_SET_FLAGS)
+		return vvp_io_setattr_time(env, ios);
+
+	return 0;
 }
 
 static void vvp_io_setattr_end(const struct lu_env *env,
@@ -662,14 +659,18 @@ static void vvp_io_setattr_end(const struct lu_env *env,
 {
 	struct cl_io *io    = ios->cis_io;
 	struct inode *inode = vvp_object_inode(io->ci_obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
 
-	if (cl_io_is_trunc(io))
+	if (cl_io_is_trunc(io)) {
 		/* Truncate in memory pages - they must be clean pages
 		 * because osc has already notified to destroy osc_extents.
 		 */
 		vvp_do_vmtruncate(inode, io->u.ci_setattr.sa_attr.lvb_size);
-
-	inode_unlock(inode);
+		inode_unlock(inode);
+		up_write(&lli->lli_trunc_sem);
+	} else {
+		inode_unlock(inode);
+	}
 }
 
 static void vvp_io_setattr_fini(const struct lu_env *env,
@@ -685,6 +686,7 @@ static int vvp_io_read_start(const struct lu_env *env,
 	struct cl_io      *io    = ios->cis_io;
 	struct cl_object  *obj   = io->ci_obj;
 	struct inode      *inode = vvp_object_inode(obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
 	struct file       *file  = vio->vui_fd->fd_file;
 
 	int     result;
@@ -697,6 +699,9 @@ static int vvp_io_read_start(const struct lu_env *env,
 
 	CDEBUG(D_VFSTRACE, "read: -> [%lli, %lli)\n", pos, pos + cnt);
 
+	if (vio->vui_io_subtype == IO_NORMAL)
+		down_read(&lli->lli_trunc_sem);
+
 	if (!can_populate_pages(env, io, inode))
 		return 0;
 
@@ -939,10 +944,14 @@ static int vvp_io_write_start(const struct lu_env *env,
 	struct cl_io       *io    = ios->cis_io;
 	struct cl_object   *obj   = io->ci_obj;
 	struct inode       *inode = vvp_object_inode(obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
 	ssize_t result = 0;
 	loff_t pos = io->u.ci_wr.wr.crw_pos;
 	size_t cnt = io->u.ci_wr.wr.crw_count;
 
+	if (vio->vui_io_subtype == IO_NORMAL)
+		down_read(&lli->lli_trunc_sem);
+
 	if (!can_populate_pages(env, io, inode))
 		return 0;
 
@@ -1026,6 +1035,17 @@ static int vvp_io_write_start(const struct lu_env *env,
 	return result;
 }
 
+static void vvp_io_rw_end(const struct lu_env *env,
+			  const struct cl_io_slice *ios)
+{
+	struct vvp_io *vio = cl2vvp_io(env, ios);
+	struct inode *inode = vvp_object_inode(ios->cis_obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
+
+	if (vio->vui_io_subtype == IO_NORMAL)
+		up_read(&lli->lli_trunc_sem);
+}
+
 static int vvp_io_kernel_fault(struct vvp_fault_io *cfio)
 {
 	struct vm_fault *vmf = cfio->ft_vmf;
@@ -1078,6 +1098,7 @@ static int vvp_io_fault_start(const struct lu_env *env,
 	struct cl_io	*io      = ios->cis_io;
 	struct cl_object    *obj     = io->ci_obj;
 	struct inode        *inode   = vvp_object_inode(obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
 	struct cl_fault_io  *fio     = &io->u.ci_fault;
 	struct vvp_fault_io *cfio    = &vio->u.fault;
 	loff_t	       offset;
@@ -1093,6 +1114,8 @@ static int vvp_io_fault_start(const struct lu_env *env,
 		      " changed while waiting for the page fault lock\n",
 		      PFID(lu_object_fid(&obj->co_lu)));
 
+	down_read(&lli->lli_trunc_sem);
+
 	/* offset of the last byte on the page */
 	offset = cl_offset(obj, fio->ft_index + 1) - 1;
 	LASSERT(cl_index(obj, offset) == fio->ft_index);
@@ -1240,6 +1263,17 @@ out:
 	return result;
 }
 
+static void vvp_io_fault_end(const struct lu_env *env,
+			     const struct cl_io_slice *ios)
+{
+	struct inode *inode = vvp_object_inode(ios->cis_obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
+
+	CLOBINVRNT(env, ios->cis_io->ci_obj,
+		   vvp_object_invariant(ios->cis_io->ci_obj));
+	up_read(&lli->lli_trunc_sem);
+}
+
 static int vvp_io_fsync_start(const struct lu_env *env,
 			      const struct cl_io_slice *ios)
 {
@@ -1269,18 +1303,13 @@ static int vvp_io_read_ahead(const struct lu_env *env,
 	return result;
 }
 
-static void vvp_io_end(const struct lu_env *env, const struct cl_io_slice *ios)
-{
-	CLOBINVRNT(env, ios->cis_io->ci_obj,
-		   vvp_object_invariant(ios->cis_io->ci_obj));
-}
-
 static const struct cl_io_operations vvp_io_ops = {
 	.op = {
 		[CIT_READ] = {
 			.cio_fini	= vvp_io_fini,
 			.cio_lock      = vvp_io_read_lock,
 			.cio_start     = vvp_io_read_start,
+			.cio_end	= vvp_io_rw_end,
 			.cio_advance	= vvp_io_advance,
 		},
 		[CIT_WRITE] = {
@@ -1289,6 +1318,7 @@ static const struct cl_io_operations vvp_io_ops = {
 			.cio_iter_fini = vvp_io_write_iter_fini,
 			.cio_lock      = vvp_io_write_lock,
 			.cio_start     = vvp_io_write_start,
+			.cio_end	= vvp_io_rw_end,
 			.cio_advance   = vvp_io_advance,
 		},
 		[CIT_SETATTR] = {
@@ -1303,7 +1333,7 @@ static const struct cl_io_operations vvp_io_ops = {
 			.cio_iter_init = vvp_io_fault_iter_init,
 			.cio_lock      = vvp_io_fault_lock,
 			.cio_start     = vvp_io_fault_start,
-			.cio_end       = vvp_io_end,
+			.cio_end       = vvp_io_fault_end,
 		},
 		[CIT_FSYNC] = {
 			.cio_start  = vvp_io_fsync_start,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index 1925072..c1990f0 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -430,25 +430,29 @@ void mdc_getattr_pack(struct ptlrpc_request *req, __u64 valid, u32 flags,
 			      op_data->op_namelen);
 }
 
-static void mdc_hsm_release_pack(struct ptlrpc_request *req,
-				 struct md_op_data *op_data)
+static void mdc_intent_close_pack(struct ptlrpc_request *req,
+				  struct md_op_data *op_data)
 {
-	if (op_data->op_bias & MDS_HSM_RELEASE) {
-		struct close_data *data;
-		struct ldlm_lock *lock;
+	enum mds_op_bias bias = op_data->op_bias;
+	struct close_data *data;
+	struct ldlm_lock *lock;
 
-		data = req_capsule_client_get(&req->rq_pill, &RMF_CLOSE_DATA);
+	if (!(bias & (MDS_HSM_RELEASE | MDS_CLOSE_LAYOUT_SWAP |
+		      MDS_RENAME_MIGRATE)))
+		return;
 
-		lock = ldlm_handle2lock(&op_data->op_lease_handle);
-		if (lock) {
-			data->cd_handle = lock->l_remote_handle;
-			LDLM_LOCK_PUT(lock);
-		}
-		ldlm_cli_cancel(&op_data->op_lease_handle, LCF_LOCAL);
+	data = req_capsule_client_get(&req->rq_pill, &RMF_CLOSE_DATA);
+	LASSERT(data);
 
-		data->cd_data_version = op_data->op_data_version;
-		data->cd_fid = op_data->op_fid2;
+	lock = ldlm_handle2lock(&op_data->op_lease_handle);
+	if (lock) {
+		data->cd_handle = lock->l_remote_handle;
+		LDLM_LOCK_PUT(lock);
 	}
+	ldlm_cli_cancel(&op_data->op_lease_handle, LCF_LOCAL);
+
+	data->cd_data_version = op_data->op_data_version;
+	data->cd_fid = op_data->op_fid2;
 }
 
 void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
@@ -473,5 +477,5 @@ void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 		rec->sa_valid &= ~MDS_ATTR_ATIME;
 
 	mdc_ioepoch_pack(epoch, op_data);
-	mdc_hsm_release_pack(req, op_data);
+	mdc_intent_close_pack(req, op_data);
 }
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 34ccff8..ac04bf3 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -720,9 +720,8 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 	int                    rc;
 	int		       saved_rc = 0;
 
-	req_fmt = &RQF_MDS_CLOSE;
 	if (op_data->op_bias & MDS_HSM_RELEASE) {
-		req_fmt = &RQF_MDS_RELEASE_CLOSE;
+		req_fmt = &RQF_MDS_INTENT_CLOSE;
 
 		/* allocate a FID for volatile file */
 		rc = mdc_fid_alloc(NULL, exp, &op_data->op_fid2, op_data);
@@ -732,6 +731,10 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 			/* save the errcode and proceed to close */
 			saved_rc = rc;
 		}
+	} else if (op_data->op_bias & MDS_CLOSE_LAYOUT_SWAP) {
+		req_fmt = &RQF_MDS_INTENT_CLOSE;
+	} else {
+		req_fmt = &RQF_MDS_CLOSE;
 	}
 
 	*request = NULL;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index f0e0448..31aa58e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -121,7 +121,7 @@ static const struct req_msg_field *mdt_close_client[] = {
 	&RMF_CAPA1
 };
 
-static const struct req_msg_field *mdt_release_close_client[] = {
+static const struct req_msg_field *mdt_intent_close_client[] = {
 	&RMF_PTLRPC_BODY,
 	&RMF_MDT_EPOCH,
 	&RMF_REC_REINT,
@@ -666,7 +666,7 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_GETXATTR,
 	&RQF_MDS_SYNC,
 	&RQF_MDS_CLOSE,
-	&RQF_MDS_RELEASE_CLOSE,
+	&RQF_MDS_INTENT_CLOSE,
 	&RQF_MDS_READPAGE,
 	&RQF_MDS_WRITEPAGE,
 	&RQF_MDS_REINT,
@@ -1365,10 +1365,10 @@ struct req_format RQF_MDS_CLOSE =
 			mdt_close_client, mds_last_unlink_server);
 EXPORT_SYMBOL(RQF_MDS_CLOSE);
 
-struct req_format RQF_MDS_RELEASE_CLOSE =
+struct req_format RQF_MDS_INTENT_CLOSE =
 	DEFINE_REQ_FMT0("MDS_CLOSE",
-			mdt_release_close_client, mds_last_unlink_server);
-EXPORT_SYMBOL(RQF_MDS_RELEASE_CLOSE);
+			mdt_intent_close_client, mds_last_unlink_server);
+EXPORT_SYMBOL(RQF_MDS_INTENT_CLOSE);
 
 struct req_format RQF_MDS_READPAGE =
 	DEFINE_REQ_FMT0("MDS_READPAGE",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 35/41] staging: lustre: hsm: Use file lease to implement migration
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List, Henri Doreau,
	Jinshan Xiong, James Simmons

From: Henri Doreau <henri.doreau@cea.fr>

Implement non-blocking migration based on exclusive open instead of
group lock. Implemented exclusive close operation to atomically put
a lease, swap two layouts and close a file. This allows race-free
migrations.

Make the caller responsible for retrying on failure (EBUSY, EAGAIN)
in non-blocking mode.

In blocking mode, allow applications to trigger layout swaps using a
grouplock they already own, to prevent race conditions between the
actual data copy and the layout swap. Updated lfs accordingly. File
leases are also taken in blocking mode, so that lfs migrate can issue
a warning if an application attempts to open a file that is being
migrated and gets blocked.

Timestamps (atime/mtime) are set from userland, after the layout swap
is performed, to prevent conflicts with the grouplock.

lli_trunc_sem is taken/released in the vvp_io layer, under the DLM
lock. This re-ordering fixes the original issue between truncate and
migrate.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4840
Reviewed-on: http://review.whamcloud.com/10013
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../lustre/lustre/include/lustre/lustre_idl.h      |    5 +-
 .../lustre/lustre/include/lustre/lustre_user.h     |    1 +
 .../lustre/lustre/include/lustre_req_layout.h      |    2 +-
 drivers/staging/lustre/lustre/llite/file.c         |  231 ++++++++++++--------
 drivers/staging/lustre/lustre/llite/llite_lib.c    |    4 -
 drivers/staging/lustre/lustre/llite/vvp_io.c       |   82 +++++---
 drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   34 ++--
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    7 +-
 drivers/staging/lustre/lustre/ptlrpc/layout.c      |   10 +-
 9 files changed, 235 insertions(+), 141 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index 5d2f845..7de8098 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -1703,7 +1703,9 @@ lov_mds_md_max_stripe_count(size_t buf_size, __u32 lmm_magic)
 /*	OBD_MD_FLRMTRGETFACL (0x0008000000000000ULL) lfs rgetfacl, obsolete */
 
 #define OBD_MD_FLDATAVERSION (0x0010000000000000ULL) /* iversion sum */
-#define OBD_MD_FLRELEASED    (0x0020000000000000ULL) /* file released */
+#define OBD_MD_CLOSE_INTENT_EXECED (0x0020000000000000ULL) /* close intent
+							    * executed
+							    */
 
 #define OBD_MD_DEFAULT_MEA   (0x0040000000000000ULL) /* default MEA */
 
@@ -2235,6 +2237,7 @@ enum mds_op_bias {
 	MDS_OWNEROVERRIDE	= 1 << 11,
 	MDS_HSM_RELEASE		= 1 << 12,
 	MDS_RENAME_MIGRATE	= BIT(13),
+	MDS_CLOSE_LAYOUT_SWAP   = BIT(14),
 };
 
 /* instance of mdt_reint_rec */
diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
index 856e2f9..579ef14 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_user.h
@@ -650,6 +650,7 @@ struct if_quotactl {
 #define SWAP_LAYOUTS_CHECK_DV2		(1 << 1)
 #define SWAP_LAYOUTS_KEEP_MTIME		(1 << 2)
 #define SWAP_LAYOUTS_KEEP_ATIME		(1 << 3)
+#define SWAP_LAYOUTS_CLOSE		BIT(4)
 
 /* Swap XATTR_NAME_HSM as well, only on the MDT so far */
 #define SWAP_LAYOUTS_MDS_HSM		(1 << 31)
diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index 78857b3..7657132 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -148,7 +148,7 @@ extern struct req_format RQF_MDS_GETATTR;
  */
 extern struct req_format RQF_MDS_GETATTR_NAME;
 extern struct req_format RQF_MDS_CLOSE;
-extern struct req_format RQF_MDS_RELEASE_CLOSE;
+extern struct req_format RQF_MDS_INTENT_CLOSE;
 extern struct req_format RQF_MDS_CONNECT;
 extern struct req_format RQF_MDS_DISCONNECT;
 extern struct req_format RQF_MDS_GET_INFO;
diff --git a/drivers/staging/lustre/lustre/llite/file.c b/drivers/staging/lustre/lustre/llite/file.c
index 9bf50bf..b9cadd9 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -113,10 +113,19 @@ out:
 			   0, 0, LUSTRE_OPC_ANY, NULL);
 }
 
+/**
+ * Perform a close, possibly with a bias.
+ * The meaning of "data" depends on the value of "bias".
+ *
+ * If \a bias is MDS_HSM_RELEASE then \a data is a pointer to the data version.
+ * If \a bias is MDS_CLOSE_LAYOUT_SWAP then \a data is a pointer to the inode to
+ * swap layouts with.
+ */
 static int ll_close_inode_openhandle(struct obd_export *md_exp,
-				     struct inode *inode,
 				     struct obd_client_handle *och,
-				     const __u64 *data_version)
+				     struct inode *inode,
+				     enum mds_op_bias bias,
+				     void *data)
 {
 	struct obd_export *exp = ll_i2mdexp(inode);
 	struct md_op_data *op_data;
@@ -143,12 +152,26 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 	}
 
 	ll_prepare_close(inode, op_data, och);
-	if (data_version) {
-		/* Pass in data_version implies release. */
+	switch (bias) {
+	case MDS_CLOSE_LAYOUT_SWAP:
+		LASSERT(data);
+		op_data->op_bias |= MDS_CLOSE_LAYOUT_SWAP;
+		op_data->op_data_version = 0;
+		op_data->op_lease_handle = och->och_lease_handle;
+		op_data->op_fid2 = *ll_inode2fid(data);
+		break;
+
+	case MDS_HSM_RELEASE:
+		LASSERT(data);
 		op_data->op_bias |= MDS_HSM_RELEASE;
-		op_data->op_data_version = *data_version;
+		op_data->op_data_version = *(__u64 *)data;
 		op_data->op_lease_handle = och->och_lease_handle;
 		op_data->op_attr.ia_valid |= ATTR_SIZE | ATTR_BLOCKS;
+		break;
+
+	default:
+		LASSERT(!data);
+		break;
 	}
 
 	rc = md_close(md_exp, op_data, och->och_mod, &req);
@@ -169,11 +192,12 @@ static int ll_close_inode_openhandle(struct obd_export *md_exp,
 		spin_unlock(&lli->lli_lock);
 	}
 
-	if (rc == 0 && op_data->op_bias & MDS_HSM_RELEASE) {
+	if (op_data->op_bias & (MDS_HSM_RELEASE | MDS_CLOSE_LAYOUT_SWAP) &&
+	    !rc) {
 		struct mdt_body *body;
 
 		body = req_capsule_server_get(&req->rq_pill, &RMF_MDT_BODY);
-		if (!(body->mbo_valid & OBD_MD_FLRELEASED))
+		if (!(body->mbo_valid & OBD_MD_CLOSE_INTENT_EXECED))
 			rc = -EBUSY;
 	}
 
@@ -227,7 +251,7 @@ int ll_md_real_close(struct inode *inode, fmode_t fmode)
 		 * be closed.
 		 */
 		rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-					       inode, och, NULL);
+					       och, inode, 0, NULL);
 	}
 
 	return rc;
@@ -263,7 +287,8 @@ static int ll_md_close(struct obd_export *md_exp, struct inode *inode,
 	}
 
 	if (fd->fd_och) {
-		rc = ll_close_inode_openhandle(md_exp, inode, fd->fd_och, NULL);
+		rc = ll_close_inode_openhandle(md_exp, fd->fd_och, inode, 0,
+					       NULL);
 		fd->fd_och = NULL;
 		goto out;
 	}
@@ -816,7 +841,7 @@ out_close:
 		it.it_lock_mode = 0;
 		och->och_lease_handle.cookie = 0ULL;
 	}
-	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, inode, och, NULL);
+	rc2 = ll_close_inode_openhandle(sbi->ll_md_exp, och, inode, 0, NULL);
 	if (rc2 < 0)
 		CERROR("%s: error closing file "DFID": %d\n",
 		       ll_get_fsname(inode->i_sb, NULL, 0),
@@ -830,6 +855,69 @@ out:
 }
 
 /**
+ * Check whether a layout swap can be done between two inodes.
+ *
+ * \param[in] inode1  First inode to check
+ * \param[in] inode2  Second inode to check
+ *
+ * \retval 0 on success, layout swap can be performed between both inodes
+ * \retval negative error code if requirements are not met
+ */
+static int ll_check_swap_layouts_validity(struct inode *inode1,
+					  struct inode *inode2)
+{
+	if (!S_ISREG(inode1->i_mode) || !S_ISREG(inode2->i_mode))
+		return -EINVAL;
+
+	if (inode_permission(inode1, MAY_WRITE) ||
+	    inode_permission(inode2, MAY_WRITE))
+		return -EPERM;
+
+	if (inode1->i_sb != inode2->i_sb)
+		return -EXDEV;
+
+	return 0;
+}
+
+static int ll_swap_layouts_close(struct obd_client_handle *och,
+				 struct inode *inode, struct inode *inode2)
+{
+	const struct lu_fid *fid1 = ll_inode2fid(inode);
+	const struct lu_fid *fid2;
+	int rc;
+
+	CDEBUG(D_INODE, "%s: biased close of file " DFID "\n",
+	       ll_get_fsname(inode->i_sb, NULL, 0), PFID(fid1));
+
+	rc = ll_check_swap_layouts_validity(inode, inode2);
+	if (rc < 0)
+		goto out_free_och;
+
+	/* We now know that inode2 is a lustre inode */
+	fid2 = ll_inode2fid(inode2);
+
+	rc = lu_fid_cmp(fid1, fid2);
+	if (!rc) {
+		rc = -EINVAL;
+		goto out_free_och;
+	}
+
+	/*
+	 * Close the file and swap layouts between inode & inode2.
+	 * NB: lease lock handle is released in mdc_close_layout_swap_pack()
+	 * because we still need it to pack l_remote_handle to MDT.
+	 */
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, och, inode,
+				       MDS_CLOSE_LAYOUT_SWAP, inode2);
+
+	och = NULL; /* freed in ll_close_inode_openhandle() */
+
+out_free_och:
+	kfree(och);
+	return rc;
+}
+
+/**
  * Release lease and close the file.
  * It will check if the lease has ever broken.
  */
@@ -856,7 +944,7 @@ static int ll_lease_close(struct obd_client_handle *och, struct inode *inode,
 		*lease_broken = cancelled;
 
 	return ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-					 inode, och, NULL);
+					 och, inode, 0, NULL);
 }
 
 int ll_merge_attr(const struct lu_env *env, struct inode *inode)
@@ -1018,7 +1106,6 @@ restart:
 
 				range_locked = true;
 			}
-			down_read(&lli->lli_trunc_sem);
 			break;
 		case IO_SPLICE:
 			vio->u.splice.vui_pipe = args->u.splice.via_pipe;
@@ -1031,8 +1118,6 @@ restart:
 		ll_cl_add(file, env, io);
 		rc = cl_io_loop(env, io);
 		ll_cl_remove(file, env);
-		if (args->via_io_subtype == IO_NORMAL)
-			up_read(&lli->lli_trunc_sem);
 		if (range_locked) {
 			CDEBUG(D_VFSTRACE, "Range unlock [%llu, %llu]\n",
 			       range.rl_node.in_extent.start,
@@ -1454,7 +1539,7 @@ int ll_release_openhandle(struct inode *inode, struct lookup_intent *it)
 	ll_och_fill(ll_i2sbi(inode)->ll_md_exp, it, och);
 
 	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp,
-				       inode, och, NULL);
+				       och, inode, 0, NULL);
 out:
 	/* this one is in place of ll_file_open */
 	if (it_disposition(it, DISP_ENQ_OPEN_REF)) {
@@ -1657,8 +1742,8 @@ int ll_hsm_release(struct inode *inode)
 	 * NB: lease lock handle is released in mdc_hsm_release_pack() because
 	 * we still need it to pack l_remote_handle to MDT.
 	 */
-	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, inode, och,
-				       &data_version);
+	rc = ll_close_inode_openhandle(ll_i2sbi(inode)->ll_md_exp, och, inode,
+				       MDS_HSM_RELEASE, &data_version);
 	och = NULL;
 
 out:
@@ -1669,10 +1754,12 @@ out:
 }
 
 struct ll_swap_stack {
-	struct iattr		 ia1, ia2;
-	__u64			 dv1, dv2;
-	struct inode		*inode1, *inode2;
-	bool			 check_dv1, check_dv2;
+	u64		dv1;
+	u64		dv2;
+	struct inode   *inode1;
+	struct inode   *inode2;
+	bool		check_dv1;
+	bool		check_dv2;
 };
 
 static int ll_swap_layouts(struct file *file1, struct file *file2,
@@ -1692,21 +1779,9 @@ static int ll_swap_layouts(struct file *file1, struct file *file2,
 	llss->inode1 = file_inode(file1);
 	llss->inode2 = file_inode(file2);
 
-	if (!S_ISREG(llss->inode2->i_mode)) {
-		rc = -EINVAL;
-		goto free;
-	}
-
-	if (inode_permission(llss->inode1, MAY_WRITE) ||
-	    inode_permission(llss->inode2, MAY_WRITE)) {
-		rc = -EPERM;
-		goto free;
-	}
-
-	if (llss->inode2->i_sb != llss->inode1->i_sb) {
-		rc = -EXDEV;
+	rc = ll_check_swap_layouts_validity(llss->inode1, llss->inode2);
+	if (rc < 0)
 		goto free;
-	}
 
 	/* we use 2 bool because it is easier to swap than 2 bits */
 	if (lsl->sl_flags & SWAP_LAYOUTS_CHECK_DV1)
@@ -1720,10 +1795,8 @@ static int ll_swap_layouts(struct file *file1, struct file *file2,
 	llss->dv2 = lsl->sl_dv2;
 
 	rc = lu_fid_cmp(ll_inode2fid(llss->inode1), ll_inode2fid(llss->inode2));
-	if (rc == 0) /* same file, done! */ {
-		rc = 0;
+	if (!rc) /* same file, done! */
 		goto free;
-	}
 
 	if (rc < 0) { /* sequentialize it */
 		swap(llss->inode1, llss->inode2);
@@ -1745,19 +1818,6 @@ static int ll_swap_layouts(struct file *file1, struct file *file2,
 		}
 	}
 
-	/* to be able to restore mtime and atime after swap
-	 * we need to first save them
-	 */
-	if (lsl->sl_flags &
-	    (SWAP_LAYOUTS_KEEP_MTIME | SWAP_LAYOUTS_KEEP_ATIME)) {
-		llss->ia1.ia_mtime = llss->inode1->i_mtime;
-		llss->ia1.ia_atime = llss->inode1->i_atime;
-		llss->ia1.ia_valid = ATTR_MTIME | ATTR_ATIME;
-		llss->ia2.ia_mtime = llss->inode2->i_mtime;
-		llss->ia2.ia_atime = llss->inode2->i_atime;
-		llss->ia2.ia_valid = ATTR_MTIME | ATTR_ATIME;
-	}
-
 	/* ultimate check, before swapping the layouts we check if
 	 * dataversion has changed (if requested)
 	 */
@@ -1807,39 +1867,6 @@ putgl:
 		ll_put_grouplock(llss->inode1, file1, gid);
 	}
 
-	/* rc can be set from obd_iocontrol() or from a GOTO(putgl, ...) */
-	if (rc != 0)
-		goto free;
-
-	/* clear useless flags */
-	if (!(lsl->sl_flags & SWAP_LAYOUTS_KEEP_MTIME)) {
-		llss->ia1.ia_valid &= ~ATTR_MTIME;
-		llss->ia2.ia_valid &= ~ATTR_MTIME;
-	}
-
-	if (!(lsl->sl_flags & SWAP_LAYOUTS_KEEP_ATIME)) {
-		llss->ia1.ia_valid &= ~ATTR_ATIME;
-		llss->ia2.ia_valid &= ~ATTR_ATIME;
-	}
-
-	/* update time if requested */
-	rc = 0;
-	if (llss->ia2.ia_valid != 0) {
-		inode_lock(llss->inode1);
-		rc = ll_setattr(file1->f_path.dentry, &llss->ia2);
-		inode_unlock(llss->inode1);
-	}
-
-	if (llss->ia1.ia_valid != 0) {
-		int rc1;
-
-		inode_lock(llss->inode2);
-		rc1 = ll_setattr(file2->f_path.dentry, &llss->ia1);
-		inode_unlock(llss->inode2);
-		if (rc == 0)
-			rc = rc1;
-	}
-
 free:
 	kfree(llss);
 
@@ -1996,16 +2023,46 @@ ll_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 				   sizeof(struct lustre_swap_layouts)))
 			return -EFAULT;
 
-		if ((file->f_flags & O_ACCMODE) == 0) /* O_RDONLY */
+		if ((file->f_flags & O_ACCMODE) == O_RDONLY)
 			return -EPERM;
 
 		file2 = fget(lsl.sl_fd);
 		if (!file2)
 			return -EBADF;
 
-		rc = -EPERM;
-		if ((file2->f_flags & O_ACCMODE) != 0) /* O_WRONLY or O_RDWR */
+		/* O_WRONLY or O_RDWR */
+		if ((file2->f_flags & O_ACCMODE) == O_RDONLY) {
+			rc = -EPERM;
+			goto out;
+		}
+
+		if (lsl.sl_flags & SWAP_LAYOUTS_CLOSE) {
+			struct obd_client_handle *och = NULL;
+			struct ll_inode_info *lli;
+			struct inode *inode2;
+
+			if (lsl.sl_flags != SWAP_LAYOUTS_CLOSE) {
+				rc = -EINVAL;
+				goto out;
+			}
+
+			lli = ll_i2info(inode);
+			mutex_lock(&lli->lli_och_mutex);
+			if (fd->fd_lease_och) {
+				och = fd->fd_lease_och;
+				fd->fd_lease_och = NULL;
+			}
+			mutex_unlock(&lli->lli_och_mutex);
+			if (!och) {
+				rc = -ENOLCK;
+				goto out;
+			}
+			inode2 = file_inode(file2);
+			rc = ll_swap_layouts_close(och, inode, inode2);
+		} else {
 			rc = ll_swap_layouts(file, file2, &lsl);
+		}
+out:
 		fput(file2);
 		return rc;
 	}
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 84d5556..dbb7bd7 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1524,11 +1524,7 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr *attr, bool hsm_import)
 		 * setting times to past, but it is necessary due to possible
 		 * time de-synchronization between MDT inode and OST objects
 		 */
-		if (attr->ia_valid & ATTR_SIZE)
-			down_write(&lli->lli_trunc_sem);
 		rc = cl_setattr_ost(ll_i2info(inode)->lli_clob, attr, 0);
-		if (attr->ia_valid & ATTR_SIZE)
-			up_write(&lli->lli_trunc_sem);
 	}
 out:
 	if (op_data)
diff --git a/drivers/staging/lustre/lustre/llite/vvp_io.c b/drivers/staging/lustre/lustre/llite/vvp_io.c
index 5f93db8..b43e3a3 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_io.c
+++ b/drivers/staging/lustre/lustre/llite/vvp_io.c
@@ -608,14 +608,6 @@ static int vvp_do_vmtruncate(struct inode *inode, size_t size)
 	return result;
 }
 
-static int vvp_io_setattr_trunc(const struct lu_env *env,
-				const struct cl_io_slice *ios,
-				struct inode *inode, loff_t size)
-{
-	inode_dio_wait(inode);
-	return 0;
-}
-
 static int vvp_io_setattr_time(const struct lu_env *env,
 			       const struct cl_io_slice *ios)
 {
@@ -646,15 +638,20 @@ static int vvp_io_setattr_start(const struct lu_env *env,
 {
 	struct cl_io	*io    = ios->cis_io;
 	struct inode	*inode = vvp_object_inode(io->ci_obj);
-	int result = 0;
+	struct ll_inode_info *lli = ll_i2info(inode);
 
-	inode_lock(inode);
-	if (cl_io_is_trunc(io))
-		result = vvp_io_setattr_trunc(env, ios, inode,
-					io->u.ci_setattr.sa_attr.lvb_size);
-	if (!result && io->u.ci_setattr.sa_valid & TIMES_SET_FLAGS)
-		result = vvp_io_setattr_time(env, ios);
-	return result;
+	if (cl_io_is_trunc(io)) {
+		down_write(&lli->lli_trunc_sem);
+		inode_lock(inode);
+		inode_dio_wait(inode);
+	} else {
+		inode_lock(inode);
+	}
+
+	if (io->u.ci_setattr.sa_valid & TIMES_SET_FLAGS)
+		return vvp_io_setattr_time(env, ios);
+
+	return 0;
 }
 
 static void vvp_io_setattr_end(const struct lu_env *env,
@@ -662,14 +659,18 @@ static void vvp_io_setattr_end(const struct lu_env *env,
 {
 	struct cl_io *io    = ios->cis_io;
 	struct inode *inode = vvp_object_inode(io->ci_obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
 
-	if (cl_io_is_trunc(io))
+	if (cl_io_is_trunc(io)) {
 		/* Truncate in memory pages - they must be clean pages
 		 * because osc has already notified to destroy osc_extents.
 		 */
 		vvp_do_vmtruncate(inode, io->u.ci_setattr.sa_attr.lvb_size);
-
-	inode_unlock(inode);
+		inode_unlock(inode);
+		up_write(&lli->lli_trunc_sem);
+	} else {
+		inode_unlock(inode);
+	}
 }
 
 static void vvp_io_setattr_fini(const struct lu_env *env,
@@ -685,6 +686,7 @@ static int vvp_io_read_start(const struct lu_env *env,
 	struct cl_io      *io    = ios->cis_io;
 	struct cl_object  *obj   = io->ci_obj;
 	struct inode      *inode = vvp_object_inode(obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
 	struct file       *file  = vio->vui_fd->fd_file;
 
 	int     result;
@@ -697,6 +699,9 @@ static int vvp_io_read_start(const struct lu_env *env,
 
 	CDEBUG(D_VFSTRACE, "read: -> [%lli, %lli)\n", pos, pos + cnt);
 
+	if (vio->vui_io_subtype == IO_NORMAL)
+		down_read(&lli->lli_trunc_sem);
+
 	if (!can_populate_pages(env, io, inode))
 		return 0;
 
@@ -939,10 +944,14 @@ static int vvp_io_write_start(const struct lu_env *env,
 	struct cl_io       *io    = ios->cis_io;
 	struct cl_object   *obj   = io->ci_obj;
 	struct inode       *inode = vvp_object_inode(obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
 	ssize_t result = 0;
 	loff_t pos = io->u.ci_wr.wr.crw_pos;
 	size_t cnt = io->u.ci_wr.wr.crw_count;
 
+	if (vio->vui_io_subtype == IO_NORMAL)
+		down_read(&lli->lli_trunc_sem);
+
 	if (!can_populate_pages(env, io, inode))
 		return 0;
 
@@ -1026,6 +1035,17 @@ static int vvp_io_write_start(const struct lu_env *env,
 	return result;
 }
 
+static void vvp_io_rw_end(const struct lu_env *env,
+			  const struct cl_io_slice *ios)
+{
+	struct vvp_io *vio = cl2vvp_io(env, ios);
+	struct inode *inode = vvp_object_inode(ios->cis_obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
+
+	if (vio->vui_io_subtype == IO_NORMAL)
+		up_read(&lli->lli_trunc_sem);
+}
+
 static int vvp_io_kernel_fault(struct vvp_fault_io *cfio)
 {
 	struct vm_fault *vmf = cfio->ft_vmf;
@@ -1078,6 +1098,7 @@ static int vvp_io_fault_start(const struct lu_env *env,
 	struct cl_io	*io      = ios->cis_io;
 	struct cl_object    *obj     = io->ci_obj;
 	struct inode        *inode   = vvp_object_inode(obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
 	struct cl_fault_io  *fio     = &io->u.ci_fault;
 	struct vvp_fault_io *cfio    = &vio->u.fault;
 	loff_t	       offset;
@@ -1093,6 +1114,8 @@ static int vvp_io_fault_start(const struct lu_env *env,
 		      " changed while waiting for the page fault lock\n",
 		      PFID(lu_object_fid(&obj->co_lu)));
 
+	down_read(&lli->lli_trunc_sem);
+
 	/* offset of the last byte on the page */
 	offset = cl_offset(obj, fio->ft_index + 1) - 1;
 	LASSERT(cl_index(obj, offset) == fio->ft_index);
@@ -1240,6 +1263,17 @@ out:
 	return result;
 }
 
+static void vvp_io_fault_end(const struct lu_env *env,
+			     const struct cl_io_slice *ios)
+{
+	struct inode *inode = vvp_object_inode(ios->cis_obj);
+	struct ll_inode_info *lli = ll_i2info(inode);
+
+	CLOBINVRNT(env, ios->cis_io->ci_obj,
+		   vvp_object_invariant(ios->cis_io->ci_obj));
+	up_read(&lli->lli_trunc_sem);
+}
+
 static int vvp_io_fsync_start(const struct lu_env *env,
 			      const struct cl_io_slice *ios)
 {
@@ -1269,18 +1303,13 @@ static int vvp_io_read_ahead(const struct lu_env *env,
 	return result;
 }
 
-static void vvp_io_end(const struct lu_env *env, const struct cl_io_slice *ios)
-{
-	CLOBINVRNT(env, ios->cis_io->ci_obj,
-		   vvp_object_invariant(ios->cis_io->ci_obj));
-}
-
 static const struct cl_io_operations vvp_io_ops = {
 	.op = {
 		[CIT_READ] = {
 			.cio_fini	= vvp_io_fini,
 			.cio_lock      = vvp_io_read_lock,
 			.cio_start     = vvp_io_read_start,
+			.cio_end	= vvp_io_rw_end,
 			.cio_advance	= vvp_io_advance,
 		},
 		[CIT_WRITE] = {
@@ -1289,6 +1318,7 @@ static const struct cl_io_operations vvp_io_ops = {
 			.cio_iter_fini = vvp_io_write_iter_fini,
 			.cio_lock      = vvp_io_write_lock,
 			.cio_start     = vvp_io_write_start,
+			.cio_end	= vvp_io_rw_end,
 			.cio_advance   = vvp_io_advance,
 		},
 		[CIT_SETATTR] = {
@@ -1303,7 +1333,7 @@ static const struct cl_io_operations vvp_io_ops = {
 			.cio_iter_init = vvp_io_fault_iter_init,
 			.cio_lock      = vvp_io_fault_lock,
 			.cio_start     = vvp_io_fault_start,
-			.cio_end       = vvp_io_end,
+			.cio_end       = vvp_io_fault_end,
 		},
 		[CIT_FSYNC] = {
 			.cio_start  = vvp_io_fsync_start,
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_lib.c b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
index 1925072..c1990f0 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_lib.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_lib.c
@@ -430,25 +430,29 @@ void mdc_getattr_pack(struct ptlrpc_request *req, __u64 valid, u32 flags,
 			      op_data->op_namelen);
 }
 
-static void mdc_hsm_release_pack(struct ptlrpc_request *req,
-				 struct md_op_data *op_data)
+static void mdc_intent_close_pack(struct ptlrpc_request *req,
+				  struct md_op_data *op_data)
 {
-	if (op_data->op_bias & MDS_HSM_RELEASE) {
-		struct close_data *data;
-		struct ldlm_lock *lock;
+	enum mds_op_bias bias = op_data->op_bias;
+	struct close_data *data;
+	struct ldlm_lock *lock;
 
-		data = req_capsule_client_get(&req->rq_pill, &RMF_CLOSE_DATA);
+	if (!(bias & (MDS_HSM_RELEASE | MDS_CLOSE_LAYOUT_SWAP |
+		      MDS_RENAME_MIGRATE)))
+		return;
 
-		lock = ldlm_handle2lock(&op_data->op_lease_handle);
-		if (lock) {
-			data->cd_handle = lock->l_remote_handle;
-			LDLM_LOCK_PUT(lock);
-		}
-		ldlm_cli_cancel(&op_data->op_lease_handle, LCF_LOCAL);
+	data = req_capsule_client_get(&req->rq_pill, &RMF_CLOSE_DATA);
+	LASSERT(data);
 
-		data->cd_data_version = op_data->op_data_version;
-		data->cd_fid = op_data->op_fid2;
+	lock = ldlm_handle2lock(&op_data->op_lease_handle);
+	if (lock) {
+		data->cd_handle = lock->l_remote_handle;
+		LDLM_LOCK_PUT(lock);
 	}
+	ldlm_cli_cancel(&op_data->op_lease_handle, LCF_LOCAL);
+
+	data->cd_data_version = op_data->op_data_version;
+	data->cd_fid = op_data->op_fid2;
 }
 
 void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
@@ -473,5 +477,5 @@ void mdc_close_pack(struct ptlrpc_request *req, struct md_op_data *op_data)
 		rec->sa_valid &= ~MDS_ATTR_ATIME;
 
 	mdc_ioepoch_pack(epoch, op_data);
-	mdc_hsm_release_pack(req, op_data);
+	mdc_intent_close_pack(req, op_data);
 }
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 34ccff8..ac04bf3 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -720,9 +720,8 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 	int                    rc;
 	int		       saved_rc = 0;
 
-	req_fmt = &RQF_MDS_CLOSE;
 	if (op_data->op_bias & MDS_HSM_RELEASE) {
-		req_fmt = &RQF_MDS_RELEASE_CLOSE;
+		req_fmt = &RQF_MDS_INTENT_CLOSE;
 
 		/* allocate a FID for volatile file */
 		rc = mdc_fid_alloc(NULL, exp, &op_data->op_fid2, op_data);
@@ -732,6 +731,10 @@ static int mdc_close(struct obd_export *exp, struct md_op_data *op_data,
 			/* save the errcode and proceed to close */
 			saved_rc = rc;
 		}
+	} else if (op_data->op_bias & MDS_CLOSE_LAYOUT_SWAP) {
+		req_fmt = &RQF_MDS_INTENT_CLOSE;
+	} else {
+		req_fmt = &RQF_MDS_CLOSE;
 	}
 
 	*request = NULL;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index f0e0448..31aa58e 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -121,7 +121,7 @@ static const struct req_msg_field *mdt_close_client[] = {
 	&RMF_CAPA1
 };
 
-static const struct req_msg_field *mdt_release_close_client[] = {
+static const struct req_msg_field *mdt_intent_close_client[] = {
 	&RMF_PTLRPC_BODY,
 	&RMF_MDT_EPOCH,
 	&RMF_REC_REINT,
@@ -666,7 +666,7 @@ static struct req_format *req_formats[] = {
 	&RQF_MDS_GETXATTR,
 	&RQF_MDS_SYNC,
 	&RQF_MDS_CLOSE,
-	&RQF_MDS_RELEASE_CLOSE,
+	&RQF_MDS_INTENT_CLOSE,
 	&RQF_MDS_READPAGE,
 	&RQF_MDS_WRITEPAGE,
 	&RQF_MDS_REINT,
@@ -1365,10 +1365,10 @@ struct req_format RQF_MDS_CLOSE =
 			mdt_close_client, mds_last_unlink_server);
 EXPORT_SYMBOL(RQF_MDS_CLOSE);
 
-struct req_format RQF_MDS_RELEASE_CLOSE =
+struct req_format RQF_MDS_INTENT_CLOSE =
 	DEFINE_REQ_FMT0("MDS_CLOSE",
-			mdt_release_close_client, mds_last_unlink_server);
-EXPORT_SYMBOL(RQF_MDS_RELEASE_CLOSE);
+			mdt_intent_close_client, mds_last_unlink_server);
+EXPORT_SYMBOL(RQF_MDS_INTENT_CLOSE);
 
 struct req_format RQF_MDS_READPAGE =
 	DEFINE_REQ_FMT0("MDS_READPAGE",
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 36/41] staging: lustre: ldlm: interval tree search in ldlm_lock_match()
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Vitaly Fertman, James Simmons

From: Vitaly Fertman <vitaly_fertman@xyratex.com>

replace the linear search by interval_tree one for granted list
in ldlm_lock_match()

Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5739
Xyratex-bug-id: MRP-2089
Reviewed-on: http://review.whamcloud.com/12294
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c |  249 ++++++++++++++++--------
 1 files changed, 171 insertions(+), 78 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index 22b4a52..f2044ec 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -1043,88 +1043,173 @@ void ldlm_grant_lock(struct ldlm_lock *lock, struct list_head *work_list)
 }
 
 /**
- * Search for a lock with given properties in a queue.
+ * Describe the overlap between two locks.  itree_overlap_cb data.
+ */
+struct lock_match_data {
+	struct ldlm_lock	*lmd_old;
+	struct ldlm_lock	*lmd_lock;
+	enum ldlm_mode		*lmd_mode;
+	ldlm_policy_data_t	*lmd_policy;
+	__u64			 lmd_flags;
+	int			 lmd_unref;
+};
+
+/**
+ * Check if the given @lock meets the criteria for a match.
+ * A reference on the lock is taken if matched.
  *
- * \retval a referenced lock or NULL.  See the flag descriptions below, in the
- * comment above ldlm_lock_match
+ * \param lock	test-against this lock
+ * \param data	parameters
  */
-static struct ldlm_lock *search_queue(struct list_head *queue,
-				      enum ldlm_mode *mode,
-				      ldlm_policy_data_t *policy,
-				      struct ldlm_lock *old_lock,
-				      __u64 flags, int unref)
+static int lock_matches(struct ldlm_lock *lock, struct lock_match_data *data)
 {
-	struct ldlm_lock *lock;
-	struct list_head       *tmp;
+	ldlm_policy_data_t *lpol = &lock->l_policy_data;
+	enum ldlm_mode match;
 
-	list_for_each(tmp, queue) {
-		enum ldlm_mode match;
+	if (lock == data->lmd_old)
+		return INTERVAL_ITER_STOP;
 
-		lock = list_entry(tmp, struct ldlm_lock, l_res_link);
+	/*
+	 * Check if this lock can be matched.
+	 * Used by LU-2919(exclusive open) for open lease lock
+	 */
+	if (ldlm_is_excl(lock))
+		return INTERVAL_ITER_CONT;
 
-		if (lock == old_lock)
-			break;
+	/*
+	 * llite sometimes wants to match locks that will be
+	 * canceled when their users drop, but we allow it to match
+	 * if it passes in CBPENDING and the lock still has users.
+	 * this is generally only going to be used by children
+	 * whose parents already hold a lock so forward progress
+	 * can still happen.
+	 */
+	if (ldlm_is_cbpending(lock) &&
+	    !(data->lmd_flags & LDLM_FL_CBPENDING))
+		return INTERVAL_ITER_CONT;
 
-		/* Check if this lock can be matched.
-		 * Used by LU-2919(exclusive open) for open lease lock
-		 */
-		if (ldlm_is_excl(lock))
-			continue;
+	if (!data->lmd_unref && ldlm_is_cbpending(lock) &&
+	    !lock->l_readers && !lock->l_writers)
+		return INTERVAL_ITER_CONT;
 
-		/* llite sometimes wants to match locks that will be
-		 * canceled when their users drop, but we allow it to match
-		 * if it passes in CBPENDING and the lock still has users.
-		 * this is generally only going to be used by children
-		 * whose parents already hold a lock so forward progress
-		 * can still happen.
-		 */
-		if (ldlm_is_cbpending(lock) && !(flags & LDLM_FL_CBPENDING))
-			continue;
-		if (!unref && ldlm_is_cbpending(lock) &&
-		    lock->l_readers == 0 && lock->l_writers == 0)
-			continue;
+	if (!(lock->l_req_mode & *data->lmd_mode))
+		return INTERVAL_ITER_CONT;
+	match = lock->l_req_mode;
 
-		if (!(lock->l_req_mode & *mode))
-			continue;
-		match = lock->l_req_mode;
-
-		if (lock->l_resource->lr_type == LDLM_EXTENT &&
-		    (lock->l_policy_data.l_extent.start >
-		     policy->l_extent.start ||
-		     lock->l_policy_data.l_extent.end < policy->l_extent.end))
-			continue;
+	switch (lock->l_resource->lr_type) {
+	case LDLM_EXTENT:
+		if (lpol->l_extent.start > data->lmd_policy->l_extent.start ||
+		    lpol->l_extent.end < data->lmd_policy->l_extent.end)
+			return INTERVAL_ITER_CONT;
 
 		if (unlikely(match == LCK_GROUP) &&
-		    lock->l_resource->lr_type == LDLM_EXTENT &&
-		    policy->l_extent.gid != LDLM_GID_ANY &&
-		    lock->l_policy_data.l_extent.gid != policy->l_extent.gid)
-			continue;
-
-		/* We match if we have existing lock with same or wider set
+		    data->lmd_policy->l_extent.gid != LDLM_GID_ANY &&
+		    lpol->l_extent.gid != data->lmd_policy->l_extent.gid)
+			return INTERVAL_ITER_CONT;
+		break;
+	case LDLM_IBITS:
+		/*
+		 * We match if we have existing lock with same or wider set
 		 * of bits.
 		 */
-		if (lock->l_resource->lr_type == LDLM_IBITS &&
-		    ((lock->l_policy_data.l_inodebits.bits &
-		      policy->l_inodebits.bits) !=
-		      policy->l_inodebits.bits))
-			continue;
+		if ((lpol->l_inodebits.bits &
+		     data->lmd_policy->l_inodebits.bits) !=
+		    data->lmd_policy->l_inodebits.bits)
+			return INTERVAL_ITER_CONT;
+		break;
+	default:
+		break;
+	}
+	/*
+	 * We match if we have existing lock with same or wider set
+	 * of bits.
+	 */
+	if (!data->lmd_unref && LDLM_HAVE_MASK(lock, GONE))
+		return INTERVAL_ITER_CONT;
+
+	if ((data->lmd_flags & LDLM_FL_LOCAL_ONLY) &&
+	    !ldlm_is_local(lock))
+		return INTERVAL_ITER_CONT;
 
-		if (!unref && LDLM_HAVE_MASK(lock, GONE))
+	if (data->lmd_flags & LDLM_FL_TEST_LOCK) {
+		LDLM_LOCK_GET(lock);
+		ldlm_lock_touch_in_lru(lock);
+	} else {
+		ldlm_lock_addref_internal_nolock(lock, match);
+	}
+
+	*data->lmd_mode = match;
+	data->lmd_lock = lock;
+
+	return INTERVAL_ITER_STOP;
+}
+
+static unsigned int itree_overlap_cb(struct interval_node *in, void *args)
+{
+	struct ldlm_interval *node = to_ldlm_interval(in);
+	struct lock_match_data *data = args;
+	struct ldlm_lock *lock;
+	int rc;
+
+	list_for_each_entry(lock, &node->li_group, l_sl_policy) {
+		rc = lock_matches(lock, data);
+		if (rc == INTERVAL_ITER_STOP)
+			return INTERVAL_ITER_STOP;
+	}
+	return INTERVAL_ITER_CONT;
+}
+
+/**
+ * Search for a lock with given parameters in interval trees.
+ *
+ * \param res	search for a lock in this resource
+ * \param data	parameters
+ *
+ * \retval	a referenced lock or NULL.
+ */
+static struct ldlm_lock *search_itree(struct ldlm_resource *res,
+				      struct lock_match_data *data)
+{
+	struct interval_node_extent ext = {
+		.start	= data->lmd_policy->l_extent.start,
+		.end	= data->lmd_policy->l_extent.end
+	};
+	int idx;
+
+	for (idx = 0; idx < LCK_MODE_NUM; idx++) {
+		struct ldlm_interval_tree *tree = &res->lr_itree[idx];
+
+		if (!tree->lit_root)
 			continue;
 
-		if ((flags & LDLM_FL_LOCAL_ONLY) && !ldlm_is_local(lock))
+		if (!(tree->lit_mode & *data->lmd_mode))
 			continue;
 
-		if (flags & LDLM_FL_TEST_LOCK) {
-			LDLM_LOCK_GET(lock);
-			ldlm_lock_touch_in_lru(lock);
-		} else {
-			ldlm_lock_addref_internal_nolock(lock, match);
-		}
-		*mode = match;
-		return lock;
+		interval_search(tree->lit_root, &ext,
+				itree_overlap_cb, data);
 	}
+	return data->lmd_lock;
+}
 
+/**
+ * Search for a lock with given properties in a queue.
+ *
+ * \param queue	search for a lock in this queue
+ * \param data	parameters
+ *
+ * \retval	a referenced lock or NULL.
+ */
+static struct ldlm_lock *search_queue(struct list_head *queue,
+				      struct lock_match_data *data)
+{
+	struct ldlm_lock *lock;
+	int rc;
+
+	list_for_each_entry(lock, queue, l_res_link) {
+		rc = lock_matches(lock, data);
+		if (rc == INTERVAL_ITER_STOP)
+			return data->lmd_lock;
+	}
 	return NULL;
 }
 
@@ -1199,31 +1284,41 @@ enum ldlm_mode ldlm_lock_match(struct ldlm_namespace *ns, __u64 flags,
 			       enum ldlm_mode mode,
 			       struct lustre_handle *lockh, int unref)
 {
+	struct lock_match_data data = {
+		.lmd_old	= NULL,
+		.lmd_lock	= NULL,
+		.lmd_mode	= &mode,
+		.lmd_policy	= policy,
+		.lmd_flags	= flags,
+		.lmd_unref	= unref,
+	};
 	struct ldlm_resource *res;
-	struct ldlm_lock *lock, *old_lock = NULL;
+	struct ldlm_lock *lock;
 	int rc = 0;
 
 	if (!ns) {
-		old_lock = ldlm_handle2lock(lockh);
-		LASSERT(old_lock);
+		data.lmd_old = ldlm_handle2lock(lockh);
+		LASSERT(data.lmd_old);
 
-		ns = ldlm_lock_to_ns(old_lock);
-		res_id = &old_lock->l_resource->lr_name;
-		type = old_lock->l_resource->lr_type;
-		mode = old_lock->l_req_mode;
+		ns = ldlm_lock_to_ns(data.lmd_old);
+		res_id = &data.lmd_old->l_resource->lr_name;
+		type = data.lmd_old->l_resource->lr_type;
+		*data.lmd_mode = data.lmd_old->l_req_mode;
 	}
 
 	res = ldlm_resource_get(ns, NULL, res_id, type, 0);
 	if (IS_ERR(res)) {
-		LASSERT(!old_lock);
+		LASSERT(!data.lmd_old);
 		return 0;
 	}
 
 	LDLM_RESOURCE_ADDREF(res);
 	lock_res(res);
 
-	lock = search_queue(&res->lr_granted, &mode, policy, old_lock,
-			    flags, unref);
+	if (res->lr_type == LDLM_EXTENT)
+		lock = search_itree(res, &data);
+	else
+		lock = search_queue(&res->lr_granted, &data);
 	if (lock) {
 		rc = 1;
 		goto out;
@@ -1232,14 +1327,12 @@ enum ldlm_mode ldlm_lock_match(struct ldlm_namespace *ns, __u64 flags,
 		rc = 0;
 		goto out;
 	}
-	lock = search_queue(&res->lr_waiting, &mode, policy, old_lock,
-			    flags, unref);
+	lock = search_queue(&res->lr_waiting, &data);
 	if (lock) {
 		rc = 1;
 		goto out;
 	}
-
- out:
+out:
 	unlock_res(res);
 	LDLM_RESOURCE_DELREF(res);
 	ldlm_resource_putref(res);
@@ -1311,8 +1404,8 @@ enum ldlm_mode ldlm_lock_match(struct ldlm_namespace *ns, __u64 flags,
 				  (type == LDLM_PLAIN || type == LDLM_IBITS) ?
 					res_id->name[3] : policy->l_extent.end);
 	}
-	if (old_lock)
-		LDLM_LOCK_PUT(old_lock);
+	if (data.lmd_old)
+		LDLM_LOCK_PUT(data.lmd_old);
 
 	return rc ? mode : 0;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 36/41] staging: lustre: ldlm: interval tree search in ldlm_lock_match()
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Vitaly Fertman, James Simmons

From: Vitaly Fertman <vitaly_fertman@xyratex.com>

replace the linear search by interval_tree one for granted list
in ldlm_lock_match()

Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5739
Xyratex-bug-id: MRP-2089
Reviewed-on: http://review.whamcloud.com/12294
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/ldlm/ldlm_lock.c |  249 ++++++++++++++++--------
 1 files changed, 171 insertions(+), 78 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
index 22b4a52..f2044ec 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lock.c
@@ -1043,88 +1043,173 @@ void ldlm_grant_lock(struct ldlm_lock *lock, struct list_head *work_list)
 }
 
 /**
- * Search for a lock with given properties in a queue.
+ * Describe the overlap between two locks.  itree_overlap_cb data.
+ */
+struct lock_match_data {
+	struct ldlm_lock	*lmd_old;
+	struct ldlm_lock	*lmd_lock;
+	enum ldlm_mode		*lmd_mode;
+	ldlm_policy_data_t	*lmd_policy;
+	__u64			 lmd_flags;
+	int			 lmd_unref;
+};
+
+/**
+ * Check if the given @lock meets the criteria for a match.
+ * A reference on the lock is taken if matched.
  *
- * \retval a referenced lock or NULL.  See the flag descriptions below, in the
- * comment above ldlm_lock_match
+ * \param lock	test-against this lock
+ * \param data	parameters
  */
-static struct ldlm_lock *search_queue(struct list_head *queue,
-				      enum ldlm_mode *mode,
-				      ldlm_policy_data_t *policy,
-				      struct ldlm_lock *old_lock,
-				      __u64 flags, int unref)
+static int lock_matches(struct ldlm_lock *lock, struct lock_match_data *data)
 {
-	struct ldlm_lock *lock;
-	struct list_head       *tmp;
+	ldlm_policy_data_t *lpol = &lock->l_policy_data;
+	enum ldlm_mode match;
 
-	list_for_each(tmp, queue) {
-		enum ldlm_mode match;
+	if (lock == data->lmd_old)
+		return INTERVAL_ITER_STOP;
 
-		lock = list_entry(tmp, struct ldlm_lock, l_res_link);
+	/*
+	 * Check if this lock can be matched.
+	 * Used by LU-2919(exclusive open) for open lease lock
+	 */
+	if (ldlm_is_excl(lock))
+		return INTERVAL_ITER_CONT;
 
-		if (lock == old_lock)
-			break;
+	/*
+	 * llite sometimes wants to match locks that will be
+	 * canceled when their users drop, but we allow it to match
+	 * if it passes in CBPENDING and the lock still has users.
+	 * this is generally only going to be used by children
+	 * whose parents already hold a lock so forward progress
+	 * can still happen.
+	 */
+	if (ldlm_is_cbpending(lock) &&
+	    !(data->lmd_flags & LDLM_FL_CBPENDING))
+		return INTERVAL_ITER_CONT;
 
-		/* Check if this lock can be matched.
-		 * Used by LU-2919(exclusive open) for open lease lock
-		 */
-		if (ldlm_is_excl(lock))
-			continue;
+	if (!data->lmd_unref && ldlm_is_cbpending(lock) &&
+	    !lock->l_readers && !lock->l_writers)
+		return INTERVAL_ITER_CONT;
 
-		/* llite sometimes wants to match locks that will be
-		 * canceled when their users drop, but we allow it to match
-		 * if it passes in CBPENDING and the lock still has users.
-		 * this is generally only going to be used by children
-		 * whose parents already hold a lock so forward progress
-		 * can still happen.
-		 */
-		if (ldlm_is_cbpending(lock) && !(flags & LDLM_FL_CBPENDING))
-			continue;
-		if (!unref && ldlm_is_cbpending(lock) &&
-		    lock->l_readers == 0 && lock->l_writers == 0)
-			continue;
+	if (!(lock->l_req_mode & *data->lmd_mode))
+		return INTERVAL_ITER_CONT;
+	match = lock->l_req_mode;
 
-		if (!(lock->l_req_mode & *mode))
-			continue;
-		match = lock->l_req_mode;
-
-		if (lock->l_resource->lr_type == LDLM_EXTENT &&
-		    (lock->l_policy_data.l_extent.start >
-		     policy->l_extent.start ||
-		     lock->l_policy_data.l_extent.end < policy->l_extent.end))
-			continue;
+	switch (lock->l_resource->lr_type) {
+	case LDLM_EXTENT:
+		if (lpol->l_extent.start > data->lmd_policy->l_extent.start ||
+		    lpol->l_extent.end < data->lmd_policy->l_extent.end)
+			return INTERVAL_ITER_CONT;
 
 		if (unlikely(match == LCK_GROUP) &&
-		    lock->l_resource->lr_type == LDLM_EXTENT &&
-		    policy->l_extent.gid != LDLM_GID_ANY &&
-		    lock->l_policy_data.l_extent.gid != policy->l_extent.gid)
-			continue;
-
-		/* We match if we have existing lock with same or wider set
+		    data->lmd_policy->l_extent.gid != LDLM_GID_ANY &&
+		    lpol->l_extent.gid != data->lmd_policy->l_extent.gid)
+			return INTERVAL_ITER_CONT;
+		break;
+	case LDLM_IBITS:
+		/*
+		 * We match if we have existing lock with same or wider set
 		 * of bits.
 		 */
-		if (lock->l_resource->lr_type == LDLM_IBITS &&
-		    ((lock->l_policy_data.l_inodebits.bits &
-		      policy->l_inodebits.bits) !=
-		      policy->l_inodebits.bits))
-			continue;
+		if ((lpol->l_inodebits.bits &
+		     data->lmd_policy->l_inodebits.bits) !=
+		    data->lmd_policy->l_inodebits.bits)
+			return INTERVAL_ITER_CONT;
+		break;
+	default:
+		break;
+	}
+	/*
+	 * We match if we have existing lock with same or wider set
+	 * of bits.
+	 */
+	if (!data->lmd_unref && LDLM_HAVE_MASK(lock, GONE))
+		return INTERVAL_ITER_CONT;
+
+	if ((data->lmd_flags & LDLM_FL_LOCAL_ONLY) &&
+	    !ldlm_is_local(lock))
+		return INTERVAL_ITER_CONT;
 
-		if (!unref && LDLM_HAVE_MASK(lock, GONE))
+	if (data->lmd_flags & LDLM_FL_TEST_LOCK) {
+		LDLM_LOCK_GET(lock);
+		ldlm_lock_touch_in_lru(lock);
+	} else {
+		ldlm_lock_addref_internal_nolock(lock, match);
+	}
+
+	*data->lmd_mode = match;
+	data->lmd_lock = lock;
+
+	return INTERVAL_ITER_STOP;
+}
+
+static unsigned int itree_overlap_cb(struct interval_node *in, void *args)
+{
+	struct ldlm_interval *node = to_ldlm_interval(in);
+	struct lock_match_data *data = args;
+	struct ldlm_lock *lock;
+	int rc;
+
+	list_for_each_entry(lock, &node->li_group, l_sl_policy) {
+		rc = lock_matches(lock, data);
+		if (rc == INTERVAL_ITER_STOP)
+			return INTERVAL_ITER_STOP;
+	}
+	return INTERVAL_ITER_CONT;
+}
+
+/**
+ * Search for a lock with given parameters in interval trees.
+ *
+ * \param res	search for a lock in this resource
+ * \param data	parameters
+ *
+ * \retval	a referenced lock or NULL.
+ */
+static struct ldlm_lock *search_itree(struct ldlm_resource *res,
+				      struct lock_match_data *data)
+{
+	struct interval_node_extent ext = {
+		.start	= data->lmd_policy->l_extent.start,
+		.end	= data->lmd_policy->l_extent.end
+	};
+	int idx;
+
+	for (idx = 0; idx < LCK_MODE_NUM; idx++) {
+		struct ldlm_interval_tree *tree = &res->lr_itree[idx];
+
+		if (!tree->lit_root)
 			continue;
 
-		if ((flags & LDLM_FL_LOCAL_ONLY) && !ldlm_is_local(lock))
+		if (!(tree->lit_mode & *data->lmd_mode))
 			continue;
 
-		if (flags & LDLM_FL_TEST_LOCK) {
-			LDLM_LOCK_GET(lock);
-			ldlm_lock_touch_in_lru(lock);
-		} else {
-			ldlm_lock_addref_internal_nolock(lock, match);
-		}
-		*mode = match;
-		return lock;
+		interval_search(tree->lit_root, &ext,
+				itree_overlap_cb, data);
 	}
+	return data->lmd_lock;
+}
 
+/**
+ * Search for a lock with given properties in a queue.
+ *
+ * \param queue	search for a lock in this queue
+ * \param data	parameters
+ *
+ * \retval	a referenced lock or NULL.
+ */
+static struct ldlm_lock *search_queue(struct list_head *queue,
+				      struct lock_match_data *data)
+{
+	struct ldlm_lock *lock;
+	int rc;
+
+	list_for_each_entry(lock, queue, l_res_link) {
+		rc = lock_matches(lock, data);
+		if (rc == INTERVAL_ITER_STOP)
+			return data->lmd_lock;
+	}
 	return NULL;
 }
 
@@ -1199,31 +1284,41 @@ enum ldlm_mode ldlm_lock_match(struct ldlm_namespace *ns, __u64 flags,
 			       enum ldlm_mode mode,
 			       struct lustre_handle *lockh, int unref)
 {
+	struct lock_match_data data = {
+		.lmd_old	= NULL,
+		.lmd_lock	= NULL,
+		.lmd_mode	= &mode,
+		.lmd_policy	= policy,
+		.lmd_flags	= flags,
+		.lmd_unref	= unref,
+	};
 	struct ldlm_resource *res;
-	struct ldlm_lock *lock, *old_lock = NULL;
+	struct ldlm_lock *lock;
 	int rc = 0;
 
 	if (!ns) {
-		old_lock = ldlm_handle2lock(lockh);
-		LASSERT(old_lock);
+		data.lmd_old = ldlm_handle2lock(lockh);
+		LASSERT(data.lmd_old);
 
-		ns = ldlm_lock_to_ns(old_lock);
-		res_id = &old_lock->l_resource->lr_name;
-		type = old_lock->l_resource->lr_type;
-		mode = old_lock->l_req_mode;
+		ns = ldlm_lock_to_ns(data.lmd_old);
+		res_id = &data.lmd_old->l_resource->lr_name;
+		type = data.lmd_old->l_resource->lr_type;
+		*data.lmd_mode = data.lmd_old->l_req_mode;
 	}
 
 	res = ldlm_resource_get(ns, NULL, res_id, type, 0);
 	if (IS_ERR(res)) {
-		LASSERT(!old_lock);
+		LASSERT(!data.lmd_old);
 		return 0;
 	}
 
 	LDLM_RESOURCE_ADDREF(res);
 	lock_res(res);
 
-	lock = search_queue(&res->lr_granted, &mode, policy, old_lock,
-			    flags, unref);
+	if (res->lr_type == LDLM_EXTENT)
+		lock = search_itree(res, &data);
+	else
+		lock = search_queue(&res->lr_granted, &data);
 	if (lock) {
 		rc = 1;
 		goto out;
@@ -1232,14 +1327,12 @@ enum ldlm_mode ldlm_lock_match(struct ldlm_namespace *ns, __u64 flags,
 		rc = 0;
 		goto out;
 	}
-	lock = search_queue(&res->lr_waiting, &mode, policy, old_lock,
-			    flags, unref);
+	lock = search_queue(&res->lr_waiting, &data);
 	if (lock) {
 		rc = 1;
 		goto out;
 	}
-
- out:
+out:
 	unlock_res(res);
 	LDLM_RESOURCE_DELREF(res);
 	ldlm_resource_putref(res);
@@ -1311,8 +1404,8 @@ enum ldlm_mode ldlm_lock_match(struct ldlm_namespace *ns, __u64 flags,
 				  (type == LDLM_PLAIN || type == LDLM_IBITS) ?
 					res_id->name[3] : policy->l_extent.end);
 	}
-	if (old_lock)
-		LDLM_LOCK_PUT(old_lock);
+	if (data.lmd_old)
+		LDLM_LOCK_PUT(data.lmd_old);
 
 	return rc ? mode : 0;
 }
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 37/41] staging: lustre: lov: copy_to_user uses wrong casting
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

With certain version of gcc lov_obd.c failes to compile
with the following warning.

In function copy_to_user,
inlined from lov_iocontrol at
lustre/lustre/lov/lov_obd.c:1168:
./arch/x86/include/asm/uaccess.h:735: error: call to
__copy_to_user_overflow declared with attribute warning:
copy_to_user() buffer size is not probably correct

In lov_iocontrol the data was being casted to int instead
of the required unsigned long. This patch changes the cast
to what is needed for copy_to_user.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6302
Reviewed-on: http://review.whamcloud.com/14613
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_obd.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 473071c..18cb92d 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1085,8 +1085,8 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 
 		/* copy UUID */
 		if (copy_to_user(data->ioc_pbuf2, obd2cli_tgt(osc_obd),
-				 min((int)data->ioc_plen2,
-				     (int)sizeof(struct obd_uuid))))
+				 min_t(unsigned long, data->ioc_plen2,
+				       sizeof(struct obd_uuid))))
 			return -EFAULT;
 
 		memcpy(&flags, data->ioc_inlbuf1, sizeof(__u32));
@@ -1099,8 +1099,8 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		if (rc)
 			return rc;
 		if (copy_to_user(data->ioc_pbuf1, &stat_buf,
-				 min((int)data->ioc_plen1,
-				     (int)sizeof(stat_buf))))
+				 min_t(unsigned long, data->ioc_plen1,
+				       sizeof(stat_buf))))
 			return -EFAULT;
 		break;
 	}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 37/41] staging: lustre: lov: copy_to_user uses wrong casting
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	James Simmons, James Simmons

With certain version of gcc lov_obd.c failes to compile
with the following warning.

In function copy_to_user,
inlined from lov_iocontrol at
lustre/lustre/lov/lov_obd.c:1168:
./arch/x86/include/asm/uaccess.h:735: error: call to
__copy_to_user_overflow declared with attribute warning:
copy_to_user() buffer size is not probably correct

In lov_iocontrol the data was being casted to int instead
of the required unsigned long. This patch changes the cast
to what is needed for copy_to_user.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6302
Reviewed-on: http://review.whamcloud.com/14613
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/lov/lov_obd.c |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 473071c..18cb92d 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1085,8 +1085,8 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 
 		/* copy UUID */
 		if (copy_to_user(data->ioc_pbuf2, obd2cli_tgt(osc_obd),
-				 min((int)data->ioc_plen2,
-				     (int)sizeof(struct obd_uuid))))
+				 min_t(unsigned long, data->ioc_plen2,
+				       sizeof(struct obd_uuid))))
 			return -EFAULT;
 
 		memcpy(&flags, data->ioc_inlbuf1, sizeof(__u32));
@@ -1099,8 +1099,8 @@ static int lov_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 		if (rc)
 			return rc;
 		if (copy_to_user(data->ioc_pbuf1, &stat_buf,
-				 min((int)data->ioc_plen1,
-				     (int)sizeof(stat_buf))))
+				 min_t(unsigned long, data->ioc_plen1,
+				       sizeof(stat_buf))))
 			return -EFAULT;
 		break;
 	}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 38/41] staging: lustre: mdc: add max modify RPCs in flight variable
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Gregoire Pichon, James Simmons

From: Gregoire Pichon <gregoire.pichon@bull.net>

This patch introduces the maximum modify RPCs in flight variable of
a mdc client obd device. Its value is set from connection flag and
and connection data. It can later be tuned through the
max_mod_rpcs_in_flight procfs file.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5319
Reviewed-on: http://review.whamcloud.com/14153
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h       |    7 +++
 drivers/staging/lustre/lustre/include/obd_class.h |    1 +
 drivers/staging/lustre/lustre/mdc/lproc_mdc.c     |   38 ++++++++++++++
 drivers/staging/lustre/lustre/mdc/mdc_request.c   |    4 ++
 drivers/staging/lustre/lustre/obdclass/genops.c   |   57 +++++++++++++++++++++
 drivers/staging/lustre/lustre/ptlrpc/import.c     |   11 ++++
 6 files changed, 118 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 5fa5838..a977388 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -318,6 +318,13 @@ struct client_obd {
 	struct mdc_rpc_lock     *cl_rpc_lock;
 	struct mdc_rpc_lock     *cl_close_lock;
 
+	/* modify rpcs in flight
+	 * currently used for metadata only
+	 */
+	spinlock_t		 cl_mod_rpcs_lock;
+	u16			 cl_max_mod_rpcs_in_flight;
+
+
 	/* mgc datastruct */
 	atomic_t	     cl_mgc_refcount;
 	struct obd_export       *cl_mgc_mgsexp;
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index e6ae4a0..7d8f062 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -100,6 +100,7 @@ int obd_get_request_slot(struct client_obd *cli);
 void obd_put_request_slot(struct client_obd *cli);
 __u32 obd_get_max_rpcs_in_flight(struct client_obd *cli);
 int obd_set_max_rpcs_in_flight(struct client_obd *cli, __u32 max);
+int obd_set_max_mod_rpcs_in_flight(struct client_obd *cli, u16 max);
 
 struct llog_handle;
 struct llog_rec_hdr;
diff --git a/drivers/staging/lustre/lustre/mdc/lproc_mdc.c b/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
index fca9450..5fdee9e 100644
--- a/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
+++ b/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
@@ -73,6 +73,43 @@ static ssize_t max_rpcs_in_flight_store(struct kobject *kobj,
 }
 LUSTRE_RW_ATTR(max_rpcs_in_flight);
 
+static ssize_t max_mod_rpcs_in_flight_show(struct kobject *kobj,
+					   struct attribute *attr,
+					   char *buf)
+{
+	struct obd_device *dev = container_of(kobj, struct obd_device,
+					      obd_kobj);
+	u16 max;
+	int len;
+
+	max = dev->u.cli.cl_max_mod_rpcs_in_flight;
+	len = sprintf(buf, "%hu\n", max);
+
+	return len;
+}
+
+static ssize_t max_mod_rpcs_in_flight_store(struct kobject *kobj,
+					    struct attribute *attr,
+					    const char *buffer,
+					    size_t count)
+{
+	struct obd_device *dev = container_of(kobj, struct obd_device,
+					      obd_kobj);
+	u16 val;
+	int rc;
+
+	rc = kstrtou16(buffer, 10, &val);
+	if (rc)
+		return rc;
+
+	rc = obd_set_max_mod_rpcs_in_flight(&dev->u.cli, val);
+	if (rc)
+		count = rc;
+
+	return count;
+}
+LUSTRE_RW_ATTR(max_mod_rpcs_in_flight);
+
 LPROC_SEQ_FOPS_WR_ONLY(mdc, ping);
 
 LPROC_SEQ_FOPS_RO_TYPE(mdc, connect_flags);
@@ -117,6 +154,7 @@ static struct lprocfs_vars lprocfs_mdc_obd_vars[] = {
 
 static struct attribute *mdc_attrs[] = {
 	&lustre_attr_max_rpcs_in_flight.attr,
+	&lustre_attr_max_mod_rpcs_in_flight.attr,
 	&lustre_attr_max_pages_per_rpc.attr,
 	NULL,
 };
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index ac04bf3..af373af 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2634,8 +2634,12 @@ static int mdc_setup(struct obd_device *obd, struct lustre_cfg *cfg)
 	if (rc) {
 		mdc_cleanup(obd);
 		CERROR("failed to setup llogging subsystems\n");
+		return rc;
 	}
 
+	spin_lock_init(&cli->cl_mod_rpcs_lock);
+	cli->cl_max_mod_rpcs_in_flight = OBD_MAX_RIF_DEFAULT - 1;
+
 	return rc;
 
 err_close_lock:
diff --git a/drivers/staging/lustre/lustre/obdclass/genops.c b/drivers/staging/lustre/lustre/obdclass/genops.c
index cf8bb2a..62e6636 100644
--- a/drivers/staging/lustre/lustre/obdclass/genops.c
+++ b/drivers/staging/lustre/lustre/obdclass/genops.c
@@ -1408,13 +1408,33 @@ EXPORT_SYMBOL(obd_get_max_rpcs_in_flight);
 int obd_set_max_rpcs_in_flight(struct client_obd *cli, __u32 max)
 {
 	struct obd_request_slot_waiter *orsw;
+	const char *typ_name;
 	__u32 old;
 	int diff;
+	int rc;
 	int i;
 
 	if (max > OBD_MAX_RIF_MAX || max < 1)
 		return -ERANGE;
 
+	typ_name = cli->cl_import->imp_obd->obd_type->typ_name;
+	if (!strcmp(typ_name, LUSTRE_MDC_NAME)) {
+		/*
+		 * adjust max_mod_rpcs_in_flight to ensure it is always
+		 * strictly lower that max_rpcs_in_flight
+		 */
+		if (max < 2) {
+			CERROR("%s: cannot set max_rpcs_in_flight to 1 because it must be higher than max_mod_rpcs_in_flight value\n",
+			       cli->cl_import->imp_obd->obd_name);
+			return -ERANGE;
+		}
+		if (max <= cli->cl_max_mod_rpcs_in_flight) {
+			rc = obd_set_max_mod_rpcs_in_flight(cli, max - 1);
+			if (rc)
+				return rc;
+		}
+	}
+
 	spin_lock(&cli->cl_loi_list_lock);
 	old = cli->cl_max_rpcs_in_flight;
 	cli->cl_max_rpcs_in_flight = max;
@@ -1436,3 +1456,40 @@ int obd_set_max_rpcs_in_flight(struct client_obd *cli, __u32 max)
 	return 0;
 }
 EXPORT_SYMBOL(obd_set_max_rpcs_in_flight);
+
+int obd_set_max_mod_rpcs_in_flight(struct client_obd *cli, __u16 max)
+{
+	struct obd_connect_data *ocd;
+	u16 maxmodrpcs;
+
+	if (max > OBD_MAX_RIF_MAX || max < 1)
+		return -ERANGE;
+
+	/* cannot exceed or equal max_rpcs_in_flight */
+	if (max >= cli->cl_max_rpcs_in_flight) {
+		CERROR("%s: can't set max_mod_rpcs_in_flight to a value (%hu) higher or equal to max_rpcs_in_flight value (%u)\n",
+		       cli->cl_import->imp_obd->obd_name,
+		       max, cli->cl_max_rpcs_in_flight);
+		return -ERANGE;
+	}
+
+	/* cannot exceed max modify RPCs in flight supported by the server */
+	ocd = &cli->cl_import->imp_connect_data;
+	if (ocd->ocd_connect_flags & OBD_CONNECT_MULTIMODRPCS)
+		maxmodrpcs = ocd->ocd_maxmodrpcs;
+	else
+		maxmodrpcs = 1;
+	if (max > maxmodrpcs) {
+		CERROR("%s: can't set max_mod_rpcs_in_flight to a value (%hu) higher than max_mod_rpcs_per_client value (%hu) returned by the server at connection\n",
+		       cli->cl_import->imp_obd->obd_name,
+		       max, maxmodrpcs);
+		return -ERANGE;
+	}
+
+	cli->cl_max_mod_rpcs_in_flight = max;
+
+	/* will have to wakeup waiters if max has been increased */
+
+	return 0;
+}
+EXPORT_SYMBOL(obd_set_max_mod_rpcs_in_flight);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index 2bdaf2b..46ba5a4 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -858,6 +858,17 @@ static int ptlrpc_connect_set_flags(struct obd_import *imp,
 	client_adjust_max_dirty(cli);
 
 	/*
+	 * Update client max modify RPCs in flight with value returned
+	 * by the server
+	 */
+	if (ocd->ocd_connect_flags & OBD_CONNECT_MULTIMODRPCS)
+		cli->cl_max_mod_rpcs_in_flight = min(
+					cli->cl_max_mod_rpcs_in_flight,
+					ocd->ocd_maxmodrpcs);
+	else
+		cli->cl_max_mod_rpcs_in_flight = 1;
+
+	/*
 	 * Reset ns_connect_flags only for initial connect. It might be
 	 * changed in while using FS and if we reset it in reconnect
 	 * this leads to losing user settings done before such as
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 38/41] staging: lustre: mdc: add max modify RPCs in flight variable
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Gregoire Pichon, James Simmons

From: Gregoire Pichon <gregoire.pichon@bull.net>

This patch introduces the maximum modify RPCs in flight variable of
a mdc client obd device. Its value is set from connection flag and
and connection data. It can later be tuned through the
max_mod_rpcs_in_flight procfs file.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5319
Reviewed-on: http://review.whamcloud.com/14153
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h       |    7 +++
 drivers/staging/lustre/lustre/include/obd_class.h |    1 +
 drivers/staging/lustre/lustre/mdc/lproc_mdc.c     |   38 ++++++++++++++
 drivers/staging/lustre/lustre/mdc/mdc_request.c   |    4 ++
 drivers/staging/lustre/lustre/obdclass/genops.c   |   57 +++++++++++++++++++++
 drivers/staging/lustre/lustre/ptlrpc/import.c     |   11 ++++
 6 files changed, 118 insertions(+), 0 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index 5fa5838..a977388 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -318,6 +318,13 @@ struct client_obd {
 	struct mdc_rpc_lock     *cl_rpc_lock;
 	struct mdc_rpc_lock     *cl_close_lock;
 
+	/* modify rpcs in flight
+	 * currently used for metadata only
+	 */
+	spinlock_t		 cl_mod_rpcs_lock;
+	u16			 cl_max_mod_rpcs_in_flight;
+
+
 	/* mgc datastruct */
 	atomic_t	     cl_mgc_refcount;
 	struct obd_export       *cl_mgc_mgsexp;
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index e6ae4a0..7d8f062 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -100,6 +100,7 @@ int obd_get_request_slot(struct client_obd *cli);
 void obd_put_request_slot(struct client_obd *cli);
 __u32 obd_get_max_rpcs_in_flight(struct client_obd *cli);
 int obd_set_max_rpcs_in_flight(struct client_obd *cli, __u32 max);
+int obd_set_max_mod_rpcs_in_flight(struct client_obd *cli, u16 max);
 
 struct llog_handle;
 struct llog_rec_hdr;
diff --git a/drivers/staging/lustre/lustre/mdc/lproc_mdc.c b/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
index fca9450..5fdee9e 100644
--- a/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
+++ b/drivers/staging/lustre/lustre/mdc/lproc_mdc.c
@@ -73,6 +73,43 @@ static ssize_t max_rpcs_in_flight_store(struct kobject *kobj,
 }
 LUSTRE_RW_ATTR(max_rpcs_in_flight);
 
+static ssize_t max_mod_rpcs_in_flight_show(struct kobject *kobj,
+					   struct attribute *attr,
+					   char *buf)
+{
+	struct obd_device *dev = container_of(kobj, struct obd_device,
+					      obd_kobj);
+	u16 max;
+	int len;
+
+	max = dev->u.cli.cl_max_mod_rpcs_in_flight;
+	len = sprintf(buf, "%hu\n", max);
+
+	return len;
+}
+
+static ssize_t max_mod_rpcs_in_flight_store(struct kobject *kobj,
+					    struct attribute *attr,
+					    const char *buffer,
+					    size_t count)
+{
+	struct obd_device *dev = container_of(kobj, struct obd_device,
+					      obd_kobj);
+	u16 val;
+	int rc;
+
+	rc = kstrtou16(buffer, 10, &val);
+	if (rc)
+		return rc;
+
+	rc = obd_set_max_mod_rpcs_in_flight(&dev->u.cli, val);
+	if (rc)
+		count = rc;
+
+	return count;
+}
+LUSTRE_RW_ATTR(max_mod_rpcs_in_flight);
+
 LPROC_SEQ_FOPS_WR_ONLY(mdc, ping);
 
 LPROC_SEQ_FOPS_RO_TYPE(mdc, connect_flags);
@@ -117,6 +154,7 @@ static struct lprocfs_vars lprocfs_mdc_obd_vars[] = {
 
 static struct attribute *mdc_attrs[] = {
 	&lustre_attr_max_rpcs_in_flight.attr,
+	&lustre_attr_max_mod_rpcs_in_flight.attr,
 	&lustre_attr_max_pages_per_rpc.attr,
 	NULL,
 };
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index ac04bf3..af373af 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -2634,8 +2634,12 @@ static int mdc_setup(struct obd_device *obd, struct lustre_cfg *cfg)
 	if (rc) {
 		mdc_cleanup(obd);
 		CERROR("failed to setup llogging subsystems\n");
+		return rc;
 	}
 
+	spin_lock_init(&cli->cl_mod_rpcs_lock);
+	cli->cl_max_mod_rpcs_in_flight = OBD_MAX_RIF_DEFAULT - 1;
+
 	return rc;
 
 err_close_lock:
diff --git a/drivers/staging/lustre/lustre/obdclass/genops.c b/drivers/staging/lustre/lustre/obdclass/genops.c
index cf8bb2a..62e6636 100644
--- a/drivers/staging/lustre/lustre/obdclass/genops.c
+++ b/drivers/staging/lustre/lustre/obdclass/genops.c
@@ -1408,13 +1408,33 @@ EXPORT_SYMBOL(obd_get_max_rpcs_in_flight);
 int obd_set_max_rpcs_in_flight(struct client_obd *cli, __u32 max)
 {
 	struct obd_request_slot_waiter *orsw;
+	const char *typ_name;
 	__u32 old;
 	int diff;
+	int rc;
 	int i;
 
 	if (max > OBD_MAX_RIF_MAX || max < 1)
 		return -ERANGE;
 
+	typ_name = cli->cl_import->imp_obd->obd_type->typ_name;
+	if (!strcmp(typ_name, LUSTRE_MDC_NAME)) {
+		/*
+		 * adjust max_mod_rpcs_in_flight to ensure it is always
+		 * strictly lower that max_rpcs_in_flight
+		 */
+		if (max < 2) {
+			CERROR("%s: cannot set max_rpcs_in_flight to 1 because it must be higher than max_mod_rpcs_in_flight value\n",
+			       cli->cl_import->imp_obd->obd_name);
+			return -ERANGE;
+		}
+		if (max <= cli->cl_max_mod_rpcs_in_flight) {
+			rc = obd_set_max_mod_rpcs_in_flight(cli, max - 1);
+			if (rc)
+				return rc;
+		}
+	}
+
 	spin_lock(&cli->cl_loi_list_lock);
 	old = cli->cl_max_rpcs_in_flight;
 	cli->cl_max_rpcs_in_flight = max;
@@ -1436,3 +1456,40 @@ int obd_set_max_rpcs_in_flight(struct client_obd *cli, __u32 max)
 	return 0;
 }
 EXPORT_SYMBOL(obd_set_max_rpcs_in_flight);
+
+int obd_set_max_mod_rpcs_in_flight(struct client_obd *cli, __u16 max)
+{
+	struct obd_connect_data *ocd;
+	u16 maxmodrpcs;
+
+	if (max > OBD_MAX_RIF_MAX || max < 1)
+		return -ERANGE;
+
+	/* cannot exceed or equal max_rpcs_in_flight */
+	if (max >= cli->cl_max_rpcs_in_flight) {
+		CERROR("%s: can't set max_mod_rpcs_in_flight to a value (%hu) higher or equal to max_rpcs_in_flight value (%u)\n",
+		       cli->cl_import->imp_obd->obd_name,
+		       max, cli->cl_max_rpcs_in_flight);
+		return -ERANGE;
+	}
+
+	/* cannot exceed max modify RPCs in flight supported by the server */
+	ocd = &cli->cl_import->imp_connect_data;
+	if (ocd->ocd_connect_flags & OBD_CONNECT_MULTIMODRPCS)
+		maxmodrpcs = ocd->ocd_maxmodrpcs;
+	else
+		maxmodrpcs = 1;
+	if (max > maxmodrpcs) {
+		CERROR("%s: can't set max_mod_rpcs_in_flight to a value (%hu) higher than max_mod_rpcs_per_client value (%hu) returned by the server at connection\n",
+		       cli->cl_import->imp_obd->obd_name,
+		       max, maxmodrpcs);
+		return -ERANGE;
+	}
+
+	cli->cl_max_mod_rpcs_in_flight = max;
+
+	/* will have to wakeup waiters if max has been increased */
+
+	return 0;
+}
+EXPORT_SYMBOL(obd_set_max_mod_rpcs_in_flight);
diff --git a/drivers/staging/lustre/lustre/ptlrpc/import.c b/drivers/staging/lustre/lustre/ptlrpc/import.c
index 2bdaf2b..46ba5a4 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/import.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/import.c
@@ -858,6 +858,17 @@ static int ptlrpc_connect_set_flags(struct obd_import *imp,
 	client_adjust_max_dirty(cli);
 
 	/*
+	 * Update client max modify RPCs in flight with value returned
+	 * by the server
+	 */
+	if (ocd->ocd_connect_flags & OBD_CONNECT_MULTIMODRPCS)
+		cli->cl_max_mod_rpcs_in_flight = min(
+					cli->cl_max_mod_rpcs_in_flight,
+					ocd->ocd_maxmodrpcs);
+	else
+		cli->cl_max_mod_rpcs_in_flight = 1;
+
+	/*
 	 * Reset ns_connect_flags only for initial connect. It might be
 	 * changed in while using FS and if we reset it in reconnect
 	 * this leads to losing user settings done before such as
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 39/41] staging: lustre: osc: remove remaining bits for capa support
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

With capa support removed from the OSC layer a few more bits
can be cleaned up. Convert the OBD getattr and setattr paths
to use struct obdo rather than struct obd_info. Remove
the oi_policy, oi_oa, and oi_capa members from struct obd_info.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3105
Reviewed-on: http://review.whamcloud.com/14640
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |   13 +----
 drivers/staging/lustre/lustre/include/obd_class.h  |   17 ++++----
 drivers/staging/lustre/lustre/llite/vvp_internal.h |    1 -
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    7 ++-
 drivers/staging/lustre/lustre/lov/lov_request.c    |    2 -
 .../staging/lustre/lustre/obdecho/echo_client.c    |   12 +----
 .../staging/lustre/lustre/osc/osc_cl_internal.h    |    1 -
 drivers/staging/lustre/lustre/osc/osc_internal.h   |    6 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |   11 +----
 drivers/staging/lustre/lustre/osc/osc_request.c    |   45 ++++++++++----------
 10 files changed, 46 insertions(+), 69 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index a977388..f63336f 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -126,17 +126,10 @@ typedef int (*obd_enqueue_update_f)(void *cookie, int rc);
 
 /* obd info for a particular level (lov, osc). */
 struct obd_info {
-	/* Flags used for set request specific flags:
-	   - while lock handling, the flags obtained on the enqueue
-	   request are set here.
-	   - while stats, the flags used for control delay/resend.
-	   - while setattr, the flags used for distinguish punch operation
-	 */
+	/* OBD_STATFS_* flags */
 	__u64		   oi_flags;
 	/* lsm data specific for every OSC. */
 	struct lov_stripe_md   *oi_md;
-	/* obdo data specific for every OSC, if needed at all. */
-	struct obdo	    *oi_oa;
 	/* statfs data specific for every OSC, if needed at all. */
 	struct obd_statfs      *oi_osfs;
 	/* An update callback which is called to update some data on upper
@@ -871,9 +864,9 @@ struct obd_ops {
 	int (*destroy)(const struct lu_env *env, struct obd_export *exp,
 		       struct obdo *oa);
 	int (*setattr)(const struct lu_env *, struct obd_export *exp,
-		       struct obd_info *oinfo);
+		       struct obdo *oa);
 	int (*getattr)(const struct lu_env *env, struct obd_export *exp,
-		       struct obd_info *oinfo);
+		       struct obdo *oa);
 	int (*preprw)(const struct lu_env *env, int cmd,
 		      struct obd_export *exp, struct obdo *oa, int objcount,
 		      struct obd_ioobj *obj, struct niobuf_remote *remote,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 7d8f062..0eaea54 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -705,26 +705,26 @@ static inline int obd_destroy(const struct lu_env *env, struct obd_export *exp,
 }
 
 static inline int obd_getattr(const struct lu_env *env, struct obd_export *exp,
-			      struct obd_info *oinfo)
+			      struct obdo *oa)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, getattr);
 	EXP_COUNTER_INCREMENT(exp, getattr);
 
-	rc = OBP(exp->exp_obd, getattr)(env, exp, oinfo);
+	rc = OBP(exp->exp_obd, getattr)(env, exp, oa);
 	return rc;
 }
 
 static inline int obd_setattr(const struct lu_env *env, struct obd_export *exp,
-			      struct obd_info *oinfo)
+			      struct obdo *oa)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, setattr);
 	EXP_COUNTER_INCREMENT(exp, setattr);
 
-	rc = OBP(exp->exp_obd, setattr)(env, exp, oinfo);
+	rc = OBP(exp->exp_obd, setattr)(env, exp, oa);
 	return rc;
 }
 
@@ -991,15 +991,16 @@ static inline int obd_statfs_rqset(struct obd_export *exp,
 				   __u32 flags)
 {
 	struct ptlrpc_request_set *set = NULL;
-	struct obd_info oinfo = { };
+	struct obd_info oinfo = {
+		.oi_osfs = osfs,
+		.oi_flags = flags,
+	};
 	int rc = 0;
 
-	set =  ptlrpc_prep_set();
+	set = ptlrpc_prep_set();
 	if (!set)
 		return -ENOMEM;
 
-	oinfo.oi_osfs = osfs;
-	oinfo.oi_flags = flags;
 	rc = obd_statfs_async(exp, &oinfo, max_age, set);
 	if (rc == 0)
 		rc = ptlrpc_set_wait(set);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index a025b35..09fa357 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -44,7 +44,6 @@ enum obd_notify_event;
 struct inode;
 struct lov_stripe_md;
 struct lustre_md;
-struct obd_capa;
 struct obd_device;
 struct obd_export;
 struct page;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 18cb92d..6530187 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1033,7 +1033,10 @@ static int lov_statfs(const struct lu_env *env, struct obd_export *exp,
 		      struct obd_statfs *osfs, __u64 max_age, __u32 flags)
 {
 	struct ptlrpc_request_set *set = NULL;
-	struct obd_info oinfo = { };
+	struct obd_info oinfo = {
+		.oi_osfs = osfs,
+		.oi_flags = flags,
+	};
 	int rc = 0;
 
 	/* for obdclass we forbid using obd_statfs_rqset, but prefer using async
@@ -1043,8 +1046,6 @@ static int lov_statfs(const struct lu_env *env, struct obd_export *exp,
 	if (!set)
 		return -ENOMEM;
 
-	oinfo.oi_osfs = osfs;
-	oinfo.oi_flags = flags;
 	rc = lov_statfs_async(exp, &oinfo, max_age, set);
 	if (rc == 0)
 		rc = ptlrpc_set_wait(set);
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index c8734a6..d43cc88 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -60,8 +60,6 @@ void lov_finish_set(struct lov_request_set *set)
 							 rq_link);
 		list_del_init(&req->rq_link);
 
-		if (req->rq_oi.oi_oa)
-			kmem_cache_free(obdo_cachep, req->rq_oi.oi_oa);
 		kfree(req->rq_oi.oi_osfs);
 		kfree(req);
 	}
diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index df6fbed..d8e3e96 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -1542,11 +1542,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	case OBD_IOC_GETATTR:
 		rc = echo_get_object(&eco, ed, oa);
 		if (rc == 0) {
-			struct obd_info oinfo = {
-				.oi_oa = oa,
-			};
-
-			rc = obd_getattr(env, ec->ec_exp, &oinfo);
+			rc = obd_getattr(env, ec->ec_exp, oa);
 			echo_put_object(eco);
 		}
 		goto out;
@@ -1559,11 +1555,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 
 		rc = echo_get_object(&eco, ed, oa);
 		if (rc == 0) {
-			struct obd_info oinfo = {
-				.oi_oa = oa,
-			};
-
-			rc = obd_setattr(env, ec->ec_exp, &oinfo);
+			rc = obd_setattr(env, ec->ec_exp, oa);
 			echo_put_object(eco);
 		}
 		goto out;
diff --git a/drivers/staging/lustre/lustre/osc/osc_cl_internal.h b/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
index 9c8de15..8a55412 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
@@ -77,7 +77,6 @@ struct osc_io {
 
 	/** write osc_lock for this IO, used by osc_extent_find(). */
 	struct osc_lock   *oi_write_osclock;
-	struct obd_info    oi_info;
 	struct obdo	oi_oa;
 	struct osc_async_cbargs {
 		bool		  opc_rpc_sent;
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 61bfacb..dc708ea 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -118,13 +118,13 @@ int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		   __u64 *flags, void *data, struct lustre_handle *lockh,
 		   int unref);
 
-int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
+int osc_setattr_async(struct obd_export *exp, struct obdo *oa,
 		      obd_enqueue_update_f upcall, void *cookie,
 		      struct ptlrpc_request_set *rqset);
-int osc_punch_base(struct obd_export *exp, struct obd_info *oinfo,
+int osc_punch_base(struct obd_export *exp, struct obdo *oa,
 		   obd_enqueue_update_f upcall, void *cookie,
 		   struct ptlrpc_request_set *rqset);
-int osc_sync_base(struct obd_export *exp, struct obd_info *oinfo,
+int osc_sync_base(struct obd_export *exp, struct obdo *oa,
 		  obd_enqueue_update_f upcall, void *cookie,
 		  struct ptlrpc_request_set *rqset);
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index b4e062d..8eb4275 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -484,7 +484,6 @@ static int osc_io_setattr_start(const struct lu_env *env,
 	__u64 size = io->u.ci_setattr.sa_attr.lvb_size;
 	unsigned int ia_valid = io->u.ci_setattr.sa_valid;
 	int result = 0;
-	struct obd_info oinfo = { };
 
 	/* truncate cache dirty pages first */
 	if (cl_io_is_trunc(io))
@@ -554,16 +553,15 @@ static int osc_io_setattr_start(const struct lu_env *env,
 			oa->o_valid |= OBD_MD_FLFLAGS;
 		}
 
-		oinfo.oi_oa = oa;
 		init_completion(&cbargs->opc_sync);
 
 		if (ia_valid & ATTR_SIZE)
 			result = osc_punch_base(osc_export(cl2osc(obj)),
-						&oinfo, osc_async_upcall,
+						oa, osc_async_upcall,
 						cbargs, PTLRPCD_SET);
 		else
 			result = osc_setattr_async(osc_export(cl2osc(obj)),
-						   &oinfo, osc_async_upcall,
+						   oa, osc_async_upcall,
 						   cbargs, PTLRPCD_SET);
 		cbargs->opc_rpc_sent = result == 0;
 	}
@@ -745,7 +743,6 @@ static int osc_fsync_ost(const struct lu_env *env, struct osc_object *obj,
 {
 	struct osc_io *oio = osc_env_io(env);
 	struct obdo *oa = &oio->oi_oa;
-	struct obd_info *oinfo = &oio->oi_info;
 	struct lov_oinfo *loi = obj->oo_oinfo;
 	struct osc_async_cbargs *cbargs = &oio->oi_cbarg;
 	int rc = 0;
@@ -761,11 +758,9 @@ static int osc_fsync_ost(const struct lu_env *env, struct osc_object *obj,
 
 	obdo_set_parent_fid(oa, fio->fi_fid);
 
-	memset(oinfo, 0, sizeof(*oinfo));
-	oinfo->oi_oa = oa;
 	init_completion(&cbargs->opc_sync);
 
-	rc = osc_sync_base(osc_export(obj), oinfo, osc_async_upcall, cbargs,
+	rc = osc_sync_base(osc_export(obj), oa, osc_async_upcall, cbargs,
 			   PTLRPCD_SET);
 	return rc;
 }
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 64d95c1..0985bda 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -82,7 +82,7 @@ struct osc_setattr_args {
 };
 
 struct osc_fsync_args {
-	struct obd_info     *fa_oi;
+	struct obdo		*fa_oa;
 	obd_enqueue_update_f fa_upcall;
 	void		*fa_cookie;
 };
@@ -166,19 +166,18 @@ static int osc_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
 }
 
 static inline void osc_pack_req_body(struct ptlrpc_request *req,
-				     struct obd_info *oinfo)
+				     struct obdo *oa)
 {
 	struct ost_body *body;
 
 	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
 	LASSERT(body);
 
-	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa,
-			     oinfo->oi_oa);
+	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa, oa);
 }
 
 static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
-		       struct obd_info *oinfo)
+		       struct obdo *oa)
 {
 	struct ptlrpc_request *req;
 	struct ost_body *body;
@@ -194,7 +193,7 @@ static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 		return rc;
 	}
 
-	osc_pack_req_body(req, oinfo);
+	osc_pack_req_body(req, oa);
 
 	ptlrpc_request_set_replen(req);
 
@@ -209,11 +208,11 @@ static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 	}
 
 	CDEBUG(D_INODE, "mode: %o\n", body->oa.o_mode);
-	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, oinfo->oi_oa,
+	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, oa,
 			     &body->oa);
 
-	oinfo->oi_oa->o_blksize = cli_brw_size(exp->exp_obd);
-	oinfo->oi_oa->o_valid |= OBD_MD_FLBLKSZ;
+	oa->o_blksize = cli_brw_size(exp->exp_obd);
+	oa->o_valid |= OBD_MD_FLBLKSZ;
 
  out:
 	ptlrpc_req_finished(req);
@@ -221,13 +220,13 @@ static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 }
 
 static int osc_setattr(const struct lu_env *env, struct obd_export *exp,
-		       struct obd_info *oinfo)
+		       struct obdo *oa)
 {
 	struct ptlrpc_request *req;
 	struct ost_body *body;
 	int rc;
 
-	LASSERT(oinfo->oi_oa->o_valid & OBD_MD_FLGROUP);
+	LASSERT(oa->o_valid & OBD_MD_FLGROUP);
 
 	req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_OST_SETATTR);
 	if (!req)
@@ -239,7 +238,7 @@ static int osc_setattr(const struct lu_env *env, struct obd_export *exp,
 		return rc;
 	}
 
-	osc_pack_req_body(req, oinfo);
+	osc_pack_req_body(req, oa);
 
 	ptlrpc_request_set_replen(req);
 
@@ -253,7 +252,7 @@ static int osc_setattr(const struct lu_env *env, struct obd_export *exp,
 		goto out;
 	}
 
-	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, oinfo->oi_oa,
+	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, oa,
 			     &body->oa);
 
 out:
@@ -283,7 +282,7 @@ out:
 	return rc;
 }
 
-int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
+int osc_setattr_async(struct obd_export *exp, struct obdo *oa,
 		      obd_enqueue_update_f upcall, void *cookie,
 		      struct ptlrpc_request_set *rqset)
 {
@@ -301,7 +300,7 @@ int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
 		return rc;
 	}
 
-	osc_pack_req_body(req, oinfo);
+	osc_pack_req_body(req, oa);
 
 	ptlrpc_request_set_replen(req);
 
@@ -315,7 +314,7 @@ int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
 
 		CLASSERT(sizeof(*sa) <= sizeof(req->rq_async_args));
 		sa = ptlrpc_req_async_args(req);
-		sa->sa_oa = oinfo->oi_oa;
+		sa->sa_oa = oa;
 		sa->sa_upcall = upcall;
 		sa->sa_cookie = cookie;
 
@@ -382,7 +381,7 @@ out:
 	return rc;
 }
 
-int osc_punch_base(struct obd_export *exp, struct obd_info *oinfo,
+int osc_punch_base(struct obd_export *exp, struct obdo *oa,
 		   obd_enqueue_update_f upcall, void *cookie,
 		   struct ptlrpc_request_set *rqset)
 {
@@ -406,14 +405,14 @@ int osc_punch_base(struct obd_export *exp, struct obd_info *oinfo,
 	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
 	LASSERT(body);
 	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa,
-			     oinfo->oi_oa);
+			     oa);
 
 	ptlrpc_request_set_replen(req);
 
 	req->rq_interpret_reply = (ptlrpc_interpterer_t)osc_setattr_interpret;
 	CLASSERT(sizeof(*sa) <= sizeof(req->rq_async_args));
 	sa = ptlrpc_req_async_args(req);
-	sa->sa_oa = oinfo->oi_oa;
+	sa->sa_oa = oa;
 	sa->sa_upcall = upcall;
 	sa->sa_cookie = cookie;
 	if (rqset == PTLRPCD_SET)
@@ -441,13 +440,13 @@ static int osc_sync_interpret(const struct lu_env *env,
 		goto out;
 	}
 
-	*fa->fa_oi->oi_oa = body->oa;
+	*fa->fa_oa = body->oa;
 out:
 	rc = fa->fa_upcall(fa->fa_cookie, rc);
 	return rc;
 }
 
-int osc_sync_base(struct obd_export *exp, struct obd_info *oinfo,
+int osc_sync_base(struct obd_export *exp, struct obdo *oa,
 		  obd_enqueue_update_f upcall, void *cookie,
 		  struct ptlrpc_request_set *rqset)
 {
@@ -470,14 +469,14 @@ int osc_sync_base(struct obd_export *exp, struct obd_info *oinfo,
 	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
 	LASSERT(body);
 	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa,
-			     oinfo->oi_oa);
+			     oa);
 
 	ptlrpc_request_set_replen(req);
 	req->rq_interpret_reply = osc_sync_interpret;
 
 	CLASSERT(sizeof(*fa) <= sizeof(req->rq_async_args));
 	fa = ptlrpc_req_async_args(req);
-	fa->fa_oi = oinfo;
+	fa->fa_oa = oa;
 	fa->fa_upcall = upcall;
 	fa->fa_cookie = cookie;
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 39/41] staging: lustre: osc: remove remaining bits for capa support
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, James Simmons

From: John L. Hammond <john.hammond@intel.com>

With capa support removed from the OSC layer a few more bits
can be cleaned up. Convert the OBD getattr and setattr paths
to use struct obdo rather than struct obd_info. Remove
the oi_policy, oi_oa, and oi_capa members from struct obd_info.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3105
Reviewed-on: http://review.whamcloud.com/14640
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |   13 +----
 drivers/staging/lustre/lustre/include/obd_class.h  |   17 ++++----
 drivers/staging/lustre/lustre/llite/vvp_internal.h |    1 -
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    7 ++-
 drivers/staging/lustre/lustre/lov/lov_request.c    |    2 -
 .../staging/lustre/lustre/obdecho/echo_client.c    |   12 +----
 .../staging/lustre/lustre/osc/osc_cl_internal.h    |    1 -
 drivers/staging/lustre/lustre/osc/osc_internal.h   |    6 +-
 drivers/staging/lustre/lustre/osc/osc_io.c         |   11 +----
 drivers/staging/lustre/lustre/osc/osc_request.c    |   45 ++++++++++----------
 10 files changed, 46 insertions(+), 69 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index a977388..f63336f 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -126,17 +126,10 @@ typedef int (*obd_enqueue_update_f)(void *cookie, int rc);
 
 /* obd info for a particular level (lov, osc). */
 struct obd_info {
-	/* Flags used for set request specific flags:
-	   - while lock handling, the flags obtained on the enqueue
-	   request are set here.
-	   - while stats, the flags used for control delay/resend.
-	   - while setattr, the flags used for distinguish punch operation
-	 */
+	/* OBD_STATFS_* flags */
 	__u64		   oi_flags;
 	/* lsm data specific for every OSC. */
 	struct lov_stripe_md   *oi_md;
-	/* obdo data specific for every OSC, if needed at all. */
-	struct obdo	    *oi_oa;
 	/* statfs data specific for every OSC, if needed at all. */
 	struct obd_statfs      *oi_osfs;
 	/* An update callback which is called to update some data on upper
@@ -871,9 +864,9 @@ struct obd_ops {
 	int (*destroy)(const struct lu_env *env, struct obd_export *exp,
 		       struct obdo *oa);
 	int (*setattr)(const struct lu_env *, struct obd_export *exp,
-		       struct obd_info *oinfo);
+		       struct obdo *oa);
 	int (*getattr)(const struct lu_env *env, struct obd_export *exp,
-		       struct obd_info *oinfo);
+		       struct obdo *oa);
 	int (*preprw)(const struct lu_env *env, int cmd,
 		      struct obd_export *exp, struct obdo *oa, int objcount,
 		      struct obd_ioobj *obj, struct niobuf_remote *remote,
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 7d8f062..0eaea54 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -705,26 +705,26 @@ static inline int obd_destroy(const struct lu_env *env, struct obd_export *exp,
 }
 
 static inline int obd_getattr(const struct lu_env *env, struct obd_export *exp,
-			      struct obd_info *oinfo)
+			      struct obdo *oa)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, getattr);
 	EXP_COUNTER_INCREMENT(exp, getattr);
 
-	rc = OBP(exp->exp_obd, getattr)(env, exp, oinfo);
+	rc = OBP(exp->exp_obd, getattr)(env, exp, oa);
 	return rc;
 }
 
 static inline int obd_setattr(const struct lu_env *env, struct obd_export *exp,
-			      struct obd_info *oinfo)
+			      struct obdo *oa)
 {
 	int rc;
 
 	EXP_CHECK_DT_OP(exp, setattr);
 	EXP_COUNTER_INCREMENT(exp, setattr);
 
-	rc = OBP(exp->exp_obd, setattr)(env, exp, oinfo);
+	rc = OBP(exp->exp_obd, setattr)(env, exp, oa);
 	return rc;
 }
 
@@ -991,15 +991,16 @@ static inline int obd_statfs_rqset(struct obd_export *exp,
 				   __u32 flags)
 {
 	struct ptlrpc_request_set *set = NULL;
-	struct obd_info oinfo = { };
+	struct obd_info oinfo = {
+		.oi_osfs = osfs,
+		.oi_flags = flags,
+	};
 	int rc = 0;
 
-	set =  ptlrpc_prep_set();
+	set = ptlrpc_prep_set();
 	if (!set)
 		return -ENOMEM;
 
-	oinfo.oi_osfs = osfs;
-	oinfo.oi_flags = flags;
 	rc = obd_statfs_async(exp, &oinfo, max_age, set);
 	if (rc == 0)
 		rc = ptlrpc_set_wait(set);
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index a025b35..09fa357 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -44,7 +44,6 @@ enum obd_notify_event;
 struct inode;
 struct lov_stripe_md;
 struct lustre_md;
-struct obd_capa;
 struct obd_device;
 struct obd_export;
 struct page;
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 18cb92d..6530187 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -1033,7 +1033,10 @@ static int lov_statfs(const struct lu_env *env, struct obd_export *exp,
 		      struct obd_statfs *osfs, __u64 max_age, __u32 flags)
 {
 	struct ptlrpc_request_set *set = NULL;
-	struct obd_info oinfo = { };
+	struct obd_info oinfo = {
+		.oi_osfs = osfs,
+		.oi_flags = flags,
+	};
 	int rc = 0;
 
 	/* for obdclass we forbid using obd_statfs_rqset, but prefer using async
@@ -1043,8 +1046,6 @@ static int lov_statfs(const struct lu_env *env, struct obd_export *exp,
 	if (!set)
 		return -ENOMEM;
 
-	oinfo.oi_osfs = osfs;
-	oinfo.oi_flags = flags;
 	rc = lov_statfs_async(exp, &oinfo, max_age, set);
 	if (rc == 0)
 		rc = ptlrpc_set_wait(set);
diff --git a/drivers/staging/lustre/lustre/lov/lov_request.c b/drivers/staging/lustre/lustre/lov/lov_request.c
index c8734a6..d43cc88 100644
--- a/drivers/staging/lustre/lustre/lov/lov_request.c
+++ b/drivers/staging/lustre/lustre/lov/lov_request.c
@@ -60,8 +60,6 @@ void lov_finish_set(struct lov_request_set *set)
 							 rq_link);
 		list_del_init(&req->rq_link);
 
-		if (req->rq_oi.oi_oa)
-			kmem_cache_free(obdo_cachep, req->rq_oi.oi_oa);
 		kfree(req->rq_oi.oi_osfs);
 		kfree(req);
 	}
diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index df6fbed..d8e3e96 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -1542,11 +1542,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 	case OBD_IOC_GETATTR:
 		rc = echo_get_object(&eco, ed, oa);
 		if (rc == 0) {
-			struct obd_info oinfo = {
-				.oi_oa = oa,
-			};
-
-			rc = obd_getattr(env, ec->ec_exp, &oinfo);
+			rc = obd_getattr(env, ec->ec_exp, oa);
 			echo_put_object(eco);
 		}
 		goto out;
@@ -1559,11 +1555,7 @@ echo_client_iocontrol(unsigned int cmd, struct obd_export *exp, int len,
 
 		rc = echo_get_object(&eco, ed, oa);
 		if (rc == 0) {
-			struct obd_info oinfo = {
-				.oi_oa = oa,
-			};
-
-			rc = obd_setattr(env, ec->ec_exp, &oinfo);
+			rc = obd_setattr(env, ec->ec_exp, oa);
 			echo_put_object(eco);
 		}
 		goto out;
diff --git a/drivers/staging/lustre/lustre/osc/osc_cl_internal.h b/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
index 9c8de15..8a55412 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_cl_internal.h
@@ -77,7 +77,6 @@ struct osc_io {
 
 	/** write osc_lock for this IO, used by osc_extent_find(). */
 	struct osc_lock   *oi_write_osclock;
-	struct obd_info    oi_info;
 	struct obdo	oi_oa;
 	struct osc_async_cbargs {
 		bool		  opc_rpc_sent;
diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 61bfacb..dc708ea 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -118,13 +118,13 @@ int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
 		   __u64 *flags, void *data, struct lustre_handle *lockh,
 		   int unref);
 
-int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
+int osc_setattr_async(struct obd_export *exp, struct obdo *oa,
 		      obd_enqueue_update_f upcall, void *cookie,
 		      struct ptlrpc_request_set *rqset);
-int osc_punch_base(struct obd_export *exp, struct obd_info *oinfo,
+int osc_punch_base(struct obd_export *exp, struct obdo *oa,
 		   obd_enqueue_update_f upcall, void *cookie,
 		   struct ptlrpc_request_set *rqset);
-int osc_sync_base(struct obd_export *exp, struct obd_info *oinfo,
+int osc_sync_base(struct obd_export *exp, struct obdo *oa,
 		  obd_enqueue_update_f upcall, void *cookie,
 		  struct ptlrpc_request_set *rqset);
 
diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c b/drivers/staging/lustre/lustre/osc/osc_io.c
index b4e062d..8eb4275 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -484,7 +484,6 @@ static int osc_io_setattr_start(const struct lu_env *env,
 	__u64 size = io->u.ci_setattr.sa_attr.lvb_size;
 	unsigned int ia_valid = io->u.ci_setattr.sa_valid;
 	int result = 0;
-	struct obd_info oinfo = { };
 
 	/* truncate cache dirty pages first */
 	if (cl_io_is_trunc(io))
@@ -554,16 +553,15 @@ static int osc_io_setattr_start(const struct lu_env *env,
 			oa->o_valid |= OBD_MD_FLFLAGS;
 		}
 
-		oinfo.oi_oa = oa;
 		init_completion(&cbargs->opc_sync);
 
 		if (ia_valid & ATTR_SIZE)
 			result = osc_punch_base(osc_export(cl2osc(obj)),
-						&oinfo, osc_async_upcall,
+						oa, osc_async_upcall,
 						cbargs, PTLRPCD_SET);
 		else
 			result = osc_setattr_async(osc_export(cl2osc(obj)),
-						   &oinfo, osc_async_upcall,
+						   oa, osc_async_upcall,
 						   cbargs, PTLRPCD_SET);
 		cbargs->opc_rpc_sent = result == 0;
 	}
@@ -745,7 +743,6 @@ static int osc_fsync_ost(const struct lu_env *env, struct osc_object *obj,
 {
 	struct osc_io *oio = osc_env_io(env);
 	struct obdo *oa = &oio->oi_oa;
-	struct obd_info *oinfo = &oio->oi_info;
 	struct lov_oinfo *loi = obj->oo_oinfo;
 	struct osc_async_cbargs *cbargs = &oio->oi_cbarg;
 	int rc = 0;
@@ -761,11 +758,9 @@ static int osc_fsync_ost(const struct lu_env *env, struct osc_object *obj,
 
 	obdo_set_parent_fid(oa, fio->fi_fid);
 
-	memset(oinfo, 0, sizeof(*oinfo));
-	oinfo->oi_oa = oa;
 	init_completion(&cbargs->opc_sync);
 
-	rc = osc_sync_base(osc_export(obj), oinfo, osc_async_upcall, cbargs,
+	rc = osc_sync_base(osc_export(obj), oa, osc_async_upcall, cbargs,
 			   PTLRPCD_SET);
 	return rc;
 }
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 64d95c1..0985bda 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -82,7 +82,7 @@ struct osc_setattr_args {
 };
 
 struct osc_fsync_args {
-	struct obd_info     *fa_oi;
+	struct obdo		*fa_oa;
 	obd_enqueue_update_f fa_upcall;
 	void		*fa_cookie;
 };
@@ -166,19 +166,18 @@ static int osc_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
 }
 
 static inline void osc_pack_req_body(struct ptlrpc_request *req,
-				     struct obd_info *oinfo)
+				     struct obdo *oa)
 {
 	struct ost_body *body;
 
 	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
 	LASSERT(body);
 
-	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa,
-			     oinfo->oi_oa);
+	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa, oa);
 }
 
 static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
-		       struct obd_info *oinfo)
+		       struct obdo *oa)
 {
 	struct ptlrpc_request *req;
 	struct ost_body *body;
@@ -194,7 +193,7 @@ static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 		return rc;
 	}
 
-	osc_pack_req_body(req, oinfo);
+	osc_pack_req_body(req, oa);
 
 	ptlrpc_request_set_replen(req);
 
@@ -209,11 +208,11 @@ static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 	}
 
 	CDEBUG(D_INODE, "mode: %o\n", body->oa.o_mode);
-	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, oinfo->oi_oa,
+	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, oa,
 			     &body->oa);
 
-	oinfo->oi_oa->o_blksize = cli_brw_size(exp->exp_obd);
-	oinfo->oi_oa->o_valid |= OBD_MD_FLBLKSZ;
+	oa->o_blksize = cli_brw_size(exp->exp_obd);
+	oa->o_valid |= OBD_MD_FLBLKSZ;
 
  out:
 	ptlrpc_req_finished(req);
@@ -221,13 +220,13 @@ static int osc_getattr(const struct lu_env *env, struct obd_export *exp,
 }
 
 static int osc_setattr(const struct lu_env *env, struct obd_export *exp,
-		       struct obd_info *oinfo)
+		       struct obdo *oa)
 {
 	struct ptlrpc_request *req;
 	struct ost_body *body;
 	int rc;
 
-	LASSERT(oinfo->oi_oa->o_valid & OBD_MD_FLGROUP);
+	LASSERT(oa->o_valid & OBD_MD_FLGROUP);
 
 	req = ptlrpc_request_alloc(class_exp2cliimp(exp), &RQF_OST_SETATTR);
 	if (!req)
@@ -239,7 +238,7 @@ static int osc_setattr(const struct lu_env *env, struct obd_export *exp,
 		return rc;
 	}
 
-	osc_pack_req_body(req, oinfo);
+	osc_pack_req_body(req, oa);
 
 	ptlrpc_request_set_replen(req);
 
@@ -253,7 +252,7 @@ static int osc_setattr(const struct lu_env *env, struct obd_export *exp,
 		goto out;
 	}
 
-	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, oinfo->oi_oa,
+	lustre_get_wire_obdo(&req->rq_import->imp_connect_data, oa,
 			     &body->oa);
 
 out:
@@ -283,7 +282,7 @@ out:
 	return rc;
 }
 
-int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
+int osc_setattr_async(struct obd_export *exp, struct obdo *oa,
 		      obd_enqueue_update_f upcall, void *cookie,
 		      struct ptlrpc_request_set *rqset)
 {
@@ -301,7 +300,7 @@ int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
 		return rc;
 	}
 
-	osc_pack_req_body(req, oinfo);
+	osc_pack_req_body(req, oa);
 
 	ptlrpc_request_set_replen(req);
 
@@ -315,7 +314,7 @@ int osc_setattr_async(struct obd_export *exp, struct obd_info *oinfo,
 
 		CLASSERT(sizeof(*sa) <= sizeof(req->rq_async_args));
 		sa = ptlrpc_req_async_args(req);
-		sa->sa_oa = oinfo->oi_oa;
+		sa->sa_oa = oa;
 		sa->sa_upcall = upcall;
 		sa->sa_cookie = cookie;
 
@@ -382,7 +381,7 @@ out:
 	return rc;
 }
 
-int osc_punch_base(struct obd_export *exp, struct obd_info *oinfo,
+int osc_punch_base(struct obd_export *exp, struct obdo *oa,
 		   obd_enqueue_update_f upcall, void *cookie,
 		   struct ptlrpc_request_set *rqset)
 {
@@ -406,14 +405,14 @@ int osc_punch_base(struct obd_export *exp, struct obd_info *oinfo,
 	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
 	LASSERT(body);
 	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa,
-			     oinfo->oi_oa);
+			     oa);
 
 	ptlrpc_request_set_replen(req);
 
 	req->rq_interpret_reply = (ptlrpc_interpterer_t)osc_setattr_interpret;
 	CLASSERT(sizeof(*sa) <= sizeof(req->rq_async_args));
 	sa = ptlrpc_req_async_args(req);
-	sa->sa_oa = oinfo->oi_oa;
+	sa->sa_oa = oa;
 	sa->sa_upcall = upcall;
 	sa->sa_cookie = cookie;
 	if (rqset == PTLRPCD_SET)
@@ -441,13 +440,13 @@ static int osc_sync_interpret(const struct lu_env *env,
 		goto out;
 	}
 
-	*fa->fa_oi->oi_oa = body->oa;
+	*fa->fa_oa = body->oa;
 out:
 	rc = fa->fa_upcall(fa->fa_cookie, rc);
 	return rc;
 }
 
-int osc_sync_base(struct obd_export *exp, struct obd_info *oinfo,
+int osc_sync_base(struct obd_export *exp, struct obdo *oa,
 		  obd_enqueue_update_f upcall, void *cookie,
 		  struct ptlrpc_request_set *rqset)
 {
@@ -470,14 +469,14 @@ int osc_sync_base(struct obd_export *exp, struct obd_info *oinfo,
 	body = req_capsule_client_get(&req->rq_pill, &RMF_OST_BODY);
 	LASSERT(body);
 	lustre_set_wire_obdo(&req->rq_import->imp_connect_data, &body->oa,
-			     oinfo->oi_oa);
+			     oa);
 
 	ptlrpc_request_set_replen(req);
 	req->rq_interpret_reply = osc_sync_interpret;
 
 	CLASSERT(sizeof(*fa) <= sizeof(req->rq_async_args));
 	fa = ptlrpc_req_async_args(req);
-	fa->fa_oi = oinfo;
+	fa->fa_oa = oa;
 	fa->fa_upcall = upcall;
 	fa->fa_cookie = cookie;
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 40/41] staging: lustre: lov: move LSM to LOV layer
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Move the definition of struct lov_stripe_md along with supporting
functions from obd.h to lov_internal.h. Remove the unused functions
obd_packmd() and obd_free_diskmd(). Simplify lov_obd_packmd()
according to the reduced use cases and rename it lov_packmd().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13696
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |   77 +----------
 drivers/staging/lustre/lustre/include/obd_class.h  |   38 -----
 .../staging/lustre/lustre/llite/llite_internal.h   |    2 -
 drivers/staging/lustre/lustre/llite/vvp_internal.h |    1 -
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |  105 --------------
 drivers/staging/lustre/lustre/lov/lov_ea.c         |    4 +-
 drivers/staging/lustre/lustre/lov/lov_internal.h   |   80 +++++++++++-
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    9 --
 drivers/staging/lustre/lustre/lov/lov_pack.c       |  145 +++++---------------
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    8 -
 drivers/staging/lustre/lustre/osc/osc_request.c    |   63 ---------
 11 files changed, 118 insertions(+), 414 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index f63336f..ebb3012 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -73,53 +73,7 @@ static inline void loi_init(struct lov_oinfo *loi)
 {
 }
 
-/*
- * If we are unable to get the maximum object size from the OST in
- * ocd_maxbytes using OBD_CONNECT_MAXBYTES, then we fall back to using
- * the old maximum object size from ext3.
- */
-#define LUSTRE_EXT3_STRIPE_MAXBYTES 0x1fffffff000ULL
-
-struct lov_stripe_md {
-	atomic_t     lsm_refc;
-	spinlock_t	lsm_lock;
-	pid_t	    lsm_lock_owner; /* debugging */
-
-	/* maximum possible file size, might change as OSTs status changes,
-	 * e.g. disconnected, deactivated
-	 */
-	loff_t		lsm_maxbytes;
-	struct ost_id	lsm_oi;
-	__u32		lsm_magic;
-	__u32		lsm_stripe_size;
-	__u32		lsm_pattern;	/* striping pattern (RAID0, RAID1) */
-	__u16		lsm_stripe_count;
-	__u16		lsm_layout_gen;
-	char		lsm_pool_name[LOV_MAXPOOLNAME + 1];
-	struct lov_oinfo *lsm_oinfo[0];
-};
-
-static inline bool lsm_is_released(struct lov_stripe_md *lsm)
-{
-	return !!(lsm->lsm_pattern & LOV_PATTERN_F_RELEASED);
-}
-
-static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
-{
-	if (!lsm)
-		return false;
-	if (lsm_is_released(lsm))
-		return false;
-	return true;
-}
-
-static inline int lov_stripe_md_size(unsigned int stripe_count)
-{
-	struct lov_stripe_md lsm;
-
-	return sizeof(lsm) + stripe_count * sizeof(lsm.lsm_oinfo[0]);
-}
-
+struct lov_stripe_md;
 struct obd_info;
 
 typedef int (*obd_enqueue_update_f)(void *cookie, int rc);
@@ -854,8 +808,6 @@ struct obd_ops {
 		      struct obd_statfs *osfs, __u64 max_age, __u32 flags);
 	int (*statfs_async)(struct obd_export *exp, struct obd_info *oinfo,
 			    __u64 max_age, struct ptlrpc_request_set *set);
-	int (*packmd)(struct obd_export *exp, struct lov_mds_md **disk_tgt,
-		      struct lov_stripe_md *mem_src);
 	int (*unpackmd)(struct obd_export *exp,
 			struct lov_stripe_md **mem_tgt,
 			struct lov_mds_md *disk_src, int disk_len);
@@ -1033,33 +985,6 @@ struct md_ops {
 	 */
 };
 
-struct lsm_operations {
-	void (*lsm_free)(struct lov_stripe_md *);
-	void (*lsm_stripe_by_index)(struct lov_stripe_md *, int *, u64 *,
-				    u64 *);
-	void (*lsm_stripe_by_offset)(struct lov_stripe_md *, int *, u64 *,
-				     u64 *);
-	int (*lsm_lmm_verify)(struct lov_mds_md *lmm, int lmm_bytes,
-			      __u16 *stripe_count);
-	int (*lsm_unpackmd)(struct lov_obd *lov, struct lov_stripe_md *lsm,
-			    struct lov_mds_md *lmm);
-};
-
-extern const struct lsm_operations lsm_v1_ops;
-extern const struct lsm_operations lsm_v3_ops;
-static inline const struct lsm_operations *lsm_op_find(int magic)
-{
-	switch (magic) {
-	case LOV_MAGIC_V1:
-	       return &lsm_v1_ops;
-	case LOV_MAGIC_V3:
-	       return &lsm_v3_ops;
-	default:
-	       CERROR("Cannot recognize lsm_magic %08x\n", magic);
-	       return NULL;
-	}
-}
-
 static inline struct md_open_data *obd_mod_alloc(void)
 {
 	struct md_open_data *mod;
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 0eaea54..aba96c3 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -609,44 +609,6 @@ obd_process_config(struct obd_device *obd, int datalen, void *data)
 	return rc;
 }
 
-/* Pack an in-memory MD struct for storage on disk.
- * Returns +ve size of packed MD (0 for free), or -ve error.
- *
- * If @disk_tgt == NULL, MD size is returned (max size if @mem_src == NULL).
- * If @*disk_tgt != NULL and @mem_src == NULL, @*disk_tgt will be freed.
- * If @*disk_tgt == NULL, it will be allocated
- */
-static inline int obd_packmd(struct obd_export *exp,
-			     struct lov_mds_md **disk_tgt,
-			     struct lov_stripe_md *mem_src)
-{
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, packmd);
-	EXP_COUNTER_INCREMENT(exp, packmd);
-
-	rc = OBP(exp->exp_obd, packmd)(exp, disk_tgt, mem_src);
-	return rc;
-}
-
-static inline int obd_free_diskmd(struct obd_export *exp,
-				  struct lov_mds_md **disk_tgt)
-{
-	LASSERT(disk_tgt);
-	LASSERT(*disk_tgt);
-	/*
-	 * LU-2590, for caller's convenience, *disk_tgt could be host
-	 * endianness, it needs swab to LE if necessary, while just
-	 * lov_mds_md header needs it for figuring out how much memory
-	 * needs to be freed.
-	 */
-	if ((cpu_to_le32(LOV_MAGIC) != LOV_MAGIC) &&
-	    (((*disk_tgt)->lmm_magic == LOV_MAGIC_V1) ||
-	     ((*disk_tgt)->lmm_magic == LOV_MAGIC_V3)))
-		lustre_swab_lov_mds_md(*disk_tgt);
-	return obd_packmd(exp, disk_tgt, NULL);
-}
-
 /* Unpack an MD struct from disk to in-memory format.
  * Returns +ve size of unpacked MD (0 for free), or -ve error.
  *
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index cf95a72..24ce243 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -609,8 +609,6 @@ struct ll_file_data {
 	struct list_head fd_lccs; /* list of ll_cl_context */
 };
 
-struct lov_stripe_md;
-
 extern struct dentry *llite_root;
 extern struct kset *llite_kset;
 
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index 09fa357..43e19da 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -42,7 +42,6 @@
 
 enum obd_notify_event;
 struct inode;
-struct lov_stripe_md;
 struct lustre_md;
 struct obd_device;
 struct obd_export;
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 75f5958..679cd87 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -2736,90 +2736,6 @@ static int lmv_set_info_async(const struct lu_env *env, struct obd_export *exp,
 	return -EINVAL;
 }
 
-static int lmv_pack_md_v1(const struct lmv_stripe_md *lsm,
-			  struct lmv_mds_md_v1 *lmm1)
-{
-	int cplen;
-	int i;
-
-	lmm1->lmv_magic = cpu_to_le32(lsm->lsm_md_magic);
-	lmm1->lmv_stripe_count = cpu_to_le32(lsm->lsm_md_stripe_count);
-	lmm1->lmv_master_mdt_index = cpu_to_le32(lsm->lsm_md_master_mdt_index);
-	lmm1->lmv_hash_type = cpu_to_le32(lsm->lsm_md_hash_type);
-	cplen = strlcpy(lmm1->lmv_pool_name, lsm->lsm_md_pool_name,
-			sizeof(lmm1->lmv_pool_name));
-	if (cplen >= sizeof(lmm1->lmv_pool_name))
-		return -E2BIG;
-
-	for (i = 0; i < lsm->lsm_md_stripe_count; i++)
-		fid_cpu_to_le(&lmm1->lmv_stripe_fids[i],
-			      &lsm->lsm_md_oinfo[i].lmo_fid);
-	return 0;
-}
-
-static int
-lmv_pack_md(union lmv_mds_md **lmmp, const struct lmv_stripe_md *lsm,
-	    int stripe_count)
-{
-	int lmm_size = 0, rc = 0;
-	bool allocated = false;
-
-	LASSERT(lmmp);
-
-	/* Free lmm */
-	if (*lmmp && !lsm) {
-		int stripe_cnt;
-
-		stripe_cnt = lmv_mds_md_stripe_count_get(*lmmp);
-		lmm_size = lmv_mds_md_size(stripe_cnt,
-					   le32_to_cpu((*lmmp)->lmv_magic));
-		if (!lmm_size)
-			return -EINVAL;
-		kvfree(*lmmp);
-		*lmmp = NULL;
-		return 0;
-	}
-
-	/* Alloc lmm */
-	if (!*lmmp && !lsm) {
-		lmm_size = lmv_mds_md_size(stripe_count, LMV_MAGIC);
-		LASSERT(lmm_size > 0);
-		*lmmp = libcfs_kvzalloc(lmm_size, GFP_NOFS);
-		if (!*lmmp)
-			return -ENOMEM;
-		lmv_mds_md_stripe_count_set(*lmmp, stripe_count);
-		(*lmmp)->lmv_magic = cpu_to_le32(LMV_MAGIC);
-		return lmm_size;
-	}
-
-	/* pack lmm */
-	LASSERT(lsm);
-	lmm_size = lmv_mds_md_size(lsm->lsm_md_stripe_count,
-				   lsm->lsm_md_magic);
-	if (!*lmmp) {
-		*lmmp = libcfs_kvzalloc(lmm_size, GFP_NOFS);
-		if (!*lmmp)
-			return -ENOMEM;
-		allocated = true;
-	}
-
-	switch (lsm->lsm_md_magic) {
-	case LMV_MAGIC_V1:
-		rc = lmv_pack_md_v1(lsm, &(*lmmp)->lmv_md_v1);
-		break;
-	default:
-		rc = -EINVAL;
-		break;
-	}
-
-	if (rc && allocated) {
-		kvfree(*lmmp);
-		*lmmp = NULL;
-	}
-
-	return lmm_size;
-}
-
 static int lmv_unpack_md_v1(struct obd_export *exp, struct lmv_stripe_md *lsm,
 			    const struct lmv_mds_md_v1 *lmm1)
 {
@@ -2959,26 +2875,6 @@ static int lmv_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
 			     (union lmv_mds_md *)lmm, disk_len);
 }
 
-static int lmv_packmd(struct obd_export *exp, struct lov_mds_md **lmmp,
-		      struct lov_stripe_md *lsm)
-{
-	const struct lmv_stripe_md *lmv = (struct lmv_stripe_md *)lsm;
-	struct obd_device *obd = exp->exp_obd;
-	struct lmv_obd *lmv_obd = &obd->u.lmv;
-	int stripe_count;
-
-	if (!lmmp) {
-		if (lsm)
-			stripe_count = lmv->lsm_md_stripe_count;
-		else
-			stripe_count = lmv_obd->desc.ld_tgt_count;
-
-		return lmv_mds_md_size(stripe_count, LMV_MAGIC_V1);
-	}
-
-	return lmv_pack_md((union lmv_mds_md **)lmmp, lmv, 0);
-}
-
 static int lmv_cancel_unused(struct obd_export *exp, const struct lu_fid *fid,
 			     ldlm_policy_data_t *policy, enum ldlm_mode mode,
 			     enum ldlm_cancel_flags flags, void *opaque)
@@ -3282,7 +3178,6 @@ static struct obd_ops lmv_obd_ops = {
 	.statfs		= lmv_statfs,
 	.get_info	= lmv_get_info,
 	.set_info_async	= lmv_set_info_async,
-	.packmd		= lmv_packmd,
 	.unpackmd	= lmv_unpackmd,
 	.notify		= lmv_notify,
 	.get_uuid	= lmv_get_uuid,
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 63dcd29..d7dc0aa 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -206,7 +206,7 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
 
 static void
 lsm_stripe_by_index_plain(struct lov_stripe_md *lsm, int *stripeno,
-			  u64 *lov_off, u64 *swidth)
+			  loff_t *lov_off, loff_t *swidth)
 {
 	if (swidth)
 		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
@@ -214,7 +214,7 @@ lsm_stripe_by_index_plain(struct lov_stripe_md *lsm, int *stripeno,
 
 static void
 lsm_stripe_by_offset_plain(struct lov_stripe_md *lsm, int *stripeno,
-			   u64 *lov_off, u64 *swidth)
+			   loff_t *lov_off, loff_t *swidth)
 {
 	if (swidth)
 		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index bd105d9..41e7c5f 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -36,6 +36,84 @@
 #include "../include/obd_class.h"
 #include "../include/lustre/lustre_user.h"
 
+/*
+ * If we are unable to get the maximum object size from the OST in
+ * ocd_maxbytes using OBD_CONNECT_MAXBYTES, then we fall back to using
+ * the old maximum object size from ext3.
+ */
+#define LUSTRE_EXT3_STRIPE_MAXBYTES 0x1fffffff000ULL
+
+struct lov_stripe_md {
+	atomic_t	lsm_refc;
+	spinlock_t	lsm_lock;
+	pid_t		lsm_lock_owner; /* debugging */
+
+	/*
+	 * maximum possible file size, might change as OSTs status changes,
+	 * e.g. disconnected, deactivated
+	 */
+	loff_t		lsm_maxbytes;
+	struct ost_id	lsm_oi;
+	u32		lsm_magic;
+	u32		lsm_stripe_size;
+	u32		lsm_pattern; /* RAID0, RAID1, released, ... */
+	u16		lsm_stripe_count;
+	u16		lsm_layout_gen;
+	char		lsm_pool_name[LOV_MAXPOOLNAME + 1];
+	struct lov_oinfo	*lsm_oinfo[0];
+};
+
+static inline bool lsm_is_released(struct lov_stripe_md *lsm)
+{
+	return !!(lsm->lsm_pattern & LOV_PATTERN_F_RELEASED);
+}
+
+static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
+{
+	if (!lsm)
+		return false;
+
+	if (lsm_is_released(lsm))
+		return false;
+
+	return true;
+}
+
+static inline int lov_stripe_md_size(unsigned int stripe_count)
+{
+	struct lov_stripe_md lsm;
+
+	return sizeof(lsm) + stripe_count * sizeof(lsm.lsm_oinfo[0]);
+}
+
+struct lsm_operations {
+	void (*lsm_free)(struct lov_stripe_md *);
+	void (*lsm_stripe_by_index)(struct lov_stripe_md *, int *, loff_t *,
+				    loff_t *);
+	void (*lsm_stripe_by_offset)(struct lov_stripe_md *, int *, loff_t *,
+				     loff_t *);
+	int (*lsm_lmm_verify)(struct lov_mds_md *lmm, int lmm_bytes,
+			      u16 *stripe_count);
+	int (*lsm_unpackmd)(struct lov_obd *lov, struct lov_stripe_md *lsm,
+			    struct lov_mds_md *lmm);
+};
+
+extern const struct lsm_operations lsm_v1_ops;
+extern const struct lsm_operations lsm_v3_ops;
+
+static inline const struct lsm_operations *lsm_op_find(int magic)
+{
+	switch (magic) {
+	case LOV_MAGIC_V1:
+		return &lsm_v1_ops;
+	case LOV_MAGIC_V3:
+		return &lsm_v3_ops;
+	default:
+		CERROR("unrecognized lsm_magic %08x\n", magic);
+		return NULL;
+	}
+}
+
 /* lov_do_div64(a, b) returns a % b, and a = a / b.
  * The 32-bit code is LOV-specific due to knowing about stripe limits in
  * order to reduce the divisor to a 32-bit number.  If the divisor is
@@ -176,8 +254,6 @@ int lov_del_target(struct obd_device *obd, __u32 index,
 /* lov_pack.c */
 ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 		     size_t buf_size);
-int lov_packmd(struct obd_export *exp, struct lov_mds_md **lmm,
-	       struct lov_stripe_md *lsm);
 int lov_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
 		 struct lov_mds_md *lmm, int lmm_bytes);
 int lov_alloc_memmd(struct lov_stripe_md **lsmp, __u16 stripe_count,
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 6530187..621f66e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -971,14 +971,6 @@ out:
 	return rc;
 }
 
-#define ASSERT_LSM_MAGIC(lsmp)						  \
-do {									    \
-	LASSERT((lsmp));						\
-	LASSERTF(((lsmp)->lsm_magic == LOV_MAGIC_V1 ||			  \
-		 (lsmp)->lsm_magic == LOV_MAGIC_V3),			    \
-		 "%p->lsm_magic=%x\n", (lsmp), (lsmp)->lsm_magic);	      \
-} while (0)
-
 int lov_statfs_interpret(struct ptlrpc_request_set *rqset, void *data, int rc)
 {
 	struct lov_request_set *lovset = (struct lov_request_set *)data;
@@ -1414,7 +1406,6 @@ static struct obd_ops lov_obd_ops = {
 	.disconnect     = lov_disconnect,
 	.statfs         = lov_statfs,
 	.statfs_async   = lov_statfs_async,
-	.packmd         = lov_packmd,
 	.unpackmd       = lov_unpackmd,
 	.iocontrol      = lov_iocontrol,
 	.get_info       = lov_get_info,
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 1156ef9..17bcead 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -153,96 +153,6 @@ ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 	return lmm_size;
 }
 
-/* Pack LOV object metadata for disk storage.  It is packed in LE byte
- * order and is opaque to the networking layer.
- *
- * XXX In the future, this will be enhanced to get the EA size from the
- *     underlying OSC device(s) to get their EA sizes so we can stack
- *     LOVs properly.  For now lov_mds_md_size() just assumes one u64
- *     per stripe.
- */
-int lov_obd_packmd(struct lov_obd *lov, struct lov_mds_md **lmmp,
-		   struct lov_stripe_md *lsm)
-{
-	__u16 stripe_count;
-	int lmm_size, lmm_magic;
-
-	if (lsm) {
-		lmm_magic = lsm->lsm_magic;
-	} else {
-		if (lmmp && *lmmp)
-			lmm_magic = le32_to_cpu((*lmmp)->lmm_magic);
-		else
-			/* lsm == NULL and lmmp == NULL */
-			lmm_magic = LOV_MAGIC;
-	}
-
-	if ((lmm_magic != LOV_MAGIC_V1) &&
-	    (lmm_magic != LOV_MAGIC_V3)) {
-		CERROR("bad mem LOV MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
-		       lmm_magic, LOV_MAGIC_V1, LOV_MAGIC_V3);
-		return -EINVAL;
-	}
-
-	if (lsm) {
-		/* If we are just sizing the EA, limit the stripe count
-		 * to the actual number of OSTs in this filesystem.
-		 */
-		if (!lmmp) {
-			stripe_count = lov_get_stripecnt(lov, lmm_magic,
-							 lsm->lsm_stripe_count);
-			lsm->lsm_stripe_count = stripe_count;
-		} else if (!lsm_is_released(lsm)) {
-			stripe_count = lsm->lsm_stripe_count;
-		} else {
-			stripe_count = 0;
-		}
-	} else {
-		/*
-		 * To calculate maximum easize by active targets at present,
-		 * which is exactly the maximum easize to be seen by LOV
-		 */
-		stripe_count = lov->desc.ld_active_tgt_count;
-	}
-
-	/* XXX LOV STACKING call into osc for sizes */
-	lmm_size = lov_mds_md_size(stripe_count, lmm_magic);
-
-	if (!lmmp)
-		return lmm_size;
-
-	if (*lmmp && !lsm) {
-		stripe_count = le16_to_cpu((*lmmp)->lmm_stripe_count);
-		lmm_size = lov_mds_md_size(stripe_count, lmm_magic);
-		kvfree(*lmmp);
-		*lmmp = NULL;
-		return 0;
-	}
-
-	if (!*lmmp) {
-		*lmmp = libcfs_kvzalloc(lmm_size, GFP_NOFS);
-		if (!*lmmp)
-			return -ENOMEM;
-	}
-
-	CDEBUG(D_INFO, "lov_packmd: LOV_MAGIC 0x%08X, lmm_size = %d\n",
-	       lmm_magic, lmm_size);
-
-	if (!lsm)
-		return lmm_size;
-
-	return lov_lsm_pack(lsm, *lmmp, lmm_size);
-}
-
-int lov_packmd(struct obd_export *exp, struct lov_mds_md **lmmp,
-	       struct lov_stripe_md *lsm)
-{
-	struct obd_device *obd = class_exp2obd(exp);
-	struct lov_obd *lov = &obd->u.lov;
-
-	return lov_obd_packmd(lov, lmmp, lsm);
-}
-
 /* Find the max stripecount we should use */
 __u16 lov_get_stripecnt(struct lov_obd *lov, __u32 magic, __u16 stripe_count)
 {
@@ -393,15 +303,14 @@ int lov_unpackmd(struct obd_export *exp,  struct lov_stripe_md **lsmp,
 int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		  struct lov_user_md __user *lump)
 {
-	/*
-	 * XXX huge struct allocated on stack.
-	 */
 	/* we use lov_user_md_v3 because it is larger than lov_user_md_v1 */
-	struct lov_obd *lov;
 	struct lov_user_md_v3 lum;
-	struct lov_mds_md *lmmk = NULL;
-	int rc, lmmk_size, lmm_size;
-	int lum_size;
+	struct lov_mds_md *lmmk;
+	u32 stripe_count;
+	ssize_t lmm_size;
+	size_t lmmk_size;
+	size_t lum_size;
+	int rc;
 	mm_segment_t seg;
 
 	if (!lsm)
@@ -414,6 +323,18 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 	seg = get_fs();
 	set_fs(KERNEL_DS);
 
+	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
+		CERROR("bad LSM MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
+		       lsm->lsm_magic, LOV_MAGIC_V1, LOV_MAGIC_V3);
+		rc = -EIO;
+		goto out;
+	}
+
+	if (!lsm_is_released(lsm))
+		stripe_count = lsm->lsm_stripe_count;
+	else
+		stripe_count = 0;
+
 	/* we only need the header part from user space to get lmm_magic and
 	 * lmm_stripe_count, (the header part is common to v1 and v3)
 	 */
@@ -432,32 +353,40 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 	if (lum.lmm_stripe_count &&
 	    (lum.lmm_stripe_count < lsm->lsm_stripe_count)) {
 		/* Return right size of stripe to user */
-		lum.lmm_stripe_count = lsm->lsm_stripe_count;
+		lum.lmm_stripe_count = stripe_count;
 		rc = copy_to_user(lump, &lum, lum_size);
 		rc = -EOVERFLOW;
 		goto out;
 	}
-	lov = lu2lov_dev(obj->lo_cl.co_lu.lo_dev)->ld_lov;
-	rc = lov_obd_packmd(lov, &lmmk, lsm);
-	if (rc < 0)
+	lmmk_size = lov_mds_md_size(stripe_count, lsm->lsm_magic);
+
+
+	lmmk = libcfs_kvzalloc(lmmk_size, GFP_NOFS);
+	if (!lmmk) {
+		rc = -ENOMEM;
 		goto out;
-	lmmk_size = rc;
-	lmm_size = rc;
-	rc = 0;
+	}
+
+	lmm_size = lov_lsm_pack(lsm, lmmk, lmmk_size);
+	if (lmm_size < 0) {
+		rc = lmm_size;
+		goto out_free;
+	}
 
 	/* FIXME: Bug 1185 - copy fields properly when structs change */
 	/* struct lov_user_md_v3 and struct lov_mds_md_v3 must be the same */
 	CLASSERT(sizeof(lum) == sizeof(struct lov_mds_md_v3));
 	CLASSERT(sizeof(lum.lmm_objects[0]) == sizeof(lmmk->lmm_objects[0]));
 
-	if ((cpu_to_le32(LOV_MAGIC) != LOV_MAGIC) &&
-	    ((lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V1)) ||
-	    (lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V3)))) {
+	if (cpu_to_le32(LOV_MAGIC) != LOV_MAGIC &&
+	    (lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V1) ||
+	     lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V3))) {
 		lustre_swab_lov_mds_md(lmmk);
 		lustre_swab_lov_user_md_objects(
 				(struct lov_user_ost_data *)lmmk->lmm_objects,
 				lmmk->lmm_stripe_count);
 	}
+
 	if (lum.lmm_magic == LOV_USER_MAGIC) {
 		/* User request for v1, we need skip lmm_pool_name */
 		if (lmmk->lmm_magic == LOV_MAGIC_V3) {
@@ -491,7 +420,7 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		rc = -EFAULT;
 
 out_free:
-	kfree(lmmk);
+	kvfree(lmmk);
 out:
 	set_fs(seg);
 	return rc;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index af373af..10d0b9d 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -447,14 +447,6 @@ static int mdc_get_lustre_md(struct obd_export *exp,
 		if (rc < 0)
 			goto out;
 
-		if (rc < (typeof(rc))sizeof(*md->lsm)) {
-			CDEBUG(D_INFO,
-			       "lsm size too small: rc < sizeof (*md->lsm) (%d < %d)\n",
-			       rc, (int)sizeof(*md->lsm));
-			rc = -EPROTO;
-			goto out;
-		}
-
 	} else if (md->body->mbo_valid & OBD_MD_FLDIREA) {
 		int lmvsize;
 		struct lov_mds_md *lmv;
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 0985bda..038f00c 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -103,68 +103,6 @@ static void osc_release_ppga(struct brw_page **ppga, u32 count);
 static int brw_interpret(const struct lu_env *env,
 			 struct ptlrpc_request *req, void *data, int rc);
 
-/* Unpack OSC object metadata from disk storage (LE byte order). */
-static int osc_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
-			struct lov_mds_md *lmm, int lmm_bytes)
-{
-	int lsm_size;
-	struct obd_import *imp = class_exp2cliimp(exp);
-
-	if (lmm) {
-		if (lmm_bytes < sizeof(*lmm)) {
-			CERROR("%s: lov_mds_md too small: %d, need %d\n",
-			       exp->exp_obd->obd_name, lmm_bytes,
-			       (int)sizeof(*lmm));
-			return -EINVAL;
-		}
-		/* XXX LOV_MAGIC etc check? */
-
-		if (unlikely(ostid_id(&lmm->lmm_oi) == 0)) {
-			CERROR("%s: zero lmm_object_id: rc = %d\n",
-			       exp->exp_obd->obd_name, -EINVAL);
-			return -EINVAL;
-		}
-	}
-
-	lsm_size = lov_stripe_md_size(1);
-	if (!lsmp)
-		return lsm_size;
-
-	if (*lsmp && !lmm) {
-		kfree((*lsmp)->lsm_oinfo[0]);
-		kfree(*lsmp);
-		*lsmp = NULL;
-		return 0;
-	}
-
-	if (!*lsmp) {
-		*lsmp = kzalloc(lsm_size, GFP_NOFS);
-		if (unlikely(!*lsmp))
-			return -ENOMEM;
-		(*lsmp)->lsm_oinfo[0] = kzalloc(sizeof(struct lov_oinfo),
-						GFP_NOFS);
-		if (unlikely(!(*lsmp)->lsm_oinfo[0])) {
-			kfree(*lsmp);
-			return -ENOMEM;
-		}
-		loi_init((*lsmp)->lsm_oinfo[0]);
-	} else if (unlikely(ostid_id(&(*lsmp)->lsm_oi) == 0)) {
-		return -EBADF;
-	}
-
-	if (lmm)
-		/* XXX zero *lsmp? */
-		ostid_le_to_cpu(&lmm->lmm_oi, &(*lsmp)->lsm_oi);
-
-	if (imp &&
-	    (imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_MAXBYTES))
-		(*lsmp)->lsm_maxbytes = imp->imp_connect_data.ocd_maxbytes;
-	else
-		(*lsmp)->lsm_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
-
-	return lsm_size;
-}
-
 static inline void osc_pack_req_body(struct ptlrpc_request *req,
 				     struct obdo *oa)
 {
@@ -2884,7 +2822,6 @@ static struct obd_ops osc_obd_ops = {
 	.disconnect     = osc_disconnect,
 	.statfs         = osc_statfs,
 	.statfs_async   = osc_statfs_async,
-	.unpackmd       = osc_unpackmd,
 	.create         = osc_create,
 	.destroy        = osc_destroy,
 	.getattr        = osc_getattr,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 40/41] staging: lustre: lov: move LSM to LOV layer
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	John L. Hammond, Jinshan Xiong, James Simmons

From: John L. Hammond <john.hammond@intel.com>

Move the definition of struct lov_stripe_md along with supporting
functions from obd.h to lov_internal.h. Remove the unused functions
obd_packmd() and obd_free_diskmd(). Simplify lov_obd_packmd()
according to the reduced use cases and rename it lov_packmd().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5814
Reviewed-on: http://review.whamcloud.com/13696
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 drivers/staging/lustre/lustre/include/obd.h        |   77 +----------
 drivers/staging/lustre/lustre/include/obd_class.h  |   38 -----
 .../staging/lustre/lustre/llite/llite_internal.h   |    2 -
 drivers/staging/lustre/lustre/llite/vvp_internal.h |    1 -
 drivers/staging/lustre/lustre/lmv/lmv_obd.c        |  105 --------------
 drivers/staging/lustre/lustre/lov/lov_ea.c         |    4 +-
 drivers/staging/lustre/lustre/lov/lov_internal.h   |   80 +++++++++++-
 drivers/staging/lustre/lustre/lov/lov_obd.c        |    9 --
 drivers/staging/lustre/lustre/lov/lov_pack.c       |  145 +++++---------------
 drivers/staging/lustre/lustre/mdc/mdc_request.c    |    8 -
 drivers/staging/lustre/lustre/osc/osc_request.c    |   63 ---------
 11 files changed, 118 insertions(+), 414 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h b/drivers/staging/lustre/lustre/include/obd.h
index f63336f..ebb3012 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -73,53 +73,7 @@ static inline void loi_init(struct lov_oinfo *loi)
 {
 }
 
-/*
- * If we are unable to get the maximum object size from the OST in
- * ocd_maxbytes using OBD_CONNECT_MAXBYTES, then we fall back to using
- * the old maximum object size from ext3.
- */
-#define LUSTRE_EXT3_STRIPE_MAXBYTES 0x1fffffff000ULL
-
-struct lov_stripe_md {
-	atomic_t     lsm_refc;
-	spinlock_t	lsm_lock;
-	pid_t	    lsm_lock_owner; /* debugging */
-
-	/* maximum possible file size, might change as OSTs status changes,
-	 * e.g. disconnected, deactivated
-	 */
-	loff_t		lsm_maxbytes;
-	struct ost_id	lsm_oi;
-	__u32		lsm_magic;
-	__u32		lsm_stripe_size;
-	__u32		lsm_pattern;	/* striping pattern (RAID0, RAID1) */
-	__u16		lsm_stripe_count;
-	__u16		lsm_layout_gen;
-	char		lsm_pool_name[LOV_MAXPOOLNAME + 1];
-	struct lov_oinfo *lsm_oinfo[0];
-};
-
-static inline bool lsm_is_released(struct lov_stripe_md *lsm)
-{
-	return !!(lsm->lsm_pattern & LOV_PATTERN_F_RELEASED);
-}
-
-static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
-{
-	if (!lsm)
-		return false;
-	if (lsm_is_released(lsm))
-		return false;
-	return true;
-}
-
-static inline int lov_stripe_md_size(unsigned int stripe_count)
-{
-	struct lov_stripe_md lsm;
-
-	return sizeof(lsm) + stripe_count * sizeof(lsm.lsm_oinfo[0]);
-}
-
+struct lov_stripe_md;
 struct obd_info;
 
 typedef int (*obd_enqueue_update_f)(void *cookie, int rc);
@@ -854,8 +808,6 @@ struct obd_ops {
 		      struct obd_statfs *osfs, __u64 max_age, __u32 flags);
 	int (*statfs_async)(struct obd_export *exp, struct obd_info *oinfo,
 			    __u64 max_age, struct ptlrpc_request_set *set);
-	int (*packmd)(struct obd_export *exp, struct lov_mds_md **disk_tgt,
-		      struct lov_stripe_md *mem_src);
 	int (*unpackmd)(struct obd_export *exp,
 			struct lov_stripe_md **mem_tgt,
 			struct lov_mds_md *disk_src, int disk_len);
@@ -1033,33 +985,6 @@ struct md_ops {
 	 */
 };
 
-struct lsm_operations {
-	void (*lsm_free)(struct lov_stripe_md *);
-	void (*lsm_stripe_by_index)(struct lov_stripe_md *, int *, u64 *,
-				    u64 *);
-	void (*lsm_stripe_by_offset)(struct lov_stripe_md *, int *, u64 *,
-				     u64 *);
-	int (*lsm_lmm_verify)(struct lov_mds_md *lmm, int lmm_bytes,
-			      __u16 *stripe_count);
-	int (*lsm_unpackmd)(struct lov_obd *lov, struct lov_stripe_md *lsm,
-			    struct lov_mds_md *lmm);
-};
-
-extern const struct lsm_operations lsm_v1_ops;
-extern const struct lsm_operations lsm_v3_ops;
-static inline const struct lsm_operations *lsm_op_find(int magic)
-{
-	switch (magic) {
-	case LOV_MAGIC_V1:
-	       return &lsm_v1_ops;
-	case LOV_MAGIC_V3:
-	       return &lsm_v3_ops;
-	default:
-	       CERROR("Cannot recognize lsm_magic %08x\n", magic);
-	       return NULL;
-	}
-}
-
 static inline struct md_open_data *obd_mod_alloc(void)
 {
 	struct md_open_data *mod;
diff --git a/drivers/staging/lustre/lustre/include/obd_class.h b/drivers/staging/lustre/lustre/include/obd_class.h
index 0eaea54..aba96c3 100644
--- a/drivers/staging/lustre/lustre/include/obd_class.h
+++ b/drivers/staging/lustre/lustre/include/obd_class.h
@@ -609,44 +609,6 @@ obd_process_config(struct obd_device *obd, int datalen, void *data)
 	return rc;
 }
 
-/* Pack an in-memory MD struct for storage on disk.
- * Returns +ve size of packed MD (0 for free), or -ve error.
- *
- * If @disk_tgt == NULL, MD size is returned (max size if @mem_src == NULL).
- * If @*disk_tgt != NULL and @mem_src == NULL, @*disk_tgt will be freed.
- * If @*disk_tgt == NULL, it will be allocated
- */
-static inline int obd_packmd(struct obd_export *exp,
-			     struct lov_mds_md **disk_tgt,
-			     struct lov_stripe_md *mem_src)
-{
-	int rc;
-
-	EXP_CHECK_DT_OP(exp, packmd);
-	EXP_COUNTER_INCREMENT(exp, packmd);
-
-	rc = OBP(exp->exp_obd, packmd)(exp, disk_tgt, mem_src);
-	return rc;
-}
-
-static inline int obd_free_diskmd(struct obd_export *exp,
-				  struct lov_mds_md **disk_tgt)
-{
-	LASSERT(disk_tgt);
-	LASSERT(*disk_tgt);
-	/*
-	 * LU-2590, for caller's convenience, *disk_tgt could be host
-	 * endianness, it needs swab to LE if necessary, while just
-	 * lov_mds_md header needs it for figuring out how much memory
-	 * needs to be freed.
-	 */
-	if ((cpu_to_le32(LOV_MAGIC) != LOV_MAGIC) &&
-	    (((*disk_tgt)->lmm_magic == LOV_MAGIC_V1) ||
-	     ((*disk_tgt)->lmm_magic == LOV_MAGIC_V3)))
-		lustre_swab_lov_mds_md(*disk_tgt);
-	return obd_packmd(exp, disk_tgt, NULL);
-}
-
 /* Unpack an MD struct from disk to in-memory format.
  * Returns +ve size of unpacked MD (0 for free), or -ve error.
  *
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h b/drivers/staging/lustre/lustre/llite/llite_internal.h
index cf95a72..24ce243 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -609,8 +609,6 @@ struct ll_file_data {
 	struct list_head fd_lccs; /* list of ll_cl_context */
 };
 
-struct lov_stripe_md;
-
 extern struct dentry *llite_root;
 extern struct kset *llite_kset;
 
diff --git a/drivers/staging/lustre/lustre/llite/vvp_internal.h b/drivers/staging/lustre/lustre/llite/vvp_internal.h
index 09fa357..43e19da 100644
--- a/drivers/staging/lustre/lustre/llite/vvp_internal.h
+++ b/drivers/staging/lustre/lustre/llite/vvp_internal.h
@@ -42,7 +42,6 @@
 
 enum obd_notify_event;
 struct inode;
-struct lov_stripe_md;
 struct lustre_md;
 struct obd_device;
 struct obd_export;
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 75f5958..679cd87 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -2736,90 +2736,6 @@ static int lmv_set_info_async(const struct lu_env *env, struct obd_export *exp,
 	return -EINVAL;
 }
 
-static int lmv_pack_md_v1(const struct lmv_stripe_md *lsm,
-			  struct lmv_mds_md_v1 *lmm1)
-{
-	int cplen;
-	int i;
-
-	lmm1->lmv_magic = cpu_to_le32(lsm->lsm_md_magic);
-	lmm1->lmv_stripe_count = cpu_to_le32(lsm->lsm_md_stripe_count);
-	lmm1->lmv_master_mdt_index = cpu_to_le32(lsm->lsm_md_master_mdt_index);
-	lmm1->lmv_hash_type = cpu_to_le32(lsm->lsm_md_hash_type);
-	cplen = strlcpy(lmm1->lmv_pool_name, lsm->lsm_md_pool_name,
-			sizeof(lmm1->lmv_pool_name));
-	if (cplen >= sizeof(lmm1->lmv_pool_name))
-		return -E2BIG;
-
-	for (i = 0; i < lsm->lsm_md_stripe_count; i++)
-		fid_cpu_to_le(&lmm1->lmv_stripe_fids[i],
-			      &lsm->lsm_md_oinfo[i].lmo_fid);
-	return 0;
-}
-
-static int
-lmv_pack_md(union lmv_mds_md **lmmp, const struct lmv_stripe_md *lsm,
-	    int stripe_count)
-{
-	int lmm_size = 0, rc = 0;
-	bool allocated = false;
-
-	LASSERT(lmmp);
-
-	/* Free lmm */
-	if (*lmmp && !lsm) {
-		int stripe_cnt;
-
-		stripe_cnt = lmv_mds_md_stripe_count_get(*lmmp);
-		lmm_size = lmv_mds_md_size(stripe_cnt,
-					   le32_to_cpu((*lmmp)->lmv_magic));
-		if (!lmm_size)
-			return -EINVAL;
-		kvfree(*lmmp);
-		*lmmp = NULL;
-		return 0;
-	}
-
-	/* Alloc lmm */
-	if (!*lmmp && !lsm) {
-		lmm_size = lmv_mds_md_size(stripe_count, LMV_MAGIC);
-		LASSERT(lmm_size > 0);
-		*lmmp = libcfs_kvzalloc(lmm_size, GFP_NOFS);
-		if (!*lmmp)
-			return -ENOMEM;
-		lmv_mds_md_stripe_count_set(*lmmp, stripe_count);
-		(*lmmp)->lmv_magic = cpu_to_le32(LMV_MAGIC);
-		return lmm_size;
-	}
-
-	/* pack lmm */
-	LASSERT(lsm);
-	lmm_size = lmv_mds_md_size(lsm->lsm_md_stripe_count,
-				   lsm->lsm_md_magic);
-	if (!*lmmp) {
-		*lmmp = libcfs_kvzalloc(lmm_size, GFP_NOFS);
-		if (!*lmmp)
-			return -ENOMEM;
-		allocated = true;
-	}
-
-	switch (lsm->lsm_md_magic) {
-	case LMV_MAGIC_V1:
-		rc = lmv_pack_md_v1(lsm, &(*lmmp)->lmv_md_v1);
-		break;
-	default:
-		rc = -EINVAL;
-		break;
-	}
-
-	if (rc && allocated) {
-		kvfree(*lmmp);
-		*lmmp = NULL;
-	}
-
-	return lmm_size;
-}
-
 static int lmv_unpack_md_v1(struct obd_export *exp, struct lmv_stripe_md *lsm,
 			    const struct lmv_mds_md_v1 *lmm1)
 {
@@ -2959,26 +2875,6 @@ static int lmv_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
 			     (union lmv_mds_md *)lmm, disk_len);
 }
 
-static int lmv_packmd(struct obd_export *exp, struct lov_mds_md **lmmp,
-		      struct lov_stripe_md *lsm)
-{
-	const struct lmv_stripe_md *lmv = (struct lmv_stripe_md *)lsm;
-	struct obd_device *obd = exp->exp_obd;
-	struct lmv_obd *lmv_obd = &obd->u.lmv;
-	int stripe_count;
-
-	if (!lmmp) {
-		if (lsm)
-			stripe_count = lmv->lsm_md_stripe_count;
-		else
-			stripe_count = lmv_obd->desc.ld_tgt_count;
-
-		return lmv_mds_md_size(stripe_count, LMV_MAGIC_V1);
-	}
-
-	return lmv_pack_md((union lmv_mds_md **)lmmp, lmv, 0);
-}
-
 static int lmv_cancel_unused(struct obd_export *exp, const struct lu_fid *fid,
 			     ldlm_policy_data_t *policy, enum ldlm_mode mode,
 			     enum ldlm_cancel_flags flags, void *opaque)
@@ -3282,7 +3178,6 @@ static struct obd_ops lmv_obd_ops = {
 	.statfs		= lmv_statfs,
 	.get_info	= lmv_get_info,
 	.set_info_async	= lmv_set_info_async,
-	.packmd		= lmv_packmd,
 	.unpackmd	= lmv_unpackmd,
 	.notify		= lmv_notify,
 	.get_uuid	= lmv_get_uuid,
diff --git a/drivers/staging/lustre/lustre/lov/lov_ea.c b/drivers/staging/lustre/lustre/lov/lov_ea.c
index 63dcd29..d7dc0aa 100644
--- a/drivers/staging/lustre/lustre/lov/lov_ea.c
+++ b/drivers/staging/lustre/lustre/lov/lov_ea.c
@@ -206,7 +206,7 @@ static int lsm_unpackmd_common(struct lov_obd *lov,
 
 static void
 lsm_stripe_by_index_plain(struct lov_stripe_md *lsm, int *stripeno,
-			  u64 *lov_off, u64 *swidth)
+			  loff_t *lov_off, loff_t *swidth)
 {
 	if (swidth)
 		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
@@ -214,7 +214,7 @@ lsm_stripe_by_index_plain(struct lov_stripe_md *lsm, int *stripeno,
 
 static void
 lsm_stripe_by_offset_plain(struct lov_stripe_md *lsm, int *stripeno,
-			   u64 *lov_off, u64 *swidth)
+			   loff_t *lov_off, loff_t *swidth)
 {
 	if (swidth)
 		*swidth = (u64)lsm->lsm_stripe_size * lsm->lsm_stripe_count;
diff --git a/drivers/staging/lustre/lustre/lov/lov_internal.h b/drivers/staging/lustre/lustre/lov/lov_internal.h
index bd105d9..41e7c5f 100644
--- a/drivers/staging/lustre/lustre/lov/lov_internal.h
+++ b/drivers/staging/lustre/lustre/lov/lov_internal.h
@@ -36,6 +36,84 @@
 #include "../include/obd_class.h"
 #include "../include/lustre/lustre_user.h"
 
+/*
+ * If we are unable to get the maximum object size from the OST in
+ * ocd_maxbytes using OBD_CONNECT_MAXBYTES, then we fall back to using
+ * the old maximum object size from ext3.
+ */
+#define LUSTRE_EXT3_STRIPE_MAXBYTES 0x1fffffff000ULL
+
+struct lov_stripe_md {
+	atomic_t	lsm_refc;
+	spinlock_t	lsm_lock;
+	pid_t		lsm_lock_owner; /* debugging */
+
+	/*
+	 * maximum possible file size, might change as OSTs status changes,
+	 * e.g. disconnected, deactivated
+	 */
+	loff_t		lsm_maxbytes;
+	struct ost_id	lsm_oi;
+	u32		lsm_magic;
+	u32		lsm_stripe_size;
+	u32		lsm_pattern; /* RAID0, RAID1, released, ... */
+	u16		lsm_stripe_count;
+	u16		lsm_layout_gen;
+	char		lsm_pool_name[LOV_MAXPOOLNAME + 1];
+	struct lov_oinfo	*lsm_oinfo[0];
+};
+
+static inline bool lsm_is_released(struct lov_stripe_md *lsm)
+{
+	return !!(lsm->lsm_pattern & LOV_PATTERN_F_RELEASED);
+}
+
+static inline bool lsm_has_objects(struct lov_stripe_md *lsm)
+{
+	if (!lsm)
+		return false;
+
+	if (lsm_is_released(lsm))
+		return false;
+
+	return true;
+}
+
+static inline int lov_stripe_md_size(unsigned int stripe_count)
+{
+	struct lov_stripe_md lsm;
+
+	return sizeof(lsm) + stripe_count * sizeof(lsm.lsm_oinfo[0]);
+}
+
+struct lsm_operations {
+	void (*lsm_free)(struct lov_stripe_md *);
+	void (*lsm_stripe_by_index)(struct lov_stripe_md *, int *, loff_t *,
+				    loff_t *);
+	void (*lsm_stripe_by_offset)(struct lov_stripe_md *, int *, loff_t *,
+				     loff_t *);
+	int (*lsm_lmm_verify)(struct lov_mds_md *lmm, int lmm_bytes,
+			      u16 *stripe_count);
+	int (*lsm_unpackmd)(struct lov_obd *lov, struct lov_stripe_md *lsm,
+			    struct lov_mds_md *lmm);
+};
+
+extern const struct lsm_operations lsm_v1_ops;
+extern const struct lsm_operations lsm_v3_ops;
+
+static inline const struct lsm_operations *lsm_op_find(int magic)
+{
+	switch (magic) {
+	case LOV_MAGIC_V1:
+		return &lsm_v1_ops;
+	case LOV_MAGIC_V3:
+		return &lsm_v3_ops;
+	default:
+		CERROR("unrecognized lsm_magic %08x\n", magic);
+		return NULL;
+	}
+}
+
 /* lov_do_div64(a, b) returns a % b, and a = a / b.
  * The 32-bit code is LOV-specific due to knowing about stripe limits in
  * order to reduce the divisor to a 32-bit number.  If the divisor is
@@ -176,8 +254,6 @@ int lov_del_target(struct obd_device *obd, __u32 index,
 /* lov_pack.c */
 ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 		     size_t buf_size);
-int lov_packmd(struct obd_export *exp, struct lov_mds_md **lmm,
-	       struct lov_stripe_md *lsm);
 int lov_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
 		 struct lov_mds_md *lmm, int lmm_bytes);
 int lov_alloc_memmd(struct lov_stripe_md **lsmp, __u16 stripe_count,
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 6530187..621f66e 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -971,14 +971,6 @@ out:
 	return rc;
 }
 
-#define ASSERT_LSM_MAGIC(lsmp)						  \
-do {									    \
-	LASSERT((lsmp));						\
-	LASSERTF(((lsmp)->lsm_magic == LOV_MAGIC_V1 ||			  \
-		 (lsmp)->lsm_magic == LOV_MAGIC_V3),			    \
-		 "%p->lsm_magic=%x\n", (lsmp), (lsmp)->lsm_magic);	      \
-} while (0)
-
 int lov_statfs_interpret(struct ptlrpc_request_set *rqset, void *data, int rc)
 {
 	struct lov_request_set *lovset = (struct lov_request_set *)data;
@@ -1414,7 +1406,6 @@ static struct obd_ops lov_obd_ops = {
 	.disconnect     = lov_disconnect,
 	.statfs         = lov_statfs,
 	.statfs_async   = lov_statfs_async,
-	.packmd         = lov_packmd,
 	.unpackmd       = lov_unpackmd,
 	.iocontrol      = lov_iocontrol,
 	.get_info       = lov_get_info,
diff --git a/drivers/staging/lustre/lustre/lov/lov_pack.c b/drivers/staging/lustre/lustre/lov/lov_pack.c
index 1156ef9..17bcead 100644
--- a/drivers/staging/lustre/lustre/lov/lov_pack.c
+++ b/drivers/staging/lustre/lustre/lov/lov_pack.c
@@ -153,96 +153,6 @@ ssize_t lov_lsm_pack(const struct lov_stripe_md *lsm, void *buf,
 	return lmm_size;
 }
 
-/* Pack LOV object metadata for disk storage.  It is packed in LE byte
- * order and is opaque to the networking layer.
- *
- * XXX In the future, this will be enhanced to get the EA size from the
- *     underlying OSC device(s) to get their EA sizes so we can stack
- *     LOVs properly.  For now lov_mds_md_size() just assumes one u64
- *     per stripe.
- */
-int lov_obd_packmd(struct lov_obd *lov, struct lov_mds_md **lmmp,
-		   struct lov_stripe_md *lsm)
-{
-	__u16 stripe_count;
-	int lmm_size, lmm_magic;
-
-	if (lsm) {
-		lmm_magic = lsm->lsm_magic;
-	} else {
-		if (lmmp && *lmmp)
-			lmm_magic = le32_to_cpu((*lmmp)->lmm_magic);
-		else
-			/* lsm == NULL and lmmp == NULL */
-			lmm_magic = LOV_MAGIC;
-	}
-
-	if ((lmm_magic != LOV_MAGIC_V1) &&
-	    (lmm_magic != LOV_MAGIC_V3)) {
-		CERROR("bad mem LOV MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
-		       lmm_magic, LOV_MAGIC_V1, LOV_MAGIC_V3);
-		return -EINVAL;
-	}
-
-	if (lsm) {
-		/* If we are just sizing the EA, limit the stripe count
-		 * to the actual number of OSTs in this filesystem.
-		 */
-		if (!lmmp) {
-			stripe_count = lov_get_stripecnt(lov, lmm_magic,
-							 lsm->lsm_stripe_count);
-			lsm->lsm_stripe_count = stripe_count;
-		} else if (!lsm_is_released(lsm)) {
-			stripe_count = lsm->lsm_stripe_count;
-		} else {
-			stripe_count = 0;
-		}
-	} else {
-		/*
-		 * To calculate maximum easize by active targets at present,
-		 * which is exactly the maximum easize to be seen by LOV
-		 */
-		stripe_count = lov->desc.ld_active_tgt_count;
-	}
-
-	/* XXX LOV STACKING call into osc for sizes */
-	lmm_size = lov_mds_md_size(stripe_count, lmm_magic);
-
-	if (!lmmp)
-		return lmm_size;
-
-	if (*lmmp && !lsm) {
-		stripe_count = le16_to_cpu((*lmmp)->lmm_stripe_count);
-		lmm_size = lov_mds_md_size(stripe_count, lmm_magic);
-		kvfree(*lmmp);
-		*lmmp = NULL;
-		return 0;
-	}
-
-	if (!*lmmp) {
-		*lmmp = libcfs_kvzalloc(lmm_size, GFP_NOFS);
-		if (!*lmmp)
-			return -ENOMEM;
-	}
-
-	CDEBUG(D_INFO, "lov_packmd: LOV_MAGIC 0x%08X, lmm_size = %d\n",
-	       lmm_magic, lmm_size);
-
-	if (!lsm)
-		return lmm_size;
-
-	return lov_lsm_pack(lsm, *lmmp, lmm_size);
-}
-
-int lov_packmd(struct obd_export *exp, struct lov_mds_md **lmmp,
-	       struct lov_stripe_md *lsm)
-{
-	struct obd_device *obd = class_exp2obd(exp);
-	struct lov_obd *lov = &obd->u.lov;
-
-	return lov_obd_packmd(lov, lmmp, lsm);
-}
-
 /* Find the max stripecount we should use */
 __u16 lov_get_stripecnt(struct lov_obd *lov, __u32 magic, __u16 stripe_count)
 {
@@ -393,15 +303,14 @@ int lov_unpackmd(struct obd_export *exp,  struct lov_stripe_md **lsmp,
 int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		  struct lov_user_md __user *lump)
 {
-	/*
-	 * XXX huge struct allocated on stack.
-	 */
 	/* we use lov_user_md_v3 because it is larger than lov_user_md_v1 */
-	struct lov_obd *lov;
 	struct lov_user_md_v3 lum;
-	struct lov_mds_md *lmmk = NULL;
-	int rc, lmmk_size, lmm_size;
-	int lum_size;
+	struct lov_mds_md *lmmk;
+	u32 stripe_count;
+	ssize_t lmm_size;
+	size_t lmmk_size;
+	size_t lum_size;
+	int rc;
 	mm_segment_t seg;
 
 	if (!lsm)
@@ -414,6 +323,18 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 	seg = get_fs();
 	set_fs(KERNEL_DS);
 
+	if (lsm->lsm_magic != LOV_MAGIC_V1 && lsm->lsm_magic != LOV_MAGIC_V3) {
+		CERROR("bad LSM MAGIC: 0x%08X != 0x%08X nor 0x%08X\n",
+		       lsm->lsm_magic, LOV_MAGIC_V1, LOV_MAGIC_V3);
+		rc = -EIO;
+		goto out;
+	}
+
+	if (!lsm_is_released(lsm))
+		stripe_count = lsm->lsm_stripe_count;
+	else
+		stripe_count = 0;
+
 	/* we only need the header part from user space to get lmm_magic and
 	 * lmm_stripe_count, (the header part is common to v1 and v3)
 	 */
@@ -432,32 +353,40 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 	if (lum.lmm_stripe_count &&
 	    (lum.lmm_stripe_count < lsm->lsm_stripe_count)) {
 		/* Return right size of stripe to user */
-		lum.lmm_stripe_count = lsm->lsm_stripe_count;
+		lum.lmm_stripe_count = stripe_count;
 		rc = copy_to_user(lump, &lum, lum_size);
 		rc = -EOVERFLOW;
 		goto out;
 	}
-	lov = lu2lov_dev(obj->lo_cl.co_lu.lo_dev)->ld_lov;
-	rc = lov_obd_packmd(lov, &lmmk, lsm);
-	if (rc < 0)
+	lmmk_size = lov_mds_md_size(stripe_count, lsm->lsm_magic);
+
+
+	lmmk = libcfs_kvzalloc(lmmk_size, GFP_NOFS);
+	if (!lmmk) {
+		rc = -ENOMEM;
 		goto out;
-	lmmk_size = rc;
-	lmm_size = rc;
-	rc = 0;
+	}
+
+	lmm_size = lov_lsm_pack(lsm, lmmk, lmmk_size);
+	if (lmm_size < 0) {
+		rc = lmm_size;
+		goto out_free;
+	}
 
 	/* FIXME: Bug 1185 - copy fields properly when structs change */
 	/* struct lov_user_md_v3 and struct lov_mds_md_v3 must be the same */
 	CLASSERT(sizeof(lum) == sizeof(struct lov_mds_md_v3));
 	CLASSERT(sizeof(lum.lmm_objects[0]) == sizeof(lmmk->lmm_objects[0]));
 
-	if ((cpu_to_le32(LOV_MAGIC) != LOV_MAGIC) &&
-	    ((lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V1)) ||
-	    (lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V3)))) {
+	if (cpu_to_le32(LOV_MAGIC) != LOV_MAGIC &&
+	    (lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V1) ||
+	     lmmk->lmm_magic == cpu_to_le32(LOV_MAGIC_V3))) {
 		lustre_swab_lov_mds_md(lmmk);
 		lustre_swab_lov_user_md_objects(
 				(struct lov_user_ost_data *)lmmk->lmm_objects,
 				lmmk->lmm_stripe_count);
 	}
+
 	if (lum.lmm_magic == LOV_USER_MAGIC) {
 		/* User request for v1, we need skip lmm_pool_name */
 		if (lmmk->lmm_magic == LOV_MAGIC_V3) {
@@ -491,7 +420,7 @@ int lov_getstripe(struct lov_object *obj, struct lov_stripe_md *lsm,
 		rc = -EFAULT;
 
 out_free:
-	kfree(lmmk);
+	kvfree(lmmk);
 out:
 	set_fs(seg);
 	return rc;
diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index af373af..10d0b9d 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -447,14 +447,6 @@ static int mdc_get_lustre_md(struct obd_export *exp,
 		if (rc < 0)
 			goto out;
 
-		if (rc < (typeof(rc))sizeof(*md->lsm)) {
-			CDEBUG(D_INFO,
-			       "lsm size too small: rc < sizeof (*md->lsm) (%d < %d)\n",
-			       rc, (int)sizeof(*md->lsm));
-			rc = -EPROTO;
-			goto out;
-		}
-
 	} else if (md->body->mbo_valid & OBD_MD_FLDIREA) {
 		int lmvsize;
 		struct lov_mds_md *lmv;
diff --git a/drivers/staging/lustre/lustre/osc/osc_request.c b/drivers/staging/lustre/lustre/osc/osc_request.c
index 0985bda..038f00c 100644
--- a/drivers/staging/lustre/lustre/osc/osc_request.c
+++ b/drivers/staging/lustre/lustre/osc/osc_request.c
@@ -103,68 +103,6 @@ static void osc_release_ppga(struct brw_page **ppga, u32 count);
 static int brw_interpret(const struct lu_env *env,
 			 struct ptlrpc_request *req, void *data, int rc);
 
-/* Unpack OSC object metadata from disk storage (LE byte order). */
-static int osc_unpackmd(struct obd_export *exp, struct lov_stripe_md **lsmp,
-			struct lov_mds_md *lmm, int lmm_bytes)
-{
-	int lsm_size;
-	struct obd_import *imp = class_exp2cliimp(exp);
-
-	if (lmm) {
-		if (lmm_bytes < sizeof(*lmm)) {
-			CERROR("%s: lov_mds_md too small: %d, need %d\n",
-			       exp->exp_obd->obd_name, lmm_bytes,
-			       (int)sizeof(*lmm));
-			return -EINVAL;
-		}
-		/* XXX LOV_MAGIC etc check? */
-
-		if (unlikely(ostid_id(&lmm->lmm_oi) == 0)) {
-			CERROR("%s: zero lmm_object_id: rc = %d\n",
-			       exp->exp_obd->obd_name, -EINVAL);
-			return -EINVAL;
-		}
-	}
-
-	lsm_size = lov_stripe_md_size(1);
-	if (!lsmp)
-		return lsm_size;
-
-	if (*lsmp && !lmm) {
-		kfree((*lsmp)->lsm_oinfo[0]);
-		kfree(*lsmp);
-		*lsmp = NULL;
-		return 0;
-	}
-
-	if (!*lsmp) {
-		*lsmp = kzalloc(lsm_size, GFP_NOFS);
-		if (unlikely(!*lsmp))
-			return -ENOMEM;
-		(*lsmp)->lsm_oinfo[0] = kzalloc(sizeof(struct lov_oinfo),
-						GFP_NOFS);
-		if (unlikely(!(*lsmp)->lsm_oinfo[0])) {
-			kfree(*lsmp);
-			return -ENOMEM;
-		}
-		loi_init((*lsmp)->lsm_oinfo[0]);
-	} else if (unlikely(ostid_id(&(*lsmp)->lsm_oi) == 0)) {
-		return -EBADF;
-	}
-
-	if (lmm)
-		/* XXX zero *lsmp? */
-		ostid_le_to_cpu(&lmm->lmm_oi, &(*lsmp)->lsm_oi);
-
-	if (imp &&
-	    (imp->imp_connect_data.ocd_connect_flags & OBD_CONNECT_MAXBYTES))
-		(*lsmp)->lsm_maxbytes = imp->imp_connect_data.ocd_maxbytes;
-	else
-		(*lsmp)->lsm_maxbytes = LUSTRE_EXT3_STRIPE_MAXBYTES;
-
-	return lsm_size;
-}
-
 static inline void osc_pack_req_body(struct ptlrpc_request *req,
 				     struct obdo *oa)
 {
@@ -2884,7 +2822,6 @@ static struct obd_ops osc_obd_ops = {
 	.disconnect     = osc_disconnect,
 	.statfs         = osc_statfs,
 	.statfs_async   = osc_statfs_async,
-	.unpackmd       = osc_unpackmd,
 	.create         = osc_create,
 	.destroy        = osc_destroy,
 	.getattr        = osc_getattr,
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [PATCH 41/41] staging: lustre: echo: request pages in batches
  2016-10-03  2:27 ` [lustre-devel] " James Simmons
@ 2016-10-03  2:28   ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Alex Zhuravlev, James Simmons

From: Alex Zhuravlev <alexey.zhuravlev@intel.com>

rather than fetch them one by one.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5278
Reviewed-on: http://review.whamcloud.com/13612
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/obdecho/echo_client.c    |   38 ++++++++-----------
 1 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index d8e3e96..c69588c 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -1335,7 +1335,7 @@ static int echo_client_prep_commit(const struct lu_env *env,
 {
 	struct obd_ioobj ioo;
 	struct niobuf_local *lnb;
-	struct niobuf_remote *rnb;
+	struct niobuf_remote rnb;
 	u64 off;
 	u64 npages, tot_pages;
 	int i, ret = 0, brw_flags = 0;
@@ -1347,9 +1347,7 @@ static int echo_client_prep_commit(const struct lu_env *env,
 	tot_pages = count >> PAGE_SHIFT;
 
 	lnb = kcalloc(npages, sizeof(struct niobuf_local), GFP_NOFS);
-	rnb = kcalloc(npages, sizeof(struct niobuf_remote), GFP_NOFS);
-
-	if (!lnb || !rnb) {
+	if (!lnb) {
 		ret = -ENOMEM;
 		goto out;
 	}
@@ -1361,25 +1359,22 @@ static int echo_client_prep_commit(const struct lu_env *env,
 
 	off = offset;
 
-	for (; tot_pages; tot_pages -= npages) {
+	for (; tot_pages > 0; tot_pages -= npages) {
 		int lpages;
 
 		if (tot_pages < npages)
 			npages = tot_pages;
 
-		for (i = 0; i < npages; i++, off += PAGE_SIZE) {
-			rnb[i].rnb_offset = off;
-			rnb[i].rnb_len = PAGE_SIZE;
-			rnb[i].rnb_flags = brw_flags;
-		}
-
-		ioo.ioo_bufcnt = npages;
+		rnb.rnb_offset = off;
+		rnb.rnb_len = npages * PAGE_SIZE;
+		rnb.rnb_flags = brw_flags;
+		ioo.ioo_bufcnt = 1;
+		off += npages * PAGE_SIZE;
 
 		lpages = npages;
-		ret = obd_preprw(env, rw, exp, oa, 1, &ioo, rnb, &lpages, lnb);
+		ret = obd_preprw(env, rw, exp, oa, 1, &ioo, &rnb, &lpages, lnb);
 		if (ret != 0)
 			goto out;
-		LASSERT(lpages == npages);
 
 		for (i = 0; i < lpages; i++) {
 			struct page *page = lnb[i].lnb_page;
@@ -1398,17 +1393,17 @@ static int echo_client_prep_commit(const struct lu_env *env,
 
 			if (rw == OBD_BRW_WRITE)
 				echo_client_page_debug_setup(page, rw,
-							    ostid_id(&oa->o_oi),
-							     rnb[i].rnb_offset,
-							     rnb[i].rnb_len);
+							     ostid_id(&oa->o_oi),
+							     lnb[i].lnb_file_offset,
+							     lnb[i].lnb_len);
 			else
 				echo_client_page_debug_check(page,
-							    ostid_id(&oa->o_oi),
-							     rnb[i].rnb_offset,
-							     rnb[i].rnb_len);
+							     ostid_id(&oa->o_oi),
+							     lnb[i].lnb_file_offset,
+							     lnb[i].lnb_len);
 		}
 
-		ret = obd_commitrw(env, rw, exp, oa, 1, &ioo, rnb, npages, lnb,
+		ret = obd_commitrw(env, rw, exp, oa, 1, &ioo, &rnb, npages, lnb,
 				   ret);
 		if (ret != 0)
 			goto out;
@@ -1420,7 +1415,6 @@ static int echo_client_prep_commit(const struct lu_env *env,
 
 out:
 	kfree(lnb);
-	kfree(rnb);
 	return ret;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 41/41] staging: lustre: echo: request pages in batches
@ 2016-10-03  2:28   ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-03  2:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: Linux Kernel Mailing List, Lustre Development List,
	Alex Zhuravlev, James Simmons

From: Alex Zhuravlev <alexey.zhuravlev@intel.com>

rather than fetch them one by one.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5278
Reviewed-on: http://review.whamcloud.com/13612
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
---
 .../staging/lustre/lustre/obdecho/echo_client.c    |   38 ++++++++-----------
 1 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/staging/lustre/lustre/obdecho/echo_client.c b/drivers/staging/lustre/lustre/obdecho/echo_client.c
index d8e3e96..c69588c 100644
--- a/drivers/staging/lustre/lustre/obdecho/echo_client.c
+++ b/drivers/staging/lustre/lustre/obdecho/echo_client.c
@@ -1335,7 +1335,7 @@ static int echo_client_prep_commit(const struct lu_env *env,
 {
 	struct obd_ioobj ioo;
 	struct niobuf_local *lnb;
-	struct niobuf_remote *rnb;
+	struct niobuf_remote rnb;
 	u64 off;
 	u64 npages, tot_pages;
 	int i, ret = 0, brw_flags = 0;
@@ -1347,9 +1347,7 @@ static int echo_client_prep_commit(const struct lu_env *env,
 	tot_pages = count >> PAGE_SHIFT;
 
 	lnb = kcalloc(npages, sizeof(struct niobuf_local), GFP_NOFS);
-	rnb = kcalloc(npages, sizeof(struct niobuf_remote), GFP_NOFS);
-
-	if (!lnb || !rnb) {
+	if (!lnb) {
 		ret = -ENOMEM;
 		goto out;
 	}
@@ -1361,25 +1359,22 @@ static int echo_client_prep_commit(const struct lu_env *env,
 
 	off = offset;
 
-	for (; tot_pages; tot_pages -= npages) {
+	for (; tot_pages > 0; tot_pages -= npages) {
 		int lpages;
 
 		if (tot_pages < npages)
 			npages = tot_pages;
 
-		for (i = 0; i < npages; i++, off += PAGE_SIZE) {
-			rnb[i].rnb_offset = off;
-			rnb[i].rnb_len = PAGE_SIZE;
-			rnb[i].rnb_flags = brw_flags;
-		}
-
-		ioo.ioo_bufcnt = npages;
+		rnb.rnb_offset = off;
+		rnb.rnb_len = npages * PAGE_SIZE;
+		rnb.rnb_flags = brw_flags;
+		ioo.ioo_bufcnt = 1;
+		off += npages * PAGE_SIZE;
 
 		lpages = npages;
-		ret = obd_preprw(env, rw, exp, oa, 1, &ioo, rnb, &lpages, lnb);
+		ret = obd_preprw(env, rw, exp, oa, 1, &ioo, &rnb, &lpages, lnb);
 		if (ret != 0)
 			goto out;
-		LASSERT(lpages == npages);
 
 		for (i = 0; i < lpages; i++) {
 			struct page *page = lnb[i].lnb_page;
@@ -1398,17 +1393,17 @@ static int echo_client_prep_commit(const struct lu_env *env,
 
 			if (rw == OBD_BRW_WRITE)
 				echo_client_page_debug_setup(page, rw,
-							    ostid_id(&oa->o_oi),
-							     rnb[i].rnb_offset,
-							     rnb[i].rnb_len);
+							     ostid_id(&oa->o_oi),
+							     lnb[i].lnb_file_offset,
+							     lnb[i].lnb_len);
 			else
 				echo_client_page_debug_check(page,
-							    ostid_id(&oa->o_oi),
-							     rnb[i].rnb_offset,
-							     rnb[i].rnb_len);
+							     ostid_id(&oa->o_oi),
+							     lnb[i].lnb_file_offset,
+							     lnb[i].lnb_len);
 		}
 
-		ret = obd_commitrw(env, rw, exp, oa, 1, &ioo, rnb, npages, lnb,
+		ret = obd_commitrw(env, rw, exp, oa, 1, &ioo, &rnb, npages, lnb,
 				   ret);
 		if (ret != 0)
 			goto out;
@@ -1420,7 +1415,6 @@ static int echo_client_prep_commit(const struct lu_env *env,
 
 out:
 	kfree(lnb);
-	kfree(rnb);
 	return ret;
 }
 
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 98+ messages in thread

* Re: [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
  2016-10-03  2:28   ` [lustre-devel] " James Simmons
@ 2016-10-09 14:16     ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 98+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-09 14:16 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List, Bobi Jam, Jinshan Xiong

On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> From: Bobi Jam <bobijam.xu@intel.com>
> 
> If normal IO got short read/write, we'd restart the IO from where
> we've accomplished until we meet EOF or error happens.
> 
> Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> Reviewed-on: http://review.whamcloud.com/14123
> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
>  .../staging/lustre/lustre/include/obd_support.h    |    2 +
>  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
>  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
>  4 files changed, 45 insertions(+), 18 deletions(-)

Due to other changes in the filesystem tree, this patch no longer
applies :(

Can you rebase it and resend?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
@ 2016-10-09 14:16     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 98+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-09 14:16 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List, Bobi Jam, Jinshan Xiong

On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> From: Bobi Jam <bobijam.xu@intel.com>
> 
> If normal IO got short read/write, we'd restart the IO from where
> we've accomplished until we meet EOF or error happens.
> 
> Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> Reviewed-on: http://review.whamcloud.com/14123
> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
>  .../staging/lustre/lustre/include/obd_support.h    |    2 +
>  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
>  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
>  4 files changed, 45 insertions(+), 18 deletions(-)

Due to other changes in the filesystem tree, this patch no longer
applies :(

Can you rebase it and resend?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 35/41] staging: lustre: hsm: Use file lease to implement migration
  2016-10-03  2:28   ` [lustre-devel] " James Simmons
@ 2016-10-09 14:18     ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 98+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-09 14:18 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List, Henri Doreau, Jinshan Xiong

On Sun, Oct 02, 2016 at 10:28:31PM -0400, James Simmons wrote:
> From: Henri Doreau <henri.doreau@cea.fr>
> 
> Implement non-blocking migration based on exclusive open instead of
> group lock. Implemented exclusive close operation to atomically put
> a lease, swap two layouts and close a file. This allows race-free
> migrations.
> 
> Make the caller responsible for retrying on failure (EBUSY, EAGAIN)
> in non-blocking mode.
> 
> In blocking mode, allow applications to trigger layout swaps using a
> grouplock they already own, to prevent race conditions between the
> actual data copy and the layout swap. Updated lfs accordingly. File
> leases are also taken in blocking mode, so that lfs migrate can issue
> a warning if an application attempts to open a file that is being
> migrated and gets blocked.
> 
> Timestamps (atime/mtime) are set from userland, after the layout swap
> is performed, to prevent conflicts with the grouplock.
> 
> lli_trunc_sem is taken/released in the vvp_io layer, under the DLM
> lock. This re-ordering fixes the original issue between truncate and
> migrate.
> 
> Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
> Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4840
> Reviewed-on: http://review.whamcloud.com/10013
> Reviewed-by: John L. Hammond <john.hammond@intel.com>
> Reviewed-by: frank zago <fzago@cray.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  .../lustre/lustre/include/lustre/lustre_idl.h      |    5 +-
>  .../lustre/lustre/include/lustre/lustre_user.h     |    1 +
>  .../lustre/lustre/include/lustre_req_layout.h      |    2 +-
>  drivers/staging/lustre/lustre/llite/file.c         |  231 ++++++++++++--------
>  drivers/staging/lustre/lustre/llite/llite_lib.c    |    4 -
>  drivers/staging/lustre/lustre/llite/vvp_io.c       |   82 +++++---
>  drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   34 ++--
>  drivers/staging/lustre/lustre/mdc/mdc_request.c    |    7 +-
>  drivers/staging/lustre/lustre/ptlrpc/layout.c      |   10 +-
>  9 files changed, 235 insertions(+), 141 deletions(-)

This patch also failed to apply :(

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 35/41] staging: lustre: hsm: Use file lease to implement migration
@ 2016-10-09 14:18     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 98+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-09 14:18 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List, Henri Doreau, Jinshan Xiong

On Sun, Oct 02, 2016 at 10:28:31PM -0400, James Simmons wrote:
> From: Henri Doreau <henri.doreau@cea.fr>
> 
> Implement non-blocking migration based on exclusive open instead of
> group lock. Implemented exclusive close operation to atomically put
> a lease, swap two layouts and close a file. This allows race-free
> migrations.
> 
> Make the caller responsible for retrying on failure (EBUSY, EAGAIN)
> in non-blocking mode.
> 
> In blocking mode, allow applications to trigger layout swaps using a
> grouplock they already own, to prevent race conditions between the
> actual data copy and the layout swap. Updated lfs accordingly. File
> leases are also taken in blocking mode, so that lfs migrate can issue
> a warning if an application attempts to open a file that is being
> migrated and gets blocked.
> 
> Timestamps (atime/mtime) are set from userland, after the layout swap
> is performed, to prevent conflicts with the grouplock.
> 
> lli_trunc_sem is taken/released in the vvp_io layer, under the DLM
> lock. This re-ordering fixes the original issue between truncate and
> migrate.
> 
> Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
> Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-4840
> Reviewed-on: http://review.whamcloud.com/10013
> Reviewed-by: John L. Hammond <john.hammond@intel.com>
> Reviewed-by: frank zago <fzago@cray.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>
> ---
>  .../lustre/lustre/include/lustre/lustre_idl.h      |    5 +-
>  .../lustre/lustre/include/lustre/lustre_user.h     |    1 +
>  .../lustre/lustre/include/lustre_req_layout.h      |    2 +-
>  drivers/staging/lustre/lustre/llite/file.c         |  231 ++++++++++++--------
>  drivers/staging/lustre/lustre/llite/llite_lib.c    |    4 -
>  drivers/staging/lustre/lustre/llite/vvp_io.c       |   82 +++++---
>  drivers/staging/lustre/lustre/mdc/mdc_lib.c        |   34 ++--
>  drivers/staging/lustre/lustre/mdc/mdc_request.c    |    7 +-
>  drivers/staging/lustre/lustre/ptlrpc/layout.c      |   10 +-
>  9 files changed, 235 insertions(+), 141 deletions(-)

This patch also failed to apply :(

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
  2016-10-09 14:16     ` [lustre-devel] " Greg Kroah-Hartman
@ 2016-10-11 23:22       ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-11 23:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List, Bobi Jam, Jinshan Xiong


> On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> > From: Bobi Jam <bobijam.xu@intel.com>
> > 
> > If normal IO got short read/write, we'd restart the IO from where
> > we've accomplished until we meet EOF or error happens.
> > 
> > Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> > Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> > Reviewed-on: http://review.whamcloud.com/14123
> > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > ---
> >  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
> >  .../staging/lustre/lustre/include/obd_support.h    |    2 +
> >  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
> >  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
> >  4 files changed, 45 insertions(+), 18 deletions(-)
> 
> Due to other changes in the filesystem tree, this patch no longer
> applies :(
> 
> Can you rebase it and resend?

How long will you be accepting patches to merge for? If its going
to be a few weeks like to just include the missing two patches with
the next batch.

Another issue I need to look at is the IB changes. That's going to
require some heavy surgery to the ko2iblnd driver so its going to
take time for me to port this to the new RDMA RW api. That will
need to be push to linus so ko2iblnd can work with the 4.9 tree
if that is okay with you.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
@ 2016-10-11 23:22       ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-11 23:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: devel, Andreas Dilger, Oleg Drokin, Linux Kernel Mailing List,
	Lustre Development List, Bobi Jam, Jinshan Xiong


> On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> > From: Bobi Jam <bobijam.xu@intel.com>
> > 
> > If normal IO got short read/write, we'd restart the IO from where
> > we've accomplished until we meet EOF or error happens.
> > 
> > Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> > Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> > Reviewed-on: http://review.whamcloud.com/14123
> > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > ---
> >  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
> >  .../staging/lustre/lustre/include/obd_support.h    |    2 +
> >  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
> >  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
> >  4 files changed, 45 insertions(+), 18 deletions(-)
> 
> Due to other changes in the filesystem tree, this patch no longer
> applies :(
> 
> Can you rebase it and resend?

How long will you be accepting patches to merge for? If its going
to be a few weeks like to just include the missing two patches with
the next batch.

Another issue I need to look at is the IB changes. That's going to
require some heavy surgery to the ko2iblnd driver so its going to
take time for me to port this to the new RDMA RW api. That will
need to be push to linus so ko2iblnd can work with the 4.9 tree
if that is okay with you.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
  2016-10-11 23:22       ` [lustre-devel] " James Simmons
@ 2016-10-12  6:08         ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 98+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-12  6:08 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Linux Kernel Mailing List, Oleg Drokin,
	Bobi Jam, Jinshan Xiong, Lustre Development List

On Wed, Oct 12, 2016 at 12:22:35AM +0100, James Simmons wrote:
> 
> > On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> > > From: Bobi Jam <bobijam.xu@intel.com>
> > > 
> > > If normal IO got short read/write, we'd restart the IO from where
> > > we've accomplished until we meet EOF or error happens.
> > > 
> > > Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> > > Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> > > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> > > Reviewed-on: http://review.whamcloud.com/14123
> > > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > > ---
> > >  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
> > >  .../staging/lustre/lustre/include/obd_support.h    |    2 +
> > >  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
> > >  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
> > >  4 files changed, 45 insertions(+), 18 deletions(-)
> > 
> > Due to other changes in the filesystem tree, this patch no longer
> > applies :(
> > 
> > Can you rebase it and resend?
> 
> How long will you be accepting patches to merge for? If its going
> to be a few weeks like to just include the missing two patches with
> the next batch.

I don't understand the question.  I always accept patches, no need to
not send them, I'll queue them up to the proper branches as needed.  So
what do you mean here?

> Another issue I need to look at is the IB changes. That's going to
> require some heavy surgery to the ko2iblnd driver so its going to
> take time for me to port this to the new RDMA RW api. That will
> need to be push to linus so ko2iblnd can work with the 4.9 tree
> if that is okay with you.

Sure, send the patches, but maybe it is a 4.10 thing if it's too much
work?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
@ 2016-10-12  6:08         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 98+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-12  6:08 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Linux Kernel Mailing List, Oleg Drokin,
	Bobi Jam, Jinshan Xiong, Lustre Development List

On Wed, Oct 12, 2016 at 12:22:35AM +0100, James Simmons wrote:
> 
> > On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> > > From: Bobi Jam <bobijam.xu@intel.com>
> > > 
> > > If normal IO got short read/write, we'd restart the IO from where
> > > we've accomplished until we meet EOF or error happens.
> > > 
> > > Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> > > Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> > > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> > > Reviewed-on: http://review.whamcloud.com/14123
> > > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > > ---
> > >  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
> > >  .../staging/lustre/lustre/include/obd_support.h    |    2 +
> > >  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
> > >  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
> > >  4 files changed, 45 insertions(+), 18 deletions(-)
> > 
> > Due to other changes in the filesystem tree, this patch no longer
> > applies :(
> > 
> > Can you rebase it and resend?
> 
> How long will you be accepting patches to merge for? If its going
> to be a few weeks like to just include the missing two patches with
> the next batch.

I don't understand the question.  I always accept patches, no need to
not send them, I'll queue them up to the proper branches as needed.  So
what do you mean here?

> Another issue I need to look at is the IB changes. That's going to
> require some heavy surgery to the ko2iblnd driver so its going to
> take time for me to port this to the new RDMA RW api. That will
> need to be push to linus so ko2iblnd can work with the 4.9 tree
> if that is okay with you.

Sure, send the patches, but maybe it is a 4.10 thing if it's too much
work?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
  2016-10-12  6:08         ` [lustre-devel] " Greg Kroah-Hartman
@ 2016-10-13 22:45           ` James Simmons
  -1 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-13 22:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: devel, Andreas Dilger, Linux Kernel Mailing List, Oleg Drokin,
	Bobi Jam, Jinshan Xiong, Lustre Development List


> On Wed, Oct 12, 2016 at 12:22:35AM +0100, James Simmons wrote:
> > 
> > > On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> > > > From: Bobi Jam <bobijam.xu@intel.com>
> > > > 
> > > > If normal IO got short read/write, we'd restart the IO from where
> > > > we've accomplished until we meet EOF or error happens.
> > > > 
> > > > Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> > > > Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> > > > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> > > > Reviewed-on: http://review.whamcloud.com/14123
> > > > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > > > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > > > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > > > ---
> > > >  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
> > > >  .../staging/lustre/lustre/include/obd_support.h    |    2 +
> > > >  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
> > > >  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
> > > >  4 files changed, 45 insertions(+), 18 deletions(-)
> > > 
> > > Due to other changes in the filesystem tree, this patch no longer
> > > applies :(
> > > 
> > > Can you rebase it and resend?
> > 
> > How long will you be accepting patches to merge for? If its going
> > to be a few weeks like to just include the missing two patches with
> > the next batch.
> 
> I don't understand the question.  I always accept patches, no need to
> not send them, I'll queue them up to the proper branches as needed.  So
> what do you mean here?

I had the impression that more complex patches like the ones I have been
sending tend to accepted only at the start of the release cycle and only
simpler patches go into *-rc[3-7] versions. That is why I asked the
above question.
 
> > Another issue I need to look at is the IB changes. That's going to
> > require some heavy surgery to the ko2iblnd driver so its going to
> > take time for me to port this to the new RDMA RW api. That will
> > need to be push to linus so ko2iblnd can work with the 4.9 tree
> > if that is okay with you.
> 
> Sure, send the patches, but maybe it is a 4.10 thing if it's too much
> work?

We can work it out once I have something useful. Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
@ 2016-10-13 22:45           ` James Simmons
  0 siblings, 0 replies; 98+ messages in thread
From: James Simmons @ 2016-10-13 22:45 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: devel, Andreas Dilger, Linux Kernel Mailing List, Oleg Drokin,
	Bobi Jam, Jinshan Xiong, Lustre Development List


> On Wed, Oct 12, 2016 at 12:22:35AM +0100, James Simmons wrote:
> > 
> > > On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> > > > From: Bobi Jam <bobijam.xu@intel.com>
> > > > 
> > > > If normal IO got short read/write, we'd restart the IO from where
> > > > we've accomplished until we meet EOF or error happens.
> > > > 
> > > > Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> > > > Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> > > > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> > > > Reviewed-on: http://review.whamcloud.com/14123
> > > > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > > > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > > > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > > > ---
> > > >  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
> > > >  .../staging/lustre/lustre/include/obd_support.h    |    2 +
> > > >  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
> > > >  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
> > > >  4 files changed, 45 insertions(+), 18 deletions(-)
> > > 
> > > Due to other changes in the filesystem tree, this patch no longer
> > > applies :(
> > > 
> > > Can you rebase it and resend?
> > 
> > How long will you be accepting patches to merge for? If its going
> > to be a few weeks like to just include the missing two patches with
> > the next batch.
> 
> I don't understand the question.  I always accept patches, no need to
> not send them, I'll queue them up to the proper branches as needed.  So
> what do you mean here?

I had the impression that more complex patches like the ones I have been
sending tend to accepted only at the start of the release cycle and only
simpler patches go into *-rc[3-7] versions. That is why I asked the
above question.
 
> > Another issue I need to look at is the IB changes. That's going to
> > require some heavy surgery to the ko2iblnd driver so its going to
> > take time for me to port this to the new RDMA RW api. That will
> > need to be push to linus so ko2iblnd can work with the 4.9 tree
> > if that is okay with you.
> 
> Sure, send the patches, but maybe it is a 4.10 thing if it's too much
> work?

We can work it out once I have something useful. Thanks.

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
  2016-10-13 22:45           ` [lustre-devel] " James Simmons
@ 2016-10-14  7:41             ` Greg Kroah-Hartman
  -1 siblings, 0 replies; 98+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-14  7:41 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Linux Kernel Mailing List, Oleg Drokin,
	Bobi Jam, Jinshan Xiong, Lustre Development List

On Thu, Oct 13, 2016 at 11:45:28PM +0100, James Simmons wrote:
> 
> > On Wed, Oct 12, 2016 at 12:22:35AM +0100, James Simmons wrote:
> > > 
> > > > On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> > > > > From: Bobi Jam <bobijam.xu@intel.com>
> > > > > 
> > > > > If normal IO got short read/write, we'd restart the IO from where
> > > > > we've accomplished until we meet EOF or error happens.
> > > > > 
> > > > > Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> > > > > Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> > > > > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> > > > > Reviewed-on: http://review.whamcloud.com/14123
> > > > > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > > > > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > > > > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > > > > ---
> > > > >  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
> > > > >  .../staging/lustre/lustre/include/obd_support.h    |    2 +
> > > > >  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
> > > > >  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
> > > > >  4 files changed, 45 insertions(+), 18 deletions(-)
> > > > 
> > > > Due to other changes in the filesystem tree, this patch no longer
> > > > applies :(
> > > > 
> > > > Can you rebase it and resend?
> > > 
> > > How long will you be accepting patches to merge for? If its going
> > > to be a few weeks like to just include the missing two patches with
> > > the next batch.
> > 
> > I don't understand the question.  I always accept patches, no need to
> > not send them, I'll queue them up to the proper branches as needed.  So
> > what do you mean here?
> 
> I had the impression that more complex patches like the ones I have been
> sending tend to accepted only at the start of the release cycle and only
> simpler patches go into *-rc[3-7] versions. That is why I asked the
> above question.

Yes, that is true, but I will take your "complex" patches and put them
into the -next branch to go to the next kernel release, and only take
bug and regression fixes and add them to the -linus branch to go to the
-rc3-7 releases.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO
@ 2016-10-14  7:41             ` Greg Kroah-Hartman
  0 siblings, 0 replies; 98+ messages in thread
From: Greg Kroah-Hartman @ 2016-10-14  7:41 UTC (permalink / raw)
  To: James Simmons
  Cc: devel, Andreas Dilger, Linux Kernel Mailing List, Oleg Drokin,
	Bobi Jam, Jinshan Xiong, Lustre Development List

On Thu, Oct 13, 2016 at 11:45:28PM +0100, James Simmons wrote:
> 
> > On Wed, Oct 12, 2016 at 12:22:35AM +0100, James Simmons wrote:
> > > 
> > > > On Sun, Oct 02, 2016 at 10:28:28PM -0400, James Simmons wrote:
> > > > > From: Bobi Jam <bobijam.xu@intel.com>
> > > > > 
> > > > > If normal IO got short read/write, we'd restart the IO from where
> > > > > we've accomplished until we meet EOF or error happens.
> > > > > 
> > > > > Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> > > > > Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
> > > > > Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6389
> > > > > Reviewed-on: http://review.whamcloud.com/14123
> > > > > Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
> > > > > Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> > > > > Signed-off-by: James Simmons <jsimmons@infradead.org>
> > > > > ---
> > > > >  drivers/staging/lustre/lnet/libcfs/fail.c          |    1 +
> > > > >  .../staging/lustre/lustre/include/obd_support.h    |    2 +
> > > > >  drivers/staging/lustre/lustre/llite/file.c         |   41 ++++++++++++--------
> > > > >  drivers/staging/lustre/lustre/llite/vvp_io.c       |   19 ++++++++-
> > > > >  4 files changed, 45 insertions(+), 18 deletions(-)
> > > > 
> > > > Due to other changes in the filesystem tree, this patch no longer
> > > > applies :(
> > > > 
> > > > Can you rebase it and resend?
> > > 
> > > How long will you be accepting patches to merge for? If its going
> > > to be a few weeks like to just include the missing two patches with
> > > the next batch.
> > 
> > I don't understand the question.  I always accept patches, no need to
> > not send them, I'll queue them up to the proper branches as needed.  So
> > what do you mean here?
> 
> I had the impression that more complex patches like the ones I have been
> sending tend to accepted only at the start of the release cycle and only
> simpler patches go into *-rc[3-7] versions. That is why I asked the
> above question.

Yes, that is true, but I will take your "complex" patches and put them
into the -next branch to go to the next kernel release, and only take
bug and regression fixes and add them to the -linus branch to go to the
-rc3-7 releases.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 98+ messages in thread

* Re: [PATCH 24/41] staging: lustre: clio: add CIT_DATA_VERSION and remove IOC_LOV_GETINFO
  2016-10-03  2:28   ` [lustre-devel] " James Simmons
@ 2018-02-22  3:06     ` NeilBrown
  -1 siblings, 0 replies; 98+ messages in thread
From: NeilBrown @ 2018-02-22  3:06 UTC (permalink / raw)
  To: James Simmons, Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: John L. Hammond, Bobi Jam, Linux Kernel Mailing List,
	Lustre Development List


[-- Attachment #1.1: Type: text/plain, Size: 2715 bytes --]


Another ancient patch....


On Sun, Oct 02 2016, James Simmons wrote:

> From: John L. Hammond <john.hammond@intel.com>
>
> During development a new api, cl_object_obd_info_get()
> and cl_object_data_version() which then were later
> replaced by a better solution CIT_DATA_VERSION. For
> the case of the upstream client their is no point in
> introducing a API to only have it removed later. Due
> to the way the patches landed with their dependencies
> it is not possible to separate out two patches. These
> two combined patches do the following:
>
>  * Add a new cl_io type CIT_DATA_VERSION to get file
>    data version.
>  * Remove the unused IOC_LOV_GETINFO ioctl.
>  * Remove ll_glimpse_ioctl() and ll_lsm_getattr().
>  * Remove the OBD API method obd_getattr_async().
>
> Signed-off-by: John L. Hammond <john.hammond@intel.com>
> Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
> Reviewed-on: http://review.whamcloud.com/12748
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
> Reviewed-on: http://review.whamcloud.com/14649
> Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>

With so many Reviewed-by :-)


>  
> +static void
> +lov_io_data_version_end(const struct lu_env *env, const struct cl_io_slice *ios)
> +{
> +	struct lov_io *lio = cl2lov_io(env, ios);
> +	struct cl_io *parent = lio->lis_cl.cis_io;
> +	struct lov_io_sub *sub;
> +
> +	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
> +		lov_io_end_wrapper(env, sub->sub_io);

This sort of construct occurs several other times in the same file:

lov_io_read_ahead:
	rc = cl_io_read_ahead(sub->sub_env, sub->sub_io,

lov_io_submit:
		rc = cl_io_submit_rw(sub->sub_env, sub->sub_io,
				     crt, queue);

lio_io_commit_async:
		rc = cl_io_commit_async(sub->sub_env, sub->sub_io, queue,
					from, to, cb);

lov_io_fsync_end:
		struct cl_io *subio = sub->sub_io;
		lov_io_end_wrapper(sub->sub_env, subio);


Every other time, sub->sub_env is used with sub->sub_io.
In this new code, 'env' is (incorrectly) used with sub->sub_io.

This reliably causes my testing to crash as the LNVRNT() in cl2osc_io()
fails, and I test with
   CONFIG_LUSTRE_DEBUG_EXPENSIVE_CHECK=y

Does anyone have any idea why no other testing trips over this?

lustre-release has the same bug in
  Commit: fcd45488711a ("LU-5683 clio: add CIT_DATA_VERSION")

I'll send a patch for Linux.  Someone else might like to fix
lustre-release.

Thanks,
NeilBrown

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 169 bytes --]

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 98+ messages in thread

* [lustre-devel] [PATCH 24/41] staging: lustre: clio: add CIT_DATA_VERSION and remove IOC_LOV_GETINFO
@ 2018-02-22  3:06     ` NeilBrown
  0 siblings, 0 replies; 98+ messages in thread
From: NeilBrown @ 2018-02-22  3:06 UTC (permalink / raw)
  To: James Simmons, Greg Kroah-Hartman, devel, Andreas Dilger, Oleg Drokin
  Cc: John L. Hammond, Bobi Jam, Linux Kernel Mailing List,
	Lustre Development List


Another ancient patch....


On Sun, Oct 02 2016, James Simmons wrote:

> From: John L. Hammond <john.hammond@intel.com>
>
> During development a new api, cl_object_obd_info_get()
> and cl_object_data_version() which then were later
> replaced by a better solution CIT_DATA_VERSION. For
> the case of the upstream client their is no point in
> introducing a API to only have it removed later. Due
> to the way the patches landed with their dependencies
> it is not possible to separate out two patches. These
> two combined patches do the following:
>
>  * Add a new cl_io type CIT_DATA_VERSION to get file
>    data version.
>  * Remove the unused IOC_LOV_GETINFO ioctl.
>  * Remove ll_glimpse_ioctl() and ll_lsm_getattr().
>  * Remove the OBD API method obd_getattr_async().
>
> Signed-off-by: John L. Hammond <john.hammond@intel.com>
> Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
> Reviewed-on: http://review.whamcloud.com/12748
> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6356
> Reviewed-on: http://review.whamcloud.com/14649
> Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
> Signed-off-by: James Simmons <jsimmons@infradead.org>

With so many Reviewed-by :-)


>  
> +static void
> +lov_io_data_version_end(const struct lu_env *env, const struct cl_io_slice *ios)
> +{
> +	struct lov_io *lio = cl2lov_io(env, ios);
> +	struct cl_io *parent = lio->lis_cl.cis_io;
> +	struct lov_io_sub *sub;
> +
> +	list_for_each_entry(sub, &lio->lis_active, sub_linkage) {
> +		lov_io_end_wrapper(env, sub->sub_io);

This sort of construct occurs several other times in the same file:

lov_io_read_ahead:
	rc = cl_io_read_ahead(sub->sub_env, sub->sub_io,

lov_io_submit:
		rc = cl_io_submit_rw(sub->sub_env, sub->sub_io,
				     crt, queue);

lio_io_commit_async:
		rc = cl_io_commit_async(sub->sub_env, sub->sub_io, queue,
					from, to, cb);

lov_io_fsync_end:
		struct cl_io *subio = sub->sub_io;
		lov_io_end_wrapper(sub->sub_env, subio);


Every other time, sub->sub_env is used with sub->sub_io.
In this new code, 'env' is (incorrectly) used with sub->sub_io.

This reliably causes my testing to crash as the LNVRNT() in cl2osc_io()
fails, and I test with
   CONFIG_LUSTRE_DEBUG_EXPENSIVE_CHECK=y

Does anyone have any idea why no other testing trips over this?

lustre-release has the same bug in
  Commit: fcd45488711a ("LU-5683 clio: add CIT_DATA_VERSION")

I'll send a patch for Linux.  Someone else might like to fix
lustre-release.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180222/0749fe05/attachment.sig>

^ permalink raw reply	[flat|nested] 98+ messages in thread

end of thread, other threads:[~2018-02-22  3:06 UTC | newest]

Thread overview: 98+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-03  2:27 [PATCH 00/41] missing patches for lustre 2.7.50 to 2.7.55 James Simmons
2016-10-03  2:27 ` [lustre-devel] " James Simmons
2016-10-03  2:27 ` [PATCH 01/41] staging: lustre: obdclass: fix race during key quiescency James Simmons
2016-10-03  2:27   ` [lustre-devel] " James Simmons
2016-10-03  2:27 ` [PATCH 02/41] staging: lustre: obdclass: Add synchro in lu_context_key_degister() James Simmons
2016-10-03  2:27   ` [lustre-devel] " James Simmons
2016-10-03  2:27 ` [PATCH 03/41] staging: lustre: llite: remove client Size on MDS support James Simmons
2016-10-03  2:27   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 04/41] staging: lustre: obd: " James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 05/41] staging: lustre: clio: Revise read ahead implementation James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 06/41] staging: lustre: ldlm: remove unnecessary EXPORT_SYMBOL James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 07/41] staging: lustre: llite: remove duplicate fiemap defines James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 08/41] staging: lustre: ptlrpc: ret -ECONNREFUSED if not context found in req James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 09/41] staging: lustre: llite: default dir stripe index only for mkdir James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 10/41] staging: lustre: libcfs: shortcut to create CPT from NUMA topology James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 11/41] staging: lustre: ptlrpc: Add OBD_CONNECT_MULTIMODRPCS flag James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 12/41] staging: lustre: clio: get rid of lov_stripe_md reference James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 13/41] staging: lustre: clio: use CIT_SETATTR for FSFILT_IOC_SETFLAGS James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 14/41] staging: lustre: ptlrpc: Add a tag field to ptlrpc messages James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 15/41] staging: lustre: osc: fix bug when setting max_pages_per_rpc James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 16/41] staging: lustre: ldlm: Do not use cbpending for group locks James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 17/41] staging: lustre: ptlrpc: remove old protocol compatibility James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 18/41] staging: lustre: llite: Report first encountered error James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 19/41] staging: lustre: ptlrpc: dont take unwrap in req_waittime calculation James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 20/41] staging: lustre: remove Size on MDS support James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 21/41] staging: lustre: mdc: Removed unneeded NULL check James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 22/41] staging: lustre: obd: remove unused LSM parameters James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 23/41] staging: lustre: mgc: MGC should retry for invalid import James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 24/41] staging: lustre: clio: add CIT_DATA_VERSION and remove IOC_LOV_GETINFO James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2018-02-22  3:06   ` NeilBrown
2018-02-22  3:06     ` [lustre-devel] " NeilBrown
2016-10-03  2:28 ` [PATCH 25/41] staging: lustre: lov: add cl_object_layout_get() James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 26/41] staging: lustre: llite: remove lli_has_smd James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 27/41] staging: lustre: llite: add cl_object_maxbytes() James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 28/41] staging: lustre: hsm: make HSM modification requests replayable James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 29/41] staging: lustre: ptlrpc: Move NRS structures out of lustre_net.h James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 30/41] staging: lustre: quota: remove obsolete quota code James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 31/41] staging: lustre: obd: remove destroy cookie handling James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 32/41] staging: lustre: llite: restart short read/write for normal IO James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-09 14:16   ` Greg Kroah-Hartman
2016-10-09 14:16     ` [lustre-devel] " Greg Kroah-Hartman
2016-10-11 23:22     ` James Simmons
2016-10-11 23:22       ` [lustre-devel] " James Simmons
2016-10-12  6:08       ` Greg Kroah-Hartman
2016-10-12  6:08         ` [lustre-devel] " Greg Kroah-Hartman
2016-10-13 22:45         ` James Simmons
2016-10-13 22:45           ` [lustre-devel] " James Simmons
2016-10-14  7:41           ` Greg Kroah-Hartman
2016-10-14  7:41             ` [lustre-devel] " Greg Kroah-Hartman
2016-10-03  2:28 ` [PATCH 33/41] staging: lustre: lov: use obd_get_info() to get def/max LOV EA sizes James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 34/41] staging: lustre: ldlm: cancel aged locks for LRUR James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 35/41] staging: lustre: hsm: Use file lease to implement migration James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-09 14:18   ` Greg Kroah-Hartman
2016-10-09 14:18     ` [lustre-devel] " Greg Kroah-Hartman
2016-10-03  2:28 ` [PATCH 36/41] staging: lustre: ldlm: interval tree search in ldlm_lock_match() James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 37/41] staging: lustre: lov: copy_to_user uses wrong casting James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 38/41] staging: lustre: mdc: add max modify RPCs in flight variable James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 39/41] staging: lustre: osc: remove remaining bits for capa support James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 40/41] staging: lustre: lov: move LSM to LOV layer James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons
2016-10-03  2:28 ` [PATCH 41/41] staging: lustre: echo: request pages in batches James Simmons
2016-10-03  2:28   ` [lustre-devel] " James Simmons

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.