All of lore.kernel.org
 help / color / mirror / Atom feed
* [lustre-devel] Lustre upstream client TODO list
@ 2018-02-11 23:17 James Simmons
  2018-02-11 23:54 ` NeilBrown
  2019-12-19  5:31 ` [lustre-devel] Lustre upstreaming status NeilBrown
  0 siblings, 2 replies; 19+ messages in thread
From: James Simmons @ 2018-02-11 23:17 UTC (permalink / raw)
  To: lustre-devel


So I sent a patch upstream that laid out what most needs to be done for
the linux lustre client to leave staging. I placed the new text here for
ease of read so you don't have to go searching for it. Feed back is
welcomed. Hoepfully posting it will make it clear what needs to be done.

Currently all the work directed toward the lustre upstream client is tracked
at the following link:

https://jira.hpdd.intel.com/browse/LU-9679

Under this ticket you will see the following work items that need to be
addressed:

******************************************************************************
* libcfs cleanup
*
* https://jira.hpdd.intel.com/browse/LU-9859
*
* Track all the cleanups and simplification of the libcfs module. Remove
* functions the kernel provides. Possible intergrate some of the functionality
* into the kernel proper.
*
******************************************************************************

https://jira.hpdd.intel.com/browse/LU-100086

LNET_MINOR conflicts with USERIO_MINOR

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8130

Fix and simplify libcfs hash handling

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8703

The current way we handle SMP is wrong. Platforms like ARM and KNL can have
core and NUMA setups with things like NUMA nodes with no cores. We need to
handle such cases. This work also greatly simplified the lustre SMP code.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9019

Replace libcfs time API with standard kernel APIs. Also migrate away from
jiffies. We found jiffies can vary on nodes which can lead to corner cases
that can break the file system due to nodes having inconsistent behavior.
So move to time64_t and ktime_t as much as possible.

******************************************************************************
* Proper IB support for ko2iblnd
******************************************************************************
https://jira.hpdd.intel.com/browse/LU-9179

Poor performance for the ko2iblnd driver. This is related to many of the
patches below that are missing from the linux client.
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9886

Crash in upstream kiblnd_handle_early_rxs()
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10394 / LU-10526 / LU-10089

Default to default to using MEM_REG
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10459

throttle tx based on queue depth
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9943

correct WR fast reg accounting
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10291

remove concurrent_sends tunable
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10213

calculate qp max_send_wrs properly
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9810

use less CQ entries for each connection
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10129 / LU-9180

rework map_on_demand behavior
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10129

query device capabilities
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10015

fix race at kiblnd_connect_peer
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9983

allow for discontiguous fragments
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9500

Don't Page Align remote_addr with FastReg
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9448

handle empty CPTs
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9507

Don't Assert On Reconnect with MultiQP
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9472

Fix FastReg map/unmap for MLX5
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9425

Turn on 2 sges by default
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8943

Enable Multiple OPA Endpoints between Nodes
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-5718

multiple sges for work request
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9094

kill timedout txs from ibp_tx_queue
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9094

reconnect peer for REJ_INVALID_SERVICE_ID
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8752

Stop MLX5 triggering a dump_cqe
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8874

Move ko2iblnd to latest RDMA changes
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8875 / LU-8874

Change to new RDMA done callback mechanism

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9164 / LU-8874

Incorporate RDMA map/unamp API's into ko2iblnd

******************************************************************************
* sysfs/debugfs fixes
*
* https://jira.hpdd.intel.com/browse/LU-8066
*
* The original migration to sysfs was done in haste without properly working
* utilities to test the changes. This covers the work to restore the proper
* behavior. Huge project to make this right.
*
******************************************************************************

https://jira.hpdd.intel.com/browse/LU-9431

The function class_process_proc_param was used for our mass updates of proc
tunables. It didn't work with sysfs and it was just ugly so it was removed.
In the process the ability to mass update thousands of clients was lost. This
work restores this in a sane way.

------------------------------------------------------------------------------
https://jira.hpdd.intel.com/browse/LU-9091

One the major request of users is the ability to pass in parameters into a
sysfs file in various different units. For example we can set max_pages_per_rpc
but this can vary on platforms due to different platform sizes. So you can
set this like max_pages_per_rpc=16MiB. The original code to handle this written
before the string helpers were created so the code doesn't follow that format
but it would be easy to move to. Currently the string helpers does the reverse
of what we need, changing bytes to string. We need to change a string to bytes.

******************************************************************************
* Proper user land to kernel space interface for Lustre
*
* https://jira.hpdd.intel.com/browse/LU-9680
*
******************************************************************************

https://jira.hpdd.intel.com/browse/LU-8915

Don't use linux list structure as user land arguments for lnet selftest.
This code is pretty poor quality and really needs to be reworked.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8834

The lustre ioctl LL_IOC_FUTIMES_3 is very generic. Need to either work with
other file systems with similar functionality and make a common syscall
interface or rework our server code to automagically do it for us.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-6202

Cleanup up ioctl handling. We have many obsolete ioctls. Also the way we do
ioctls can be changed over to netlink. This also has the benefit of working
better with HPC systems that do IO forwarding. Such systems don't like ioctls
very well.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9667

More cleanups by making our utilities use sysfs instead of ioctls for LNet.
Also it has been requested to move the remaining ioctls to the netlink API.

******************************************************************************
* Misc
******************************************************************************

------------------------------------------------------------------------------
https://jira.hpdd.intel.com/browse/LU-9855

Clean up obdclass preprocessor code. One of the major eye sores is the various
pointer redirections and macros used by the obdclass. This makes the code very
difficult to understand. It was requested by the Al Viro to clean this up before
we leave staging.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9633

Migrate to sphinx kernel-doc style comments. Add documents in Documentation.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-6142

Possible remaining coding style fix. Remove deadcode. Enforce kernel code
style. Other minor misc cleanups...

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8837

Separate client/server functionality. Functions only used by server can be
removed from client. Most of this has been done but we need a inspect of the
code to make sure.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-8964

Lustre client readahead/writeback control needs to better suit kernel providings.
Currently its being explored. We could end up replacing the CLIO read ahead
abstract with the kernel proper version.

------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9862

Patch that landed for LU-7890 leads to static checker errors
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-9868

dcache/namei fixes for lustre
------------------------------------------------------------------------------

https://jira.hpdd.intel.com/browse/LU-10467

use standard linux wait_events macros work by Neil Brown

------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstream client TODO list
  2018-02-11 23:17 [lustre-devel] Lustre upstream client TODO list James Simmons
@ 2018-02-11 23:54 ` NeilBrown
  2018-02-12  1:15   ` Patrick Farrell
  2018-03-22 23:21   ` [lustre-devel] Current results and status of my upstream work James Simmons
  2019-12-19  5:31 ` [lustre-devel] Lustre upstreaming status NeilBrown
  1 sibling, 2 replies; 19+ messages in thread
From: NeilBrown @ 2018-02-11 23:54 UTC (permalink / raw)
  To: lustre-devel

On Sun, Feb 11 2018, James Simmons wrote:

> So I sent a patch upstream that laid out what most needs to be done for
> the linux lustre client to leave staging. I placed the new text here for
> ease of read so you don't have to go searching for it. Feed back is
> welcomed. Hoepfully posting it will make it clear what needs to be done.


Thanks so much for putting this together and pushing it out.  I really
appreciated it and hope to show that appreciation with patches :-)

NeilBrown

>
> Currently all the work directed toward the lustre upstream client is tracked
> at the following link:
>
> https://jira.hpdd.intel.com/browse/LU-9679
>
> Under this ticket you will see the following work items that need to be
> addressed:
>
> ******************************************************************************
> * libcfs cleanup
> *
> * https://jira.hpdd.intel.com/browse/LU-9859
> *
> * Track all the cleanups and simplification of the libcfs module. Remove
> * functions the kernel provides. Possible intergrate some of the functionality
> * into the kernel proper.
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-100086
>
> LNET_MINOR conflicts with USERIO_MINOR
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8130
>
> Fix and simplify libcfs hash handling
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8703
>
> The current way we handle SMP is wrong. Platforms like ARM and KNL can have
> core and NUMA setups with things like NUMA nodes with no cores. We need to
> handle such cases. This work also greatly simplified the lustre SMP code.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9019
>
> Replace libcfs time API with standard kernel APIs. Also migrate away from
> jiffies. We found jiffies can vary on nodes which can lead to corner cases
> that can break the file system due to nodes having inconsistent behavior.
> So move to time64_t and ktime_t as much as possible.
>
> ******************************************************************************
> * Proper IB support for ko2iblnd
> ******************************************************************************
> https://jira.hpdd.intel.com/browse/LU-9179
>
> Poor performance for the ko2iblnd driver. This is related to many of the
> patches below that are missing from the linux client.
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9886
>
> Crash in upstream kiblnd_handle_early_rxs()
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10394 / LU-10526 / LU-10089
>
> Default to default to using MEM_REG
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10459
>
> throttle tx based on queue depth
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9943
>
> correct WR fast reg accounting
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10291
>
> remove concurrent_sends tunable
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10213
>
> calculate qp max_send_wrs properly
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9810
>
> use less CQ entries for each connection
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10129 / LU-9180
>
> rework map_on_demand behavior
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10129
>
> query device capabilities
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10015
>
> fix race at kiblnd_connect_peer
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9983
>
> allow for discontiguous fragments
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9500
>
> Don't Page Align remote_addr with FastReg
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9448
>
> handle empty CPTs
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9507
>
> Don't Assert On Reconnect with MultiQP
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9472
>
> Fix FastReg map/unmap for MLX5
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9425
>
> Turn on 2 sges by default
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8943
>
> Enable Multiple OPA Endpoints between Nodes
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-5718
>
> multiple sges for work request
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9094
>
> kill timedout txs from ibp_tx_queue
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9094
>
> reconnect peer for REJ_INVALID_SERVICE_ID
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8752
>
> Stop MLX5 triggering a dump_cqe
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8874
>
> Move ko2iblnd to latest RDMA changes
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8875 / LU-8874
>
> Change to new RDMA done callback mechanism
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9164 / LU-8874
>
> Incorporate RDMA map/unamp API's into ko2iblnd
>
> ******************************************************************************
> * sysfs/debugfs fixes
> *
> * https://jira.hpdd.intel.com/browse/LU-8066
> *
> * The original migration to sysfs was done in haste without properly working
> * utilities to test the changes. This covers the work to restore the proper
> * behavior. Huge project to make this right.
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-9431
>
> The function class_process_proc_param was used for our mass updates of proc
> tunables. It didn't work with sysfs and it was just ugly so it was removed.
> In the process the ability to mass update thousands of clients was lost. This
> work restores this in a sane way.
>
> ------------------------------------------------------------------------------
> https://jira.hpdd.intel.com/browse/LU-9091
>
> One the major request of users is the ability to pass in parameters into a
> sysfs file in various different units. For example we can set max_pages_per_rpc
> but this can vary on platforms due to different platform sizes. So you can
> set this like max_pages_per_rpc=16MiB. The original code to handle this written
> before the string helpers were created so the code doesn't follow that format
> but it would be easy to move to. Currently the string helpers does the reverse
> of what we need, changing bytes to string. We need to change a string to bytes.
>
> ******************************************************************************
> * Proper user land to kernel space interface for Lustre
> *
> * https://jira.hpdd.intel.com/browse/LU-9680
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-8915
>
> Don't use linux list structure as user land arguments for lnet selftest.
> This code is pretty poor quality and really needs to be reworked.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8834
>
> The lustre ioctl LL_IOC_FUTIMES_3 is very generic. Need to either work with
> other file systems with similar functionality and make a common syscall
> interface or rework our server code to automagically do it for us.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-6202
>
> Cleanup up ioctl handling. We have many obsolete ioctls. Also the way we do
> ioctls can be changed over to netlink. This also has the benefit of working
> better with HPC systems that do IO forwarding. Such systems don't like ioctls
> very well.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9667
>
> More cleanups by making our utilities use sysfs instead of ioctls for LNet.
> Also it has been requested to move the remaining ioctls to the netlink API.
>
> ******************************************************************************
> * Misc
> ******************************************************************************
>
> ------------------------------------------------------------------------------
> https://jira.hpdd.intel.com/browse/LU-9855
>
> Clean up obdclass preprocessor code. One of the major eye sores is the various
> pointer redirections and macros used by the obdclass. This makes the code very
> difficult to understand. It was requested by the Al Viro to clean this up before
> we leave staging.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9633
>
> Migrate to sphinx kernel-doc style comments. Add documents in Documentation.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-6142
>
> Possible remaining coding style fix. Remove deadcode. Enforce kernel code
> style. Other minor misc cleanups...
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8837
>
> Separate client/server functionality. Functions only used by server can be
> removed from client. Most of this has been done but we need a inspect of the
> code to make sure.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8964
>
> Lustre client readahead/writeback control needs to better suit kernel providings.
> Currently its being explored. We could end up replacing the CLIO read ahead
> abstract with the kernel proper version.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9862
>
> Patch that landed for LU-7890 leads to static checker errors
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9868
>
> dcache/namei fixes for lustre
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10467
>
> use standard linux wait_events macros work by Neil Brown
>
> ------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180212/d13cd8f1/attachment.sig>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstream client TODO list
  2018-02-11 23:54 ` NeilBrown
@ 2018-02-12  1:15   ` Patrick Farrell
  2018-02-12  2:09     ` NeilBrown
  2018-03-22 23:21   ` [lustre-devel] Current results and status of my upstream work James Simmons
  1 sibling, 1 reply; 19+ messages in thread
From: Patrick Farrell @ 2018-02-12  1:15 UTC (permalink / raw)
  To: lustre-devel

Neil,

Apologies if you've answered this elsewhere, but what's the genesis of your current (extremely welcome) interest in Lustre?  Some commitment by SUSE?

Regards,
- Patrick
________________________________
From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
Sent: Sunday, February 11, 2018 5:54:51 PM
To: James Simmons; Lustre Development List
Cc: Oleg Drokin
Subject: Re: [lustre-devel] Lustre upstream client TODO list

On Sun, Feb 11 2018, James Simmons wrote:

> So I sent a patch upstream that laid out what most needs to be done for
> the linux lustre client to leave staging. I placed the new text here for
> ease of read so you don't have to go searching for it. Feed back is
> welcomed. Hoepfully posting it will make it clear what needs to be done.


Thanks so much for putting this together and pushing it out.  I really
appreciated it and hope to show that appreciation with patches :-)

NeilBrown

>
> Currently all the work directed toward the lustre upstream client is tracked
> at the following link:
>
> https://jira.hpdd.intel.com/browse/LU-9679
>
> Under this ticket you will see the following work items that need to be
> addressed:
>
> ******************************************************************************
> * libcfs cleanup
> *
> * https://jira.hpdd.intel.com/browse/LU-9859
> *
> * Track all the cleanups and simplification of the libcfs module. Remove
> * functions the kernel provides. Possible intergrate some of the functionality
> * into the kernel proper.
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-100086
>
> LNET_MINOR conflicts with USERIO_MINOR
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8130
>
> Fix and simplify libcfs hash handling
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8703
>
> The current way we handle SMP is wrong. Platforms like ARM and KNL can have
> core and NUMA setups with things like NUMA nodes with no cores. We need to
> handle such cases. This work also greatly simplified the lustre SMP code.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9019
>
> Replace libcfs time API with standard kernel APIs. Also migrate away from
> jiffies. We found jiffies can vary on nodes which can lead to corner cases
> that can break the file system due to nodes having inconsistent behavior.
> So move to time64_t and ktime_t as much as possible.
>
> ******************************************************************************
> * Proper IB support for ko2iblnd
> ******************************************************************************
> https://jira.hpdd.intel.com/browse/LU-9179
>
> Poor performance for the ko2iblnd driver. This is related to many of the
> patches below that are missing from the linux client.
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9886
>
> Crash in upstream kiblnd_handle_early_rxs()
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10394 / LU-10526 / LU-10089
>
> Default to default to using MEM_REG
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10459
>
> throttle tx based on queue depth
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9943
>
> correct WR fast reg accounting
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10291
>
> remove concurrent_sends tunable
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10213
>
> calculate qp max_send_wrs properly
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9810
>
> use less CQ entries for each connection
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10129 / LU-9180
>
> rework map_on_demand behavior
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10129
>
> query device capabilities
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10015
>
> fix race at kiblnd_connect_peer
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9983
>
> allow for discontiguous fragments
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9500
>
> Don't Page Align remote_addr with FastReg
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9448
>
> handle empty CPTs
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9507
>
> Don't Assert On Reconnect with MultiQP
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9472
>
> Fix FastReg map/unmap for MLX5
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9425
>
> Turn on 2 sges by default
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8943
>
> Enable Multiple OPA Endpoints between Nodes
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-5718
>
> multiple sges for work request
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9094
>
> kill timedout txs from ibp_tx_queue
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9094
>
> reconnect peer for REJ_INVALID_SERVICE_ID
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8752
>
> Stop MLX5 triggering a dump_cqe
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8874
>
> Move ko2iblnd to latest RDMA changes
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8875 / LU-8874
>
> Change to new RDMA done callback mechanism
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9164 / LU-8874
>
> Incorporate RDMA map/unamp API's into ko2iblnd
>
> ******************************************************************************
> * sysfs/debugfs fixes
> *
> * https://jira.hpdd.intel.com/browse/LU-8066
> *
> * The original migration to sysfs was done in haste without properly working
> * utilities to test the changes. This covers the work to restore the proper
> * behavior. Huge project to make this right.
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-9431
>
> The function class_process_proc_param was used for our mass updates of proc
> tunables. It didn't work with sysfs and it was just ugly so it was removed.
> In the process the ability to mass update thousands of clients was lost. This
> work restores this in a sane way.
>
> ------------------------------------------------------------------------------
> https://jira.hpdd.intel.com/browse/LU-9091
>
> One the major request of users is the ability to pass in parameters into a
> sysfs file in various different units. For example we can set max_pages_per_rpc
> but this can vary on platforms due to different platform sizes. So you can
> set this like max_pages_per_rpc=16MiB. The original code to handle this written
> before the string helpers were created so the code doesn't follow that format
> but it would be easy to move to. Currently the string helpers does the reverse
> of what we need, changing bytes to string. We need to change a string to bytes.
>
> ******************************************************************************
> * Proper user land to kernel space interface for Lustre
> *
> * https://jira.hpdd.intel.com/browse/LU-9680
> *
> ******************************************************************************
>
> https://jira.hpdd.intel.com/browse/LU-8915
>
> Don't use linux list structure as user land arguments for lnet selftest.
> This code is pretty poor quality and really needs to be reworked.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8834
>
> The lustre ioctl LL_IOC_FUTIMES_3 is very generic. Need to either work with
> other file systems with similar functionality and make a common syscall
> interface or rework our server code to automagically do it for us.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-6202
>
> Cleanup up ioctl handling. We have many obsolete ioctls. Also the way we do
> ioctls can be changed over to netlink. This also has the benefit of working
> better with HPC systems that do IO forwarding. Such systems don't like ioctls
> very well.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9667
>
> More cleanups by making our utilities use sysfs instead of ioctls for LNet.
> Also it has been requested to move the remaining ioctls to the netlink API.
>
> ******************************************************************************
> * Misc
> ******************************************************************************
>
> ------------------------------------------------------------------------------
> https://jira.hpdd.intel.com/browse/LU-9855
>
> Clean up obdclass preprocessor code. One of the major eye sores is the various
> pointer redirections and macros used by the obdclass. This makes the code very
> difficult to understand. It was requested by the Al Viro to clean this up before
> we leave staging.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9633
>
> Migrate to sphinx kernel-doc style comments. Add documents in Documentation.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-6142
>
> Possible remaining coding style fix. Remove deadcode. Enforce kernel code
> style. Other minor misc cleanups...
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8837
>
> Separate client/server functionality. Functions only used by server can be
> removed from client. Most of this has been done but we need a inspect of the
> code to make sure.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-8964
>
> Lustre client readahead/writeback control needs to better suit kernel providings.
> Currently its being explored. We could end up replacing the CLIO read ahead
> abstract with the kernel proper version.
>
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9862
>
> Patch that landed for LU-7890 leads to static checker errors
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-9868
>
> dcache/namei fixes for lustre
> ------------------------------------------------------------------------------
>
> https://jira.hpdd.intel.com/browse/LU-10467
>
> use standard linux wait_events macros work by Neil Brown
>
> ------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180212/dec38065/attachment-0001.html>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstream client TODO list
  2018-02-12  1:15   ` Patrick Farrell
@ 2018-02-12  2:09     ` NeilBrown
  0 siblings, 0 replies; 19+ messages in thread
From: NeilBrown @ 2018-02-12  2:09 UTC (permalink / raw)
  To: lustre-devel

On Mon, Feb 12 2018, Patrick Farrell wrote:

> Neil,
>
> Apologies if you've answered this elsewhere, but what's the genesis of
> your current (extremely welcome) interest in Lustre?  Some commitment
> by SUSE? 

"commitment" might be too strong a word - certainly too strong for me to
use - but "interest" is probably fair.  Some interest within SUSE.

NeilBrown


>
> Regards,
> - Patrick
> ________________________________
> From: lustre-devel <lustre-devel-bounces@lists.lustre.org> on behalf of NeilBrown <neilb@suse.com>
> Sent: Sunday, February 11, 2018 5:54:51 PM
> To: James Simmons; Lustre Development List
> Cc: Oleg Drokin
> Subject: Re: [lustre-devel] Lustre upstream client TODO list
>
> On Sun, Feb 11 2018, James Simmons wrote:
>
>> So I sent a patch upstream that laid out what most needs to be done for
>> the linux lustre client to leave staging. I placed the new text here for
>> ease of read so you don't have to go searching for it. Feed back is
>> welcomed. Hoepfully posting it will make it clear what needs to be done.
>
>
> Thanks so much for putting this together and pushing it out.  I really
> appreciated it and hope to show that appreciation with patches :-)
>
> NeilBrown
>
>>
>> Currently all the work directed toward the lustre upstream client is tracked
>> at the following link:
>>
>> https://jira.hpdd.intel.com/browse/LU-9679
>>
>> Under this ticket you will see the following work items that need to be
>> addressed:
>>
>> ******************************************************************************
>> * libcfs cleanup
>> *
>> * https://jira.hpdd.intel.com/browse/LU-9859
>> *
>> * Track all the cleanups and simplification of the libcfs module. Remove
>> * functions the kernel provides. Possible intergrate some of the functionality
>> * into the kernel proper.
>> *
>> ******************************************************************************
>>
>> https://jira.hpdd.intel.com/browse/LU-100086
>>
>> LNET_MINOR conflicts with USERIO_MINOR
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8130
>>
>> Fix and simplify libcfs hash handling
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8703
>>
>> The current way we handle SMP is wrong. Platforms like ARM and KNL can have
>> core and NUMA setups with things like NUMA nodes with no cores. We need to
>> handle such cases. This work also greatly simplified the lustre SMP code.
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9019
>>
>> Replace libcfs time API with standard kernel APIs. Also migrate away from
>> jiffies. We found jiffies can vary on nodes which can lead to corner cases
>> that can break the file system due to nodes having inconsistent behavior.
>> So move to time64_t and ktime_t as much as possible.
>>
>> ******************************************************************************
>> * Proper IB support for ko2iblnd
>> ******************************************************************************
>> https://jira.hpdd.intel.com/browse/LU-9179
>>
>> Poor performance for the ko2iblnd driver. This is related to many of the
>> patches below that are missing from the linux client.
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9886
>>
>> Crash in upstream kiblnd_handle_early_rxs()
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-10394 / LU-10526 / LU-10089
>>
>> Default to default to using MEM_REG
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-10459
>>
>> throttle tx based on queue depth
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9943
>>
>> correct WR fast reg accounting
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-10291
>>
>> remove concurrent_sends tunable
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-10213
>>
>> calculate qp max_send_wrs properly
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9810
>>
>> use less CQ entries for each connection
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-10129 / LU-9180
>>
>> rework map_on_demand behavior
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-10129
>>
>> query device capabilities
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-10015
>>
>> fix race at kiblnd_connect_peer
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9983
>>
>> allow for discontiguous fragments
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9500
>>
>> Don't Page Align remote_addr with FastReg
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9448
>>
>> handle empty CPTs
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9507
>>
>> Don't Assert On Reconnect with MultiQP
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9472
>>
>> Fix FastReg map/unmap for MLX5
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9425
>>
>> Turn on 2 sges by default
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8943
>>
>> Enable Multiple OPA Endpoints between Nodes
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-5718
>>
>> multiple sges for work request
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9094
>>
>> kill timedout txs from ibp_tx_queue
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9094
>>
>> reconnect peer for REJ_INVALID_SERVICE_ID
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8752
>>
>> Stop MLX5 triggering a dump_cqe
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8874
>>
>> Move ko2iblnd to latest RDMA changes
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8875 / LU-8874
>>
>> Change to new RDMA done callback mechanism
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9164 / LU-8874
>>
>> Incorporate RDMA map/unamp API's into ko2iblnd
>>
>> ******************************************************************************
>> * sysfs/debugfs fixes
>> *
>> * https://jira.hpdd.intel.com/browse/LU-8066
>> *
>> * The original migration to sysfs was done in haste without properly working
>> * utilities to test the changes. This covers the work to restore the proper
>> * behavior. Huge project to make this right.
>> *
>> ******************************************************************************
>>
>> https://jira.hpdd.intel.com/browse/LU-9431
>>
>> The function class_process_proc_param was used for our mass updates of proc
>> tunables. It didn't work with sysfs and it was just ugly so it was removed.
>> In the process the ability to mass update thousands of clients was lost. This
>> work restores this in a sane way.
>>
>> ------------------------------------------------------------------------------
>> https://jira.hpdd.intel.com/browse/LU-9091
>>
>> One the major request of users is the ability to pass in parameters into a
>> sysfs file in various different units. For example we can set max_pages_per_rpc
>> but this can vary on platforms due to different platform sizes. So you can
>> set this like max_pages_per_rpc=16MiB. The original code to handle this written
>> before the string helpers were created so the code doesn't follow that format
>> but it would be easy to move to. Currently the string helpers does the reverse
>> of what we need, changing bytes to string. We need to change a string to bytes.
>>
>> ******************************************************************************
>> * Proper user land to kernel space interface for Lustre
>> *
>> * https://jira.hpdd.intel.com/browse/LU-9680
>> *
>> ******************************************************************************
>>
>> https://jira.hpdd.intel.com/browse/LU-8915
>>
>> Don't use linux list structure as user land arguments for lnet selftest.
>> This code is pretty poor quality and really needs to be reworked.
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8834
>>
>> The lustre ioctl LL_IOC_FUTIMES_3 is very generic. Need to either work with
>> other file systems with similar functionality and make a common syscall
>> interface or rework our server code to automagically do it for us.
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-6202
>>
>> Cleanup up ioctl handling. We have many obsolete ioctls. Also the way we do
>> ioctls can be changed over to netlink. This also has the benefit of working
>> better with HPC systems that do IO forwarding. Such systems don't like ioctls
>> very well.
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9667
>>
>> More cleanups by making our utilities use sysfs instead of ioctls for LNet.
>> Also it has been requested to move the remaining ioctls to the netlink API.
>>
>> ******************************************************************************
>> * Misc
>> ******************************************************************************
>>
>> ------------------------------------------------------------------------------
>> https://jira.hpdd.intel.com/browse/LU-9855
>>
>> Clean up obdclass preprocessor code. One of the major eye sores is the various
>> pointer redirections and macros used by the obdclass. This makes the code very
>> difficult to understand. It was requested by the Al Viro to clean this up before
>> we leave staging.
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9633
>>
>> Migrate to sphinx kernel-doc style comments. Add documents in Documentation.
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-6142
>>
>> Possible remaining coding style fix. Remove deadcode. Enforce kernel code
>> style. Other minor misc cleanups...
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8837
>>
>> Separate client/server functionality. Functions only used by server can be
>> removed from client. Most of this has been done but we need a inspect of the
>> code to make sure.
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-8964
>>
>> Lustre client readahead/writeback control needs to better suit kernel providings.
>> Currently its being explored. We could end up replacing the CLIO read ahead
>> abstract with the kernel proper version.
>>
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9862
>>
>> Patch that landed for LU-7890 leads to static checker errors
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-9868
>>
>> dcache/namei fixes for lustre
>> ------------------------------------------------------------------------------
>>
>> https://jira.hpdd.intel.com/browse/LU-10467
>>
>> use standard linux wait_events macros work by Neil Brown
>>
>> ------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180212/6be1b155/attachment.sig>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Current results and status of my upstream work
  2018-02-11 23:54 ` NeilBrown
  2018-02-12  1:15   ` Patrick Farrell
@ 2018-03-22 23:21   ` James Simmons
  2018-03-27  5:32     ` NeilBrown
  1 sibling, 1 reply; 19+ messages in thread
From: James Simmons @ 2018-03-22 23:21 UTC (permalink / raw)
  To: lustre-devel


Hi Neil

	I have been testing the upstream client using lustre 2.10 tools 
and the test suite that comes with it. I see the following failures in
my testing and wonder at how it compares to your testing:

sanity: FAIL: test_17n migrate remote dir error 1
sanity: FAIL: test_17o stat file should fail
sanity: FAIL: test_24v large readdir doesn't take effect:  2 should be about 0
sanity: FAIL: test_27z FF stripe count 1 != 0
sanity: FAIL: test_27D llapi_layout_test failed
sanity: FAIL: test_29 No mdc lock count
sanity: FAIL: test_42d failed: client:75497472 server: 92274688.
sanity: FAIL: test_42e failed: client:76218368 server: 92995584.
sanity: FAIL: test_56c OST lustre-OST0000 is in status of '', not 'D'
sanity: FAIL: test_56t "lfs find -S 4M /lustre/lustre/d56t.sanityt" wrong: found 5, expected 3
sanity: FAIL: test_56w /usr/bin/lfs getstripe -c /lustre/lustre/d56w.sanityw/dir1/file1 wrong: found 2, expected 1
sanity: FAIL: test_63a failed: client:75497472 server: 92274688.
sanity: FAIL: test_63b failed: client:75497472 server: 92274688.
sanity: FAIL: test_64a failed: client:75497472 server: 92274688.
sanity: FAIL: test_64c failed: client:75497472 server: 92274688.
sanity: FAIL: test_76 inode slab grew from 182313 to 182399
sanity: FAIL: test_77c no checksum dump file on Client
sanity: FAIL: test_101g unable to set max_pages_per_rpc=16M
sanity: FAIL: test_102a /lustre/lustre/f102a.sanity missing 3 trusted.name xattrs
sanity: FAIL: test_102b can't get trusted.lov from /lustre/lustre/f102b.sanity
sanity: FAIL: test_102n setxattr invalid 'trusted.lov' success
sanity: FAIL: test_103a run_acl_subtest cp failed
sanity: FAIL: test_125 setfacl /lustre/lustre/d125 failed
sanity: FAIL: test_154  kernel panics in ll_splice code
sanity: FAIL: test_154B decode linkea /lustre/lustre/d154B.sanity/f154B.sanity failed
sanity: FAIL: test_160d migrate fails
sanity: FAIL: test_161d create should be blocked
sanity: FAIL: test_162a check path d162a.sanity/d2/p/q/r/slink failed
sanity: FAIL: test_200 unable to mount /lustre/lustre on MGS
sanity: FAIL: test_220 unable to mount /lustre/lustre on MGS
sanity: FAIL: test_225  kills the MDS server
sanity: FAIL: test_226a cannot get path of FIFO by /lustre/lustre /lustre/lustre/d226a.sanity/fifo
sanity: FAIL: test_226b cannot get path of FIFO by /lustre/lustre /lustre/lustre/d226b.sanity/remote_dir/fifo
sanity: FAIL: test_230b fails on migrating remote dir to MDT1
sanity: FAIL: test_230c mkdir succeeds under migrating directory
sanity: FAIL: test_230d migrate remote dir error
sanity: FAIL: test_230e migrate dir fails
sanity: FAIL: test_230f #1 migrate dir fails
sanity: FAIL: test_230h migrating d230h.sanity fail
sanity: FAIL: test_230i migration fails with a tailing slash
sanity: FAIL: test_233a cannot access /lustre/lustre using its FID '[0x200000007:0x1:0x0]'
sanity: FAIL: test_233b cannot access /lustre/lustre/.lustre using its FID '[0x200000002:0x1:0x0]'
sanity: FAIL: test_234 touch failed
sanity: FAIL: test_240 umount failed
sanity: FAIL: test_242 ls /lustre/lustre/d242.sanity failed
sanity: FAIL: test_251 short write happened
sanity: FAIL: test_300a 1:stripe_count is 0, expect 2
sanity: FAIL: test_300e set striped bdir under striped dir error
sanity: FAIL: test_300g create dir1 fails
sanity: FAIL: test_300h expect 4 get 0 for striped_dir
sanity: FAIL: test_300i set striped hashdir error
sanity: FAIL: test_300n create test_dir1 fails
sanity: FAIL: test_315 read is not accounted (0)
sanity: FAIL: test_399a fake write is slower
sanity: FAIL: test_405 One layout swap locked test failed
sanity: FAIL: test_406 unable to mount /lustre/lustre on MGS
sanity: FAIL: test_410 no inode match
sanity: FAIL: test_900 never finishes. ldlm_lock dumps

Some of those failures are due to new functionality that the upstream 
client doesn't support which at this point is not important. Another
batch is due to xattr/acl support being broken. John Hammond and I
have been looking into those failures. I have a bunch of patches done
and are being tested. Letting you know so we don't duplicate work.
The other source of the bugs is the sysfs support. I'm porting the
fixes I have done to upstream and I'm in the process of validating
the patches. As a last bunch of changes it was found that lustre
doesn't work properly with its SMP code on systems like KNL and ARM
on one end and the other end on these massive systems with 100s of
core also doesn't work well. I have those patches already finished
and tested. I will be pushing those after the next merge window.
I'm almost done working out the 64 bit time code as well. Haven't
ported those yet.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Current results and status of my upstream work
  2018-03-22 23:21   ` [lustre-devel] Current results and status of my upstream work James Simmons
@ 2018-03-27  5:32     ` NeilBrown
  2018-03-27  6:17       ` Dilger, Andreas
  2018-03-30 18:55       ` James Simmons
  0 siblings, 2 replies; 19+ messages in thread
From: NeilBrown @ 2018-03-27  5:32 UTC (permalink / raw)
  To: lustre-devel


On Thu, Mar 22 2018, James Simmons wrote:

> Hi Neil
>
> 	I have been testing the upstream client using lustre 2.10 tools 
> and the test suite that comes with it. I see the following failures in
> my testing and wonder at how it compares to your testing:
>
> sanity: FAIL: test_17n migrate remote dir error 1
> sanity: FAIL: test_17o stat file should fail
> sanity: FAIL: test_24v large readdir doesn't take effect:  2 should be about 0
> sanity: FAIL: test_27z FF stripe count 1 != 0
> sanity: FAIL: test_27D llapi_layout_test failed
> sanity: FAIL: test_29 No mdc lock count
> sanity: FAIL: test_42d failed: client:75497472 server: 92274688.
> sanity: FAIL: test_42e failed: client:76218368 server: 92995584.
> sanity: FAIL: test_56c OST lustre-OST0000 is in status of '', not 'D'
> sanity: FAIL: test_56t "lfs find -S 4M /lustre/lustre/d56t.sanityt" wrong: found 5, expected 3
> sanity: FAIL: test_56w /usr/bin/lfs getstripe -c /lustre/lustre/d56w.sanityw/dir1/file1 wrong: found 2, expected 1
> sanity: FAIL: test_63a failed: client:75497472 server: 92274688.
> sanity: FAIL: test_63b failed: client:75497472 server: 92274688.
> sanity: FAIL: test_64a failed: client:75497472 server: 92274688.
> sanity: FAIL: test_64c failed: client:75497472 server: 92274688.
> sanity: FAIL: test_76 inode slab grew from 182313 to 182399
> sanity: FAIL: test_77c no checksum dump file on Client
> sanity: FAIL: test_101g unable to set max_pages_per_rpc=16M
> sanity: FAIL: test_102a /lustre/lustre/f102a.sanity missing 3 trusted.name xattrs
> sanity: FAIL: test_102b can't get trusted.lov from /lustre/lustre/f102b.sanity
> sanity: FAIL: test_102n setxattr invalid 'trusted.lov' success
> sanity: FAIL: test_103a run_acl_subtest cp failed
> sanity: FAIL: test_125 setfacl /lustre/lustre/d125 failed
> sanity: FAIL: test_154  kernel panics in ll_splice code
> sanity: FAIL: test_154B decode linkea /lustre/lustre/d154B.sanity/f154B.sanity failed
> sanity: FAIL: test_160d migrate fails
> sanity: FAIL: test_161d create should be blocked
> sanity: FAIL: test_162a check path d162a.sanity/d2/p/q/r/slink failed
> sanity: FAIL: test_200 unable to mount /lustre/lustre on MGS
> sanity: FAIL: test_220 unable to mount /lustre/lustre on MGS
> sanity: FAIL: test_225  kills the MDS server
> sanity: FAIL: test_226a cannot get path of FIFO by /lustre/lustre /lustre/lustre/d226a.sanity/fifo
> sanity: FAIL: test_226b cannot get path of FIFO by /lustre/lustre /lustre/lustre/d226b.sanity/remote_dir/fifo
> sanity: FAIL: test_230b fails on migrating remote dir to MDT1
> sanity: FAIL: test_230c mkdir succeeds under migrating directory
> sanity: FAIL: test_230d migrate remote dir error
> sanity: FAIL: test_230e migrate dir fails
> sanity: FAIL: test_230f #1 migrate dir fails
> sanity: FAIL: test_230h migrating d230h.sanity fail
> sanity: FAIL: test_230i migration fails with a tailing slash
> sanity: FAIL: test_233a cannot access /lustre/lustre using its FID '[0x200000007:0x1:0x0]'
> sanity: FAIL: test_233b cannot access /lustre/lustre/.lustre using its FID '[0x200000002:0x1:0x0]'
> sanity: FAIL: test_234 touch failed
> sanity: FAIL: test_240 umount failed
> sanity: FAIL: test_242 ls /lustre/lustre/d242.sanity failed
> sanity: FAIL: test_251 short write happened
> sanity: FAIL: test_300a 1:stripe_count is 0, expect 2
> sanity: FAIL: test_300e set striped bdir under striped dir error
> sanity: FAIL: test_300g create dir1 fails
> sanity: FAIL: test_300h expect 4 get 0 for striped_dir
> sanity: FAIL: test_300i set striped hashdir error
> sanity: FAIL: test_300n create test_dir1 fails
> sanity: FAIL: test_315 read is not accounted (0)
> sanity: FAIL: test_399a fake write is slower
> sanity: FAIL: test_405 One layout swap locked test failed
> sanity: FAIL: test_406 unable to mount /lustre/lustre on MGS
> sanity: FAIL: test_410 no inode match
> sanity: FAIL: test_900 never finishes. ldlm_lock dumps
>
> Some of those failures are due to new functionality that the upstream 
> client doesn't support which at this point is not important. Another
> batch is due to xattr/acl support being broken. John Hammond and I
> have been looking into those failures. I have a bunch of patches done
> and are being tested. Letting you know so we don't duplicate work.
> The other source of the bugs is the sysfs support. I'm porting the
> fixes I have done to upstream and I'm in the process of validating
> the patches. As a last bunch of changes it was found that lustre
> doesn't work properly with its SMP code on systems like KNL and ARM
> on one end and the other end on these massive systems with 100s of
> core also doesn't work well. I have those patches already finished
> and tested. I will be pushing those after the next merge window.
> I'm almost done working out the 64 bit time code as well. Haven't
> ported those yet.

Hi,
 thanks for this.
 Yes, my list is very similar, though not identical.
 I've modified the test harness a little so that it unmounts and
 remounts the filesystem on every test.  I was chasing down a bug that
 happened on unmount, and wanted to trigger it as quickly as possible.
 The might explain some of the differences

 Tests in your list, not in mine:
   56[ctw] 76 154  200 300e 315 399a 900

 Tests in my list, not in yours
   56z 60aa 64b 83 104b 120e  130[abcde] 161c 205 215 

 It might be worth looking in to some of these(?). Last time I
 tried to understand one of the failures, I quickly realized that my
 understand of how lustre works wasn't deep enough.  So I went back
 to code cleaning.  Doing that has slowly improved my understanding
 so it might be worth it to go hunting again.

 I don't think anything you have mentioned will duplicate anything I've
 been working on.  Most of my time recently has been understanding the
 various hash tables and working to convert lustre to use rhashtable.
 I look forward to looking over your patches, I'll probably learn something!

 One failure that I have looked into but haven't posted a patch for yet,
 is that sometimes
        LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED);
 in cl_io_read_ahead() fails.  When it does, the value is CIS_INIT.
 Tracing through the code, it seem that CIS_INIT is an easy possibility
 since 1e1db2a97be5 ("staging: lustre: clio: Revise read ahead implementation")
 However CIS_IO_GOING and CIS_LOCKED do also happen and I cannot see
 that patht that leads to those - so I didn't feel that I could
 correctly explain the patch.
 I currently have:

       LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED ||
               io->ci_state == CIS_INIT);

 Do you have any idea if that is right?

Thanks,
NeilBrown

 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Current results and status of my upstream work
  2018-03-27  5:32     ` NeilBrown
@ 2018-03-27  6:17       ` Dilger, Andreas
  2018-03-27 21:17         ` Jinshan Xiong
  2018-03-30 18:55       ` James Simmons
  1 sibling, 1 reply; 19+ messages in thread
From: Dilger, Andreas @ 2018-03-27  6:17 UTC (permalink / raw)
  To: lustre-devel

On Mar 26, 2018, at 23:32, NeilBrown <neilb@suse.com> wrote:
> 
> One failure that I have looked into but haven't posted a patch for yet,
> is that sometimes
>        LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED);
> in cl_io_read_ahead() fails.  When it does, the value is CIS_INIT.
> Tracing through the code, it seem that CIS_INIT is an easy possibility
> since 1e1db2a97be5 ("staging: lustre: clio: Revise read ahead implementation")
> However CIS_IO_GOING and CIS_LOCKED do also happen and I cannot see
> that patht that leads to those - so I didn't feel that I could
> correctly explain the patch.
> I currently have:
> 
>       LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED ||
>               io->ci_state == CIS_INIT);
> Do you have any idea if that is right?

It is worthwhile to mention that LINVRNT() doesn't get enabled very often, since this enables some very expensive correctness tests.  This means it is entirely possible that this assertion is incorrect since the referenced commit.

Possibly Jinshan can comment?

Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Intel Corporation

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Current results and status of my upstream work
  2018-03-27  6:17       ` Dilger, Andreas
@ 2018-03-27 21:17         ` Jinshan Xiong
  2018-03-27 21:58           ` NeilBrown
  0 siblings, 1 reply; 19+ messages in thread
From: Jinshan Xiong @ 2018-03-27 21:17 UTC (permalink / raw)
  To: lustre-devel

Yes, what Andreas said is correct. This check is really expensive and has
never been enabled after the initial development of client I/O code. We
should get rid of it.

Jinshan

On Mon, Mar 26, 2018 at 11:17 PM, Dilger, Andreas <andreas.dilger@intel.com>
wrote:

> On Mar 26, 2018, at 23:32, NeilBrown <neilb@suse.com> wrote:
> >
> > One failure that I have looked into but haven't posted a patch for yet,
> > is that sometimes
> >        LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state ==
> CIS_LOCKED);
> > in cl_io_read_ahead() fails.  When it does, the value is CIS_INIT.
> > Tracing through the code, it seem that CIS_INIT is an easy possibility
> > since 1e1db2a97be5 ("staging: lustre: clio: Revise read ahead
> implementation")
> > However CIS_IO_GOING and CIS_LOCKED do also happen and I cannot see
> > that patht that leads to those - so I didn't feel that I could
> > correctly explain the patch.
> > I currently have:
> >
> >       LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED
> ||
> >               io->ci_state == CIS_INIT);
> > Do you have any idea if that is right?
>
> It is worthwhile to mention that LINVRNT() doesn't get enabled very often,
> since this enables some very expensive correctness tests.  This means it is
> entirely possible that this assertion is incorrect since the referenced
> commit.
>
> Possibly Jinshan can comment?
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel Corporation
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180327/48f29a6e/attachment.html>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Current results and status of my upstream work
  2018-03-27 21:17         ` Jinshan Xiong
@ 2018-03-27 21:58           ` NeilBrown
  0 siblings, 0 replies; 19+ messages in thread
From: NeilBrown @ 2018-03-27 21:58 UTC (permalink / raw)
  To: lustre-devel

On Tue, Mar 27 2018, Jinshan Xiong wrote:

> Yes, what Andreas said is correct. This check is really expensive and has
> never been enabled after the initial development of client I/O code. We
> should get rid of it.

"never" isn't correct.  I always do development with all the LINVRNTs
enabled - it seems like a good idea for catching bugs early.

But if you say this invariant is wrong and should be deleted, then I'll
submit a patch to do that.

Thanks,
NeilBrown


>
> Jinshan
>
> On Mon, Mar 26, 2018 at 11:17 PM, Dilger, Andreas <andreas.dilger@intel.com>
> wrote:
>
>> On Mar 26, 2018, at 23:32, NeilBrown <neilb@suse.com> wrote:
>> >
>> > One failure that I have looked into but haven't posted a patch for yet,
>> > is that sometimes
>> >        LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state ==
>> CIS_LOCKED);
>> > in cl_io_read_ahead() fails.  When it does, the value is CIS_INIT.
>> > Tracing through the code, it seem that CIS_INIT is an easy possibility
>> > since 1e1db2a97be5 ("staging: lustre: clio: Revise read ahead
>> implementation")
>> > However CIS_IO_GOING and CIS_LOCKED do also happen and I cannot see
>> > that patht that leads to those - so I didn't feel that I could
>> > correctly explain the patch.
>> > I currently have:
>> >
>> >       LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED
>> ||
>> >               io->ci_state == CIS_INIT);
>> > Do you have any idea if that is right?
>>
>> It is worthwhile to mention that LINVRNT() doesn't get enabled very often,
>> since this enables some very expensive correctness tests.  This means it is
>> entirely possible that this assertion is incorrect since the referenced
>> commit.
>>
>> Possibly Jinshan can comment?
>>
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Lustre Principal Architect
>> Intel Corporation
>>
>>
>>
>>
>>
>>
>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180328/c9047c6e/attachment.sig>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Current results and status of my upstream work
  2018-03-27  5:32     ` NeilBrown
  2018-03-27  6:17       ` Dilger, Andreas
@ 2018-03-30 18:55       ` James Simmons
  2018-03-31  5:47         ` NeilBrown
  1 sibling, 1 reply; 19+ messages in thread
From: James Simmons @ 2018-03-30 18:55 UTC (permalink / raw)
  To: lustre-devel


> > Hi Neil
> >
> > 	I have been testing the upstream client using lustre 2.10 tools 
> > and the test suite that comes with it. I see the following failures in
> > my testing and wonder at how it compares to your testing:
> >
> > sanity: FAIL: test_17n migrate remote dir error 1
> > sanity: FAIL: test_17o stat file should fail
> > sanity: FAIL: test_24v large readdir doesn't take effect:  2 should be about 0
> > sanity: FAIL: test_27z FF stripe count 1 != 0
> > sanity: FAIL: test_27D llapi_layout_test failed
> > sanity: FAIL: test_29 No mdc lock count
> > sanity: FAIL: test_42d failed: client:75497472 server: 92274688.
> > sanity: FAIL: test_42e failed: client:76218368 server: 92995584.
> > sanity: FAIL: test_56c OST lustre-OST0000 is in status of '', not 'D'
> > sanity: FAIL: test_56t "lfs find -S 4M /lustre/lustre/d56t.sanityt" wrong: found 5, expected 3
> > sanity: FAIL: test_56w /usr/bin/lfs getstripe -c /lustre/lustre/d56w.sanityw/dir1/file1 wrong: found 2, expected 1
> > sanity: FAIL: test_63a failed: client:75497472 server: 92274688.
> > sanity: FAIL: test_63b failed: client:75497472 server: 92274688.
> > sanity: FAIL: test_64a failed: client:75497472 server: 92274688.
> > sanity: FAIL: test_64c failed: client:75497472 server: 92274688.
> > sanity: FAIL: test_76 inode slab grew from 182313 to 182399
> > sanity: FAIL: test_77c no checksum dump file on Client
> > sanity: FAIL: test_101g unable to set max_pages_per_rpc=16M
> > sanity: FAIL: test_102a /lustre/lustre/f102a.sanity missing 3 trusted.name xattrs
> > sanity: FAIL: test_102b can't get trusted.lov from /lustre/lustre/f102b.sanity
> > sanity: FAIL: test_102n setxattr invalid 'trusted.lov' success
> > sanity: FAIL: test_103a run_acl_subtest cp failed
> > sanity: FAIL: test_125 setfacl /lustre/lustre/d125 failed
> > sanity: FAIL: test_154  kernel panics in ll_splice code
> > sanity: FAIL: test_154B decode linkea /lustre/lustre/d154B.sanity/f154B.sanity failed
> > sanity: FAIL: test_160d migrate fails
> > sanity: FAIL: test_161d create should be blocked
> > sanity: FAIL: test_162a check path d162a.sanity/d2/p/q/r/slink failed
> > sanity: FAIL: test_200 unable to mount /lustre/lustre on MGS
> > sanity: FAIL: test_220 unable to mount /lustre/lustre on MGS
> > sanity: FAIL: test_225  kills the MDS server
> > sanity: FAIL: test_226a cannot get path of FIFO by /lustre/lustre /lustre/lustre/d226a.sanity/fifo
> > sanity: FAIL: test_226b cannot get path of FIFO by /lustre/lustre /lustre/lustre/d226b.sanity/remote_dir/fifo
> > sanity: FAIL: test_230b fails on migrating remote dir to MDT1
> > sanity: FAIL: test_230c mkdir succeeds under migrating directory
> > sanity: FAIL: test_230d migrate remote dir error
> > sanity: FAIL: test_230e migrate dir fails
> > sanity: FAIL: test_230f #1 migrate dir fails
> > sanity: FAIL: test_230h migrating d230h.sanity fail
> > sanity: FAIL: test_230i migration fails with a tailing slash
> > sanity: FAIL: test_233a cannot access /lustre/lustre using its FID '[0x200000007:0x1:0x0]'
> > sanity: FAIL: test_233b cannot access /lustre/lustre/.lustre using its FID '[0x200000002:0x1:0x0]'
> > sanity: FAIL: test_234 touch failed
> > sanity: FAIL: test_240 umount failed
> > sanity: FAIL: test_242 ls /lustre/lustre/d242.sanity failed
> > sanity: FAIL: test_251 short write happened
> > sanity: FAIL: test_300a 1:stripe_count is 0, expect 2
> > sanity: FAIL: test_300e set striped bdir under striped dir error
> > sanity: FAIL: test_300g create dir1 fails
> > sanity: FAIL: test_300h expect 4 get 0 for striped_dir
> > sanity: FAIL: test_300i set striped hashdir error
> > sanity: FAIL: test_300n create test_dir1 fails
> > sanity: FAIL: test_315 read is not accounted (0)
> > sanity: FAIL: test_399a fake write is slower
> > sanity: FAIL: test_405 One layout swap locked test failed
> > sanity: FAIL: test_406 unable to mount /lustre/lustre on MGS
> > sanity: FAIL: test_410 no inode match
> > sanity: FAIL: test_900 never finishes. ldlm_lock dumps
> >
> > Some of those failures are due to new functionality that the upstream 
> > client doesn't support which at this point is not important. Another
> > batch is due to xattr/acl support being broken. John Hammond and I
> > have been looking into those failures. I have a bunch of patches done
> > and are being tested. Letting you know so we don't duplicate work.
> > The other source of the bugs is the sysfs support. I'm porting the
> > fixes I have done to upstream and I'm in the process of validating
> > the patches. As a last bunch of changes it was found that lustre
> > doesn't work properly with its SMP code on systems like KNL and ARM
> > on one end and the other end on these massive systems with 100s of
> > core also doesn't work well. I have those patches already finished
> > and tested. I will be pushing those after the next merge window.
> > I'm almost done working out the 64 bit time code as well. Haven't
> > ported those yet.
> 
> Hi,
>  thanks for this.
>  Yes, my list is very similar, though not identical.
>  I've modified the test harness a little so that it unmounts and
>  remounts the filesystem on every test.  I was chasing down a bug that
>  happened on unmount, and wanted to trigger it as quickly as possible.
>  The might explain some of the differences

Actually I know what is causing the umount issues. Its a bug in the
sysfs/debugfs implementation. I fixed in my sysfs/debugfs patches. 
I'm still working the sysfs stuff and its going to be a huge patch
set, 100+ patches.
 
>  Tests in your list, not in mine:
>    56[ctw] 76 154  200 300e 315 399a 900

I'm using the test suite from the latest lustre 2.10.3 release.

Test 900 still has issues with ldlm locks not being freed when I applied
the current sysfs patch set I have. The output is:

2018-03-19T10:35:33.206057-04:00 ninja19.ccs.ornl.gov kernel: LustreError: 
7244:0:(ldlm_resource.c:842:ldlm_resource_complain()) 
lustre-MDT0000-mdc-000000007eed4fca: namespace resource 
[0x200000002:0x1:0x0].0 (00000000cd1c4e2f) refcount nonzero (1) after lock 
cleanup; forcing cleanup.

One the sysfs stuff is fixed this can be looked into.

To let you know test 154 always triggers a LBUG due to splice changes that
landed. I'm seeing

2018-03-30 13:41:37 [ 6351.749923] BUG: unable to handle kernel paging 
request at ffffffffffffffbc
2018-03-30 13:41:37 [ 6351.757031] IP: ll_splice_alias+0x1df/0x250 
[lustre]
2018-03-30 13:41:37 [ 6351.762111] PGD 1e0a067 P4D 1e0a067 PUD 1e0c067 PMD 
0
2018-03-30 13:41:37 [ 6351.767373] Oops: 0000 [#1] SMP
2018-03-30 13:41:37 [ 6351.770627] Modules linked in: loop lustre(C) 
obdecho(C) mgc(C) lov(C) osc(C) mdc(C) lmv(C) fid(C) fld(C) ptlrpc(C) 
obdclass(C) ksocklnd(C) lnet(C) sha512_generic md5 libcfs(C) 
ip6table_filter ip6_tables joydev iptable_filter dm_mirror dm_region_hash 
dm_log dm_mod coretemp x86_pkg_temp_thermal crc32_pclmul ahci libahci 
ipmi_si ehci_pci pcspkr i2c_i801 libata ehci_hcd ipmi_devintf wmi 
ipmi_msghandler rpcrdma button ib_iser libiscsi scsi_transport_iscsi 
scsi_mod ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm configfs ib_cm 
iw_cm mlx4_ib ib_core nfsd ip_tables x_tables autofs4 nfsv3 nfs_acl nfs 
lockd grace mlx4_en mlx4_core igb i2c_algo_bit i2c_core hwmon ipv6 
crc_ccitt sunrpc
2018-03-30 13:41:37 [ 6351.831324] CPU: 5 PID: 15718 Comm: mrename 
Tainted: G        WC       4.16.0-rc7+ #1
2018-03-30 13:41:37 [ 6351.839335] Hardware name: Supermicro X9DRT/X9DRT, 
BIOS 3.0a 02/19/2014
2018-03-30 13:41:37 [ 6351.846070] RIP: 0010:ll_splice_alias+0x1df/0x250 
[lustre]
2018-03-30 13:41:37 [ 6351.851675] RSP: 0018:ffffc90023307c18 EFLAGS: 
00010282
2018-03-30 13:41:37 [ 6351.857014] RAX: ffffffffffffff8c RBX: 
ffffffffffffff8c RCX: 0000000000000000
2018-03-30 13:41:37 [ 6351.864263] RDX: ffffffffffffff8c RSI: 
ffffffffa074a8b8 RDI: ffffffffa0759540
2018-03-30 13:41:37 [ 6351.871512] RBP: ffffc90023307c38 R08: 
0000000000000000 R09: ffff881053a87762
2018-03-30 13:41:38 [ 6351.878758] R10: 0000000000000000 R11: 
000000000000000f R12: ffff880fbea8e180
2018-03-30 13:41:38 [ 6351.886011] R13: ffff880fad4589c8 R14: 
ffff880faf12b288 R15: 0000000000000013
2018-03-30 13:41:38 [ 6351.893263] FS:  00007f4b72cd6740(0000) 
GS:ffff88107fc80000(0000) knlGS:0000000000000000
2018-03-30 13:41:38 [ 6351.901542] CS:  0010 DS: 0000 ES: 0000 CR0: 
0000000080050033
2018-03-30 13:41:38 [ 6351.907401] CR2: ffffffffffffffbc CR3: 
0000001053db6005 CR4: 00000000000606e0
2018-03-30 13:41:38 [ 6351.914650] Call Trace:
2018-03-30 13:41:38 [ 6351.917228]  ll_lookup_it_finish+0x57/0xc50 
[lustre]
2018-03-30 13:41:38 [ 6351.922315]  ? 
ll_invalidate_negative_children+0x190/0x190 [lustre]
2018-03-30 13:41:38 [ 6351.928711]  ll_lookup_it+0x258/0x8e0 [lustre]
2018-03-30 13:41:38 [ 6351.933278]  ll_lookup_nd+0xf7/0x160 [lustre]
2018-03-30 13:41:38 [ 6351.938977]  __lookup_hash+0x54/0xa0
2018-03-30 13:41:38 [ 6351.942666]  ? lock_rename+0x5c/0xe0
2018-03-30 13:41:38 [ 6351.946359]  SyS_renameat2+0x1bc/0x490
2018-03-30 13:41:38 [ 6351.950226]  SyS_rename+0x19/0x20
2018-03-30 13:41:38 [ 6351.953653]  do_syscall_64+0x5b/0x110
2018-03-30 13:41:38 [ 6351.957433]  
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
2018-03-30 13:41:38 [ 6351.962597] RIP: 0033:0x7f4b727698d7
2018-03-30 13:41:38 [ 6351.966286] RSP: 002b:00007fff9b8dc838 EFLAGS: 
00000202 ORIG_RAX: 0000000000000052
2018-03-30 13:41:38 [ 6351.974033] RAX: ffffffffffffffda RBX: 
0000000000000000 RCX: 00007f4b727698d7
2018-03-30 13:41:38 [ 6351.981279] RDX: 00007fff9b8dc948 RSI: 
00007fff9b8dda6c RDI: 00007fff9b8dda50
2018-03-30 13:41:38 [ 6351.988530] RBP: 0000000000000000 R08: 
00007f4b72abee80 R09: 0000000000000000
2018-03-30 13:41:38 [ 6351.995784] R10: 00007fff9b8dc550 R11: 
0000000000000202 R12: 0000000000400650
2018-03-30 13:41:38 [ 6352.003035] R13: 00007fff9b8dc920 R14: 
0000000000000000 R15: 0000000000000000
2018-03-30 13:41:38 [ 6352.010285] Code: d8 02 03 00 c1 01 00 00 48 c7 c6 
b8 a8 74 a0 48 c7 05 d2 02 03 00 00 00
00 00 c7 05 c0 02 03 00 00 20 00 00 48 c7 c7 40 95 75 a0 <48> 8b 48 30 44 
8b 08 44 8b 40 5c 31 c0 e8 df 80 d3 ff
48 89 d8
2018-03-30 13:41:38 [ 6352.029415] RIP: ll_splice_alias+0x1df/0x250 
[lustre] RSP: ffffc90023307c18
2018-03-30 13:41:38 [ 6352.036485] CR2: ffffffffffffffbc
2018-03-30 13:41:38 [ 6352.039910] ---[ end trace 07dc9db2fd0c6402 ]---

>  Tests in my list, not in yours
>    56z 60aa 64b 83 104b 120e  130[abcde] 161c 205 215 

For sanity test 215 that is a test issue. It is testing for a lnet
proc file that doesn't exist anymore. For lustre 2.11 the test was
updated and it can pass now.

The sanity 130 test are the fiemap failures. We see those errors on
Ubuntu 16 as well during are testing. It is due to a bug in the
e2fsprogs. Andreas sent a fix out for this. You can read about it
under ticket https://jira.hpdd.intel.com/browse/LU-10335. I suspect
SuSE might need to patch up their e2fsprogs.
 
The rest I believe are real bugs.

>  It might be worth looking in to some of these(?). Last time I
>  tried to understand one of the failures, I quickly realized that my
>  understand of how lustre works wasn't deep enough.  So I went back
>  to code cleaning.  Doing that has slowly improved my understanding
>  so it might be worth it to go hunting again.
> 
>  I don't think anything you have mentioned will duplicate anything I've
>  been working on.  Most of my time recently has been understanding the
>  various hash tables and working to convert lustre to use rhashtable.

That is a big change. I was looking to fix that up but if you want to
tackle that go for it. Another piece that lustre implements is an rbtree.
The kernel has a rbtree that could replace what lustre uses.

>  I look forward to looking over your patches, I'll probably learn something

I fixed up the xattr problems. Lustre heavly uses xattr for its user
land tools so this is a big step forward. I need to break up the xattr 
patches into smaller pieces to make Greg happy. With the latest patches
you sent I will need to rebase the SMP patches. In that patch set I 
removed linux-cpu.h. I have yet to handle linux-cpu.c. I can either 
merge both linux-cpu.c and libcfs_cpu.c into one file and have lots of 
ifdef CONFIG_SMP or put static inline functions into libcfs_cpu.h for
the UMP case. I'm leaning to the second option. What do you think is 
better? Note I also have a patch already that removes linux/libcfs.h so 
you don't need to worry about that one. With that last  patch no more 
libcfs/linux directory.

Lastly I'm working on the sysfs port which is a big task. Currently I'm 
working out the bugs I introduce in the port :-/ One thing that lustre 
implements for its sysfs handling is the ability to pass in units, KiB, MB 
etc. The way lustre does it is very lustre specific so I'm looking at 
creating a string helper that does the opposite of string_get_size(). It 
changes a string into a real number value. I will post it here once I have 
it ready. Since Greg is open to merging lustre specific stuff to the 
kernel for general use I think this would be a nice feature for people in
general. I will need to update the lustre tools to handle this.

>  One failure that I have looked into but haven't posted a patch for yet,
>  is that sometimes
>         LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED);
>  in cl_io_read_ahead() fails.  When it does, the value is CIS_INIT.
>  Tracing through the code, it seem that CIS_INIT is an easy possibility
>  since 1e1db2a97be5 ("staging: lustre: clio: Revise read ahead implementation")
>  However CIS_IO_GOING and CIS_LOCKED do also happen and I cannot see
>  that patht that leads to those - so I didn't feel that I could
>  correctly explain the patch.
>  I currently have:
> 
>        LINVRNT(io->ci_state == CIS_IO_GOING || io->ci_state == CIS_LOCKED ||
>                io->ci_state == CIS_INIT);
> 
>  Do you have any idea if that is right?

I see this was answered already :-)

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Current results and status of my upstream work
  2018-03-30 18:55       ` James Simmons
@ 2018-03-31  5:47         ` NeilBrown
  0 siblings, 0 replies; 19+ messages in thread
From: NeilBrown @ 2018-03-31  5:47 UTC (permalink / raw)
  To: lustre-devel

On Fri, Mar 30 2018, James Simmons wrote:

>
> To let you know test 154 always triggers a LBUG due to splice changes that
> landed. I'm seeing
>
> 2018-03-30 13:41:37 [ 6351.749923] BUG: unable to handle kernel paging 
> request at ffffffffffffffbc
> 2018-03-30 13:41:37 [ 6351.757031] IP: ll_splice_alias+0x1df/0x250 
> [lustre]

Can you run
  ./scripts/faddr2line drivers/staging/lustre/lustre/llite/namei.o  ll_splice_alias+0x1df/0x250 

(you might need a different pathname for namei.o, depending on how you
build the kernel).

d_splice_alias() can return an error (very rarely I think) and the
CDEBUG() at the end of ll_splice_alias wouldn't be happy about that.

The code shows a crash at:
  2b:*	48 8b 48 30          	mov    0x30(%rax),%rcx		<-- trapping instruction

and 
   RAX: ffffffffffffff8c

which is -116 or -ESTALE.
That is an error which d_splice_alias() can get from __d_unalias, so
I guess that is what is happening.

I'll dig through d_splice_alias() and try to work out what that means.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20180331/510a5e99/attachment.sig>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstreaming status.
  2018-02-11 23:17 [lustre-devel] Lustre upstream client TODO list James Simmons
  2018-02-11 23:54 ` NeilBrown
@ 2019-12-19  5:31 ` NeilBrown
  2019-12-27 16:04   ` Degremont, Aurelien
  2020-01-07  0:02   ` James Simmons
  1 sibling, 2 replies; 19+ messages in thread
From: NeilBrown @ 2019-12-19  5:31 UTC (permalink / raw)
  To: lustre-devel


Hi all,
 At the LUG in Houston, I said that I hoped to submit something upstream
 by the end of 2019.  Clearly that isn't going to happen now.

 The main reason that caused me to not even try is IPv6 support.
 It became apparent to me that LNet would not be accepted until it has
 working IPv6 support, and that doesn't exist yet.
 I hope to put some development time into IPv6, and to have something
 that works and is worth reviewing by the end of January 2020.

 The other issue is that development has progressed slowly because
 there is no spare review bandwidth.  James has contributed a lot, and
 others have helped, but reviewing patches for two code streams (OpenSFS
 and Linux-upstream) turns out to be too much to ask for.
 So I've decided to take a different approach.  From now on I'm not
 going to wait for reviews for patches going into my linux-lustre tree.
 Part of my justification for this is that historically, review hasn't
 really provided much promise of correctness.  Patches go missing.
 Random lines from patches go missing.  Errors creep in in other ways.

 Instead, I am developing a tool which will compare OpenSFS lustre
 and Linux-lustre and report relevant differences.  I have a prototype
 working, and it is helping me to find missing patches and parts of
 patches in both trees.

 I will continue to submit patches to gerrit to bring OpenSFS closer to
 my linux tree when that is needed, and will apply patches from OpenSFS
 to my tree without extra review when that it needed.

 When the time comes to submit upstream, I plan to present the tool so
 that other developers can confirm that what I am submitting is
 functionally equivalent to OpenSFS, and so that we can ensure the
 equivalence remains.

 Consequently my "lustre" branch will jump forward to v5.4 soon,
 probably tomorrow, and will remain close to mainline.
 I will also be growing my list of outstanding OpenSFS patches
 (currently about 100, many of which haven't been submitted to gerrit
 yet) and will hope to get those reviewed.  Any changes that result from
 the review will be detected by my comparison script when the patch
 lands, and I'll update linux-lustre to match.

 My new goal for upstream submission is the end of Q1-2020.  This is
 probably a bit optimistic, but gives me a suitable focus.

NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20191219/c48e850c/attachment.sig>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstreaming status.
  2019-12-19  5:31 ` [lustre-devel] Lustre upstreaming status NeilBrown
@ 2019-12-27 16:04   ` Degremont, Aurelien
  2020-01-07  0:02   ` James Simmons
  1 sibling, 0 replies; 19+ messages in thread
From: Degremont, Aurelien @ 2019-12-27 16:04 UTC (permalink / raw)
  To: lustre-devel

Thanks for that.

My understanding is that the only sustainable plan at long term is for linux lustre client to be the same code that OpenSFS master branch
- without backward compat ifdef and similar
- without server specific code.

We need to avoid managing a kind of fork because it will too difficult to maintain both branches. IPV6 support in LNET should be at least half accepted in OpenSFS branch before going too far on that front.

Aur?lien

?Le 19/12/2019 06:32, ? lustre-devel au nom de NeilBrown ? <lustre-devel-bounces at lists.lustre.org au nom de neilb@suse.de> a ?crit :

    
    Hi all,
     At the LUG in Houston, I said that I hoped to submit something upstream
     by the end of 2019.  Clearly that isn't going to happen now.
    
     The main reason that caused me to not even try is IPv6 support.
     It became apparent to me that LNet would not be accepted until it has
     working IPv6 support, and that doesn't exist yet.
     I hope to put some development time into IPv6, and to have something
     that works and is worth reviewing by the end of January 2020.
    
     The other issue is that development has progressed slowly because
     there is no spare review bandwidth.  James has contributed a lot, and
     others have helped, but reviewing patches for two code streams (OpenSFS
     and Linux-upstream) turns out to be too much to ask for.
     So I've decided to take a different approach.  From now on I'm not
     going to wait for reviews for patches going into my linux-lustre tree.
     Part of my justification for this is that historically, review hasn't
     really provided much promise of correctness.  Patches go missing.
     Random lines from patches go missing.  Errors creep in in other ways.
    
     Instead, I am developing a tool which will compare OpenSFS lustre
     and Linux-lustre and report relevant differences.  I have a prototype
     working, and it is helping me to find missing patches and parts of
     patches in both trees.
    
     I will continue to submit patches to gerrit to bring OpenSFS closer to
     my linux tree when that is needed, and will apply patches from OpenSFS
     to my tree without extra review when that it needed.
    
     When the time comes to submit upstream, I plan to present the tool so
     that other developers can confirm that what I am submitting is
     functionally equivalent to OpenSFS, and so that we can ensure the
     equivalence remains.
    
     Consequently my "lustre" branch will jump forward to v5.4 soon,
     probably tomorrow, and will remain close to mainline.
     I will also be growing my list of outstanding OpenSFS patches
     (currently about 100, many of which haven't been submitted to gerrit
     yet) and will hope to get those reviewed.  Any changes that result from
     the review will be detected by my comparison script when the patch
     lands, and I'll update linux-lustre to match.
    
     My new goal for upstream submission is the end of Q1-2020.  This is
     probably a bit optimistic, but gives me a suitable focus.
    
    NeilBrown
    

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstreaming status.
  2019-12-19  5:31 ` [lustre-devel] Lustre upstreaming status NeilBrown
  2019-12-27 16:04   ` Degremont, Aurelien
@ 2020-01-07  0:02   ` James Simmons
  2020-01-07  1:53     ` Andreas Dilger
  2020-01-07  4:05     ` NeilBrown
  1 sibling, 2 replies; 19+ messages in thread
From: James Simmons @ 2020-01-07  0:02 UTC (permalink / raw)
  To: lustre-devel


> Hi all,
>  At the LUG in Houston, I said that I hoped to submit something upstream
>  by the end of 2019.  Clearly that isn't going to happen now.
> 
>  The main reason that caused me to not even try is IPv6 support.
>  It became apparent to me that LNet would not be accepted until it has
>  working IPv6 support, and that doesn't exist yet.
>  I hope to put some development time into IPv6, and to have something
>  that works and is worth reviewing by the end of January 2020.

That would be awesome. I believe the original plan was for IPv6 support
for 2.14 but USDP didn't make it in for 2.13 so everything got delayed.

>  The other issue is that development has progressed slowly because
>  there is no spare review bandwidth.  James has contributed a lot, and
>  others have helped, but reviewing patches for two code streams (OpenSFS
>  and Linux-upstream) turns out to be too much to ask for.
>  So I've decided to take a different approach.  From now on I'm not
>  going to wait for reviews for patches going into my linux-lustre tree.
>  Part of my justification for this is that historically, review hasn't
>  really provided much promise of correctness.  Patches go missing.
>  Random lines from patches go missing.  Errors creep in in other ways.

I have been going over the patches from your backport tree to find
missing patches and test for regressions. I think all regressions I
saw was stomped out for everything for 2.12. I'm doing full regression
right now. The only bug I see now is very unique to the linux client.

2020-01-06T16:24:58.006823-05:00 ninja81.ccs.ornl.gov kernel: RIP: 
0010:ll_dcompare+0x62/0xf0 [lustre]
2020-01-06T16:24:58.006880-05:00 ninja81.ccs.ornl.gov kernel: RAX: 
0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
2020-01-06T16:24:58.006934-05:00 ninja81.ccs.ornl.gov kernel: RDX: 
0000000000000001 RSI: 0000000000000001 RDI: 00000000ffffffff
2020-01-06T16:24:58.006992-05:00 ninja81.ccs.ornl.gov kernel: Code: 85 c0 
89 c3 75 2d f6 05 c8 c7 c8 ff 20 74 09 f6 05 c2 c7 c8 ff
 80 75 2b 41 f7 04 24 00 00 01 10 75 c2 49 8b 84 24 f8 00 00 00 <0f> b6 58 
0c 83 e3 01 eb b1 48 83 c4 08 bb 01 00 00 00 89 d8 5b 5
d
2020-01-06T16:24:58.007051-05:00 ninja81.ccs.ornl.gov kernel: RBP: 
ffffc90009137cf0 R08: 0000000000000000 R09: 0000000000000000
2020-01-06T16:24:58.007105-05:00 ninja81.ccs.ornl.gov kernel: R10: 
0000000000000000 R11: 000000000000000f R12: ffff888fecf4ab40
2020-01-06T16:24:58.007157-05:00 ninja81.ccs.ornl.gov kernel: RSP: 
0018:ffffc9000944b950 EFLAGS: 00010246
2020-01-06T16:24:58.007216-05:00 ninja81.ccs.ornl.gov kernel: R13: 
000000137118a4ee R14: ffffc90009137cf0 R15: 0000000000000000
2020-01-06T16:24:58.007270-05:00 ninja81.ccs.ornl.gov kernel: FS:  
00007fb072a3a740(0000) GS:ffff88885ec00000(0000) knlGS:00000000
00000000
2020-01-06T16:24:58.007334-05:00 ninja81.ccs.ornl.gov kernel: RAX: 
0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
2020-01-06T16:24:58.007394-05:00 ninja81.ccs.ornl.gov kernel: RDX: 
0000000000000001 RSI: 0000000000000001 RDI: 00000000ffffffff
2020-01-06T16:24:58.007448-05:00 ninja81.ccs.ornl.gov kernel: CS:  0010 
DS: 0000 ES: 0000 CR0: 0000000080050033
2020-01-06T16:24:58.007505-05:00 ninja81.ccs.ornl.gov kernel: CR2: 
000000000000000c CR3: 00000007cd282001 CR4: 00000000001606e0
2020-01-06T16:24:58.007562-05:00 ninja81.ccs.ornl.gov kernel: RBP: 
ffffc9000944bcf0 R08: 0000000000000000 R09: 0000000000000000
2020-01-06T16:24:58.007616-05:00 ninja81.ccs.ornl.gov kernel: Call Trace:
2020-01-06T16:24:58.007669-05:00 ninja81.ccs.ornl.gov kernel: R10: 
0000000000000000 R11: 000000000000000f R12: ffff8887d0562640
2020-01-06T16:24:58.007727-05:00 ninja81.ccs.ornl.gov kernel: R13: 
000000137118a4ee R14: ffffc9000944bcf0 R15: 0000000000000000
2020-01-06T16:24:58.007780-05:00 ninja81.ccs.ornl.gov kernel: 
__d_lookup_rcu+0x183/0x1e0
2020-01-06T16:24:58.007832-05:00 ninja81.ccs.ornl.gov kernel: 
__d_lookup_rcu+0x183/0x1e0
2020-01-06T16:24:58.007885-05:00 ninja81.ccs.ornl.gov kernel: 
d_alloc_parallel+0x15e/0x7c0
2020-01-06T16:24:58.007936-05:00 ninja81.ccs.ornl.gov kernel: 
d_alloc_parallel+0x15e/0x7c0
2020-01-06T16:24:58.007999-05:00 ninja81.ccs.ornl.gov kernel: ? 
__lookup_slow+0xf5/0x1d0
2020-01-06T16:24:58.008056-05:00 ninja81.ccs.ornl.gov kernel: ? 
__lookup_slow+0xf5/0x1d0
2020-01-06T16:24:58.008112-05:00 ninja81.ccs.ornl.gov kernel: ? 
wake_up_q+0x80/0x80
2020-01-06T16:24:58.008169-05:00 ninja81.ccs.ornl.gov kernel: ? 
_raw_spin_unlock_irq+0x34/0x50

This might be resolved with

https://review.whamcloud.com/#/c/24175

I also have started working through the 2.13 release. I'm up to 2.12.54
but no heavy testing as of yet of those patches. Once I'm done testing
2.12 in depth I can push quickly through 2.13 and even sync up to
OpenSFS branch. I think the back porting work can be wrapped up by the
end of the month.
 
>  Instead, I am developing a tool which will compare OpenSFS lustre
>  and Linux-lustre and report relevant differences.  I have a prototype
>  working, and it is helping me to find missing patches and parts of
>  patches in both trees.
> 
>  I will continue to submit patches to gerrit to bring OpenSFS closer to
>  my linux tree when that is needed, and will apply patches from OpenSFS
>  to my tree without extra review when that it needed.
> 
>  When the time comes to submit upstream, I plan to present the tool so
>  that other developers can confirm that what I am submitting is
>  functionally equivalent to OpenSFS, and so that we can ensure the
>  equivalence remains.
> 
>  Consequently my "lustre" branch will jump forward to v5.4 soon,
>  probably tomorrow, and will remain close to mainline.
>  I will also be growing my list of outstanding OpenSFS patches
>  (currently about 100, many of which haven't been submitted to gerrit
>  yet) and will hope to get those reviewed.  Any changes that result from
>  the review will be detected by my comparison script when the patch
>  lands, and I'll update linux-lustre to match.
> 
>  My new goal for upstream submission is the end of Q1-2020.  This is
>  probably a bit optimistic, but gives me a suitable focus.

I believe having it ready for LUG 2020 is a reasonable goal.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstreaming status.
  2020-01-07  0:02   ` James Simmons
@ 2020-01-07  1:53     ` Andreas Dilger
  2020-01-07  2:24       ` Andreas Dilger
  2020-01-07  4:32       ` NeilBrown
  2020-01-07  4:05     ` NeilBrown
  1 sibling, 2 replies; 19+ messages in thread
From: Andreas Dilger @ 2020-01-07  1:53 UTC (permalink / raw)
  To: lustre-devel

On Jan 6, 2020, at 17:02, James Simmons <jsimmons at infradead.org<mailto:jsimmons@infradead.org>> wrote:


Hi all,
At the LUG in Houston, I said that I hoped to submit something upstream
by the end of 2019.  Clearly that isn't going to happen now.

The main reason that caused me to not even try is IPv6 support.
It became apparent to me that LNet would not be accepted until it has
working IPv6 support, and that doesn't exist yet.
I hope to put some development time into IPv6, and to have something
that works and is worth reviewing by the end of January 2020.

That would be awesome. I believe the original plan was for IPv6 support
for 2.14 but USDP didn't make it in for 2.13 so everything got delayed.

The other issue is that development has progressed slowly because
there is no spare review bandwidth.  James has contributed a lot, and
others have helped, but reviewing patches for two code streams (OpenSFS
and Linux-upstream) turns out to be too much to ask for.
So I've decided to take a different approach.  From now on I'm not
going to wait for reviews for patches going into my linux-lustre tree.
Part of my justification for this is that historically, review hasn't
really provided much promise of correctness.  Patches go missing.
Random lines from patches go missing.  Errors creep in in other ways.

I have been going over the patches from your backport tree to find
missing patches and test for regressions. I think all regressions I
saw was stomped out for everything for 2.12. I'm doing full regression
right now. The only bug I see now is very unique to the linux client.

[snip]

I also have started working through the 2.13 release. I'm up to 2.12.54
but no heavy testing as of yet of those patches. Once I'm done testing
2.12 in depth I can push quickly through 2.13 and even sync up to
OpenSFS branch. I think the back porting work can be wrapped up by the
end of the month.

I thought the goal was to stop at 2.12.x (following the b2_12 branch to get
important fixes) and try to get that included upstream?  That gives a good
point-in-time to track, and ensures that the upstream code is aligned with
a relatively stable version of the code.  It also has the major benefit that
2.12 is an LTS branch and we will need to keep compatibility with that for
a long time, which isn't always true of intermediate releases.

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20200107/b5a019db/attachment.html>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstreaming status.
  2020-01-07  1:53     ` Andreas Dilger
@ 2020-01-07  2:24       ` Andreas Dilger
  2020-01-07  4:32       ` NeilBrown
  1 sibling, 0 replies; 19+ messages in thread
From: Andreas Dilger @ 2020-01-07  2:24 UTC (permalink / raw)
  To: lustre-devel

Replying to my own email to get email addresses correct.

On Jan 6, 2020, at 18:53, Andreas Dilger <adilger at whamcloud.com<mailto:adilger@whamcloud.com>> wrote:

On Jan 6, 2020, at 17:02, James Simmons <jsimmons at infradead.org<mailto:jsimmons@infradead.org>> wrote:


Hi all,
At the LUG in Houston, I said that I hoped to submit something upstream
by the end of 2019.  Clearly that isn't going to happen now.

The main reason that caused me to not even try is IPv6 support.
It became apparent to me that LNet would not be accepted until it has
working IPv6 support, and that doesn't exist yet.
I hope to put some development time into IPv6, and to have something
that works and is worth reviewing by the end of January 2020.

That would be awesome. I believe the original plan was for IPv6 support
for 2.14 but USDP didn't make it in for 2.13 so everything got delayed.

The other issue is that development has progressed slowly because
there is no spare review bandwidth.  James has contributed a lot, and
others have helped, but reviewing patches for two code streams (OpenSFS
and Linux-upstream) turns out to be too much to ask for.
So I've decided to take a different approach.  From now on I'm not
going to wait for reviews for patches going into my linux-lustre tree.
Part of my justification for this is that historically, review hasn't
really provided much promise of correctness.  Patches go missing.
Random lines from patches go missing.  Errors creep in in other ways.

I have been going over the patches from your backport tree to find
missing patches and test for regressions. I think all regressions I
saw was stomped out for everything for 2.12. I'm doing full regression
right now. The only bug I see now is very unique to the linux client.

[snip]

I also have started working through the 2.13 release. I'm up to 2.12.54
but no heavy testing as of yet of those patches. Once I'm done testing
2.12 in depth I can push quickly through 2.13 and even sync up to
OpenSFS branch. I think the back porting work can be wrapped up by the
end of the month.

I thought the goal was to stop at 2.12.x (following the b2_12 branch to get
important fixes) and try to get that included upstream?  That gives a good
point-in-time to track, and ensures that the upstream code is aligned with
a relatively stable version of the code.  It also has the major benefit that
2.12 is an LTS branch and we will need to keep compatibility with that for
a long time, which isn't always true of intermediate releases.

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud






_______________________________________________
lustre-devel mailing list
lustre-devel at lists.lustre.org<mailto:lustre-devel@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-devel-lustre.org

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20200107/c40587f9/attachment-0001.html>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstreaming status.
  2020-01-07  0:02   ` James Simmons
  2020-01-07  1:53     ` Andreas Dilger
@ 2020-01-07  4:05     ` NeilBrown
  2020-01-08 21:18       ` NeilBrown
  1 sibling, 1 reply; 19+ messages in thread
From: NeilBrown @ 2020-01-07  4:05 UTC (permalink / raw)
  To: lustre-devel

On Tue, Jan 07 2020, James Simmons wrote:
>
> I have been going over the patches from your backport tree to find
> missing patches and test for regressions. I think all regressions I
> saw was stomped out for everything for 2.12. I'm doing full regression
> right now. The only bug I see now is very unique to the linux client.
>
> 2020-01-06T16:24:58.006823-05:00 ninja81.ccs.ornl.gov kernel: RIP: 
> 0010:ll_dcompare+0x62/0xf0 [lustre]
> 2020-01-06T16:24:58.006880-05:00 ninja81.ccs.ornl.gov kernel: RAX: 
> 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
> 2020-01-06T16:24:58.006934-05:00 ninja81.ccs.ornl.gov kernel: RDX: 
> 0000000000000001 RSI: 0000000000000001 RDI: 00000000ffffffff
> 2020-01-06T16:24:58.006992-05:00 ninja81.ccs.ornl.gov kernel: Code: 85 c0 
> 89 c3 75 2d f6 05 c8 c7 c8 ff 20 74 09 f6 05 c2 c7 c8 ff
>  80 75 2b 41 f7 04 24 00 00 01 10 75 c2 49 8b 84 24 f8 00 00 00 <0f> b6 58 
> 0c 83 e3 01 eb b1 48 83 c4 08 bb 01 00 00 00 89 d8 5b 5
> d

The crashing code here is

  22:	49 8b 84 24 f8 00 00 	mov    0xf8(%r12),%rax
  29:	00 
  2a:*	0f b6 58 0c          	movzbl 0xc(%rax),%ebx		<-- trapping instruction
  2e:	83 e3 01             	and    $0x1,%ebx
  31:	eb b1                	jmp    0xffffffffffffffe4

The only place that could happen in ll_dcompare is the

        if (d_lustre_invalid(dentry))
                return 1;
call@the end.
ll_d2d(dentry) must be NULL.

in OpenSFS lustre, d_lustre_invalid() protects against that being NULL.
Linux/lustre lost that protection in

Commit 7126bc2e8d60 ("lustre: switch to use of ->d_init()")

because it really shouldn't need it.

Have you reported this to me before? It seems awfully familiar.

>
> This might be resolved with
>
> https://review.whamcloud.com/#/c/24175

That would certainly resolve this particular symptom.
I don't know if it is correct though ... maybe.

>
> I believe having it ready for LUG 2020 is a reasonable goal.

Sounds like a nice goal.


Thanks,
NeilBrown

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20200107/5f656022/attachment.sig>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstreaming status.
  2020-01-07  1:53     ` Andreas Dilger
  2020-01-07  2:24       ` Andreas Dilger
@ 2020-01-07  4:32       ` NeilBrown
  1 sibling, 0 replies; 19+ messages in thread
From: NeilBrown @ 2020-01-07  4:32 UTC (permalink / raw)
  To: lustre-devel


(sorry for including the intel addresses in the original :-( )

On Tue, Jan 07 2020, Andreas Dilger wrote:

>
> I thought the goal was to stop at 2.12.x (following the b2_12 branch to get
> important fixes) and try to get that included upstream?  That gives a good
> point-in-time to track, and ensures that the upstream code is aligned with
> a relatively stable version of the code.  It also has the major benefit that
> 2.12 is an LTS branch and we will need to keep compatibility with that for
> a long time, which isn't always true of intermediate releases.

That was suggested at Houston, but I was against it.  I think I said I
would considered it going all the way to 'master' looked like too much
work - but it didn't.

Upstream Linux is not a place for old code.  It is a place for
developing new code.  If we don't submit the latest code to Linux,
people will want to know why.

Bug fixes will flow into the 'stable' releases with little or no effort
from the lustre team.  For the "community" face of lustre, this is
really all you need.

Compatability is import, but there should be no need to break it.  There
are feature flags (or similar) in the protocol, and a modest amount of
care with using them should avoid obvious breakage.

You don't need to test every single combination - you can assume that
people who care are doing that testing.  If someone reports a
compatability regression, we need to fix it.  But if no-one reports one,
then no fix is needed.

Obviously you will test the combinations that you support for your
customers - just as you do now.

We must have a long term goal of doing all (kernel) development against
upstream, and have it appear upstream first.  The lustre-release from
whamcloud will eventually contain no stand-alone kernel code.  Just
user-space code and a selection of backported patches (maybe).

So think of upstream-linux much the same way that you currently think
about "master".  My focus is to keep the two in-sync, so that thinking
this way will be natural.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20200107/02bc058f/attachment.sig>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [lustre-devel] Lustre upstreaming status.
  2020-01-07  4:05     ` NeilBrown
@ 2020-01-08 21:18       ` NeilBrown
  0 siblings, 0 replies; 19+ messages in thread
From: NeilBrown @ 2020-01-08 21:18 UTC (permalink / raw)
  To: lustre-devel

On Tue, Jan 07 2020, NeilBrown wrote:

> On Tue, Jan 07 2020, James Simmons wrote:
>>
>> I have been going over the patches from your backport tree to find
>> missing patches and test for regressions. I think all regressions I
>> saw was stomped out for everything for 2.12. I'm doing full regression
>> right now. The only bug I see now is very unique to the linux client.
>>
>> 2020-01-06T16:24:58.006823-05:00 ninja81.ccs.ornl.gov kernel: RIP: 
>> 0010:ll_dcompare+0x62/0xf0 [lustre]
>> 2020-01-06T16:24:58.006880-05:00 ninja81.ccs.ornl.gov kernel: RAX: 
>> 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
>> 2020-01-06T16:24:58.006934-05:00 ninja81.ccs.ornl.gov kernel: RDX: 
>> 0000000000000001 RSI: 0000000000000001 RDI: 00000000ffffffff
>> 2020-01-06T16:24:58.006992-05:00 ninja81.ccs.ornl.gov kernel: Code: 85 c0 
>> 89 c3 75 2d f6 05 c8 c7 c8 ff 20 74 09 f6 05 c2 c7 c8 ff
>>  80 75 2b 41 f7 04 24 00 00 01 10 75 c2 49 8b 84 24 f8 00 00 00 <0f> b6 58 
>> 0c 83 e3 01 eb b1 48 83 c4 08 bb 01 00 00 00 89 d8 5b 5
>> d
>
> The crashing code here is
>
>   22:	49 8b 84 24 f8 00 00 	mov    0xf8(%r12),%rax
>   29:	00 
>   2a:*	0f b6 58 0c          	movzbl 0xc(%rax),%ebx		<-- trapping instruction
>   2e:	83 e3 01             	and    $0x1,%ebx
>   31:	eb b1                	jmp    0xffffffffffffffe4
>
> The only place that could happen in ll_dcompare is the
>
>         if (d_lustre_invalid(dentry))
>                 return 1;
> call at the end.
> ll_d2d(dentry) must be NULL.
>
> in OpenSFS lustre, d_lustre_invalid() protects against that being NULL.
> Linux/lustre lost that protection in
>
> Commit 7126bc2e8d60 ("lustre: switch to use of ->d_init()")
>
> because it really shouldn't need it.
>
> Have you reported this to me before? It seems awfully familiar.

I remember now.  You have reported it.
The problem is

   de->d_fsdata = NULL;

in ll_release().
I'll remove that.

Thanks,
NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 832 bytes
Desc: not available
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20200109/dff15821/attachment.sig>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-01-08 21:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-11 23:17 [lustre-devel] Lustre upstream client TODO list James Simmons
2018-02-11 23:54 ` NeilBrown
2018-02-12  1:15   ` Patrick Farrell
2018-02-12  2:09     ` NeilBrown
2018-03-22 23:21   ` [lustre-devel] Current results and status of my upstream work James Simmons
2018-03-27  5:32     ` NeilBrown
2018-03-27  6:17       ` Dilger, Andreas
2018-03-27 21:17         ` Jinshan Xiong
2018-03-27 21:58           ` NeilBrown
2018-03-30 18:55       ` James Simmons
2018-03-31  5:47         ` NeilBrown
2019-12-19  5:31 ` [lustre-devel] Lustre upstreaming status NeilBrown
2019-12-27 16:04   ` Degremont, Aurelien
2020-01-07  0:02   ` James Simmons
2020-01-07  1:53     ` Andreas Dilger
2020-01-07  2:24       ` Andreas Dilger
2020-01-07  4:32       ` NeilBrown
2020-01-07  4:05     ` NeilBrown
2020-01-08 21:18       ` NeilBrown

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.