All of lore.kernel.org
 help / color / mirror / Atom feed
From: quentin.bouget at cea.fr <quentin.bouget@cea.fr>
To: lustre-devel@lists.lustre.org
Subject: [lustre-devel] Error checking for llapi_hsm_action_progress().
Date: Mon, 31 Aug 2020 14:53:31 +0200	[thread overview]
Message-ID: <2659fe05-9cd7-afd3-e1ce-c1726129e354@cea.fr> (raw)
In-Reply-To: <87pn77qx60.fsf@notabene.neil.brown.name>

Hi Neil,


On 31/08/2020 06:03, NeilBrown wrote:
> I have a question about llapi_hsm_action_progress().  The documentation
> says that every interval sent "must" be unique, and must not overlap
> (which not exactly the same as 'unique').  The code (on server side)
> only partially enforces this.  It causes any request for an empty
> interval (start>end) to fail, but otherwise accepts any interval.  If it
> gets two identical intervals (not just overlapping, but identical), it
> ignores the second.  This seems weird.
>
> It would make some sense to just accept any interval - all it does is
> sum the lengths, and use this to report status, so no corruption would
> result.  It would also make sense to return an error if an interval
> overlaps any previous interval, as this violates the spec.  It might
> make sense to accept any interval, but only count the overlapped length
> once.  But the current behaviour of only ignoring exact duplicates is
> weird.  I tried removing that check, but there is a test (hsm_test 108)
> which checks for repeating identical intervals.

test108 handles a "growing" extent (offset=0, length=1, 2, 3, ...).
test112 handles acknowledging non-overlapping extents twice each.

I wrote a test to check what happens if you acknowledge overlapping extents:

  * (offset=0, length=256)
  * (offset=128, length=256)

And surely enough mdt_cdt_get_work_done() returns "512" rather than the 
expected "384" (ie. 128 + 256).

Even worse, when acknowledging a "shrinking" extent (offset=0, length=N, 
N - 1, N - 2, ...), only the last value is kept in store.

 From this, I think that exact duplicates are not really ignored, 
rather, intervals that share the same starting point overwrite one 
another, until only the last one remains.
Bug or feature? I don't really know.


> I want to clean up this code as I'm converting all users of the lustre
> interval-tree to use the upstream-linux interval tree code.  What should
> I do?
>
> Should I remove test 108 because it is only testing one particular
> corner case, or should I improve the code to handle all overlaps
> consistently?  Would it be OK to fail an overlap (I'd need to change
> test 108), it must they be quietly accepted?

How does the upstream-linux interval tree compares to Lustre's?

If their behaviours match, there should not be any issue (so far as the 
current behaviour can be considered issue-free).

Otherwise, I think it would be OK to just assume sending overlapping 
extents is a programming error and the server does not need to protect 
itself against it.
In terms of security, this isn't very good, but the problem already 
exists, and copytools are supposedly trusted binaries run as root.
You could then remove test108 as it is itself a programming error, 
test112 as well, and maybe others.

Just to be sure, you could open a new issue on Jira, and let others rule 
how much of a bug/feature the whole thing is.

Cheers,
Quentin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-devel-lustre.org/attachments/20200831/dc117d63/attachment.html>

  reply	other threads:[~2020-08-31 12:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-31  4:03 [lustre-devel] Error checking for llapi_hsm_action_progress() NeilBrown
2020-08-31 12:53 ` quentin.bouget at cea.fr [this message]
2020-09-01  0:58   ` NeilBrown
2020-09-01  9:33     ` quentin.bouget at cea.fr
2020-09-01 12:07       ` Degremont, Aurelien
2020-09-02  0:36       ` NeilBrown
2020-08-31 15:36 ` Joseph Benjamin Evans
2020-09-01  1:27   ` NeilBrown
2020-09-01  7:41     ` Degremont, Aurelien
2020-09-01 13:10       ` Joseph Benjamin Evans
2020-09-18 17:33 Nathan Rutman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2659fe05-9cd7-afd3-e1ce-c1726129e354@cea.fr \
    --to=quentin.bouget@cea.fr \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.