All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
To: linux-raid@vger.kernel.org
Subject: Re: "md/raid10:md5: sdf: redirecting sector 2979126480 to another mirror"
Date: Tue, 12 Jan 2021 12:30:37 +0100	[thread overview]
Message-ID: <e090f945-d616-bd93-cc61-268cf881c367@cloud.ionos.com> (raw)
In-Reply-To: <20210112020336.GJ3712@bitfolk.com>

Hi Andy,

On 1/12/21 03:03, Andy Smith wrote:
> Hi Guoqing,
> 
> Thanks for following up on this. I have a couple of questions.
> 
> On Tue, Jan 12, 2021 at 01:36:55AM +0100, Guoqing Jiang wrote:
>> On 1/7/21 00:27, Andy Smith wrote:
>>> err_rdev there can only be set inside the block above that starts
>>> with:
>>>
>>>      if (r10_bio->devs[slot].rdev) {
>>>          /*
>>>           * This is an error retry, but we cannot
>>>           * safely dereference the rdev in the r10_bio,
>>>           * we must use the one in conf.
>>>
>>> …but why is this an error retry? Nothing was logged so how do I find
>>> out what the error was?
>>
>> This is because handle_read_error also calls raid10_read_request, pls see
>> commit 545250f2480 ("md/raid10: simplify handle_read_error()").
> 
> So if I understand you correctly, this is a consequence of
> raid10_read_request being reworked so that it can be called by
> handle_read_error, but in my case it is not being called by
> handle_read_error but instead raid10_make_request and is incorrectly
> going down an error path and reporting a redirected read?

Yes, that is my guess too if the message appears but there is no read 
issue from component device.

> 
>  From my stack trace it seemed that it was just
> raid10.c:__make_request that was calling raid10_read_request but I
> could not see where in __make_request the r10_bio->devs[slot].rdev
> was being set, enabling the above test to succeed. All I could see
> was a memset to 0.

IIUC, the rdev is set in raid10_run instead of before dispatch IO request.

> 
> I understand that your patch makes it so this test can no longer
> succeed when being called by __make_request, meaning that aside from
> not logging about a redirected read it will also not do the
> rcu_read_lock() / rcu_dereference() / rcu_read_unlock() that's in
> that if block. Is that a significant amount of work that is being
> needlessly done right now or is it trivial?

I think check if raid10_read_request is called from read error path is 
enough.

> 
> I'm trying to understand how big of a problem this is, beyond some
> spurious logging.
> 
> Right now when it is logging about redirecting a read, does that
> mean that it isn't actually redirecting a read? That is, when it
> says:
> 
> Jan 11 17:10:40 hostname kernel: [1318773.480077] md/raid10:md3: nvme1n1p5: redirecting sector 699122984 to another mirror
> 
> in the absence of any other error logging it is in fact its first
> try at reading and it will really be using device nvme1n1p5 (rdev)
> to satisfy that?
> 
> I suppose I am also confused why this happens so rarely. I can only
> encourage it to happen a couple of times by putting the array under
> very heavy read load, and it only seems to happen with pretty high
> performing arrays (all SSD). But perhaps that is the result of the
> rate-limited logging with pr_err_ratelimited() causing me to only
> see very few of the actual events.

If the message ("redirecting sector ...") is not comes from handle read 
err path, then I suppose the message is false alarm.


Thanks,
Guoqing

      reply	other threads:[~2021-01-12 11:31 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-06 23:27 "md/raid10:md5: sdf: redirecting sector 2979126480 to another mirror" Andy Smith
2021-01-11 17:27 ` Andy Smith
2021-01-12  0:36 ` Guoqing Jiang
2021-01-12  2:03   ` Andy Smith
2021-01-12 11:30     ` Guoqing Jiang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e090f945-d616-bd93-cc61-268cf881c367@cloud.ionos.com \
    --to=guoqing.jiang@cloud.ionos.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.