All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
To: Pavel <pavel2000@ngs.ru>
Cc: linux-raid@vger.kernel.org
Subject: Re: Misbehavior of md-raid RAID on failed NVMe.
Date: Wed, 8 Jun 2022 10:32:09 +0200	[thread overview]
Message-ID: <20220608103209.00001d6a@linux.intel.com> (raw)
In-Reply-To: <984f2ca5-2565-025d-62a2-2425b518a01f@ngs.ru>

On Wed, 8 Jun 2022 10:48:09 +0700
Pavel <pavel2000@ngs.ru> wrote:

> Hi, linux-raid community.
> 
> Today we found a strange and even scaring behavior of md-raid RAID based 
> on NVMe devices.
> 
> We ordered new server, and started data transfer (using dd, filesystems 
> was unmounted on source, etc - no errors here).
> 
> While data in transfer, kernel started IO errors reporting on one of 
> NVMe devices. (dmesg output below)
> But md-raid not reacted on them in any way. RAID array not went into any 
> failed state, and "clean" state reported all the time.
> 
> Based on earlier practice, we trusted md-raid and thought things goes ok.
> After data transfer finished, server was turned off and cables was 
> replaced on suspicion.
> 
> After OS started on this new server, we found MySQL crashes.
> Thorough checksum check showed us mismatches on files content.
> (Of course, we did checksumming of untouched files, not MySQL database 
> files)
> 
> So, there is data-loss possible when NVMe device behaves wrong.
> We think, md-raid has to remove failed device from raid in a such case.
> That it didn't happen is wrong behaviour, so want to inform community 
> about finding.
> 
> Hope, this will help to make kernel ever better.
> Thanks for your work.
> 

Hi Pavel,
IMO it is not a RAID problem. In this case some part of requests hangs
inside nvme driver and raid1d hanged too. It is rather nvme problem not a raid.
RAID should handle it well if IO errors are continuously reported.

Thanks,
Mariusz

> ---
> [Tue Jun  7 09:58:45 2022] Call Trace:
> [Tue Jun  7 09:58:45 2022]  <IRQ>
> [Tue Jun  7 09:58:45 2022]  nvme_pci_complete_rq+0x5b/0x67 [nvme]
> [Tue Jun  7 09:58:45 2022]  nvme_poll_cq+0x1e4/0x265 [nvme]
> [Tue Jun  7 09:58:45 2022]  nvme_irq+0x36/0x6e [nvme]
> [Tue Jun  7 09:58:45 2022]  __handle_irq_event_percpu+0x73/0x13e
> [Tue Jun  7 09:58:45 2022]  handle_irq_event_percpu+0x31/0x77
> [Tue Jun  7 09:58:45 2022]  handle_irq_event+0x2e/0x51
> [Tue Jun  7 09:58:45 2022]  handle_edge_irq+0xc9/0xee
> [Tue Jun  7 09:58:45 2022]  __common_interrupt+0x41/0x9e
> [Tue Jun  7 09:58:45 2022]  common_interrupt+0x6e/0x8b
> [Tue Jun  7 09:58:45 2022]  </IRQ>
> [Tue Jun  7 09:58:45 2022]  <TASK>
> [Tue Jun  7 09:58:45 2022]  asm_common_interrupt+0x1e/0x40
> [Tue Jun  7 09:58:45 2022] RIP: 0010:__blk_mq_try_issue_directly+0x12/0x136
> [Tue Jun  7 09:58:45 2022] Code: fe ff ff 48 8b 97 78 01 00 00 48 8b 92 
> 80 00 00 00 48 89 34 c2 b0 01 c3 0f 1f 44 00 00 41 57 41 56 41 55 41 54 
> 55 48 89 f5 53 <89> d3 48 83 ec 18 65 48 8b 04 25 28 00 00 00 48 89 44 
> 24 10 31 c0
> [Tue Jun  7 09:58:45 2022] RSP: 0018:ffff88810bbf7ad8 EFLAGS: 00000246
> [Tue Jun  7 09:58:45 2022] RAX: 0000000000000000 RBX: 0000000000000001 
> RCX: 0000000000000001
> [Tue Jun  7 09:58:45 2022] RDX: 0000000000000001 RSI: ffff8881137e6e40 
> RDI: ffff88810dfdf400
> [Tue Jun  7 09:58:45 2022] RBP: ffff8881137e6e40 R08: 0000000000000001 
> R09: 0000000000000a20
> [Tue Jun  7 09:58:45 2022] R10: 0000000000000000 R11: 0000000000000000 
> R12: 0000000000000000
> [Tue Jun  7 09:58:45 2022] R13: ffff88810dfdf400 R14: ffff8881137e6e40 
> R15: ffff88810bbf7df0
> [Tue Jun  7 09:58:45 2022] blk_mq_request_issue_directly+0x46/0x78
> [Tue Jun  7 09:58:45 2022] blk_mq_try_issue_list_directly+0x41/0xba
> [Tue Jun  7 09:58:45 2022]  blk_mq_sched_insert_requests+0x86/0xd0
> [Tue Jun  7 09:58:45 2022]  blk_mq_flush_plug_list+0x1b5/0x214
> [Tue Jun  7 09:58:45 2022]  ? __blk_mq_alloc_requests+0x1c7/0x21d
> [Tue Jun  7 09:58:45 2022]  blk_mq_submit_bio+0x437/0x518
> [Tue Jun  7 09:58:45 2022]  submit_bio_noacct+0x93/0x1e6
> [Tue Jun  7 09:58:45 2022]  ? bio_associate_blkg_from_css+0x137/0x15c
> [Tue Jun  7 09:58:45 2022]  flush_bio_list+0x96/0xa5
> [Tue Jun  7 09:58:45 2022]  flush_pending_writes+0x7a/0xbf
> [Tue Jun  7 09:58:45 2022]  ? md_check_recovery+0x8a/0x4bd
> [Tue Jun  7 09:58:45 2022]  raid1d+0x194/0x10e8
> [Tue Jun  7 09:58:45 2022]  ? common_interrupt+0xf/0x8b
> [Tue Jun  7 09:58:45 2022]  md_thread+0x12c/0x155
> [Tue Jun  7 09:58:45 2022]  ? init_wait_entry+0x29/0x29
> [Tue Jun  7 09:58:45 2022]  ? signal_pending+0x19/0x19
> [Tue Jun  7 09:58:45 2022]  kthread+0x104/0x10c
> [Tue Jun  7 09:58:45 2022]  ? set_kthread_struct+0x32/0x32
> [Tue Jun  7 09:58:45 2022]  ret_from_fork+0x22/0x30
> [Tue Jun  7 09:58:45 2022]  </TASK>
> [Tue Jun  7 09:58:45 2022] ---[ end trace 15dc74ae2e04f737 ]---
> [Tue Jun  7 09:58:45 2022] ------------[ cut here ]------------
> [Tue Jun  7 09:58:45 2022] refcount_t: underflow; use-after-free.




  reply	other threads:[~2022-06-08  9:11 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-08  3:48 Misbehavior of md-raid RAID on failed NVMe Pavel
2022-06-08  8:32 ` Mariusz Tkaczyk [this message]
     [not found]   ` <8b0c4bf1-a165-95ca-9746-8ef7be46092e@areainter.net>
2022-06-08  9:11     ` Mariusz Tkaczyk
2022-06-08 16:52 ` Wol
2022-06-08 18:16   ` Pavel
     [not found]   ` <CAAMCDef5jamJa+um=DSM08CPdzoTvEQuFOdrGo7jiNivrNVbpg@mail.gmail.com>
2022-06-09  6:43     ` Pavel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220608103209.00001d6a@linux.intel.com \
    --to=mariusz.tkaczyk@linux.intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=pavel2000@ngs.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.