All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Bruno Goncalves <bgoncalv@redhat.com>,
	linux-block <linux-block@vger.kernel.org>
Cc: CKI Project <cki-project@redhat.com>
Subject: Re: kernel BUG at lib/list_debug.c:30! (list_add corruption. prev->next should be nex)
Date: Wed, 23 Nov 2022 06:46:05 -0700	[thread overview]
Message-ID: <2e5f0ed1-4771-1b24-e6da-b63393506e47@kernel.dk> (raw)
In-Reply-To: <CA+QYu4oxiRKC6hJ7F27whXy-PRBx=Tvb+-7TQTONN8qTtV3aDA@mail.gmail.com>

On 11/23/22 1:48 AM, Bruno Goncalves wrote:
> Hello,
> 
> We recently started to hit the following panic when testing the block
> tree (for-next branch).
> 
> [ 5076.172749] list_add corruption. prev->next should be next
> (ffff91cd6f7fa568), but was ffff91c991ca6670. (prev=ffff91c991ca6670).
> [ 5076.173863] ------------[ cut here ]------------
> [ 5076.174853] kernel BUG at lib/list_debug.c:30!
> [ 5076.175523] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> [ 5076.175853] CPU: 15 PID: 16415 Comm: kworker/15:13 Tainted: G
>    I        6.1.0-rc6 #1
> [ 5076.176799] Hardware name: HP ProLiant DL360p Gen8, BIOS P71 05/24/2019
> [ 5076.177198] Workqueue: cgwb_release cgwb_release_workfn
> [ 5076.177497] RIP: 0010:__list_add_valid.cold+0x3a/0x5b
> [ 5076.177788] Code: f2 48 89 c1 48 89 fe 48 c7 c7 48 d8 76 ad e8 5a
> 8f fd ff 0f 0b 48 89 d1 48 89 c6 4c 89 c2 48 c7 c7 f0 d7 76 ad e8 43
> 8f fd ff <0f> 0b 48 89 c1 48 c7 c7 98 d7 76 ad e8 32 8f fd ff 0f 0b 48
> c7 c7
> [ 5076.179173] RSP: 0018:ffffa1c98a6afdb0 EFLAGS: 00010082
> [ 5076.179472] RAX: 0000000000000075 RBX: ffff91c991ca6668 RCX: 0000000000000000
> [ 5076.180241] RDX: 0000000000000002 RSI: ffffffffad752ad3 RDI: 00000000ffffffff
> [ 5076.181069] RBP: ffff91cd6f7fa500 R08: 0000000000000000 R09: ffffa1c98a6afc60
> [ 5076.182209] R10: 0000000000000003 R11: ffff91cd7ff42fe8 R12: ffff91cd6f7fa568
> [ 5076.183002] R13: ffff91c991ca6670 R14: ffff91c991ca6670 R15: ffff91cd6f7f1440
> [ 5076.183902] FS:  0000000000000000(0000) GS:ffff91cd6f7c0000(0000)
> knlGS:0000000000000000
> [ 5076.184377] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5076.185084] CR2: 0000560ff67e11b8 CR3: 000000020d010005 CR4: 00000000000606e0
> [ 5076.185945] Call Trace:
> [ 5076.186110]  <TASK>
> [ 5076.186916]  insert_work+0x46/0xc0
> [ 5076.187533]  __queue_work+0x1d4/0x460
> [ 5076.187788]  queue_work_on+0x37/0x40
> [ 5076.187993]  blkcg_unpin_online+0x1ad/0x1b0
> [ 5076.188244]  cgwb_release_workfn+0x6a/0x200
> [ 5076.188464]  process_one_work+0x1c7/0x380
> [ 5076.188675]  worker_thread+0x4d/0x380
> [ 5076.188881]  ? rescuer_thread+0x380/0x380
> [ 5076.189089]  kthread+0xe9/0x110
> [ 5076.189716]  ? kthread_complete_and_exit+0x20/0x20
> [ 5076.190407]  ret_from_fork+0x22/0x30
> [ 5076.190677]  </TASK>
> [ 5076.190816] Modules linked in: nvme nvme_core nvme_common loop tls
> rfkill intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal
> intel_powerclamp coretemp sunrpc kvm_intel kvm iTCO_wdt iapl
> intel_cstate intel_uncore pcspkr lpc_ich ipmi_ssif hpilo tg3 acpi_ipmi
> ioatdma ipmi_si ipmi_devintf dca ipmi_msghandler acpi_power_meter fuse
> zram xfs crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni
> polyval_generic ghash_clmulni_intel sha512_ssse3 serio_raw hpsa
> mgag200 scsi_transport_sas [last unloaded: scsi_debug]
> [ 5076.293315] ---[ end trace 0000000000000000 ]---
> [ 5076.295226] RIP: 0010:__list_add_valid.cold+0x3a/0x5b
> [ 5076.295587] Code: f2 48 89 c1 48 89 fe 48 c7 c7 48 d8 76 ad e8 5a
> 8f fd ff 0f 0b 48 89 d1 48 89 c6 4c 89 c2 48 c7 c7 f0 d7 76 ad e8 43
> 8f fd ff <0f> 0b 48 89 c1 48 c7 c7 98 d7 76 ad e8 32 8f fd ff 0f 0b 48
> c7 c7
> [ 5076.296921] RSP: 0018:ffffa1c98a6afdb0 EFLAGS: 00010082
> [ 5076.297239] RAX: 0000000000000075 RBX: ffff91c991ca6668 RCX: 0000000000000000
> [ 5076.297983] RDX: 0000000000000002 RSI: ffffffffad752ad3 RDI: 00000000ffffffff
> [ 5076.298768] RBP: ffff91cd6f7fa500 R08: 0000000000000000 R09: ffffa1c98a6afc60
> [ 5076.299525] R10: 0000S:  0000000000000000(0000)
> GS:ffff91cd6f7c0000(0000) knlGS:0000000000000000
> [ 5076.700351] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5076.701046] CR2: 0000560ff67e11b8 CR3: 000000020d010005 CR4: 00000000000606e0
> [ 5076ernel panic - not syncing: Fatal exception
> [ 5077.924713] Shutting down cpus with NMI
> [ 5077.924986] Kernel Offset: 0x2b000000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [ 5077.927946] ---[ end Kernel panic - not syncing: Fatal exception ]---
> 
> It seems to happen often during different tests.
> 
> full console.log:
> https://s3.us-east-1.amazonaws.com/arr-cki-prod-datawarehouse-public/datawarehouse-public/2022/11/21/redhat:700955106/build_x86_64_redhat:700955106_x86_64/tests/1/results_0001/console.log/console.log
> 
> kernel tarball:
> https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/700955106/publish%20x86_64/3356091217/artifacts/kernel-block-redhat_700955106_x86_64.tar.gz
> 
> kernel config: https://s3.amazonaws.com/arr-cki-prod-trusted-artifacts/trusted-artifacts/700955106/build%20x86_64/3356091207/artifacts/kernel-block-redhat_700955106_x86_64.config
> 
> test logs: https://datawarehouse.cki-project.org/kcidb/tests/6061677
> 
> We didn't bisect, but the first commit we hit the problem was
> "f65d92c600fe6eecdbd6e7fab7893c9c094dfcbf
> (io_uring-6.1-2022-11-18-2180-gf65d92c600fe)" and the last one where
> we didn't hit the problem was
> "40fa774af7fd04d06014ac74947c351649b6f64f
> (io_uring-6.1-2022-11-11-1843-g40fa774af7fd)"
> 
> test logs: https://datawarehouse.cki-project.org/kcidb/tests/6061677
> cki issue tracker: https://datawarehouse.cki-project.org/issue/1732

Please just try and clone for-6.2/block from the block tree and bisect
it?

-- 
Jens Axboe



  reply	other threads:[~2022-11-23 13:53 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-23  8:48 kernel BUG at lib/list_debug.c:30! (list_add corruption. prev->next should be nex) Bruno Goncalves
2022-11-23 13:46 ` Jens Axboe [this message]
2022-11-24 14:57   ` Bruno Goncalves
2022-11-25  8:38     ` Yi Zhang
2022-11-26 14:29       ` [bisected]kernel " Yi Zhang
2022-11-26 15:53         ` Jens Axboe
2022-11-26 22:54           ` Waiman Long
2022-11-27  4:13             ` Waiman Long
2022-11-28 18:55         ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2e5f0ed1-4771-1b24-e6da-b63393506e47@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=bgoncalv@redhat.com \
    --cc=cki-project@redhat.com \
    --cc=linux-block@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.