From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, SUBJ_ALL_CAPS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DD18C388F9 for ; Wed, 11 Nov 2020 16:26:01 +0000 (UTC) Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8838D20678 for ; Wed, 11 Nov 2020 16:26:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="rUyvuPSR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8838D20678 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kernelnewbies-bounces@kernelnewbies.org Received: from localhost ([::1] helo=shelob.surriel.com) by shelob.surriel.com with esmtp (Exim 4.94) (envelope-from ) id 1kcsw4-0003oK-Lr; Wed, 11 Nov 2020 11:25:28 -0500 Received: from mail-vk1-xa30.google.com ([2607:f8b0:4864:20::a30]) by shelob.surriel.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94) (envelope-from ) id 1kcsw2-0003oC-RG for kernelnewbies@kernelnewbies.org; Wed, 11 Nov 2020 11:25:26 -0500 Received: by mail-vk1-xa30.google.com with SMTP id q77so617209vkq.1 for ; Wed, 11 Nov 2020 08:25:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=7Zsqhi20+/X22JHv21ddK65hIlyhlxqNL6b3Nbwx+h4=; b=rUyvuPSR6F5QC6AZ/Hjkr8TANHSGmfUbwghxz4NhyEYsxcnXqnsBG3Eg6KQRfyPEtR AuG+aW8m17vaAvgvwoEwAq/S7sKPvz4l0c2jvEbyo8sXlhinxa/qUrud0Km5yYzmcInh Kb4X9ilT/4U5CRC7dmRQ4ybmLN2p8uG157/cdflAC16g8ZNfS2eUMgjVdJ8nu9jU3GT/ m5tFSCvzaHV2dfA4sZe0YUEtWw5XPe594Nf4/nA2FrErNXIo8KiB7akq0TsC5uZ6HMz9 tBbRX50brDch+WvB+NLPoGLabl/XUIVWR/CoOApJON/PHQKMOYr2Juj8TvdziYSHj6ff DizA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=7Zsqhi20+/X22JHv21ddK65hIlyhlxqNL6b3Nbwx+h4=; b=YrkHrQL55sM8c9eD0LiClxK2Ft8LcVfh3M22MvxNTTdtRugiuy8d+kSh9Ac4UP6UUL pGeMlNvR9GyHkNU97xx+DaqUMyinXiYr98dxmX8ZysjiKmQz/c9oMXjY84sQI8ZYn44b TnqlQML+lEsZVuu/FAJTyUH6VKGW2lVW+Lj1WjYnxWXh81yhpx45WnZVzPe6/cUOpIxP vxq1+YJnVmeXfF3ycCQoIxRJMirzDVFSWFHg4Z30E49ZZ21ufYcQdqyzK73eDLnb3Gx4 +wCeIgl4sx6hyQ3pUc77CWymvDXCMRTIDBcGN0HLE6JWC2+WHk1iQoOZihQxdrPEo6AN rrag== X-Gm-Message-State: AOAM5327tzELdiAyeeNEhWZa0K8oQ1SCd6j417+I0vMYHpE+YugSsjRy 2GcuC/vA72MQaCKQ0VzwIeNS4FODvZsXzcNouCL28L773I3DCQ== X-Google-Smtp-Source: ABdhPJzgcOWQ4m0p4Ocds7n+ZV0fsL+jEjrDYPIreywt1/atTwu6JGStoDkW6S3RtRwGEN8jN57P0DXTdJ0Izqy/7Pc= X-Received: by 2002:a1f:a0c1:: with SMTP id j184mr13942213vke.23.1605111925219; Wed, 11 Nov 2020 08:25:25 -0800 (PST) MIME-Version: 1.0 From: Marc Smith Date: Wed, 11 Nov 2020 11:25:14 -0500 Message-ID: Subject: STATE: TASK_UNINTERRUPTIBLE (PANIC) To: kernelnewbies@kernelnewbies.org X-BeenThere: kernelnewbies@kernelnewbies.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Learn about the Linux kernel List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: kernelnewbies-bounces@kernelnewbies.org Hi, I have an issue with the 'bcache' Linux subsystem (block I/O cache). I hit a kernel panic when using this software, and I've reported that upstream on the "linux-bcache" mailing list: https://www.spinics.net/lists/linux-bcache/msg09069.html I'd like to contribute and learn more on how to debug this myself. Here is the output from 'crash' on a dumpfile from this panic: SYSTEM MAP: /home/marc.smith/Downloads/System.map-esos.prod DEBUG KERNEL: /home/marc.smith/Downloads/vmlinux-esos.prod (5.4.69-esos.prod) DUMPFILE: /home/marc.smith/Downloads/dumpfile-1604062993 CPUS: 8 DATE: Fri Oct 30 09:02:56 2020 UPTIME: 2 days, 12:38:15 LOAD AVERAGE: 9.48, 8.89, 7.69 TASKS: 980 NODENAME: node-10cccd-2 RELEASE: 5.4.69-esos.prod VERSION: #1 SMP Thu Oct 22 19:45:11 UTC 2020 MACHINE: x86_64 (2799 Mhz) MEMORY: 24 GB PANIC: "Oops: 0002 [#1] SMP NOPTI" (check log for details) PID: 18272 COMMAND: "kworker/2:13" TASK: ffff88841d9e8000 [THREAD_INFO: ffff88841d9e8000] CPU: 2 STATE: TASK_UNINTERRUPTIBLE (PANIC) crash> bt PID: 18272 TASK: ffff88841d9e8000 CPU: 2 COMMAND: "kworker/2:13" #0 [ffffc90000100938] machine_kexec at ffffffff8103d6b5 #1 [ffffc90000100980] __crash_kexec at ffffffff8110d37b #2 [ffffc90000100a48] crash_kexec at ffffffff8110e07d #3 [ffffc90000100a58] oops_end at ffffffff8101a9de #4 [ffffc90000100a78] no_context at ffffffff81045e99 #5 [ffffc90000100ae0] async_page_fault at ffffffff81e010cf [exception RIP: atomic_try_cmpxchg+2] RIP: ffffffff810d3e3b RSP: ffffc90000100b98 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000080006 RDX: 0000000000000001 RSI: ffffc90000100ba4 RDI: 0000000000000a6c RBP: 0000000000000010 R8: 0000000000000001 R9: ffffffffa0418d4e R10: ffff88841c8b3000 R11: ffff88841c8b3000 R12: 0000000000000046 R13: 0000000000000000 R14: ffff8885a3a0a000 R15: 0000000000000a6c ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #6 [ffffc90000100b98] _raw_spin_lock_irqsave at ffffffff81cf7d7d #7 [ffffc90000100bb8] try_to_wake_up at ffffffff810c1624 #8 [ffffc90000100c08] closure_sync_fn at ffffffffa040fb07 [bcache] #9 [ffffc90000100c10] clone_endio at ffffffff81aac48c #10 [ffffc90000100c40] call_bio_endio at ffffffff81a78e20 #11 [ffffc90000100c58] raid_end_bio_io at ffffffff81a78e69 #12 [ffffc90000100c88] raid1_end_write_request at ffffffff81a79ad9 #13 [ffffc90000100cf8] blk_update_request at ffffffff814c3ab1 #14 [ffffc90000100d38] blk_mq_end_request at ffffffff814caaf2 #15 [ffffc90000100d50] blk_mq_complete_request at ffffffff814c91c1 #16 [ffffc90000100d78] nvme_complete_cqes at ffffffffa002fb03 [nvme] #17 [ffffc90000100db8] nvme_irq at ffffffffa002fb7f [nvme] #18 [ffffc90000100de0] __handle_irq_event_percpu at ffffffff810e0d60 #19 [ffffc90000100e20] handle_irq_event_percpu at ffffffff810e0e65 #20 [ffffc90000100e48] handle_irq_event at ffffffff810e0ecb #21 [ffffc90000100e60] handle_edge_irq at ffffffff810e494d #22 [ffffc90000100e78] do_IRQ at ffffffff81e01900 #23 [ffffc90000100eb0] common_interrupt at ffffffff81e00a0a #24 [ffffc90000100f38] __softirqentry_text_start at ffffffff8200006a #25 [ffffc90000100fc8] irq_exit at ffffffff810a3f6a #26 [ffffc90000100fd0] smp_apic_timer_interrupt at ffffffff81e020b2 bt: invalid kernel virtual address: ffffc90000101000 type: "pt_regs" crash> Looking at the call trace, I see this was the last function from 'bcache' in the trace (linux-5.4.69/drivers/md/bcache/closure.c): static void closure_sync_fn(struct closure *cl) { struct closure_syncer *s = cl->s; struct task_struct *p; rcu_read_lock(); p = READ_ONCE(s->task); s->done = 1; wake_up_process(p); rcu_read_unlock(); } And I believe the calls above this in my crash-backtrace output come from this call: wake_up_process() Is the panic perhaps because the task/process is already gone/finished? Not sure where to start looking next. Any help would be greatly appreciated. --Marc _______________________________________________ Kernelnewbies mailing list Kernelnewbies@kernelnewbies.org https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies