All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
@ 2024-03-15 10:04 ` bugzilla-daemon
  2024-03-15 10:07 ` bugzilla-daemon
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-15 10:04 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|NVMe                        |ext4
           Assignee|io_nvme@kernel-bugs.kernel. |fs_ext4@kernel-bugs.osdl.or
                   |org                         |g
            Product|IO/Storage                  |File System

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
  2024-03-15 10:04 ` [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds && NULL pointer dereference, address: 0000000000000003 bugzilla-daemon
@ 2024-03-15 10:07 ` bugzilla-daemon
  2024-03-15 10:18 ` bugzilla-daemon
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-15 10:07 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

--- Comment #3 from Artem S. Tashkinov (aros@gmx.com) ---
Did you actually bisect?

The commit you've provided doesn't look like it might have caused the issue:

https://github.com/torvalds/linux/commit/326e1c208f3f24d14b93f910b8ae32c94923d22c

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
  2024-03-15 10:04 ` [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds && NULL pointer dereference, address: 0000000000000003 bugzilla-daemon
  2024-03-15 10:07 ` bugzilla-daemon
@ 2024-03-15 10:18 ` bugzilla-daemon
  2024-03-15 16:55 ` bugzilla-daemon
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-15 10:18 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

Artem S. Tashkinov (aros@gmx.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Bisected commit-id|326e1c208f3f24d14b93f910b8a |
                   |e32c94923d22c               |

--- Comment #4 from Artem S. Tashkinov (aros@gmx.com) ---
I'm not an expert bug your backtrace looks weird. Please run memtest86 or
memtest86+ for an hour or two.

https://www.memtest86.com/download.htm

https://github.com/memtest86plus/memtest86plus/releases


I'm also removing the bisected ID because it doesn't look like it has anything
to do with this issue.

You don't just copy random stuff from other bug reports which look similar to
yours. You actually bisect and provide the full bisect history.

Ubuntu 22.10 comes with kernel Kernel 5.19.

Ubuntu 23.10 comes with kernel Kernel 6.5.

That's a very large regression window. And then you can only bisect on vanilla
kernels, not Ubuntu ones. First compile and make sure 5.19 absolutely works for
you. Then try consecutive kernels one by one. If you hit the issue, then you
will have just two kernel releases to work with, instead of trying to find the
regressions between two very distant kernels.

https://docs.kernel.org/admin-guide/bug-bisect.html

Best of luck.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
                   ` (2 preceding siblings ...)
  2024-03-15 10:18 ` bugzilla-daemon
@ 2024-03-15 16:55 ` bugzilla-daemon
  2024-03-19  6:22 ` bugzilla-daemon
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-15 16:55 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

Theodore Tso (tytso@mit.edu) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tytso@mit.edu

--- Comment #5 from Theodore Tso (tytso@mit.edu) ---
Also note that upstream Linux kernel developers do not provide free kernel
support for Ubuntu (or any Distro) kernel.   If you want support for your
distro kernel, in the case of Ubuntu, you need to pay $$$ to Canonical. 
Otherwise, please try to replicate the problem on the *current* upstream
kernel, and not an ancient kernel such as 5.19 or 6.5.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
                   ` (3 preceding siblings ...)
  2024-03-15 16:55 ` bugzilla-daemon
@ 2024-03-19  6:22 ` bugzilla-daemon
  2024-03-19 15:08 ` bugzilla-daemon
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-19  6:22 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

--- Comment #6 from Colin (colin.kernel@i-pentest.info) ---
Firstly, apologies for any incorrectness - the commit reference link did not
accept inputs that were 'probably this sha' or similar - it was not meant in
malice or laziness, but admittedly I was very tiered as a result of the
unpredictability of this issue. 

I suspect this may actually be dead hardware but it's hard to tell. If somebody
is  interested in exploring this issue further feel free to provide instruction
over the next few days, otherwise I'll buy a new motherboard / RMA the
processor if the motherboard does not fix the issue. I value both my and all of
your time, so buying new equipment in the hopes that I'm rid of the problem is
my preference unless there's some burning desire for further exploration.

Here are some facts:

- `stress-ng --class cpu --seq 32` reliably crashes the machine in less than 60
seconds with the error message 'stress-ng: fail:  [2701] af-alg: ctr(twofish):
decrypted data different from original data (possible kernel bug)', as well as
other algos (pcbc(fcrypt), cbc(sm4) etc) noting `dd if=/dev/zero` is not
cryptographic. I routinely checksum files with sha and do not notice any
inconsistencies.
- I have tried vanilla kernels inc. 6.0.1 and 6.8, `dd` _seems_ to fail faster
with more recent kernels, but maybe that's just due to a small test sample
size.
- I have been unable to crash Windows using Prime95 (24h) Furmark (24h) nor dd,
nor am I aware of any errors from either tool. Windows install routinely BSOD,
but went away as soon as I switched to a different USB stick both freshly
flashed - it's possible this is related, but it could also be a bad USB.
- The motherboard is an ASUS Prime Z790-P WiFi D4 LGA 1700, the processor a
13900k, I have been experiencing the issue for about 12 months, seemingly it
has become more frequent lately
- I cannot visually see any problems with the motherboard, no bloated
capacitors as far as I can tell.
- I have replaced the RAM, PSU, SSDs (w/ an non-samsung model) and removed all
aux cards, with the exception of onboard wifi6 which cannot be removed
- Memtest86+ was run with my original 128G ram configuration on 2x24h occasions
and did not yield any errors indicating cpu<>memory integrity is not the issue

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
                   ` (4 preceding siblings ...)
  2024-03-19  6:22 ` bugzilla-daemon
@ 2024-03-19 15:08 ` bugzilla-daemon
  2024-03-21  5:20 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-19 15:08 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

--- Comment #7 from Keith Busch (kbusch@kernel.org) ---
(In reply to Colin from comment #6)
> I suspect this may actually be dead hardware but it's hard to tell.

I can't necessarily rule that out, however, based on the stack trace you
attached, this looks like a software bug. If you want to go further, I
recommend following Ted's suggestion and attempt to reproduce with a recent
upstream kernel. The most recent stable version as I write this is 6.8.1.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
                   ` (5 preceding siblings ...)
  2024-03-19 15:08 ` bugzilla-daemon
@ 2024-03-21  5:20 ` bugzilla-daemon
  2024-03-21  5:20 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-21  5:20 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

--- Comment #8 from Colin (colin.kernel@i-pentest.info) ---
Attached is the dmesg output of 6.8.1 built manually from vanilla sources under
Ubuntu 23.10 when running `stress-ng --class cpu --seq 32`. 

A few things I find odd: 

- If this was a bug, surely the hardware is not esoteric enough that nobody
else would have experienced this? 
- Again, it feels like this has 'gotten worse' since I first experienced it,
and older kernel seem more stable
- Building the linux kernel was segfaulting at different locations, I ended up
building it on another machine.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
                   ` (6 preceding siblings ...)
  2024-03-21  5:20 ` bugzilla-daemon
@ 2024-03-21  5:20 ` bugzilla-daemon
  2024-03-21 15:07 ` bugzilla-daemon
  2024-03-21 16:20 ` bugzilla-daemon
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-21  5:20 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

--- Comment #9 from Colin (colin.kernel@i-pentest.info) ---
Created attachment 306016
  --> https://bugzilla.kernel.org/attachment.cgi?id=306016&action=edit
kernel 6.8.1 dmesg of stress-ng

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
                   ` (7 preceding siblings ...)
  2024-03-21  5:20 ` bugzilla-daemon
@ 2024-03-21 15:07 ` bugzilla-daemon
  2024-03-21 16:20 ` bugzilla-daemon
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-21 15:07 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

Christian Kujau (kernel@nerdbynature.de) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |kernel@nerdbynature.de

--- Comment #10 from Christian Kujau (kernel@nerdbynature.de) ---
> Building the linux kernel was segfaulting at different locations, 
> I ended up building it on another machine.

Not good. Maybe swap/remove some RAM modules. See also:
https://bitwizard.nl/sig11/

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds &&  NULL pointer dereference, address: 0000000000000003
       [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
                   ` (8 preceding siblings ...)
  2024-03-21 15:07 ` bugzilla-daemon
@ 2024-03-21 16:20 ` bugzilla-daemon
  9 siblings, 0 replies; 10+ messages in thread
From: bugzilla-daemon @ 2024-03-21 16:20 UTC (permalink / raw)
  To: linux-ext4

https://bugzilla.kernel.org/show_bug.cgi?id=218601

--- Comment #11 from Keith Busch (kbusch@kernel.org) ---
(In reply to Colin from comment #9)
> Created attachment 306016 [details]
> kernel 6.8.1 dmesg of stress-ng

Doesn't look like the same failure as the first attachment. Maybe your hardware
really is broken.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-03-21 16:20 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-218601-13602@https.bugzilla.kernel.org/>
2024-03-15 10:04 ` [Bug 218601] Regression - dd if=/dev/zero of=/zero causes shift-out-of-bounds && NULL pointer dereference, address: 0000000000000003 bugzilla-daemon
2024-03-15 10:07 ` bugzilla-daemon
2024-03-15 10:18 ` bugzilla-daemon
2024-03-15 16:55 ` bugzilla-daemon
2024-03-19  6:22 ` bugzilla-daemon
2024-03-19 15:08 ` bugzilla-daemon
2024-03-21  5:20 ` bugzilla-daemon
2024-03-21  5:20 ` bugzilla-daemon
2024-03-21 15:07 ` bugzilla-daemon
2024-03-21 16:20 ` bugzilla-daemon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.