* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
@ 2020-02-06 19:17 ` bugzilla-daemon
2020-02-06 19:19 ` bugzilla-daemon
` (12 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-06 19:17 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #1 from Suraj (surajjs@amazon.com) ---
Created attachment 287191
--> https://bugzilla.kernel.org/attachment.cgi?id=287191&action=edit
testcase
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
2020-02-06 19:17 ` [Bug 206443] " bugzilla-daemon
@ 2020-02-06 19:19 ` bugzilla-daemon
2020-02-06 19:19 ` bugzilla-daemon
` (11 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-06 19:19 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #2 from Suraj (surajjs@amazon.com) ---
Created attachment 287193
--> https://bugzilla.kernel.org/attachment.cgi?id=287193&action=edit
ext4_mb_load_buddy_gfp.trace
Call trace in ext4_mb_load_buddy_gfp
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
2020-02-06 19:17 ` [Bug 206443] " bugzilla-daemon
2020-02-06 19:19 ` bugzilla-daemon
@ 2020-02-06 19:19 ` bugzilla-daemon
2020-02-06 19:20 ` bugzilla-daemon
` (10 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-06 19:19 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #3 from Suraj (surajjs@amazon.com) ---
Created attachment 287195
--> https://bugzilla.kernel.org/attachment.cgi?id=287195&action=edit
ext4_free_blocks.trace
Call trace in ext4_free_blocks
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (2 preceding siblings ...)
2020-02-06 19:19 ` bugzilla-daemon
@ 2020-02-06 19:20 ` bugzilla-daemon
2020-02-06 19:21 ` bugzilla-daemon
` (9 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-06 19:20 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #4 from Suraj (surajjs@amazon.com) ---
Created attachment 287197
--> https://bugzilla.kernel.org/attachment.cgi?id=287197&action=edit
ext4_free_inode.trace
Call trace in ext4_free_inode
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (3 preceding siblings ...)
2020-02-06 19:20 ` bugzilla-daemon
@ 2020-02-06 19:21 ` bugzilla-daemon
2020-02-06 19:25 ` bugzilla-daemon
` (8 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-06 19:21 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #5 from Suraj (surajjs@amazon.com) ---
Created attachment 287199
--> https://bugzilla.kernel.org/attachment.cgi?id=287199&action=edit
__ext4_new_inode.trace
Call trace in __ext4_new_inode
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (4 preceding siblings ...)
2020-02-06 19:21 ` bugzilla-daemon
@ 2020-02-06 19:25 ` bugzilla-daemon
2020-02-07 6:05 ` bugzilla-daemon
` (7 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-06 19:25 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #6 from Suraj (surajjs@amazon.com) ---
Initial bug was hit reliably (~95% of the time) within 30 minutes.
The following traces only occurred on ~50% of runs and some times taking up to
5 hours to hit.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (5 preceding siblings ...)
2020-02-06 19:25 ` bugzilla-daemon
@ 2020-02-07 6:05 ` bugzilla-daemon
2020-02-07 23:22 ` bugzilla-daemon
` (6 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-07 6:05 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #7 from Suraj (surajjs@amazon.com) ---
The other crashes look to be related to access sbi->s_flex_groups and
sbi->s_group_info which are reallocated in the same way that s_group_desc is on
resize.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (6 preceding siblings ...)
2020-02-07 6:05 ` bugzilla-daemon
@ 2020-02-07 23:22 ` bugzilla-daemon
2020-02-07 23:22 ` bugzilla-daemon
` (5 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-07 23:22 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #8 from Suraj (surajjs@amazon.com) ---
I've attached 2 additional patches which seem to resolve the s_group_info and
s_flex_group issues in the same way as the s_group_desc reallocation was
resolved. That is to used rcu to protect access to the pointer.
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (7 preceding siblings ...)
2020-02-07 23:22 ` bugzilla-daemon
@ 2020-02-07 23:22 ` bugzilla-daemon
2020-02-07 23:23 ` bugzilla-daemon
` (4 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-07 23:22 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #9 from Suraj (surajjs@amazon.com) ---
Created attachment 287249
--> https://bugzilla.kernel.org/attachment.cgi?id=287249&action=edit
s_group_info.patch
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (8 preceding siblings ...)
2020-02-07 23:22 ` bugzilla-daemon
@ 2020-02-07 23:23 ` bugzilla-daemon
2020-02-15 20:43 ` bugzilla-daemon
` (3 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-07 23:23 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #10 from Suraj (surajjs@amazon.com) ---
Created attachment 287251
--> https://bugzilla.kernel.org/attachment.cgi?id=287251&action=edit
s_flex_group.patch
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (9 preceding siblings ...)
2020-02-07 23:23 ` bugzilla-daemon
@ 2020-02-15 20:43 ` bugzilla-daemon
2020-02-16 1:46 ` bugzilla-daemon
` (2 subsequent siblings)
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-15 20:43 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
Theodore Tso (tytso@mit.edu) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |tytso@mit.edu
--- Comment #11 from Theodore Tso (tytso@mit.edu) ---
Hi Suraj,
The two patches s_group_info.patch and s_flex_group.patch appear to use the
helper function rcu_sbi_array_dereference() without defining it?
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (10 preceding siblings ...)
2020-02-15 20:43 ` bugzilla-daemon
@ 2020-02-16 1:46 ` bugzilla-daemon
2020-02-19 19:30 ` bugzilla-daemon
2020-02-20 4:26 ` bugzilla-daemon
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-16 1:46 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #12 from Theodore Tso (tytso@mit.edu) ---
I've posted a proposed improvement[1] to the first proposed patch[2] on LKML.
[1] https://lore.kernel.org/r/20200215233817.GA670792@mit.edu
[2] https://bugzilla.kernel.org/attachment.cgi?id=287189
Suraj, please note that your patches are whitespace damaged, and are lacking
the Developer's Certification of Origin. In the future, it would save me a
lot of time you take a look at the Submitting Patches[3] instructions from the
kernel documentation.
[3] https://www.kernel.org/doc/html/latest/process/submitting-patches.html
You can either use e-mail to linux-ext4@vger.kernel.org or attach patches to a
Bugzilla entry. although the former is certainly preferred. It's better to
send a proposal to the linux-ext4@vger.kernel.org, since that way the patch can
also get tracked via patchwork[4], and on lore.kernel.org, as in [1] above.
[4] http://patchwork.ozlabs.org/project/linux-ext4/list/
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (11 preceding siblings ...)
2020-02-16 1:46 ` bugzilla-daemon
@ 2020-02-19 19:30 ` bugzilla-daemon
2020-02-20 4:26 ` bugzilla-daemon
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-19 19:30 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #13 from Suraj (surajjs@amazon.com) ---
Your approach using call_rcu is very similar to something I was also exploring
so that looks good to me.
Apologies, I didn't realise that patches attached to the BZ had to be mailing
list ready.
There is also the same issue present with the resizing of the s_group_info and
s_flex_groups arrays which are addressed in the following patches which I have
posted to the EXT4 mailing list:
[1/3] ext4: introduce macro sbi_array_rcu_deref() to access rcu protected
fields [1]
[2/3] ext4: fix potential race between s_group_info online resizing and access
[2]
[3/3] ext4: fix potential race between s_flex_groups online resizing and access
[3]
[1] http://patchwork.ozlabs.org/patch/1240507/
[2] http://patchwork.ozlabs.org/patch/1240506/
[3] http://patchwork.ozlabs.org/patch/1240509/
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [Bug 206443] general protection fault in ext4 during simultaneous online resize and write operations
2020-02-06 19:16 [Bug 206443] New: general protection fault in ext4 during simultaneous online resize and write operations bugzilla-daemon
` (12 preceding siblings ...)
2020-02-19 19:30 ` bugzilla-daemon
@ 2020-02-20 4:26 ` bugzilla-daemon
13 siblings, 0 replies; 15+ messages in thread
From: bugzilla-daemon @ 2020-02-20 4:26 UTC (permalink / raw)
To: linux-ext4
https://bugzilla.kernel.org/show_bug.cgi?id=206443
--- Comment #14 from Theodore Tso (tytso@mit.edu) ---
Patches to BZ don't have to be perfect, or mailing list ready. But it would be
nice if they actually applied (e.g., not be white-space damaged) and if they
actually compiled (not be missing macro definitions). :-)
In my experience, bugzilla is good for collecting data when we are trying to
root-cause a problem. But it's a lot more work to look at a bug in BZ, since
we have to download it first. Where as if it is sent to the mailing list,
it's a lot easier to review it and to send back comments.
For that matter, it's fine to send patches to the mailing list that aren't
ready to be applied. Using a [PATCH RFC] subject prefix is a good way to make
that clear; Linus Torvalds has been known to post patches with "Warning! I
haven't even tried to compile it yet"; this is just to show the approach I'm
thinking of. What's important is to make sure expectations are set for why
the patch is being sent to the list or being uploaded to BZ.
Thanks for your work on this bug!
--
You are receiving this mail because:
You are watching the assignee of the bug.
^ permalink raw reply [flat|nested] 15+ messages in thread