Kernel Newbies archive on lore.kernel.org
 help / color / Atom feed
* general protection fault vs Oops
@ 2020-05-16 12:35 Subhashini Rao Beerisetty
  2020-05-16 13:53 ` Valdis Klētnieks
  0 siblings, 1 reply; 7+ messages in thread
From: Subhashini Rao Beerisetty @ 2020-05-16 12:35 UTC (permalink / raw)
  To: kernelnewbies, linux-kernel

Hi all,

In my Linux box, I see that kernel crashes for a known test case.

In the first attempt when I run that test case I landed into “general
protection fault: 0000 [#1] SMP" .. Next I rebooted and ran the same
test , but now it resulted the “Oops: 0002 [#1] SMP".
In both cases the call trace looks exactly same and RIP points to
“native_queued_spin_lock_slowpath+0xfe/0x170"..

So, can some one clarify me , what are the difference between “general
protection fault: 0000 [#1] SMP" & “Oops: 0002 [#1] SMP". In which
scenario kernel throws “Oops: 0002 [#1] SMP" Or “general protection
fault: 0000 [#1] SMP"..


May 16 12:06:17 test-pc kernel: [96934.528114] general protection
fault: 0000 [#1] SMP
May 16 12:06:17 test-pc kernel: [96934.528990] Modules linked in:
dbg(OE) mcore(OE) osa(OE) cfg80211 ppdev intel_rapl intel_soc_dts_iosf
intel_powerclamp coretemp kvm irqbypass punit_atom_debug cdc_acm
mei_txe  mei lpc_ich shpchp parport_pc mac_hid parport tpm_infineon
8250_fintek ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel i915
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd igb
drm_kms_helper dca syscopyarea ptp sysfillrect sysimgblt pps_core
fb_sys_fops i2c_algo_bit drm ahci libahci fjes video [last unloaded:
osa]
May 16 12:06:17 test-pc kernel: [96934.537343] CPU: 2 PID: 5457 Comm:
multi/complete Tainted: G           OE   4.4.0-66-generic #87-Ubuntu
May 16 12:06:17 test-pc kernel: [96934.538715] Hardware name:
Supermicro Super Server/X10SDV-TLN4F, BIOS 1.0b 09/09/2015
May 16 12:06:17 test-pc kernel: [96934.540146] task: ffff880139d02640
ti: ffff880034fe0000 task.ti: ffff880034fe0000
May 16 12:06:17 test-pc kernel: [96934.541632] RIP:
0010:[<ffffffff810cafee>]  [<ffffffff810cafee>]
native_queued_spin_lock_slowpath+0xfe/0x170
May 16 12:06:17 test-pc kernel: [96934.543224] RSP:
0018:ffff880034fe3e08  EFLAGS: 00010002
May 16 12:06:17 test-pc kernel: [96934.544837] RAX: 0000000000002be1
RBX: 0000000000000082 RCX: 0004b8107b8cc138
May 16 12:06:17 test-pc kernel: [96934.546511] RDX: ffff88013fd17900
RSI: 00000000000c0000 RDI: ffff8800b35fbb88
May 16 12:06:17 test-pc kernel: [96934.548213] RBP: ffff880034fe3e08
R08: 0000000000000101 R09: 0000000180200019
May 16 12:06:17 test-pc kernel: [96934.549949] R10: ffff8800340c9680
R11: 0000000000000001 R12: ffff8800b35fbb88
May 16 12:06:17 test-pc kernel: [96934.551713] R13: 0000000000000246
R14: 0000000000000001 R15: ffff8800340c9d00
May 16 12:06:17 test-pc kernel: [96934.553501] FS:
0000000000000000(0000) GS:ffff88013fd00000(0000)
knlGS:0000000000000000
May 16 12:06:17 test-pc kernel: [96934.555346] CS:  0010 DS: 0000 ES:
0000 CR0: 000000008005003b
May 16 12:06:17 test-pc kernel: [96934.557216] CR2: 00007ffe8614d938
CR3: 00000000b35ef000 CR4: 00000000001006e0
May 16 12:06:17 test-pc kernel: [96934.559146] Stack:
May 16 12:06:17 test-pc kernel: [96934.561081]  ffff880034fe3e20
ffffffff8183c427 ffff8800b35fbb70 ffff880034fe3e50
May 16 12:06:17 test-pc kernel: [96934.563133]  ffffffffc0606812
ffff880035581cc0 ffff8800b1eb5ec8 0000000000000246
May 16 12:06:17 test-pc kernel: [96934.565220]  ffff8800b1eb5ef0
ffff880034fe3e60 ffffffffc06aa2a4 ffff880034fe3ea
May 16 12:06:17 test-pc kernel: [96934.567347] Call Trace:
May 16 12:06:17 test-pc kernel: [96934.569475]  [<ffffffff8183c427>]
_raw_spin_lock_irqsave+0x37/0x40
May 16 12:06:17 test-pc kernel: [96934.571686]  [<ffffffffc0606812>]
event_raise+0x22/0x60 [osa]
May 16 12:06:17 test-pc kernel: [96934.573935]  [<ffffffffc06aa2a4>]
multi_q_completed_one_buffer+0x34/0x40 [mcore]
May 16 12:06:17 test-pc kernel: [96934.576236]  [<ffffffffc06a919e>]
complete_thread+0x7e/0x110 [mcore]
May 16 12:06:17 test-pc kernel: [96934.578567]  [<ffffffffc0606a10>] ?
thread_should_stop+0x10/0x10 [osa]
May 16 12:06:17 test-pc kernel: [96934.580934]  [<ffffffffc0606a26>]
thread_func+0x16/0x50 [osa]
May 16 12:06:17 test-pc kernel: [96934.583326]  [<ffffffffc0606a10>] ?
thread_should_stop+0x10/0x10 [osa]
May 16 12:06:17 test-pc kernel: [96934.585762]  [<ffffffff810a0ba8>]
kthread+0xd8/0xf0
May 16 12:06:17 test-pc kernel: [96934.588219]  [<ffffffff810a0ad0>] ?
kthread_create_on_node+0x1e0/0x1e0
May 16 12:06:17 test-pc kernel: [96934.590724]  [<ffffffff8183c98f>]
ret_from_fork+0x3f/0x70
May 16 12:06:17 test-pc kernel: [96934.593251]  [<ffffffff810a0ad0>] ?
kthread_create_on_node+0x1e0/0x1e0
May 16 12:06:17 test-pc kernel: [96934.595822] Code: 87 47 02 c1 e0 10
85 c0 74 38 48 89 c1 c1 e8 12 48 c1 e9 0c 83 e8 01 83 e1 30 48 98 48
81 c1 00 79 01 00 48 03 0c c5 40 75 f3 81 <48> 89 11 8b 42 08 85 c0 75
0d f3 90 8b 42 08 85 c0 74 f7 eb 02
May 16 12:06:17 test-pc kernel: [96934.601479] RIP
[<ffffffff810cafee>] native_queued_spin_lock_slowpath+0xfe/0x17
May 16 12:06:17 test-pc kernel: [96934.604306]  RSP <ffff880034fe3e08>
May 16 12:06:17 test-pc kernel: [96934.617229] ---[ end trace
0b60bd63d72bdffa ]---





May 16 12:59:22 test-pc kernel: [ 3011.360710] BUG: unable to handle
kernel paging request at 0000000000017900
May 16 12:59:22 test-pc kernel: [ 3011.361623] IP:
[<ffffffff810cafee>] native_queued_spin_lock_slowpath+0xfe/0x170
May 16 12:59:22 test-pc kernel: [ 3011.362547] PGD
May 16 12:59:22 test-pc kernel: [ 3011.363419] Oops: 0002 [#1] SMP
May 16 12:59:22 test-pc kernel: [ 3011.364298] Modules linked in:
dbg(OE) mcore(OE) osa(OE) cfg80211 ppdev intel_rapl intel_soc_dts_iosf
intel_powerclamp coretemp kvm irqbypass punit_atom_debug cdc_acm
mei_txe  mei lpc_ich shpchp parport_pc mac_hid parport tpm_infineon
8250_fintek ib_iser rdma_cm iw_cm ib_cm ib_sa ib_mad ib_core ib_addr
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs
raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor
async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel i915
aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd igb
drm_kms_helper dca syscopyarea ptp sysfillrect sysimgblt pps_core
fb_sys_fops i2c_algo_bit drm ahci libahci fjes video [last unloaded:
osa]
May 16 12:59:22 test-pc kernel: [ 3011.373097] CPU: 1 PID: 4103 Comm:
multi/complete Tainted: G           OE   4.4.0-66-generic #87-Ubunt
May 16 12:59:22 test-pc kernel: [ 3011.374622] Hardware name:
Supermicro Super Server/X10SDV-TLN4F, BIOS 1.0b 09/09/2015
May 16 12:59:22 test-pc kernel: [ 3011.376203] task: ffff8800b3b972c0
ti: ffff8800b1cb0000 task.ti: ffff8800b1cb0000
May 16 12:59:22 test-pc kernel: [ 3011.377825] RIP:
0010:[<ffffffff810cafee>]  [<ffffffff810cafee>]
native_queued_spin_lock_slowpath+0xfe/0x170
May 16 12:59:22 test-pc kernel: [ 3011.379575] RSP:
0018:ffff8800b1cb3e08  EFLAGS: 00010006
May 16 12:59:22 test-pc kernel: [ 3011.381324] RAX: 0000000000000e8c
RBX: 0000000000000082 RCX: 000000000001790
May 16 12:59:22 test-pc kernel: [ 3011.383141] RDX: ffff88013fc97900
RSI: 0000000000080000 RDI: ffff88013a347b88
May 16 12:59:22 test-pc kernel: [ 3011.384997] RBP: ffff8800b1cb3e08
R08: 0000000000000101 R09: 0000000180200003
May 16 12:59:22 test-pc kernel: [ 3011.386878] R10: ffff880034fb4580
R11: 0000000000000001 R12: ffff88013a347b88
May 16 12:59:22 test-pc kernel: [ 3011.388792] R13: 0000000000000246
R14: 0000000000000001 R15: ffff880034fb4100
May 16 12:59:22 test-pc kernel: [ 3011.390726] FS:
0000000000000000(0000) GS:ffff88013fc80000(0000)
knlGS:0000000000000000
May 16 12:59:22 test-pc kernel: [ 3011.392722] CS:  0010 DS: 0000 ES:
0000 CR0: 000000008005003b
May 16 12:59:22 test-pc kernel: [ 3011.394738] CR2: 0000000000017900
CR3: 0000000001e0a000 CR4: 00000000001006e0
May 16 12:59:22 test-pc kernel: [ 3011.396819] Stack:
May 16 12:59:22 test-pc kernel: [ 3011.398896]  ffff8800b1cb3e20
ffffffff8183c427 ffff88013a347b70 ffff8800b1cb3e50
May 16 12:59:22 test-pc kernel: [ 3011.401093]  ffffffffc0604812
ffff8800b49f46c0 ffff880034a238c8 0000000000000246
May 16 12:59:22 test-pc kernel: [ 3011.403327]  ffff880034a238f0
ffff8800b1cb3e60 ffffffffc06b72a4 ffff8800b1cb3ea8
May 16 12:59:22 test-pc kernel: [ 3011.405602] Call Trace:
May 16 12:59:22 test-pc kernel: [ 3011.407892]  [<ffffffff8183c427>]
_raw_spin_lock_irqsave+0x37/0x40
May 16 12:59:22 test-pc kernel: [ 3011.410256]  [<ffffffffc0604812>]
event_raise+0x22/0x60 [osa]
May 16 12:59:22 test-pc kernel: [ 3011.412652]  [<ffffffffc06b72a4>]
multi_q_completed_one_buffer+0x34/0x40 [mcore]
May 16 12:59:22 test-pc kernel: [ 3011.415103]  [<ffffffffc06b619e>]
complete_thread+0x7e/0x110 [mcore]
May 16 12:59:22 test-pc kernel: [ 3011.417584]  [<ffffffffc0604a10>] ?
thread_should_stop+0x10/0x10 [osa]
May 16 12:59:22 test-pc kernel: [ 3011.420113]  [<ffffffffc0604a26>]
thread_func+0x16/0x50 [osa
May 16 12:59:22 test-pc kernel: [ 3011.422654]  [<ffffffffc0604a10>] ?
thread_should_stop+0x10/0x10 [osa]
May 16 12:59:22 test-pc kernel: [ 3011.425241]  [<ffffffff810a0ba8>]
kthread+0xd8/0xf0
May 16 12:59:22 test-pc kernel: [ 3011.427857]  [<ffffffff810a0ad0>] ?
kthread_create_on_node+0x1e0/0x1e0
May 16 12:59:22 test-pc kernel: [ 3011.430511]  [<ffffffff8183c98f>]
ret_from_fork+0x3f/0x70
May 16 12:59:22 test-pc kernel: [ 3011.433185]  [<ffffffff810a0ad0>] ?
kthread_create_on_node+0x1e0/0x1e0
May 16 12:59:22 test-pc kernel: [ 3011.435915] Code: 87 47 02 c1 e0 10
85 c0 74 38 48 89 c1 c1 e8 12 48 c1 e9 0c 83 e8 01 83 e1 30 48 98 48
81 c1 00 79 01 00 48 03 0c c5 40 75 f3 81 <48>
89 11 8b 42 08 85 c0 75 0d f3 90 8b 42 08 85 c0 74 f7 eb 02
May 16 12:59:22 test-pc kernel: [ 3011.441870] RIP
[<ffffffff810cafee>] native_queued_spin_lock_slowpath+0xfe/0x170
May 16 12:59:22 test-pc kernel: [ 3011.444858]  RSP <ffff8800b1cb3e08>
May 16 12:59:22 test-pc kernel: [ 3011.447817] CR2: 000000000001790
May 16 12:59:22 test-pc kernel: [ 3011.460906] ---[ end trace
0337c6fc94b1cb3d ]---

Thanks

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault vs Oops
  2020-05-16 12:35 general protection fault vs Oops Subhashini Rao Beerisetty
@ 2020-05-16 13:53 ` Valdis Klētnieks
  2020-05-16 15:10   ` Subhashini Rao Beerisetty
  2020-05-16 15:59   ` Randy Dunlap
  0 siblings, 2 replies; 7+ messages in thread
From: Valdis Klētnieks @ 2020-05-16 13:53 UTC (permalink / raw)
  To: Subhashini Rao Beerisetty; +Cc: linux-kernel, kernelnewbies

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1.1: Type: text/plain; charset=us-ascii, Size: 1938 bytes --]

On Sat, 16 May 2020 18:05:07 +0530, Subhashini Rao Beerisetty said:

> In the first attempt when I run that test case I landed into “general
> protection fault: 0000 [#1] SMP" .. Next I rebooted and ran the same
> test , but now it resulted the “Oops: 0002 [#1] SMP".

And the 0002 is telling you that there's been 2 previous bug/oops since the
reboot, so you need to go back through your dmesg and find the *first* one.

> In both cases the call trace looks exactly same and RIP points to
> “native_queued_spin_lock_slowpath+0xfe/0x170"..

The first few entries in the call trace are the oops handler itself. So...


> May 16 12:06:17 test-pc kernel: [96934.567347] Call Trace:
> May 16 12:06:17 test-pc kernel: [96934.569475]  [<ffffffff8183c427>]__raw_spin_lock_irqsave+0x37/0x40
> May 16 12:06:17 test-pc kernel: [96934.571686]  [<ffffffffc0606812>] event_raise+0x22/0x60 [osa]
> May 16 12:06:17 test-pc kernel: [96934.573935]  [<ffffffffc06aa2a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]

The above line is the one where you hit the wall.

> May 16 12:59:22 test-pc kernel: [ 3011.405602] Call Trace:
> May 16 12:59:22 test-pc kernel: [ 3011.407892]  [<ffffffff8183c427>] _raw_spin_lock_irqsave+0x37/0x40
> May 16 12:59:22 test-pc kernel: [ 3011.410256]  [<ffffffffc0604812>] event_raise+0x22/0x60 [osa]
> May 16 12:59:22 test-pc kernel: [ 3011.412652]  [<ffffffffc06b72a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]

And again.

However,  given that it's a 4.4 kernel from 4 years ago, it's going to be
hard to find anybody who really cares.

In fact. I'm wondering if this is from some out-of-tree or vendor patch,
because I'm not finding any sign of that function in either the 5.7 or 4.4
tree.  Not even a sign of ## catenation abuse - no relevant hits for
"completed_one_buffer" or "multi_q" either

I don't think anybody's going to be able to help unless somebody first
identifies where that function is....


[-- Attachment #1.2: Type: application/pgp-signature, Size: 832 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault vs Oops
  2020-05-16 13:53 ` Valdis Klētnieks
@ 2020-05-16 15:10   ` Subhashini Rao Beerisetty
  2020-05-16 15:59   ` Randy Dunlap
  1 sibling, 0 replies; 7+ messages in thread
From: Subhashini Rao Beerisetty @ 2020-05-16 15:10 UTC (permalink / raw)
  To: Valdis Klētnieks; +Cc: linux-kernel, kernelnewbies

On Sat, May 16, 2020 at 7:23 PM Valdis Klētnieks
<valdis.kletnieks@vt.edu> wrote:
>
> On Sat, 16 May 2020 18:05:07 +0530, Subhashini Rao Beerisetty said:
>
> > In the first attempt when I run that test case I landed into “general
> > protection fault: 0000 [#1] SMP" .. Next I rebooted and ran the same
> > test , but now it resulted the “Oops: 0002 [#1] SMP".
>
> And the 0002 is telling you that there's been 2 previous bug/oops since the
> reboot, so you need to go back through your dmesg and find the *first* one.
I could not find Oops: 0001 in kern.log.
Actually I captured the crash call trace by running tail -f
/var/log/kern.log. But after reboot, I could not find the same in
kern.log file. Why the kernel failed to store in kern.log? In that
case how does tail command captured? Could you please clarify on
this..

$strings /var/log/kern.log | grep -i oops

>
> > In both cases the call trace looks exactly same and RIP points to
> > “native_queued_spin_lock_slowpath+0xfe/0x170"..
>
> The first few entries in the call trace are the oops handler itself. So...
>
>
> > May 16 12:06:17 test-pc kernel: [96934.567347] Call Trace:
> > May 16 12:06:17 test-pc kernel: [96934.569475]  [<ffffffff8183c427>]__raw_spin_lock_irqsave+0x37/0x40
> > May 16 12:06:17 test-pc kernel: [96934.571686]  [<ffffffffc0606812>] event_raise+0x22/0x60 [osa]
> > May 16 12:06:17 test-pc kernel: [96934.573935]  [<ffffffffc06aa2a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]
>
> The above line is the one where you hit the wall.
>
> > May 16 12:59:22 test-pc kernel: [ 3011.405602] Call Trace:
> > May 16 12:59:22 test-pc kernel: [ 3011.407892]  [<ffffffff8183c427>] _raw_spin_lock_irqsave+0x37/0x40
> > May 16 12:59:22 test-pc kernel: [ 3011.410256]  [<ffffffffc0604812>] event_raise+0x22/0x60 [osa]
> > May 16 12:59:22 test-pc kernel: [ 3011.412652]  [<ffffffffc06b72a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]
>
> And again.
>
> However,  given that it's a 4.4 kernel from 4 years ago, it's going to be
> hard to find anybody who really cares.
>
> In fact. I'm wondering if this is from some out-of-tree or vendor patch,
> because I'm not finding any sign of that function in either the 5.7 or 4.4
> tree.  Not even a sign of ## catenation abuse - no relevant hits for
> "completed_one_buffer" or "multi_q" either
>
> I don't think anybody's going to be able to help unless somebody first
> identifies where that function is....
>

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault vs Oops
  2020-05-16 13:53 ` Valdis Klētnieks
  2020-05-16 15:10   ` Subhashini Rao Beerisetty
@ 2020-05-16 15:59   ` Randy Dunlap
  2020-05-16 16:15     ` Subhashini Rao Beerisetty
  1 sibling, 1 reply; 7+ messages in thread
From: Randy Dunlap @ 2020-05-16 15:59 UTC (permalink / raw)
  To: Valdis Klētnieks, Subhashini Rao Beerisetty
  Cc: linux-kernel, kernelnewbies

On 5/16/20 6:53 AM, Valdis Klētnieks wrote:
> On Sat, 16 May 2020 18:05:07 +0530, Subhashini Rao Beerisetty said:
> 
>> In the first attempt when I run that test case I landed into “general
>> protection fault: 0000 [#1] SMP" .. Next I rebooted and ran the same
>> test , but now it resulted the “Oops: 0002 [#1] SMP".
> 
> And the 0002 is telling you that there's been 2 previous bug/oops since the
> reboot, so you need to go back through your dmesg and find the *first* one.
> 
>> In both cases the call trace looks exactly same and RIP points to
>> “native_queued_spin_lock_slowpath+0xfe/0x170"..
> 
> The first few entries in the call trace are the oops handler itself. So...
> 
> 
>> May 16 12:06:17 test-pc kernel: [96934.567347] Call Trace:
>> May 16 12:06:17 test-pc kernel: [96934.569475]  [<ffffffff8183c427>]__raw_spin_lock_irqsave+0x37/0x40
>> May 16 12:06:17 test-pc kernel: [96934.571686]  [<ffffffffc0606812>] event_raise+0x22/0x60 [osa]
>> May 16 12:06:17 test-pc kernel: [96934.573935]  [<ffffffffc06aa2a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]
> 
> The above line is the one where you hit the wall.
> 
>> May 16 12:59:22 test-pc kernel: [ 3011.405602] Call Trace:
>> May 16 12:59:22 test-pc kernel: [ 3011.407892]  [<ffffffff8183c427>] _raw_spin_lock_irqsave+0x37/0x40
>> May 16 12:59:22 test-pc kernel: [ 3011.410256]  [<ffffffffc0604812>] event_raise+0x22/0x60 [osa]
>> May 16 12:59:22 test-pc kernel: [ 3011.412652]  [<ffffffffc06b72a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]
> 
> And again.
> 
> However,  given that it's a 4.4 kernel from 4 years ago, it's going to be
> hard to find anybody who really cares.

Right.

> In fact. I'm wondering if this is from some out-of-tree or vendor patch,
> because I'm not finding any sign of that function in either the 5.7 or 4.4
> tree.  Not even a sign of ## catenation abuse - no relevant hits for
> "completed_one_buffer" or "multi_q" either

Modules linked in:
dbg(OE) mcore(OE) osa(OE)

Out-of-tree, unsigned modules loaded.
We don't know what those are or how to debug them.

> I don't think anybody's going to be able to help unless somebody first
> identifies where that function is....
> 


-- 
~Randy


_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault vs Oops
  2020-05-16 15:59   ` Randy Dunlap
@ 2020-05-16 16:15     ` Subhashini Rao Beerisetty
  2020-05-17 20:46       ` Cong Wang
  0 siblings, 1 reply; 7+ messages in thread
From: Subhashini Rao Beerisetty @ 2020-05-16 16:15 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: Valdis Klētnieks, linux-kernel, kernelnewbies

On Sat, May 16, 2020 at 9:29 PM Randy Dunlap <rdunlap@infradead.org> wrote:
>
> On 5/16/20 6:53 AM, Valdis Klētnieks wrote:
> > On Sat, 16 May 2020 18:05:07 +0530, Subhashini Rao Beerisetty said:
> >
> >> In the first attempt when I run that test case I landed into “general
> >> protection fault: 0000 [#1] SMP" .. Next I rebooted and ran the same
> >> test , but now it resulted the “Oops: 0002 [#1] SMP".
> >
> > And the 0002 is telling you that there's been 2 previous bug/oops since the
> > reboot, so you need to go back through your dmesg and find the *first* one.
> >
> >> In both cases the call trace looks exactly same and RIP points to
> >> “native_queued_spin_lock_slowpath+0xfe/0x170"..
> >
> > The first few entries in the call trace are the oops handler itself. So...
> >
> >
> >> May 16 12:06:17 test-pc kernel: [96934.567347] Call Trace:
> >> May 16 12:06:17 test-pc kernel: [96934.569475]  [<ffffffff8183c427>]__raw_spin_lock_irqsave+0x37/0x40
> >> May 16 12:06:17 test-pc kernel: [96934.571686]  [<ffffffffc0606812>] event_raise+0x22/0x60 [osa]
> >> May 16 12:06:17 test-pc kernel: [96934.573935]  [<ffffffffc06aa2a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]
> >
> > The above line is the one where you hit the wall.
> >
> >> May 16 12:59:22 test-pc kernel: [ 3011.405602] Call Trace:
> >> May 16 12:59:22 test-pc kernel: [ 3011.407892]  [<ffffffff8183c427>] _raw_spin_lock_irqsave+0x37/0x40
> >> May 16 12:59:22 test-pc kernel: [ 3011.410256]  [<ffffffffc0604812>] event_raise+0x22/0x60 [osa]
> >> May 16 12:59:22 test-pc kernel: [ 3011.412652]  [<ffffffffc06b72a4>] multi_q_completed_one_buffer+0x34/0x40 [mcore]
> >
> > And again.
> >
> > However,  given that it's a 4.4 kernel from 4 years ago, it's going to be
> > hard to find anybody who really cares.
>
> Right.
>
> > In fact. I'm wondering if this is from some out-of-tree or vendor patch,
> > because I'm not finding any sign of that function in either the 5.7 or 4.4
> > tree.  Not even a sign of ## catenation abuse - no relevant hits for
> > "completed_one_buffer" or "multi_q" either
>
> Modules linked in:
> dbg(OE) mcore(OE) osa(OE)
>
> Out-of-tree, unsigned modules loaded.
Yes, those are out-of-tree modules. Basically, my question is, in
general what is the difference between 'general protection fault' and
'Oops' failure in kernel mode.

> We don't know what those are or how to debug them.
>
> > I don't think anybody's going to be able to help unless somebody first
> > identifies where that function is....
> >
>
>
> --
> ~Randy
>

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault vs Oops
  2020-05-16 16:15     ` Subhashini Rao Beerisetty
@ 2020-05-17 20:46       ` Cong Wang
  2020-05-18  5:45         ` Subhashini Rao Beerisetty
  0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2020-05-17 20:46 UTC (permalink / raw)
  To: Subhashini Rao Beerisetty
  Cc: Randy Dunlap, Valdis Klētnieks, LKML, kernelnewbies

On Sat, May 16, 2020 at 9:16 AM Subhashini Rao Beerisetty
<subhashbeerisetty@gmail.com> wrote:
> Yes, those are out-of-tree modules. Basically, my question is, in
> general what is the difference between 'general protection fault' and
> 'Oops' failure in kernel mode.

For your case, they are likely just different consequences of a same
memory error. Let's assume it is a use-after-free, the behavior is UAF
is undefined: If that memory freed by kernel is also unmapped from
kernel address space, you would get a page fault when using it
afterward, that is an Oops. Or if that memory freed by kernel gets
reallocated and remapped as read-only, you would get a general
protection error when you writing to it afterward.

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: general protection fault vs Oops
  2020-05-17 20:46       ` Cong Wang
@ 2020-05-18  5:45         ` Subhashini Rao Beerisetty
  0 siblings, 0 replies; 7+ messages in thread
From: Subhashini Rao Beerisetty @ 2020-05-18  5:45 UTC (permalink / raw)
  To: Cong Wang; +Cc: Randy Dunlap, Valdis Klētnieks, LKML, kernelnewbies

On Mon, May 18, 2020 at 2:16 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Sat, May 16, 2020 at 9:16 AM Subhashini Rao Beerisetty
> <subhashbeerisetty@gmail.com> wrote:
> > Yes, those are out-of-tree modules. Basically, my question is, in
> > general what is the difference between 'general protection fault' and
> > 'Oops' failure in kernel mode.
>
> For your case, they are likely just different consequences of a same
> memory error. Let's assume it is a use-after-free, the behavior is UAF
> is undefined: If that memory freed by kernel is also unmapped from
> kernel address space, you would get a page fault when using it
> afterward, that is an Oops. Or if that memory freed by kernel gets
> reallocated and remapped as read-only, you would get a general
> protection error when you writing to it afterward.
Cool, thanks for the clarification.

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-16 12:35 general protection fault vs Oops Subhashini Rao Beerisetty
2020-05-16 13:53 ` Valdis Klētnieks
2020-05-16 15:10   ` Subhashini Rao Beerisetty
2020-05-16 15:59   ` Randy Dunlap
2020-05-16 16:15     ` Subhashini Rao Beerisetty
2020-05-17 20:46       ` Cong Wang
2020-05-18  5:45         ` Subhashini Rao Beerisetty

Kernel Newbies archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/kernelnewbies/0 kernelnewbies/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 kernelnewbies kernelnewbies/ https://lore.kernel.org/kernelnewbies \
		kernelnewbies@kernelnewbies.org
	public-inbox-index kernelnewbies

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernelnewbies.kernelnewbies


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git