On 27/01/18 8:40 AM, Matthew Wilcox wrote:

On Fri, Jan 26, 2018 at 07:54:06PM +1300, xen@randomwebstuff.com wrote:
Re-tried with the current latest 4.14 (4.14.15).A  Received the following:

[2018-01-24 19:26:57] dev login: [44501.106868] BUG: unable to handle kernel
NULL pointer dereference at 00000008
[2018-01-25 07:47:50] [44501.106897] IP: __radix_tree_lookup+0x14/0xa0
Please try including this patch:

https://bugzilla.kernel.org/show_bug.cgi?id=198497#c7

And have you had the chance to run memtest86 yet?
Added the patch at https://bugzilla.kernel.org/show_bug.cgi?id=198497#c7

After, received this stack.

Have not tried memtest86.A These are production hosts.A This has occurred on multiple hosts.A I can only recall this occurring on 32 bit kernels.A I cannot recall issues with other VMs not running that kernel on the same hosts.

[ A 125.329163] Bad swp_entry: e000000
[ A 125.329202] ------------[ cut here ]------------
[ A 125.329219] WARNING: CPU: 0 PID: 4175 at mm/swap_state.c:339 lookup_swap_cache+0x140/0x160
[ A 125.329233] CPU: 0 PID: 4175 Comm: apt-show-versio Not tainted 4.14.15-rh14-20180126233810.xenU.i386-00001-g6ba70cb #1
[ A 125.329245] task: ead9a940 task.stack: e7c8c000
[ A 125.329253] EIP: lookup_swap_cache+0x140/0x160
[ A 125.329260] EFLAGS: 00010282 CPU: 0
[ A 125.329267] EAX: 00000016 EBX: 00000000 ECX: ec5289c4 EDX: 0100016d
[ A 125.329275] ESI: b6312000 EDI: e7d94ea0 EBP: e7c8de24 ESP: e7c8de0c
[ A 125.329284] A DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0069
[ A 125.329295] CR0: 80050033 CR2: b63124b0 CR3: 2718c000 CR4: 00002660
[ A 125.329308] Call Trace:
[ A 125.329323] A ? percpu_counter_add_batch+0x91/0xb0
[ A 125.329332] A swap_readahead_detect+0x66/0x2e0
[ A 125.329343] A ? radix_tree_tag_set+0x7a/0xe0
[ A 125.329352] A do_swap_page+0x1fa/0x860
[ A 125.329361] A ? __set_page_dirty_buffers+0xb1/0xe0
[ A 125.329372] A ? ext4_set_page_dirty+0x22/0x60
[ A 125.329383] A ? fault_dirty_shared_page.isra.90+0x3e/0xa0
[ A 125.329396] A ? xen_pmd_val+0x10/0x20
[ A 125.329403] A handle_mm_fault+0x6f8/0x1020
[ A 125.329414] A ? handle_irq_event_percpu+0x3c/0x50
[ A 125.329424] A __do_page_fault+0x18a/0x450
[ A 125.329432] A ? vmalloc_sync_all+0x250/0x250
[ A 125.329439] A do_page_fault+0x21/0x30
[ A 125.329449] A common_exception+0x45/0x4a
[ A 125.329456] EIP: 0xb7ce397b
[ A 125.329462] EFLAGS: 00010202 CPU: 0
[ A 125.329469] EAX: 0000052a EBX: b7d77ff4 ECX: 000004fa EDX: b6311000
[ A 125.329477] ESI: bf90eae0 EDI: b6ed4b20 EBP: bf90ea60 ESP: bf90ea20
[ A 125.329486] A DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ A 125.329493] Code: 18 1f 14 c2 85 ff 0f 85 41 ff ff ff f0 ff 05 38 fb 02 c2 e9 35 ff ff ff 8d 76 00 89 44 24 04 c7 04 24 55 93 f3 c1 e8 8c e7 f5 ff <0f> ff 8b 5d f4 31 c0 8b 75 f8 8b 7d fc 89 ec 5d c3 64 ff 05 18
[ A 125.329558] ---[ end trace dd2704ca649b44ba ]---