linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG] non-swapcache page in end_swap_bio_read()
@ 2013-06-07 12:59 Artem Savkov
  2013-06-07 15:26 ` Minchan Kim
  0 siblings, 1 reply; 5+ messages in thread
From: Artem Savkov @ 2013-06-07 12:59 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Dan Magenheimer, Andrew Morton, Rafael J. Wysocki, linux-mm,
	linux-pm, linux-kernel

Hello all,

I'm hitting the following BUG_ON during boot when
CONFIG_PM_STD_PARTITION or "resume" kernel boot option are set. Looks
like this issue was introduced in (or brought up to light by)
"mm: remove compressed copy from zram in-memory"
(84e5bb4f06e6d6f0c4dfc033b4700702ed8aaccc in linux-next.git)
What happens is that during swsusp_check() bio is created with
bio_end_io set to end_swap_bio_read(), but the page is not in swap
cache.
Not sure how to handle this the right way, but proceeding with the
optimization in end_swap_bio_read() only after checking PageSwapCache
flag does help.

[    2.065206] kernel BUG at mm/swapfile.c:2361!
[    2.065469] invalid opcode: 0000 [#1] SMP 
[    2.065469] Modules linked in:
[    2.065469] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-rc4-next-20130607+ #61
[    2.065469] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[    2.065469] task: ffff88001e5ccfc0 ti: ffff88001e5ea000 task.ti: ffff88001e5ea000
[    2.065469] RIP: 0010:[<ffffffff811462eb>]  [<ffffffff811462eb>] page_swap_info+0xab/0xb0
[    2.065469] RSP: 0000:ffff88001ec03c78  EFLAGS: 00010246
[    2.065469] RAX: 0100000000000009 RBX: ffffea0000794780 RCX: 0000000000000c0b
[    2.065469] RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000
[    2.065469] RBP: ffff88001ec03c88 R08: 0000000000000000 R09: 0000000000000000
[    2.065469] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[    2.065469] R13: 0000000000000001 R14: ffff88001e7f6200 R15: 0000000000001000
[    2.065469] FS:  0000000000000000(0000) GS:ffff88001ec00000(0000) knlGS:0000000000000000
[    2.065469] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    2.065469] CR2: 0000000000000000 CR3: 000000000240b000 CR4: 00000000000006e0
[    2.065469] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    2.065469] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    2.065469] Stack:
[    2.065469]  ffffea0000794780 ffff88001e7f6200 ffff88001ec03cb8 ffffffff81145486
[    2.065469]  ffff88001e5cd5c0 ffff88001c02cd20 0000000000001000 0000000000000000
[    2.065469]  ffff88001ec03cc8 ffffffff81199518 ffff88001ec03d28 ffffffff81518ec3
[    2.065469] Call Trace:
[    2.065469]  <IRQ> 
[    2.065469]  [<ffffffff81145486>] end_swap_bio_read+0x96/0x130
[    2.065469]  [<ffffffff81199518>] bio_endio+0x18/0x40
[    2.065469]  [<ffffffff81518ec3>] blk_update_request+0x213/0x540
[    2.065469]  [<ffffffff81518fa0>] ? blk_update_request+0x2f0/0x540
[    2.065469]  [<ffffffff817986a6>] ? ata_hsm_qc_complete+0x46/0x130
[    2.065469]  [<ffffffff81519212>] blk_update_bidi_request+0x22/0x90
[    2.065469]  [<ffffffff8151b9ea>] blk_end_bidi_request+0x2a/0x80
[    2.065469]  [<ffffffff8151ba8b>] blk_end_request+0xb/0x10
[    2.065469]  [<ffffffff817693aa>] scsi_io_completion+0xaa/0x6b0
[    2.065469]  [<ffffffff817608d8>] scsi_finish_command+0xc8/0x130
[    2.065469]  [<ffffffff81769aff>] scsi_softirq_done+0x13f/0x160
[    2.065469]  [<ffffffff81521ebc>] blk_done_softirq+0x7c/0x90
[    2.065469]  [<ffffffff81049030>] __do_softirq+0x130/0x3f0
[    2.065469]  [<ffffffff810d454e>] ? handle_irq_event+0x4e/0x70
[    2.065469]  [<ffffffff81049405>] irq_exit+0xa5/0xb0
[    2.065469]  [<ffffffff81003cb1>] do_IRQ+0x61/0xe0
[    2.065469]  [<ffffffff81c2832f>] common_interrupt+0x6f/0x6f
[    2.065469]  <EOI> 
[    2.065469]  [<ffffffff8107ebff>] ? local_clock+0x4f/0x60
[    2.065469]  [<ffffffff81c27f85>] ? _raw_spin_unlock_irq+0x35/0x50
[    2.065469]  [<ffffffff81c27f7b>] ? _raw_spin_unlock_irq+0x2b/0x50
[    2.065469]  [<ffffffff81078bd0>] finish_task_switch+0x80/0x110
[    2.065469]  [<ffffffff81078b93>] ? finish_task_switch+0x43/0x110
[    2.065469]  [<ffffffff81c2525c>] __schedule+0x32c/0x8c0
[    2.065469]  [<ffffffff81c2c010>] ? notifier_call_chain+0x150/0x150
[    2.065469]  [<ffffffff81c259d4>] schedule+0x24/0x70
[    2.065469]  [<ffffffff81c25d42>] schedule_preempt_disabled+0x22/0x30
[    2.065469]  [<ffffffff81093645>] cpu_startup_entry+0x335/0x380
[    2.065469]  [<ffffffff81c1ed7e>] start_secondary+0x217/0x219
[    2.065469] Code: 69 bc 16 82 48 c7 c7 77 bc 16 82 31 c0 49 c1 ec 39 49 c1 e9 10 41 83 e1 01 e8 6c d2 ad 00 5b 4a 8b 04 e5 e0 bf 14 83 41 5c c9 c3 <0f> 0b eb fe 90 48 8b 07 55 48 89 e5 a9 00 00 01 00 74 12 e8 3d 
[    2.065469] RIP  [<ffffffff811462eb>] page_swap_info+0xab/0xb0
[    2.065469]  RSP <ffff88001ec03c78>

-- 
Regards,
    Artem

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [BUG] non-swapcache page in end_swap_bio_read()
  2013-06-07 12:59 [BUG] non-swapcache page in end_swap_bio_read() Artem Savkov
@ 2013-06-07 15:26 ` Minchan Kim
  2013-06-07 20:23   ` [PATCH] non-swapcache pages " Artem Savkov
  0 siblings, 1 reply; 5+ messages in thread
From: Minchan Kim @ 2013-06-07 15:26 UTC (permalink / raw)
  To: Artem Savkov
  Cc: Dan Magenheimer, Andrew Morton, Rafael J. Wysocki, linux-mm,
	linux-pm, linux-kernel

Hello Artem,

On Fri, Jun 07, 2013 at 04:59:09PM +0400, Artem Savkov wrote:
> Hello all,
> 
> I'm hitting the following BUG_ON during boot when
> CONFIG_PM_STD_PARTITION or "resume" kernel boot option are set. Looks
> like this issue was introduced in (or brought up to light by)
> "mm: remove compressed copy from zram in-memory"
> (84e5bb4f06e6d6f0c4dfc033b4700702ed8aaccc in linux-next.git)
> What happens is that during swsusp_check() bio is created with
> bio_end_io set to end_swap_bio_read(), but the page is not in swap
> cache.

True. I totally missed it.

> Not sure how to handle this the right way, but proceeding with the
> optimization in end_swap_bio_read() only after checking PageSwapCache
> flag does help.

I'd like to go with your way.
We already have SetPageUptodate so PageSwapCache's cost would be really
cheap from hitting from cacheline. Even, we can optimize it with unlikely
hint.

The credit should be yours so could you send a patch with your SOB?
Please write a comment in code about that why we need such check.

If you have any problem about sending a patch, I will send it for you
on Monday if others don't suggest another solution.

Thanks for the report!

> 
> [    2.065206] kernel BUG at mm/swapfile.c:2361!
> [    2.065469] invalid opcode: 0000 [#1] SMP 
> [    2.065469] Modules linked in:
> [    2.065469] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-rc4-next-20130607+ #61
> [    2.065469] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [    2.065469] task: ffff88001e5ccfc0 ti: ffff88001e5ea000 task.ti: ffff88001e5ea000
> [    2.065469] RIP: 0010:[<ffffffff811462eb>]  [<ffffffff811462eb>] page_swap_info+0xab/0xb0
> [    2.065469] RSP: 0000:ffff88001ec03c78  EFLAGS: 00010246
> [    2.065469] RAX: 0100000000000009 RBX: ffffea0000794780 RCX: 0000000000000c0b
> [    2.065469] RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000
> [    2.065469] RBP: ffff88001ec03c88 R08: 0000000000000000 R09: 0000000000000000
> [    2.065469] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [    2.065469] R13: 0000000000000001 R14: ffff88001e7f6200 R15: 0000000000001000
> [    2.065469] FS:  0000000000000000(0000) GS:ffff88001ec00000(0000) knlGS:0000000000000000
> [    2.065469] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [    2.065469] CR2: 0000000000000000 CR3: 000000000240b000 CR4: 00000000000006e0
> [    2.065469] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [    2.065469] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [    2.065469] Stack:
> [    2.065469]  ffffea0000794780 ffff88001e7f6200 ffff88001ec03cb8 ffffffff81145486
> [    2.065469]  ffff88001e5cd5c0 ffff88001c02cd20 0000000000001000 0000000000000000
> [    2.065469]  ffff88001ec03cc8 ffffffff81199518 ffff88001ec03d28 ffffffff81518ec3
> [    2.065469] Call Trace:
> [    2.065469]  <IRQ> 
> [    2.065469]  [<ffffffff81145486>] end_swap_bio_read+0x96/0x130
> [    2.065469]  [<ffffffff81199518>] bio_endio+0x18/0x40
> [    2.065469]  [<ffffffff81518ec3>] blk_update_request+0x213/0x540
> [    2.065469]  [<ffffffff81518fa0>] ? blk_update_request+0x2f0/0x540
> [    2.065469]  [<ffffffff817986a6>] ? ata_hsm_qc_complete+0x46/0x130
> [    2.065469]  [<ffffffff81519212>] blk_update_bidi_request+0x22/0x90
> [    2.065469]  [<ffffffff8151b9ea>] blk_end_bidi_request+0x2a/0x80
> [    2.065469]  [<ffffffff8151ba8b>] blk_end_request+0xb/0x10
> [    2.065469]  [<ffffffff817693aa>] scsi_io_completion+0xaa/0x6b0
> [    2.065469]  [<ffffffff817608d8>] scsi_finish_command+0xc8/0x130
> [    2.065469]  [<ffffffff81769aff>] scsi_softirq_done+0x13f/0x160
> [    2.065469]  [<ffffffff81521ebc>] blk_done_softirq+0x7c/0x90
> [    2.065469]  [<ffffffff81049030>] __do_softirq+0x130/0x3f0
> [    2.065469]  [<ffffffff810d454e>] ? handle_irq_event+0x4e/0x70
> [    2.065469]  [<ffffffff81049405>] irq_exit+0xa5/0xb0
> [    2.065469]  [<ffffffff81003cb1>] do_IRQ+0x61/0xe0
> [    2.065469]  [<ffffffff81c2832f>] common_interrupt+0x6f/0x6f
> [    2.065469]  <EOI> 
> [    2.065469]  [<ffffffff8107ebff>] ? local_clock+0x4f/0x60
> [    2.065469]  [<ffffffff81c27f85>] ? _raw_spin_unlock_irq+0x35/0x50
> [    2.065469]  [<ffffffff81c27f7b>] ? _raw_spin_unlock_irq+0x2b/0x50
> [    2.065469]  [<ffffffff81078bd0>] finish_task_switch+0x80/0x110
> [    2.065469]  [<ffffffff81078b93>] ? finish_task_switch+0x43/0x110
> [    2.065469]  [<ffffffff81c2525c>] __schedule+0x32c/0x8c0
> [    2.065469]  [<ffffffff81c2c010>] ? notifier_call_chain+0x150/0x150
> [    2.065469]  [<ffffffff81c259d4>] schedule+0x24/0x70
> [    2.065469]  [<ffffffff81c25d42>] schedule_preempt_disabled+0x22/0x30
> [    2.065469]  [<ffffffff81093645>] cpu_startup_entry+0x335/0x380
> [    2.065469]  [<ffffffff81c1ed7e>] start_secondary+0x217/0x219
> [    2.065469] Code: 69 bc 16 82 48 c7 c7 77 bc 16 82 31 c0 49 c1 ec 39 49 c1 e9 10 41 83 e1 01 e8 6c d2 ad 00 5b 4a 8b 04 e5 e0 bf 14 83 41 5c c9 c3 <0f> 0b eb fe 90 48 8b 07 55 48 89 e5 a9 00 00 01 00 74 12 e8 3d 
> [    2.065469] RIP  [<ffffffff811462eb>] page_swap_info+0xab/0xb0
> [    2.065469]  RSP <ffff88001ec03c78>
> 
> -- 
> Regards,
>     Artem
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] non-swapcache pages in end_swap_bio_read()
  2013-06-07 15:26 ` Minchan Kim
@ 2013-06-07 20:23   ` Artem Savkov
  2013-06-07 20:43     ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Artem Savkov @ 2013-06-07 20:23 UTC (permalink / raw)
  To: minchan.kernel.2
  Cc: dan.magenheimer, akpm, rjw, linux-mm, linux-pm, linux-kernel,
	Artem Savkov

There is no guarantee that page in end_swap_bio_read is in swapcache so we need
to check it before calling page_swap_info(). Otherwise kernel hits a bug on
like the one below.
Introduced in "mm: remove compressed copy from zram in-memory"

kernel BUG at mm/swapfile.c:2361!
invalid opcode: 0000 [#1] SMP
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-rc4-next-20130607+ #61
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
task: ffff88001e5ccfc0 ti: ffff88001e5ea000 task.ti: ffff88001e5ea000
RIP: 0010:[<ffffffff811462eb>]  [<ffffffff811462eb>] page_swap_info+0xab/0xb0
RSP: 0000:ffff88001ec03c78  EFLAGS: 00010246
RAX: 0100000000000009 RBX: ffffea0000794780 RCX: 0000000000000c0b
RDX: 0000000000000046 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff88001ec03c88 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000001 R14: ffff88001e7f6200 R15: 0000000000001000
FS:  0000000000000000(0000) GS:ffff88001ec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 000000000240b000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Stack:
ffffea0000794780 ffff88001e7f6200 ffff88001ec03cb8 ffffffff81145486
ffff88001e5cd5c0 ffff88001c02cd20 0000000000001000 0000000000000000
ffff88001ec03cc8 ffffffff81199518 ffff88001ec03d28 ffffffff81518ec3
Call Trace:
<IRQ>
[<ffffffff81145486>] end_swap_bio_read+0x96/0x130
[<ffffffff81199518>] bio_endio+0x18/0x40
[<ffffffff81518ec3>] blk_update_request+0x213/0x540
[<ffffffff81518fa0>] ? blk_update_request+0x2f0/0x540
[<ffffffff817986a6>] ? ata_hsm_qc_complete+0x46/0x130
[<ffffffff81519212>] blk_update_bidi_request+0x22/0x90
[<ffffffff8151b9ea>] blk_end_bidi_request+0x2a/0x80
[<ffffffff8151ba8b>] blk_end_request+0xb/0x10
[<ffffffff817693aa>] scsi_io_completion+0xaa/0x6b0
[<ffffffff817608d8>] scsi_finish_command+0xc8/0x130
[<ffffffff81769aff>] scsi_softirq_done+0x13f/0x160
[<ffffffff81521ebc>] blk_done_softirq+0x7c/0x90
[<ffffffff81049030>] __do_softirq+0x130/0x3f0
[<ffffffff810d454e>] ? handle_irq_event+0x4e/0x70
[<ffffffff81049405>] irq_exit+0xa5/0xb0
[<ffffffff81003cb1>] do_IRQ+0x61/0xe0
[<ffffffff81c2832f>] common_interrupt+0x6f/0x6f
<EOI>
[<ffffffff8107ebff>] ? local_clock+0x4f/0x60
[<ffffffff81c27f85>] ? _raw_spin_unlock_irq+0x35/0x50
[<ffffffff81c27f7b>] ? _raw_spin_unlock_irq+0x2b/0x50
[<ffffffff81078bd0>] finish_task_switch+0x80/0x110
[<ffffffff81078b93>] ? finish_task_switch+0x43/0x110
[<ffffffff81c2525c>] __schedule+0x32c/0x8c0
[<ffffffff81c2c010>] ? notifier_call_chain+0x150/0x150
[<ffffffff81c259d4>] schedule+0x24/0x70
[<ffffffff81c25d42>] schedule_preempt_disabled+0x22/0x30
[<ffffffff81093645>] cpu_startup_entry+0x335/0x380
[<ffffffff81c1ed7e>] start_secondary+0x217/0x219
Code: 69 bc 16 82 48 c7 c7 77 bc 16 82 31 c0 49 c1 ec 39 49 c1 e9 10 41 83 e1 01 e8 6c d2 ad 00 5b 4a 8b 04 e5 e0 bf 14 83 41 5c c9 c3 <0f> 0b eb fe 90 48 8b 07 55 48 89 e5 a9 00 00 01 00 74 12 e8 3d
RIP  [<ffffffff811462eb>] page_swap_info+0xab/0xb0
RSP <ffff88001ec03c78>

Signed-off-by: Artem Savkov <artem.savkov@gmail.com>
---
 mm/page_io.c | 68 +++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 38 insertions(+), 30 deletions(-)

diff --git a/mm/page_io.c b/mm/page_io.c
index 1897abb..2b76ac7 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -82,38 +82,46 @@ void end_swap_bio_read(struct bio *bio, int err)
 				iminor(bio->bi_bdev->bd_inode),
 				(unsigned long long)bio->bi_sector);
 	} else {
-		struct swap_info_struct *sis;
-
 		SetPageUptodate(page);
-		sis = page_swap_info(page);
-		if (sis->flags & SWP_BLKDEV) {
-			/*
-			 * The swap subsystem performs lazy swap slot freeing,
-			 * expecting that the page will be swapped out again.
-			 * So we can avoid an unnecessary write if the page
-			 * isn't redirtied.
-			 * This is good for real swap storage because we can
-			 * reduce unnecessary I/O and enhance wear-leveling
-			 * if an SSD is used as the as swap device.
-			 * But if in-memory swap device (eg zram) is used,
-			 * this causes a duplicated copy between uncompressed
-			 * data in VM-owned memory and compressed data in
-			 * zram-owned memory.  So let's free zram-owned memory
-			 * and make the VM-owned decompressed page *dirty*,
-			 * so the page should be swapped out somewhere again if
-			 * we again wish to reclaim it.
-			 */
-			struct gendisk *disk = sis->bdev->bd_disk;
-			if (disk->fops->swap_slot_free_notify) {
-				swp_entry_t entry;
-				unsigned long offset;
 
-				entry.val = page_private(page);
-				offset = swp_offset(entry);
-
-				SetPageDirty(page);
-				disk->fops->swap_slot_free_notify(sis->bdev,
-						offset);
+		/*
+		 * There is no guarantee that the page is in swap cache, so
+		 * we need to check PG_swapcache before proceeding with this
+		 * optimization.
+		 */
+		if (unlikely(PageSwapCache(page))) {
+			struct swap_info_struct *sis;
+
+			sis = page_swap_info(page);
+			if (sis->flags & SWP_BLKDEV) {
+				/*
+				 * The swap subsystem performs lazy swap slot freeing,
+				 * expecting that the page will be swapped out again.
+				 * So we can avoid an unnecessary write if the page
+				 * isn't redirtied.
+				 * This is good for real swap storage because we can
+				 * reduce unnecessary I/O and enhance wear-leveling
+				 * if an SSD is used as the as swap device.
+				 * But if in-memory swap device (eg zram) is used,
+				 * this causes a duplicated copy between uncompressed
+				 * data in VM-owned memory and compressed data in
+				 * zram-owned memory.  So let's free zram-owned memory
+				 * and make the VM-owned decompressed page *dirty*,
+				 * so the page should be swapped out somewhere again if
+				 * we again wish to reclaim it.
+				 */
+				struct gendisk *disk = sis->bdev->bd_disk;
+				if (disk->fops->swap_slot_free_notify) {
+					swp_entry_t entry;
+					unsigned long offset;
+
+					entry.val = page_private(page);
+					offset = swp_offset(entry);
+
+					SetPageDirty(page);
+					disk->fops->swap_slot_free_notify(sis->bdev,
+							offset);
+				}
 			}
 		}
 	}
-- 
1.8.3


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] non-swapcache pages in end_swap_bio_read()
  2013-06-07 20:23   ` [PATCH] non-swapcache pages " Artem Savkov
@ 2013-06-07 20:43     ` Andrew Morton
  2013-06-07 20:51       ` Joe Perches
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2013-06-07 20:43 UTC (permalink / raw)
  To: Artem Savkov
  Cc: minchan.kernel.2, dan.magenheimer, rjw, linux-mm, linux-pm, linux-kernel

On Sat,  8 Jun 2013 00:23:18 +0400 Artem Savkov <artem.savkov@gmail.com> wrote:

> There is no guarantee that page in end_swap_bio_read is in swapcache so we need
> to check it before calling page_swap_info(). Otherwise kernel hits a bug on
> like the one below.
> Introduced in "mm: remove compressed copy from zram in-memory"
> 
> kernel BUG at mm/swapfile.c:2361!

Fair enough.

> --- a/mm/page_io.c
> +++ b/mm/page_io.c
>
> ...
>
> +		/*
> +		 * There is no guarantee that the page is in swap cache, so
> +		 * we need to check PG_swapcache before proceeding with this
> +		 * optimization.
> +		 */

My initial thought was "how the heck can this not be a swapcache page".
So let's add the important details to this comment.

> +		if (unlikely(PageSwapCache(page))) {

Surely this is "likely".

> +			struct swap_info_struct *sis;
> +
> +			sis = page_swap_info(page);
> +			if (sis->flags & SWP_BLKDEV) {
> +				/*
> +				 * The swap subsystem performs lazy swap slot freeing,
> +				 * expecting that the page will be swapped out again.
> +				 * So we can avoid an unnecessary write if the page
> +				 * isn't redirtied.
> +				 * This is good for real swap storage because we can
> +				 * reduce unnecessary I/O and enhance wear-leveling
> +				 * if an SSD is used as the as swap device.
> +				 * But if in-memory swap device (eg zram) is used,
> +				 * this causes a duplicated copy between uncompressed
> +				 * data in VM-owned memory and compressed data in
> +				 * zram-owned memory.  So let's free zram-owned memory
> +				 * and make the VM-owned decompressed page *dirty*,
> +				 * so the page should be swapped out somewhere again if
> +				 * we again wish to reclaim it.
> +				 */

This comment now makes a horrid mess in an 80-col display.  I think we
may as well remove a tabstop here.

> +				struct gendisk *disk = sis->bdev->bd_disk;
> +				if (disk->fops->swap_slot_free_notify) {
> +					swp_entry_t entry;
> +					unsigned long offset;
> +
> +					entry.val = page_private(page);
> +					offset = swp_offset(entry);
> +
> +					SetPageDirty(page);
> +					disk->fops->swap_slot_free_notify(sis->bdev,
> +							offset);
> +				}
>  			}
>  		}
>  	}

Result:

--- a/mm/page_io.c~mm-remove-compressed-copy-from-zram-in-memory-fix-2-fix
+++ a/mm/page_io.c
@@ -81,51 +81,54 @@ void end_swap_bio_read(struct bio *bio,
 				imajor(bio->bi_bdev->bd_inode),
 				iminor(bio->bi_bdev->bd_inode),
 				(unsigned long long)bio->bi_sector);
-	} else {
-		SetPageUptodate(page);
+		goto out;
+	}
+
+	SetPageUptodate(page);
+
+	/*
+	 * There is no guarantee that the page is in swap cache - the software
+	 * suspend code (at least) uses end_swap_bio_read() against a non-
+	 * swapcache page.  So we must check PG_swapcache before proceeding with
+	 * this optimization.
+	 */
+	if (likely(PageSwapCache(page))) {
+		struct swap_info_struct *sis;
+
+		sis = page_swap_info(page);
+		if (sis->flags & SWP_BLKDEV) {
+			/*
+			 * The swap subsystem performs lazy swap slot freeing,
+			 * expecting that the page will be swapped out again.
+			 * So we can avoid an unnecessary write if the page
+			 * isn't redirtied.
+			 * This is good for real swap storage because we can
+			 * reduce unnecessary I/O and enhance wear-leveling
+			 * if an SSD is used as the as swap device.
+			 * But if in-memory swap device (eg zram) is used,
+			 * this causes a duplicated copy between uncompressed
+			 * data in VM-owned memory and compressed data in
+			 * zram-owned memory.  So let's free zram-owned memory
+			 * and make the VM-owned decompressed page *dirty*,
+			 * so the page should be swapped out somewhere again if
+			 * we again wish to reclaim it.
+			 */
+			struct gendisk *disk = sis->bdev->bd_disk;
+			if (disk->fops->swap_slot_free_notify) {
+				swp_entry_t entry;
+				unsigned long offset;
+
+				entry.val = page_private(page);
+				offset = swp_offset(entry);
 
-		/*
-		 * There is no guarantee that the page is in swap cache, so
-		 * we need to check PG_swapcache before proceeding with this
-		 * optimization.
-		 */
-		if (unlikely(PageSwapCache(page))) {
-			struct swap_info_struct *sis;
-
-			sis = page_swap_info(page);
-			if (sis->flags & SWP_BLKDEV) {
-				/*
-				 * The swap subsystem performs lazy swap slot freeing,
-				 * expecting that the page will be swapped out again.
-				 * So we can avoid an unnecessary write if the page
-				 * isn't redirtied.
-				 * This is good for real swap storage because we can
-				 * reduce unnecessary I/O and enhance wear-leveling
-				 * if an SSD is used as the as swap device.
-				 * But if in-memory swap device (eg zram) is used,
-				 * this causes a duplicated copy between uncompressed
-				 * data in VM-owned memory and compressed data in
-				 * zram-owned memory.  So let's free zram-owned memory
-				 * and make the VM-owned decompressed page *dirty*,
-				 * so the page should be swapped out somewhere again if
-				 * we again wish to reclaim it.
-				 */
-				struct gendisk *disk = sis->bdev->bd_disk;
-				if (disk->fops->swap_slot_free_notify) {
-					swp_entry_t entry;
-					unsigned long offset;
-
-					entry.val = page_private(page);
-					offset = swp_offset(entry);
-
-					SetPageDirty(page);
-					disk->fops->swap_slot_free_notify(sis->bdev,
-							offset);
-				}
+				SetPageDirty(page);
+				disk->fops->swap_slot_free_notify(sis->bdev,
+						offset);
 			}
 		}
 	}
 
+out:
 	unlock_page(page);
 	bio_put(bio);
 }
_


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] non-swapcache pages in end_swap_bio_read()
  2013-06-07 20:43     ` Andrew Morton
@ 2013-06-07 20:51       ` Joe Perches
  0 siblings, 0 replies; 5+ messages in thread
From: Joe Perches @ 2013-06-07 20:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Artem Savkov, minchan.kernel.2, dan.magenheimer, rjw, linux-mm,
	linux-pm, linux-kernel

On Fri, 2013-06-07 at 13:43 -0700, Andrew Morton wrote:
> On Sat,  8 Jun 2013 00:23:18 +0400 Artem Savkov <artem.savkov@gmail.com> wrote:
[]
> +++ a/mm/page_io.c
[]
> +	/*
> +	 * There is no guarantee that the page is in swap cache - the software
> +	 * suspend code (at least) uses end_swap_bio_read() against a non-
> +	 * swapcache page.  So we must check PG_swapcache before proceeding with
> +	 * this optimization.
> +	 */
> +	if (likely(PageSwapCache(page))) {

or

	if (unlikely(!PageSwapCache(page)))
		goto out;

to save an indent level

> +out:
>  	unlock_page(page);
>  	bio_put(bio);
>  }



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-06-07 20:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-07 12:59 [BUG] non-swapcache page in end_swap_bio_read() Artem Savkov
2013-06-07 15:26 ` Minchan Kim
2013-06-07 20:23   ` [PATCH] non-swapcache pages " Artem Savkov
2013-06-07 20:43     ` Andrew Morton
2013-06-07 20:51       ` Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).