All of lore.kernel.org
 help / color / mirror / Atom feed
* PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
@ 2014-07-14 12:58 Ingo Korb
  2014-07-14 19:22   ` Hugh Dickins
  0 siblings, 1 reply; 23+ messages in thread
From: Ingo Korb @ 2014-07-14 12:58 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4252 bytes --]

Hi,

repeated mapping of the same file on tmpfs using remap_file_pages
sometimes triggers a "BUG at mm/filemap.c:202" when the process exits, log
message below. The system is an x86_64 VirtualBox machine with 2GB of RAM
running Debian, but it could also be reproduced on a non-virtualized
laptop.

The bug can be triggered in Linux 3.16-rc5, bisecting has located d7c17551
as the first failing commit (mm: implement ->map_pages for shmem/tmpfs).

A test program for this has been attached (I don't trust this webmailer to
not mangle it). With the parameters set in the source code, the BUG
message should be triggered within a small number of tries (usually the
first or second). Changing the size of the memory map sometimes delays the
bug ("while true; do ./remap-demo; done" should still trigger it within a
few seconds) or avoids it completely - I don't see any patterns yet. Using
(at least) two different mappings for the file, each of which has been
remapped seem to be a requirement for triggering it.

Implementing the same mappings using mmap() does not appear to cause any
problems, but I assume that someone might care about this problem while
remap_file_pages() is still in the kernel.

-ik


------------[ cut here ]------------
kernel BUG at mm/filemap.c:202!
invalid opcode: 0000 [#1] SMP
Modules linked in: uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
fscache sunrpc ext3 jbd loop joydev hid_generic usbhid hid psmouse
parport_pc ohci_pci ohci_hcd ehci_hcd usbcore ac i2c_piix4 pcspkr
serio_raw evdev parport battery button processor i2c_core usb_common
microcode thermal_sys ext4 crc16 jbd2 mbcache sr_mod cdrom sg sd_mod
crc_t10dif crct10dif_common ata_generic e1000 ahci libahci ata_piix libata
scsi_mod
CPU: 3 PID: 2992 Comm: test Not tainted 3.16.0-rc5ik1 #37
Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 task:
ffff88005a9363d0 ti: ffff880037968000 task.ti: ffff880037968000 RIP:
0010:[<ffffffff810db4d3>]  [<ffffffff810db4d3>]
__delete_from_page_cache+0x16f/0x1f6
RSP: 0018:ffff88003796bba8  EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffffea00012ee220 RCX: 00000000ffffffe2
RDX: 0000000000000018 RSI: 0000000000000018 RDI: ffff88005dbeb700
RBP: ffff8800378d1c10 R08: ffff88005dbeb700 R09: 0000000000000013
R10: 0000000000000013 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000003 R14: ffff8800378d1c18 R15: 000000000000000f
FS:  0000000000000000(0000) GS:ffff88005d980000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f69ad38fa30 CR3: 0000000001611000 CR4: 00000000000006e0
Stack:
 0000000000000002 000000000000000f ffff880059899008 ffff8800598990a8
ffff8800378d1c10 ffffea00012ee220 ffff8800378d1c28 0000000000000000
ffff8800378d1ac0 ffff8800374d0600 0000000000000001 ffffffff810db65b
Call Trace:
 [<ffffffff810db65b>] ? delete_from_page_cache+0x32/0x56
 [<ffffffff810e621d>] ? truncate_inode_page+0x62/0x69
 [<ffffffff810edf29>] ? shmem_undo_range+0x13f/0x3f3
 [<ffffffff810df855>] ? get_pfnblock_flags_mask+0x1d/0x4d
 [<ffffffff810e0bcb>] ? free_hot_cold_page+0x76/0x134
 [<ffffffff810e528a>] ? release_pages+0x171/0x180
 [<ffffffff810e4aa2>] ? hpage_nr_pages+0x1b/0x1b
 [<ffffffff811418df>] ? __inode_wait_for_writeback+0x67/0xae
 [<ffffffff810ee1e8>] ? shmem_truncate_range+0xb/0x25
 [<ffffffff810ee76d>] ? shmem_evict_inode+0x4f/0xed
 [<ffffffff810ee71e>] ? shmem_file_setup+0x7/0x7
 [<ffffffff81136947>] ? evict+0xa3/0x147
 [<ffffffff81133576>] ? __dentry_kill+0x103/0x173
 [<ffffffff81133983>] ? dput+0x133/0x150
 [<ffffffff8112489d>] ? __fput+0x163/0x184
 [<ffffffff8105f10c>] ? task_work_run+0x7b/0x8f
 [<ffffffff81049c69>] ? do_exit+0x3f6/0x904
 [<ffffffff8104a282>] ? do_group_exit+0x68/0x9a
 [<ffffffff8104a2c4>] ? SyS_exit_group+0x10/0x10
 [<ffffffff8138fb69>] ? system_call_fastpath+0x16/0x1b
Code: be 0a 00 00 00 48 89 df e8 96 5b 01 00 48 8b 03 a9 00 00 08 00 74 0d
be 18 00 00 00 48 89 df e8 7f 5b 01 00 8b 43 18 85 c0 78 02 <0f> 0b 48 8b
03 a8 10 74 6f 48 8b 85 88 00 00 00 f6 40 20 01 75
RIP  [<ffffffff810db4d3>] __delete_from_page_cache+0x16f/0x1f6
 RSP <ffff88003796bba8>
---[ end trace 79ae5bd27fcedca9 ]---
Fixing recursive fault but reboot is needed!
BUG: Bad rss-counter state mm:ffff88005aae60c0 idx:0 val:1




[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: remap-demo.c --]
[-- Type: text/x-csrc; name="remap-demo.c", Size: 1884 bytes --]

#define _GNU_SOURCE
#include <sys/mman.h>
#include <sys/resource.h>
#include <errno.h>
#include <limits.h>
#include <malloc.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define PAGE_SIZE 4096
// NOTE: DATA=MAP2=16 seems to trigger in the first few tries
// NOTE: 9/9 needs a loop and a few seconds to trigger
// NOTE: DATA=9, MAP2=8 does not trigger
#define DATA_SIZE 16
#define MAP2_SIZE 16

int shmfd;
char shmpath[] = "/dev/shm/mmaptest-XXXXXX";
unsigned char *map1, *map2;
unsigned int i;

int main(int argc, char *argv[]) {
  /* create a data file on tmpfs */
  shmfd = mkstemp(shmpath);
  if (shmfd < 0) {
    perror("mkstemp");
    exit(2);
  }

  if (unlink(shmpath)) {
    perror("unlink");
    exit(2);
  }

  if (ftruncate(shmfd, DATA_SIZE * PAGE_SIZE)) {
    perror("ftruncate");
    exit(2);
  }

  /* map a single page from the file */
  map1 = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0);
  if (map1 == MAP_FAILED) {
    perror("mmap 1");
    exit(2);
  }

  /* remap it to another page in the file */
  // NOTE: Does not trigger without remapping
  // NOTE: Does not trigger for 7, but does trigger for 8 if both sizes are 16
  //  (DATA_SIZE-2 is sufficiently generic here)
  if (remap_file_pages(map1, PAGE_SIZE, 0, DATA_SIZE - 2, MAP_SHARED)) {
    perror("remap_file_pages 1");
    exit(2);
  }

  /* create a second mapping */
  map2 = mmap(NULL, MAP2_SIZE * PAGE_SIZE, PROT_READ | PROT_WRITE,
              MAP_SHARED, shmfd, 0);
  if (map2 == MAP_FAILED) {
    perror("mmap 2");
    exit(2);
  }

  /* map all of its pages to page 0 */
  // NOTE: Remapping only the last page does not trigger
  for (i = 0; i < MAP2_SIZE; i++) {
    if (remap_file_pages(map2 + PAGE_SIZE * i, PAGE_SIZE, 0, 0, MAP_SHARED)) {
      perror("remap_file_pages 3");
      exit(2);
    }
  }

  close(shmfd);

  exit(0);
}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
  2014-07-14 12:58 PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit Ingo Korb
@ 2014-07-14 19:22   ` Hugh Dickins
  0 siblings, 0 replies; 23+ messages in thread
From: Hugh Dickins @ 2014-07-14 19:22 UTC (permalink / raw)
  To: Ingo Korb
  Cc: Kirill A. Shutemov, Konstantin Khlebnikov, Ning Qu, Dave Jones,
	Sasha Levin, Andrew Morton, linux-mm, linux-kernel

On Mon, 14 Jul 2014, Ingo Korb wrote:

> Hi,
> 
> repeated mapping of the same file on tmpfs using remap_file_pages
> sometimes triggers a "BUG at mm/filemap.c:202" when the process exits, log
> message below. The system is an x86_64 VirtualBox machine with 2GB of RAM
> running Debian, but it could also be reproduced on a non-virtualized
> laptop.
> 
> The bug can be triggered in Linux 3.16-rc5, bisecting has located d7c17551
> as the first failing commit (mm: implement ->map_pages for shmem/tmpfs).
> 
> A test program for this has been attached (I don't trust this webmailer to
> not mangle it). With the parameters set in the source code, the BUG
> message should be triggered within a small number of tries (usually the
> first or second). Changing the size of the memory map sometimes delays the
> bug ("while true; do ./remap-demo; done" should still trigger it within a
> few seconds) or avoids it completely - I don't see any patterns yet. Using
> (at least) two different mappings for the file, each of which has been
> remapped seem to be a requirement for triggering it.
> 
> Implementing the same mappings using mmap() does not appear to cause any
> problems, but I assume that someone might care about this problem while
> remap_file_pages() is still in the kernel.

This is very good news :)  Thank you so much for going to all this
trouble over it.  If you didn't realize, yours is not the first report
of an mm/filemap.c:202! BUG_ON(page_mapped(page)), but most of them
have happened when using the Trinity fuzzer (known to be fond of tmpfs
and remap_file_pages), and too rare to track down further.

I have several times in recent months eyed the (old) remap_file_pages
code, and the filemap_map_pages code, hoping to find the answer in one
or the other; but had no success.

Kirill, Konstantin, would either of you have a moment to try and track
this down further?  I'd love to, but I am _still_ not finished with the
fallocate hang business, then sealing review, then plenty beyond that.
Ingo's remap-demo.c inline below.

Of course, one option will be just to revert d7c17551; but I'd much
rather track down the bug and fix it, if we can in the next couple of
weeks - even if it does turn out to be in code removed in 3.17.

Thanks!
Hugh

> 
> -ik
> 
> 
> ------------[ cut here ]------------
> kernel BUG at mm/filemap.c:202!
> invalid opcode: 0000 [#1] SMP
> Modules linked in: uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
> fscache sunrpc ext3 jbd loop joydev hid_generic usbhid hid psmouse
> parport_pc ohci_pci ohci_hcd ehci_hcd usbcore ac i2c_piix4 pcspkr
> serio_raw evdev parport battery button processor i2c_core usb_common
> microcode thermal_sys ext4 crc16 jbd2 mbcache sr_mod cdrom sg sd_mod
> crc_t10dif crct10dif_common ata_generic e1000 ahci libahci ata_piix libata
> scsi_mod
> CPU: 3 PID: 2992 Comm: test Not tainted 3.16.0-rc5ik1 #37
> Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 task:
> ffff88005a9363d0 ti: ffff880037968000 task.ti: ffff880037968000 RIP:
> 0010:[<ffffffff810db4d3>]  [<ffffffff810db4d3>]
> __delete_from_page_cache+0x16f/0x1f6
> RSP: 0018:ffff88003796bba8  EFLAGS: 00010046
> RAX: 0000000000000000 RBX: ffffea00012ee220 RCX: 00000000ffffffe2
> RDX: 0000000000000018 RSI: 0000000000000018 RDI: ffff88005dbeb700
> RBP: ffff8800378d1c10 R08: ffff88005dbeb700 R09: 0000000000000013
> R10: 0000000000000013 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000003 R14: ffff8800378d1c18 R15: 000000000000000f
> FS:  0000000000000000(0000) GS:ffff88005d980000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f69ad38fa30 CR3: 0000000001611000 CR4: 00000000000006e0
> Stack:
>  0000000000000002 000000000000000f ffff880059899008 ffff8800598990a8
> ffff8800378d1c10 ffffea00012ee220 ffff8800378d1c28 0000000000000000
> ffff8800378d1ac0 ffff8800374d0600 0000000000000001 ffffffff810db65b
> Call Trace:
>  [<ffffffff810db65b>] ? delete_from_page_cache+0x32/0x56
>  [<ffffffff810e621d>] ? truncate_inode_page+0x62/0x69
>  [<ffffffff810edf29>] ? shmem_undo_range+0x13f/0x3f3
>  [<ffffffff810df855>] ? get_pfnblock_flags_mask+0x1d/0x4d
>  [<ffffffff810e0bcb>] ? free_hot_cold_page+0x76/0x134
>  [<ffffffff810e528a>] ? release_pages+0x171/0x180
>  [<ffffffff810e4aa2>] ? hpage_nr_pages+0x1b/0x1b
>  [<ffffffff811418df>] ? __inode_wait_for_writeback+0x67/0xae
>  [<ffffffff810ee1e8>] ? shmem_truncate_range+0xb/0x25
>  [<ffffffff810ee76d>] ? shmem_evict_inode+0x4f/0xed
>  [<ffffffff810ee71e>] ? shmem_file_setup+0x7/0x7
>  [<ffffffff81136947>] ? evict+0xa3/0x147
>  [<ffffffff81133576>] ? __dentry_kill+0x103/0x173
>  [<ffffffff81133983>] ? dput+0x133/0x150
>  [<ffffffff8112489d>] ? __fput+0x163/0x184
>  [<ffffffff8105f10c>] ? task_work_run+0x7b/0x8f
>  [<ffffffff81049c69>] ? do_exit+0x3f6/0x904
>  [<ffffffff8104a282>] ? do_group_exit+0x68/0x9a
>  [<ffffffff8104a2c4>] ? SyS_exit_group+0x10/0x10
>  [<ffffffff8138fb69>] ? system_call_fastpath+0x16/0x1b
> Code: be 0a 00 00 00 48 89 df e8 96 5b 01 00 48 8b 03 a9 00 00 08 00 74 0d
> be 18 00 00 00 48 89 df e8 7f 5b 01 00 8b 43 18 85 c0 78 02 <0f> 0b 48 8b
> 03 a8 10 74 6f 48 8b 85 88 00 00 00 f6 40 20 01 75
> RIP  [<ffffffff810db4d3>] __delete_from_page_cache+0x16f/0x1f6
>  RSP <ffff88003796bba8>
> ---[ end trace 79ae5bd27fcedca9 ]---
> Fixing recursive fault but reboot is needed!
> BUG: Bad rss-counter state mm:ffff88005aae60c0 idx:0 val:1

And that "Bad rss-counter" report fits some of the reports too, good.

Here's Ingo's remap-demo.c inline, but I've not tried it:

#define _GNU_SOURCE
#include <sys/mman.h>
#include <sys/resource.h>
#include <errno.h>
#include <limits.h>
#include <malloc.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define PAGE_SIZE 4096
// NOTE: DATA=MAP2=16 seems to trigger in the first few tries
// NOTE: 9/9 needs a loop and a few seconds to trigger
// NOTE: DATA=9, MAP2=8 does not trigger
#define DATA_SIZE 16
#define MAP2_SIZE 16

int shmfd;
char shmpath[] = "/dev/shm/mmaptest-XXXXXX";
unsigned char *map1, *map2;
unsigned int i;

int main(int argc, char *argv[]) {
  /* create a data file on tmpfs */
  shmfd = mkstemp(shmpath);
  if (shmfd < 0) {
    perror("mkstemp");
    exit(2);
  }

  if (unlink(shmpath)) {
    perror("unlink");
    exit(2);
  }

  if (ftruncate(shmfd, DATA_SIZE * PAGE_SIZE)) {
    perror("ftruncate");
    exit(2);
  }

  /* map a single page from the file */
  map1 = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0);
  if (map1 == MAP_FAILED) {
    perror("mmap 1");
    exit(2);
  }

  /* remap it to another page in the file */
  // NOTE: Does not trigger without remapping
  // NOTE: Does not trigger for 7, but does trigger for 8 if both sizes are 16
  //  (DATA_SIZE-2 is sufficiently generic here)
  if (remap_file_pages(map1, PAGE_SIZE, 0, DATA_SIZE - 2, MAP_SHARED)) {
    perror("remap_file_pages 1");
    exit(2);
  }

  /* create a second mapping */
  map2 = mmap(NULL, MAP2_SIZE * PAGE_SIZE, PROT_READ | PROT_WRITE,
              MAP_SHARED, shmfd, 0);
  if (map2 == MAP_FAILED) {
    perror("mmap 2");
    exit(2);
  }

  /* map all of its pages to page 0 */
  // NOTE: Remapping only the last page does not trigger
  for (i = 0; i < MAP2_SIZE; i++) {
    if (remap_file_pages(map2 + PAGE_SIZE * i, PAGE_SIZE, 0, 0, MAP_SHARED)) {
      perror("remap_file_pages 3");
      exit(2);
    }
  }

  close(shmfd);

  exit(0);
}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
@ 2014-07-14 19:22   ` Hugh Dickins
  0 siblings, 0 replies; 23+ messages in thread
From: Hugh Dickins @ 2014-07-14 19:22 UTC (permalink / raw)
  To: Ingo Korb
  Cc: Kirill A. Shutemov, Konstantin Khlebnikov, Ning Qu, Dave Jones,
	Sasha Levin, Andrew Morton, linux-mm, linux-kernel

On Mon, 14 Jul 2014, Ingo Korb wrote:

> Hi,
> 
> repeated mapping of the same file on tmpfs using remap_file_pages
> sometimes triggers a "BUG at mm/filemap.c:202" when the process exits, log
> message below. The system is an x86_64 VirtualBox machine with 2GB of RAM
> running Debian, but it could also be reproduced on a non-virtualized
> laptop.
> 
> The bug can be triggered in Linux 3.16-rc5, bisecting has located d7c17551
> as the first failing commit (mm: implement ->map_pages for shmem/tmpfs).
> 
> A test program for this has been attached (I don't trust this webmailer to
> not mangle it). With the parameters set in the source code, the BUG
> message should be triggered within a small number of tries (usually the
> first or second). Changing the size of the memory map sometimes delays the
> bug ("while true; do ./remap-demo; done" should still trigger it within a
> few seconds) or avoids it completely - I don't see any patterns yet. Using
> (at least) two different mappings for the file, each of which has been
> remapped seem to be a requirement for triggering it.
> 
> Implementing the same mappings using mmap() does not appear to cause any
> problems, but I assume that someone might care about this problem while
> remap_file_pages() is still in the kernel.

This is very good news :)  Thank you so much for going to all this
trouble over it.  If you didn't realize, yours is not the first report
of an mm/filemap.c:202! BUG_ON(page_mapped(page)), but most of them
have happened when using the Trinity fuzzer (known to be fond of tmpfs
and remap_file_pages), and too rare to track down further.

I have several times in recent months eyed the (old) remap_file_pages
code, and the filemap_map_pages code, hoping to find the answer in one
or the other; but had no success.

Kirill, Konstantin, would either of you have a moment to try and track
this down further?  I'd love to, but I am _still_ not finished with the
fallocate hang business, then sealing review, then plenty beyond that.
Ingo's remap-demo.c inline below.

Of course, one option will be just to revert d7c17551; but I'd much
rather track down the bug and fix it, if we can in the next couple of
weeks - even if it does turn out to be in code removed in 3.17.

Thanks!
Hugh

> 
> -ik
> 
> 
> ------------[ cut here ]------------
> kernel BUG at mm/filemap.c:202!
> invalid opcode: 0000 [#1] SMP
> Modules linked in: uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
> fscache sunrpc ext3 jbd loop joydev hid_generic usbhid hid psmouse
> parport_pc ohci_pci ohci_hcd ehci_hcd usbcore ac i2c_piix4 pcspkr
> serio_raw evdev parport battery button processor i2c_core usb_common
> microcode thermal_sys ext4 crc16 jbd2 mbcache sr_mod cdrom sg sd_mod
> crc_t10dif crct10dif_common ata_generic e1000 ahci libahci ata_piix libata
> scsi_mod
> CPU: 3 PID: 2992 Comm: test Not tainted 3.16.0-rc5ik1 #37
> Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 task:
> ffff88005a9363d0 ti: ffff880037968000 task.ti: ffff880037968000 RIP:
> 0010:[<ffffffff810db4d3>]  [<ffffffff810db4d3>]
> __delete_from_page_cache+0x16f/0x1f6
> RSP: 0018:ffff88003796bba8  EFLAGS: 00010046
> RAX: 0000000000000000 RBX: ffffea00012ee220 RCX: 00000000ffffffe2
> RDX: 0000000000000018 RSI: 0000000000000018 RDI: ffff88005dbeb700
> RBP: ffff8800378d1c10 R08: ffff88005dbeb700 R09: 0000000000000013
> R10: 0000000000000013 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000000000000003 R14: ffff8800378d1c18 R15: 000000000000000f
> FS:  0000000000000000(0000) GS:ffff88005d980000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f69ad38fa30 CR3: 0000000001611000 CR4: 00000000000006e0
> Stack:
>  0000000000000002 000000000000000f ffff880059899008 ffff8800598990a8
> ffff8800378d1c10 ffffea00012ee220 ffff8800378d1c28 0000000000000000
> ffff8800378d1ac0 ffff8800374d0600 0000000000000001 ffffffff810db65b
> Call Trace:
>  [<ffffffff810db65b>] ? delete_from_page_cache+0x32/0x56
>  [<ffffffff810e621d>] ? truncate_inode_page+0x62/0x69
>  [<ffffffff810edf29>] ? shmem_undo_range+0x13f/0x3f3
>  [<ffffffff810df855>] ? get_pfnblock_flags_mask+0x1d/0x4d
>  [<ffffffff810e0bcb>] ? free_hot_cold_page+0x76/0x134
>  [<ffffffff810e528a>] ? release_pages+0x171/0x180
>  [<ffffffff810e4aa2>] ? hpage_nr_pages+0x1b/0x1b
>  [<ffffffff811418df>] ? __inode_wait_for_writeback+0x67/0xae
>  [<ffffffff810ee1e8>] ? shmem_truncate_range+0xb/0x25
>  [<ffffffff810ee76d>] ? shmem_evict_inode+0x4f/0xed
>  [<ffffffff810ee71e>] ? shmem_file_setup+0x7/0x7
>  [<ffffffff81136947>] ? evict+0xa3/0x147
>  [<ffffffff81133576>] ? __dentry_kill+0x103/0x173
>  [<ffffffff81133983>] ? dput+0x133/0x150
>  [<ffffffff8112489d>] ? __fput+0x163/0x184
>  [<ffffffff8105f10c>] ? task_work_run+0x7b/0x8f
>  [<ffffffff81049c69>] ? do_exit+0x3f6/0x904
>  [<ffffffff8104a282>] ? do_group_exit+0x68/0x9a
>  [<ffffffff8104a2c4>] ? SyS_exit_group+0x10/0x10
>  [<ffffffff8138fb69>] ? system_call_fastpath+0x16/0x1b
> Code: be 0a 00 00 00 48 89 df e8 96 5b 01 00 48 8b 03 a9 00 00 08 00 74 0d
> be 18 00 00 00 48 89 df e8 7f 5b 01 00 8b 43 18 85 c0 78 02 <0f> 0b 48 8b
> 03 a8 10 74 6f 48 8b 85 88 00 00 00 f6 40 20 01 75
> RIP  [<ffffffff810db4d3>] __delete_from_page_cache+0x16f/0x1f6
>  RSP <ffff88003796bba8>
> ---[ end trace 79ae5bd27fcedca9 ]---
> Fixing recursive fault but reboot is needed!
> BUG: Bad rss-counter state mm:ffff88005aae60c0 idx:0 val:1

And that "Bad rss-counter" report fits some of the reports too, good.

Here's Ingo's remap-demo.c inline, but I've not tried it:

#define _GNU_SOURCE
#include <sys/mman.h>
#include <sys/resource.h>
#include <errno.h>
#include <limits.h>
#include <malloc.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

#define PAGE_SIZE 4096
// NOTE: DATA=MAP2=16 seems to trigger in the first few tries
// NOTE: 9/9 needs a loop and a few seconds to trigger
// NOTE: DATA=9, MAP2=8 does not trigger
#define DATA_SIZE 16
#define MAP2_SIZE 16

int shmfd;
char shmpath[] = "/dev/shm/mmaptest-XXXXXX";
unsigned char *map1, *map2;
unsigned int i;

int main(int argc, char *argv[]) {
  /* create a data file on tmpfs */
  shmfd = mkstemp(shmpath);
  if (shmfd < 0) {
    perror("mkstemp");
    exit(2);
  }

  if (unlink(shmpath)) {
    perror("unlink");
    exit(2);
  }

  if (ftruncate(shmfd, DATA_SIZE * PAGE_SIZE)) {
    perror("ftruncate");
    exit(2);
  }

  /* map a single page from the file */
  map1 = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0);
  if (map1 == MAP_FAILED) {
    perror("mmap 1");
    exit(2);
  }

  /* remap it to another page in the file */
  // NOTE: Does not trigger without remapping
  // NOTE: Does not trigger for 7, but does trigger for 8 if both sizes are 16
  //  (DATA_SIZE-2 is sufficiently generic here)
  if (remap_file_pages(map1, PAGE_SIZE, 0, DATA_SIZE - 2, MAP_SHARED)) {
    perror("remap_file_pages 1");
    exit(2);
  }

  /* create a second mapping */
  map2 = mmap(NULL, MAP2_SIZE * PAGE_SIZE, PROT_READ | PROT_WRITE,
              MAP_SHARED, shmfd, 0);
  if (map2 == MAP_FAILED) {
    perror("mmap 2");
    exit(2);
  }

  /* map all of its pages to page 0 */
  // NOTE: Remapping only the last page does not trigger
  for (i = 0; i < MAP2_SIZE; i++) {
    if (remap_file_pages(map2 + PAGE_SIZE * i, PAGE_SIZE, 0, 0, MAP_SHARED)) {
      perror("remap_file_pages 3");
      exit(2);
    }
  }

  close(shmfd);

  exit(0);
}

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
  2014-07-14 19:22   ` Hugh Dickins
@ 2014-07-14 20:13     ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-14 20:13 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Ingo Korb, Kirill A. Shutemov, Ning Qu, Dave Jones, Sasha Levin,
	Andrew Morton, linux-mm, Linux Kernel Mailing List

It seems boundng logic in do_fault_around is wrong:

start_addr = max(address & fault_around_mask(), vma->vm_start);
off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
pte -= off;
pgoff -= off;

Ok, off  <= 511, but it might be bigger than pte offset in pte table.
So after pte -= off pte points into previous page.

/*
*  max_pgoff is either end of page table or end of vma
*  or fault_around_pages() from pgoff, depending what is nearest.
*/
max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
PTRS_PER_PTE - 1;
max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
pgoff + fault_around_pages() - 1);



On Mon, Jul 14, 2014 at 11:22 PM, Hugh Dickins <hughd@google.com> wrote:
> On Mon, 14 Jul 2014, Ingo Korb wrote:
>
>> Hi,
>>
>> repeated mapping of the same file on tmpfs using remap_file_pages
>> sometimes triggers a "BUG at mm/filemap.c:202" when the process exits, log
>> message below. The system is an x86_64 VirtualBox machine with 2GB of RAM
>> running Debian, but it could also be reproduced on a non-virtualized
>> laptop.
>>
>> The bug can be triggered in Linux 3.16-rc5, bisecting has located d7c17551
>> as the first failing commit (mm: implement ->map_pages for shmem/tmpfs).
>>
>> A test program for this has been attached (I don't trust this webmailer to
>> not mangle it). With the parameters set in the source code, the BUG
>> message should be triggered within a small number of tries (usually the
>> first or second). Changing the size of the memory map sometimes delays the
>> bug ("while true; do ./remap-demo; done" should still trigger it within a
>> few seconds) or avoids it completely - I don't see any patterns yet. Using
>> (at least) two different mappings for the file, each of which has been
>> remapped seem to be a requirement for triggering it.
>>
>> Implementing the same mappings using mmap() does not appear to cause any
>> problems, but I assume that someone might care about this problem while
>> remap_file_pages() is still in the kernel.
>
> This is very good news :)  Thank you so much for going to all this
> trouble over it.  If you didn't realize, yours is not the first report
> of an mm/filemap.c:202! BUG_ON(page_mapped(page)), but most of them
> have happened when using the Trinity fuzzer (known to be fond of tmpfs
> and remap_file_pages), and too rare to track down further.
>
> I have several times in recent months eyed the (old) remap_file_pages
> code, and the filemap_map_pages code, hoping to find the answer in one
> or the other; but had no success.
>
> Kirill, Konstantin, would either of you have a moment to try and track
> this down further?  I'd love to, but I am _still_ not finished with the
> fallocate hang business, then sealing review, then plenty beyond that.
> Ingo's remap-demo.c inline below.
>
> Of course, one option will be just to revert d7c17551; but I'd much
> rather track down the bug and fix it, if we can in the next couple of
> weeks - even if it does turn out to be in code removed in 3.17.
>
> Thanks!
> Hugh
>
>>
>> -ik
>>
>>
>> ------------[ cut here ]------------
>> kernel BUG at mm/filemap.c:202!
>> invalid opcode: 0000 [#1] SMP
>> Modules linked in: uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
>> fscache sunrpc ext3 jbd loop joydev hid_generic usbhid hid psmouse
>> parport_pc ohci_pci ohci_hcd ehci_hcd usbcore ac i2c_piix4 pcspkr
>> serio_raw evdev parport battery button processor i2c_core usb_common
>> microcode thermal_sys ext4 crc16 jbd2 mbcache sr_mod cdrom sg sd_mod
>> crc_t10dif crct10dif_common ata_generic e1000 ahci libahci ata_piix libata
>> scsi_mod
>> CPU: 3 PID: 2992 Comm: test Not tainted 3.16.0-rc5ik1 #37
>> Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 task:
>> ffff88005a9363d0 ti: ffff880037968000 task.ti: ffff880037968000 RIP:
>> 0010:[<ffffffff810db4d3>]  [<ffffffff810db4d3>]
>> __delete_from_page_cache+0x16f/0x1f6
>> RSP: 0018:ffff88003796bba8  EFLAGS: 00010046
>> RAX: 0000000000000000 RBX: ffffea00012ee220 RCX: 00000000ffffffe2
>> RDX: 0000000000000018 RSI: 0000000000000018 RDI: ffff88005dbeb700
>> RBP: ffff8800378d1c10 R08: ffff88005dbeb700 R09: 0000000000000013
>> R10: 0000000000000013 R11: 0000000000000000 R12: 0000000000000000
>> R13: 0000000000000003 R14: ffff8800378d1c18 R15: 000000000000000f
>> FS:  0000000000000000(0000) GS:ffff88005d980000(0000)
>> knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f69ad38fa30 CR3: 0000000001611000 CR4: 00000000000006e0
>> Stack:
>>  0000000000000002 000000000000000f ffff880059899008 ffff8800598990a8
>> ffff8800378d1c10 ffffea00012ee220 ffff8800378d1c28 0000000000000000
>> ffff8800378d1ac0 ffff8800374d0600 0000000000000001 ffffffff810db65b
>> Call Trace:
>>  [<ffffffff810db65b>] ? delete_from_page_cache+0x32/0x56
>>  [<ffffffff810e621d>] ? truncate_inode_page+0x62/0x69
>>  [<ffffffff810edf29>] ? shmem_undo_range+0x13f/0x3f3
>>  [<ffffffff810df855>] ? get_pfnblock_flags_mask+0x1d/0x4d
>>  [<ffffffff810e0bcb>] ? free_hot_cold_page+0x76/0x134
>>  [<ffffffff810e528a>] ? release_pages+0x171/0x180
>>  [<ffffffff810e4aa2>] ? hpage_nr_pages+0x1b/0x1b
>>  [<ffffffff811418df>] ? __inode_wait_for_writeback+0x67/0xae
>>  [<ffffffff810ee1e8>] ? shmem_truncate_range+0xb/0x25
>>  [<ffffffff810ee76d>] ? shmem_evict_inode+0x4f/0xed
>>  [<ffffffff810ee71e>] ? shmem_file_setup+0x7/0x7
>>  [<ffffffff81136947>] ? evict+0xa3/0x147
>>  [<ffffffff81133576>] ? __dentry_kill+0x103/0x173
>>  [<ffffffff81133983>] ? dput+0x133/0x150
>>  [<ffffffff8112489d>] ? __fput+0x163/0x184
>>  [<ffffffff8105f10c>] ? task_work_run+0x7b/0x8f
>>  [<ffffffff81049c69>] ? do_exit+0x3f6/0x904
>>  [<ffffffff8104a282>] ? do_group_exit+0x68/0x9a
>>  [<ffffffff8104a2c4>] ? SyS_exit_group+0x10/0x10
>>  [<ffffffff8138fb69>] ? system_call_fastpath+0x16/0x1b
>> Code: be 0a 00 00 00 48 89 df e8 96 5b 01 00 48 8b 03 a9 00 00 08 00 74 0d
>> be 18 00 00 00 48 89 df e8 7f 5b 01 00 8b 43 18 85 c0 78 02 <0f> 0b 48 8b
>> 03 a8 10 74 6f 48 8b 85 88 00 00 00 f6 40 20 01 75
>> RIP  [<ffffffff810db4d3>] __delete_from_page_cache+0x16f/0x1f6
>>  RSP <ffff88003796bba8>
>> ---[ end trace 79ae5bd27fcedca9 ]---
>> Fixing recursive fault but reboot is needed!
>> BUG: Bad rss-counter state mm:ffff88005aae60c0 idx:0 val:1
>
> And that "Bad rss-counter" report fits some of the reports too, good.
>
> Here's Ingo's remap-demo.c inline, but I've not tried it:
>
> #define _GNU_SOURCE
> #include <sys/mman.h>
> #include <sys/resource.h>
> #include <errno.h>
> #include <limits.h>
> #include <malloc.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <unistd.h>
>
> #define PAGE_SIZE 4096
> // NOTE: DATA=MAP2=16 seems to trigger in the first few tries
> // NOTE: 9/9 needs a loop and a few seconds to trigger
> // NOTE: DATA=9, MAP2=8 does not trigger
> #define DATA_SIZE 16
> #define MAP2_SIZE 16
>
> int shmfd;
> char shmpath[] = "/dev/shm/mmaptest-XXXXXX";
> unsigned char *map1, *map2;
> unsigned int i;
>
> int main(int argc, char *argv[]) {
>   /* create a data file on tmpfs */
>   shmfd = mkstemp(shmpath);
>   if (shmfd < 0) {
>     perror("mkstemp");
>     exit(2);
>   }
>
>   if (unlink(shmpath)) {
>     perror("unlink");
>     exit(2);
>   }
>
>   if (ftruncate(shmfd, DATA_SIZE * PAGE_SIZE)) {
>     perror("ftruncate");
>     exit(2);
>   }
>
>   /* map a single page from the file */
>   map1 = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0);
>   if (map1 == MAP_FAILED) {
>     perror("mmap 1");
>     exit(2);
>   }
>
>   /* remap it to another page in the file */
>   // NOTE: Does not trigger without remapping
>   // NOTE: Does not trigger for 7, but does trigger for 8 if both sizes are 16
>   //  (DATA_SIZE-2 is sufficiently generic here)
>   if (remap_file_pages(map1, PAGE_SIZE, 0, DATA_SIZE - 2, MAP_SHARED)) {
>     perror("remap_file_pages 1");
>     exit(2);
>   }
>
>   /* create a second mapping */
>   map2 = mmap(NULL, MAP2_SIZE * PAGE_SIZE, PROT_READ | PROT_WRITE,
>               MAP_SHARED, shmfd, 0);
>   if (map2 == MAP_FAILED) {
>     perror("mmap 2");
>     exit(2);
>   }
>
>   /* map all of its pages to page 0 */
>   // NOTE: Remapping only the last page does not trigger
>   for (i = 0; i < MAP2_SIZE; i++) {
>     if (remap_file_pages(map2 + PAGE_SIZE * i, PAGE_SIZE, 0, 0, MAP_SHARED)) {
>       perror("remap_file_pages 3");
>       exit(2);
>     }
>   }
>
>   close(shmfd);
>
>   exit(0);
> }

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
@ 2014-07-14 20:13     ` Konstantin Khlebnikov
  0 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-14 20:13 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Ingo Korb, Kirill A. Shutemov, Ning Qu, Dave Jones, Sasha Levin,
	Andrew Morton, linux-mm, Linux Kernel Mailing List

It seems boundng logic in do_fault_around is wrong:

start_addr = max(address & fault_around_mask(), vma->vm_start);
off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
pte -= off;
pgoff -= off;

Ok, off  <= 511, but it might be bigger than pte offset in pte table.
So after pte -= off pte points into previous page.

/*
*  max_pgoff is either end of page table or end of vma
*  or fault_around_pages() from pgoff, depending what is nearest.
*/
max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
PTRS_PER_PTE - 1;
max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
pgoff + fault_around_pages() - 1);



On Mon, Jul 14, 2014 at 11:22 PM, Hugh Dickins <hughd@google.com> wrote:
> On Mon, 14 Jul 2014, Ingo Korb wrote:
>
>> Hi,
>>
>> repeated mapping of the same file on tmpfs using remap_file_pages
>> sometimes triggers a "BUG at mm/filemap.c:202" when the process exits, log
>> message below. The system is an x86_64 VirtualBox machine with 2GB of RAM
>> running Debian, but it could also be reproduced on a non-virtualized
>> laptop.
>>
>> The bug can be triggered in Linux 3.16-rc5, bisecting has located d7c17551
>> as the first failing commit (mm: implement ->map_pages for shmem/tmpfs).
>>
>> A test program for this has been attached (I don't trust this webmailer to
>> not mangle it). With the parameters set in the source code, the BUG
>> message should be triggered within a small number of tries (usually the
>> first or second). Changing the size of the memory map sometimes delays the
>> bug ("while true; do ./remap-demo; done" should still trigger it within a
>> few seconds) or avoids it completely - I don't see any patterns yet. Using
>> (at least) two different mappings for the file, each of which has been
>> remapped seem to be a requirement for triggering it.
>>
>> Implementing the same mappings using mmap() does not appear to cause any
>> problems, but I assume that someone might care about this problem while
>> remap_file_pages() is still in the kernel.
>
> This is very good news :)  Thank you so much for going to all this
> trouble over it.  If you didn't realize, yours is not the first report
> of an mm/filemap.c:202! BUG_ON(page_mapped(page)), but most of them
> have happened when using the Trinity fuzzer (known to be fond of tmpfs
> and remap_file_pages), and too rare to track down further.
>
> I have several times in recent months eyed the (old) remap_file_pages
> code, and the filemap_map_pages code, hoping to find the answer in one
> or the other; but had no success.
>
> Kirill, Konstantin, would either of you have a moment to try and track
> this down further?  I'd love to, but I am _still_ not finished with the
> fallocate hang business, then sealing review, then plenty beyond that.
> Ingo's remap-demo.c inline below.
>
> Of course, one option will be just to revert d7c17551; but I'd much
> rather track down the bug and fix it, if we can in the next couple of
> weeks - even if it does turn out to be in code removed in 3.17.
>
> Thanks!
> Hugh
>
>>
>> -ik
>>
>>
>> ------------[ cut here ]------------
>> kernel BUG at mm/filemap.c:202!
>> invalid opcode: 0000 [#1] SMP
>> Modules linked in: uinput nfsd auth_rpcgss oid_registry nfs_acl nfs lockd
>> fscache sunrpc ext3 jbd loop joydev hid_generic usbhid hid psmouse
>> parport_pc ohci_pci ohci_hcd ehci_hcd usbcore ac i2c_piix4 pcspkr
>> serio_raw evdev parport battery button processor i2c_core usb_common
>> microcode thermal_sys ext4 crc16 jbd2 mbcache sr_mod cdrom sg sd_mod
>> crc_t10dif crct10dif_common ata_generic e1000 ahci libahci ata_piix libata
>> scsi_mod
>> CPU: 3 PID: 2992 Comm: test Not tainted 3.16.0-rc5ik1 #37
>> Hardware name: innotek GmbH VirtualBox, BIOS VirtualBox 12/01/2006 task:
>> ffff88005a9363d0 ti: ffff880037968000 task.ti: ffff880037968000 RIP:
>> 0010:[<ffffffff810db4d3>]  [<ffffffff810db4d3>]
>> __delete_from_page_cache+0x16f/0x1f6
>> RSP: 0018:ffff88003796bba8  EFLAGS: 00010046
>> RAX: 0000000000000000 RBX: ffffea00012ee220 RCX: 00000000ffffffe2
>> RDX: 0000000000000018 RSI: 0000000000000018 RDI: ffff88005dbeb700
>> RBP: ffff8800378d1c10 R08: ffff88005dbeb700 R09: 0000000000000013
>> R10: 0000000000000013 R11: 0000000000000000 R12: 0000000000000000
>> R13: 0000000000000003 R14: ffff8800378d1c18 R15: 000000000000000f
>> FS:  0000000000000000(0000) GS:ffff88005d980000(0000)
>> knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f69ad38fa30 CR3: 0000000001611000 CR4: 00000000000006e0
>> Stack:
>>  0000000000000002 000000000000000f ffff880059899008 ffff8800598990a8
>> ffff8800378d1c10 ffffea00012ee220 ffff8800378d1c28 0000000000000000
>> ffff8800378d1ac0 ffff8800374d0600 0000000000000001 ffffffff810db65b
>> Call Trace:
>>  [<ffffffff810db65b>] ? delete_from_page_cache+0x32/0x56
>>  [<ffffffff810e621d>] ? truncate_inode_page+0x62/0x69
>>  [<ffffffff810edf29>] ? shmem_undo_range+0x13f/0x3f3
>>  [<ffffffff810df855>] ? get_pfnblock_flags_mask+0x1d/0x4d
>>  [<ffffffff810e0bcb>] ? free_hot_cold_page+0x76/0x134
>>  [<ffffffff810e528a>] ? release_pages+0x171/0x180
>>  [<ffffffff810e4aa2>] ? hpage_nr_pages+0x1b/0x1b
>>  [<ffffffff811418df>] ? __inode_wait_for_writeback+0x67/0xae
>>  [<ffffffff810ee1e8>] ? shmem_truncate_range+0xb/0x25
>>  [<ffffffff810ee76d>] ? shmem_evict_inode+0x4f/0xed
>>  [<ffffffff810ee71e>] ? shmem_file_setup+0x7/0x7
>>  [<ffffffff81136947>] ? evict+0xa3/0x147
>>  [<ffffffff81133576>] ? __dentry_kill+0x103/0x173
>>  [<ffffffff81133983>] ? dput+0x133/0x150
>>  [<ffffffff8112489d>] ? __fput+0x163/0x184
>>  [<ffffffff8105f10c>] ? task_work_run+0x7b/0x8f
>>  [<ffffffff81049c69>] ? do_exit+0x3f6/0x904
>>  [<ffffffff8104a282>] ? do_group_exit+0x68/0x9a
>>  [<ffffffff8104a2c4>] ? SyS_exit_group+0x10/0x10
>>  [<ffffffff8138fb69>] ? system_call_fastpath+0x16/0x1b
>> Code: be 0a 00 00 00 48 89 df e8 96 5b 01 00 48 8b 03 a9 00 00 08 00 74 0d
>> be 18 00 00 00 48 89 df e8 7f 5b 01 00 8b 43 18 85 c0 78 02 <0f> 0b 48 8b
>> 03 a8 10 74 6f 48 8b 85 88 00 00 00 f6 40 20 01 75
>> RIP  [<ffffffff810db4d3>] __delete_from_page_cache+0x16f/0x1f6
>>  RSP <ffff88003796bba8>
>> ---[ end trace 79ae5bd27fcedca9 ]---
>> Fixing recursive fault but reboot is needed!
>> BUG: Bad rss-counter state mm:ffff88005aae60c0 idx:0 val:1
>
> And that "Bad rss-counter" report fits some of the reports too, good.
>
> Here's Ingo's remap-demo.c inline, but I've not tried it:
>
> #define _GNU_SOURCE
> #include <sys/mman.h>
> #include <sys/resource.h>
> #include <errno.h>
> #include <limits.h>
> #include <malloc.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <unistd.h>
>
> #define PAGE_SIZE 4096
> // NOTE: DATA=MAP2=16 seems to trigger in the first few tries
> // NOTE: 9/9 needs a loop and a few seconds to trigger
> // NOTE: DATA=9, MAP2=8 does not trigger
> #define DATA_SIZE 16
> #define MAP2_SIZE 16
>
> int shmfd;
> char shmpath[] = "/dev/shm/mmaptest-XXXXXX";
> unsigned char *map1, *map2;
> unsigned int i;
>
> int main(int argc, char *argv[]) {
>   /* create a data file on tmpfs */
>   shmfd = mkstemp(shmpath);
>   if (shmfd < 0) {
>     perror("mkstemp");
>     exit(2);
>   }
>
>   if (unlink(shmpath)) {
>     perror("unlink");
>     exit(2);
>   }
>
>   if (ftruncate(shmfd, DATA_SIZE * PAGE_SIZE)) {
>     perror("ftruncate");
>     exit(2);
>   }
>
>   /* map a single page from the file */
>   map1 = mmap(NULL, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, shmfd, 0);
>   if (map1 == MAP_FAILED) {
>     perror("mmap 1");
>     exit(2);
>   }
>
>   /* remap it to another page in the file */
>   // NOTE: Does not trigger without remapping
>   // NOTE: Does not trigger for 7, but does trigger for 8 if both sizes are 16
>   //  (DATA_SIZE-2 is sufficiently generic here)
>   if (remap_file_pages(map1, PAGE_SIZE, 0, DATA_SIZE - 2, MAP_SHARED)) {
>     perror("remap_file_pages 1");
>     exit(2);
>   }
>
>   /* create a second mapping */
>   map2 = mmap(NULL, MAP2_SIZE * PAGE_SIZE, PROT_READ | PROT_WRITE,
>               MAP_SHARED, shmfd, 0);
>   if (map2 == MAP_FAILED) {
>     perror("mmap 2");
>     exit(2);
>   }
>
>   /* map all of its pages to page 0 */
>   // NOTE: Remapping only the last page does not trigger
>   for (i = 0; i < MAP2_SIZE; i++) {
>     if (remap_file_pages(map2 + PAGE_SIZE * i, PAGE_SIZE, 0, 0, MAP_SHARED)) {
>       perror("remap_file_pages 3");
>       exit(2);
>     }
>   }
>
>   close(shmfd);
>
>   exit(0);
> }

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] mm: fix faulting range in do_fault_around
  2014-07-14 20:13     ` Konstantin Khlebnikov
@ 2014-07-15  9:55       ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-15  9:55 UTC (permalink / raw)
  To: linux-mm, Hugh Dickins, linux-kernel, Sasha Levin
  Cc: Ingo Korb, Kirill A. Shutemov, Dave Jones, Andrew Morton,
	Ning Qu, Konstantin Khlebnikov

From: Konstantin Khlebnikov <koct9i@gmail.com>

do_fault_around shoudn't cross pmd boundaries.

This patch does calculation in terms of addresses rather than pgoff.
It looks much cleaner in this way. Probably it's worth to replace
vmf->max_pgoff with vmf->end_address as well.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
---
 mm/memory.c |   26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index d67fd9f..f27638a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2831,33 +2831,29 @@ late_initcall(fault_around_debugfs);
 static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
 		pte_t *pte, pgoff_t pgoff, unsigned int flags)
 {
-	unsigned long start_addr;
+	unsigned long start_addr, end_addr;
 	pgoff_t max_pgoff;
 	struct vm_fault vmf;
 	int off;
 
-	start_addr = max(address & fault_around_mask(), vma->vm_start);
-	off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+	start_addr = max3(vma->vm_start, address & PMD_MASK,
+			  address & fault_around_mask());
+
+	end_addr = min3(vma->vm_end, ALIGN(address, PMD_SIZE),
+			start_addr + PAGE_ALIGN(fault_around_bytes));
+
+	off = (address - start_addr) >> PAGE_SHIFT;
 	pte -= off;
 	pgoff -= off;
-
-	/*
-	 *  max_pgoff is either end of page table or end of vma
-	 *  or fault_around_pages() from pgoff, depending what is nearest.
-	 */
-	max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
-		PTRS_PER_PTE - 1;
-	max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
-			pgoff + fault_around_pages() - 1);
+	max_pgoff = pgoff + ((end_addr - start_addr) >> PAGE_SHIFT) - 1;
 
 	/* Check if it makes any sense to call ->map_pages */
 	while (!pte_none(*pte)) {
-		if (++pgoff > max_pgoff)
-			return;
 		start_addr += PAGE_SIZE;
-		if (start_addr >= vma->vm_end)
+		if (start_addr >= end_addr)
 			return;
 		pte++;
+		pgoff++;
 	}
 
 	vmf.virtual_address = (void __user *) start_addr;


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] mm: fix faulting range in do_fault_around
@ 2014-07-15  9:55       ` Konstantin Khlebnikov
  0 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-15  9:55 UTC (permalink / raw)
  To: linux-mm, Hugh Dickins, linux-kernel, Sasha Levin
  Cc: Ingo Korb, Kirill A. Shutemov, Dave Jones, Andrew Morton,
	Ning Qu, Konstantin Khlebnikov

From: Konstantin Khlebnikov <koct9i@gmail.com>

do_fault_around shoudn't cross pmd boundaries.

This patch does calculation in terms of addresses rather than pgoff.
It looks much cleaner in this way. Probably it's worth to replace
vmf->max_pgoff with vmf->end_address as well.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
---
 mm/memory.c |   26 +++++++++++---------------
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index d67fd9f..f27638a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2831,33 +2831,29 @@ late_initcall(fault_around_debugfs);
 static void do_fault_around(struct vm_area_struct *vma, unsigned long address,
 		pte_t *pte, pgoff_t pgoff, unsigned int flags)
 {
-	unsigned long start_addr;
+	unsigned long start_addr, end_addr;
 	pgoff_t max_pgoff;
 	struct vm_fault vmf;
 	int off;
 
-	start_addr = max(address & fault_around_mask(), vma->vm_start);
-	off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+	start_addr = max3(vma->vm_start, address & PMD_MASK,
+			  address & fault_around_mask());
+
+	end_addr = min3(vma->vm_end, ALIGN(address, PMD_SIZE),
+			start_addr + PAGE_ALIGN(fault_around_bytes));
+
+	off = (address - start_addr) >> PAGE_SHIFT;
 	pte -= off;
 	pgoff -= off;
-
-	/*
-	 *  max_pgoff is either end of page table or end of vma
-	 *  or fault_around_pages() from pgoff, depending what is nearest.
-	 */
-	max_pgoff = pgoff - ((start_addr >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) +
-		PTRS_PER_PTE - 1;
-	max_pgoff = min3(max_pgoff, vma_pages(vma) + vma->vm_pgoff - 1,
-			pgoff + fault_around_pages() - 1);
+	max_pgoff = pgoff + ((end_addr - start_addr) >> PAGE_SHIFT) - 1;
 
 	/* Check if it makes any sense to call ->map_pages */
 	while (!pte_none(*pte)) {
-		if (++pgoff > max_pgoff)
-			return;
 		start_addr += PAGE_SIZE;
-		if (start_addr >= vma->vm_end)
+		if (start_addr >= end_addr)
 			return;
 		pte++;
+		pgoff++;
 	}
 
 	vmf.virtual_address = (void __user *) start_addr;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
  2014-07-14 20:13     ` Konstantin Khlebnikov
@ 2014-07-15 10:55       ` Kirill A. Shutemov
  -1 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2014-07-15 10:55 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Hugh Dickins, Ingo Korb, Kirill A. Shutemov, Ning Qu, Dave Jones,
	Sasha Levin, Andrew Morton, linux-mm, Linux Kernel Mailing List

Konstantin Khlebnikov wrote:
> It seems boundng logic in do_fault_around is wrong:
> 
> start_addr = max(address & fault_around_mask(), vma->vm_start);
> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> pte -= off;
> pgoff -= off;
> 
> Ok, off  <= 511, but it might be bigger than pte offset in pte table.

I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
(x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
boundary in this case which is start of the page table pte belong to.

Do I miss something?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
@ 2014-07-15 10:55       ` Kirill A. Shutemov
  0 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2014-07-15 10:55 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Hugh Dickins, Ingo Korb, Kirill A. Shutemov, Ning Qu, Dave Jones,
	Sasha Levin, Andrew Morton, linux-mm, Linux Kernel Mailing List

Konstantin Khlebnikov wrote:
> It seems boundng logic in do_fault_around is wrong:
> 
> start_addr = max(address & fault_around_mask(), vma->vm_start);
> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> pte -= off;
> pgoff -= off;
> 
> Ok, off  <= 511, but it might be bigger than pte offset in pte table.

I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
(x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
boundary in this case which is start of the page table pte belong to.

Do I miss something?

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
  2014-07-15 10:55       ` Kirill A. Shutemov
@ 2014-07-15 11:33         ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-15 11:33 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Hugh Dickins, Ingo Korb, Ning Qu, Dave Jones, Sasha Levin,
	Andrew Morton, linux-mm, Linux Kernel Mailing List

On Tue, Jul 15, 2014 at 2:55 PM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
> Konstantin Khlebnikov wrote:
>> It seems boundng logic in do_fault_around is wrong:
>>
>> start_addr = max(address & fault_around_mask(), vma->vm_start);
>> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
>> pte -= off;
>> pgoff -= off;
>>
>> Ok, off  <= 511, but it might be bigger than pte offset in pte table.
>
> I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
> (x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
> boundary in this case which is start of the page table pte belong to.
>
> Do I miss something?

Nope, you're right. This fixes kernel crash but not the original problem.

Problem is caused by calling do_fault_around for _non-linear_ faiult.
In this case pgoff is shifted and might become negative during calculation.
I'll send another patch.

>
> --
>  Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
@ 2014-07-15 11:33         ` Konstantin Khlebnikov
  0 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-15 11:33 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Hugh Dickins, Ingo Korb, Ning Qu, Dave Jones, Sasha Levin,
	Andrew Morton, linux-mm, Linux Kernel Mailing List

On Tue, Jul 15, 2014 at 2:55 PM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
> Konstantin Khlebnikov wrote:
>> It seems boundng logic in do_fault_around is wrong:
>>
>> start_addr = max(address & fault_around_mask(), vma->vm_start);
>> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
>> pte -= off;
>> pgoff -= off;
>>
>> Ok, off  <= 511, but it might be bigger than pte offset in pte table.
>
> I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
> (x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
> boundary in this case which is start of the page table pte belong to.
>
> Do I miss something?

Nope, you're right. This fixes kernel crash but not the original problem.

Problem is caused by calling do_fault_around for _non-linear_ faiult.
In this case pgoff is shifted and might become negative during calculation.
I'll send another patch.

>
> --
>  Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
  2014-07-15 11:33         ` Konstantin Khlebnikov
@ 2014-07-15 11:54           ` Kirill A. Shutemov
  -1 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2014-07-15 11:54 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Kirill A. Shutemov, Hugh Dickins, Ingo Korb, Ning Qu, Dave Jones,
	Sasha Levin, Andrew Morton, linux-mm, Linux Kernel Mailing List

Konstantin Khlebnikov wrote:
> On Tue, Jul 15, 2014 at 2:55 PM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> > Konstantin Khlebnikov wrote:
> >> It seems boundng logic in do_fault_around is wrong:
> >>
> >> start_addr = max(address & fault_around_mask(), vma->vm_start);
> >> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> >> pte -= off;
> >> pgoff -= off;
> >>
> >> Ok, off  <= 511, but it might be bigger than pte offset in pte table.
> >
> > I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
> > (x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
> > boundary in this case which is start of the page table pte belong to.
> >
> > Do I miss something?
> 
> Nope, you're right. This fixes kernel crash but not the original problem.
> 
> Problem is caused by calling do_fault_around for _non-linear_ faiult.
> In this case pgoff is shifted and might become negative during calculation.
> I'll send another patch.

I've got to the same conclusion. My patch is below.

>From dd761b693cd06c649499e913713ae5bc7c029f6e Mon Sep 17 00:00:00 2001
From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Date: Tue, 15 Jul 2014 14:40:02 +0300
Subject: [PATCH] mm: avoid do_fault_around() on non-linear mappings

Originally, I've wrongly assumed that non-linear mapping are always
populated at least with pte_file() entries there, so !pte_none() check
will catch them. It's not always the case: we can get there from
__mm_populte in remap_file_pages() and pte will be clear.

Let's put explicit check for non-linear mapping.

This is a root cause of recent "kernel BUG at mm/filemap.c:202!".

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: stable@vger.kernel.org # 3.15+
---
 mm/memory.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index d67fd9fcf1f2..440ad48266d6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * if page by the offset is not ready to be mapped (cold cache or
 	 * something).
 	 */
-	if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
+	if (vma->vm_ops->map_pages && fault_around_pages() > 1 &&
+			!(vma->vm_flags & VM_NONLINEAR)) {
 		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
 		do_fault_around(vma, address, pte, pgoff, flags);
 		if (!pte_same(*pte, orig_pte))
-- 
2.0.1

-- 
 Kirill A. Shutemov

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
@ 2014-07-15 11:54           ` Kirill A. Shutemov
  0 siblings, 0 replies; 23+ messages in thread
From: Kirill A. Shutemov @ 2014-07-15 11:54 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Kirill A. Shutemov, Hugh Dickins, Ingo Korb, Ning Qu, Dave Jones,
	Sasha Levin, Andrew Morton, linux-mm, Linux Kernel Mailing List

Konstantin Khlebnikov wrote:
> On Tue, Jul 15, 2014 at 2:55 PM, Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> > Konstantin Khlebnikov wrote:
> >> It seems boundng logic in do_fault_around is wrong:
> >>
> >> start_addr = max(address & fault_around_mask(), vma->vm_start);
> >> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> >> pte -= off;
> >> pgoff -= off;
> >>
> >> Ok, off  <= 511, but it might be bigger than pte offset in pte table.
> >
> > I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
> > (x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
> > boundary in this case which is start of the page table pte belong to.
> >
> > Do I miss something?
> 
> Nope, you're right. This fixes kernel crash but not the original problem.
> 
> Problem is caused by calling do_fault_around for _non-linear_ faiult.
> In this case pgoff is shifted and might become negative during calculation.
> I'll send another patch.

I've got to the same conclusion. My patch is below.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] mm: do not call do_fault_around for non-linear fault
  2014-07-14 20:13     ` Konstantin Khlebnikov
@ 2014-07-15 11:58       ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-15 11:58 UTC (permalink / raw)
  To: linux-mm, Hugh Dickins, linux-kernel, Sasha Levin
  Cc: Ingo Korb, Kirill A. Shutemov, Dave Jones, Andrew Morton,
	Ning Qu, Konstantin Khlebnikov

From: Konstantin Khlebnikov <koct9i@gmail.com>

Faulting around non-linear page-fault has no sense and
breaks logic in do_fault_around because pgoff is shifted.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
---
 mm/memory.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index d67fd9f..7e8d820 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * if page by the offset is not ready to be mapped (cold cache or
 	 * something).
 	 */
-	if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
+	if (vma->vm_ops->map_pages && !(flags & FAULT_FLAG_NONLINEAR) &&
+	    fault_around_pages() > 1) {
 		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
 		do_fault_around(vma, address, pte, pgoff, flags);
 		if (!pte_same(*pte, orig_pte))


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH] mm: do not call do_fault_around for non-linear fault
@ 2014-07-15 11:58       ` Konstantin Khlebnikov
  0 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-15 11:58 UTC (permalink / raw)
  To: linux-mm, Hugh Dickins, linux-kernel, Sasha Levin
  Cc: Ingo Korb, Kirill A. Shutemov, Dave Jones, Andrew Morton,
	Ning Qu, Konstantin Khlebnikov

From: Konstantin Khlebnikov <koct9i@gmail.com>

Faulting around non-linear page-fault has no sense and
breaks logic in do_fault_around because pgoff is shifted.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
---
 mm/memory.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index d67fd9f..7e8d820 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 	 * if page by the offset is not ready to be mapped (cold cache or
 	 * something).
 	 */
-	if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
+	if (vma->vm_ops->map_pages && !(flags & FAULT_FLAG_NONLINEAR) &&
+	    fault_around_pages() > 1) {
 		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
 		do_fault_around(vma, address, pte, pgoff, flags);
 		if (!pte_same(*pte, orig_pte))

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] mm: do not call do_fault_around for non-linear fault
  2014-07-15 11:58       ` Konstantin Khlebnikov
@ 2014-07-15 15:29         ` Ingo Korb
  -1 siblings, 0 replies; 23+ messages in thread
From: Ingo Korb @ 2014-07-15 15:29 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Hugh Dickins, linux-kernel, Sasha Levin,
	Kirill A. Shutemov, Dave Jones, Andrew Morton, Ning Qu,
	Konstantin Khlebnikov

> Faulting around non-linear page-fault has no sense and
> breaks logic in do_fault_around because pgoff is shifted.

I can confirm that this patches fixes the bug here not just for the test
program but also the application where I originally noticed it.

-ik



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] mm: do not call do_fault_around for non-linear fault
@ 2014-07-15 15:29         ` Ingo Korb
  0 siblings, 0 replies; 23+ messages in thread
From: Ingo Korb @ 2014-07-15 15:29 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Hugh Dickins, linux-kernel, Sasha Levin,
	Kirill A. Shutemov, Dave Jones, Andrew Morton, Ning Qu,
	Konstantin Khlebnikov

> Faulting around non-linear page-fault has no sense and
> breaks logic in do_fault_around because pgoff is shifted.

I can confirm that this patches fixes the bug here not just for the test
program but also the application where I originally noticed it.

-ik


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] mm: do not call do_fault_around for non-linear fault
  2014-07-15 11:58       ` Konstantin Khlebnikov
@ 2014-07-15 20:42         ` Andrew Morton
  -1 siblings, 0 replies; 23+ messages in thread
From: Andrew Morton @ 2014-07-15 20:42 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Hugh Dickins, linux-kernel, Sasha Levin, Ingo Korb,
	Kirill A. Shutemov, Dave Jones, Ning Qu, Konstantin Khlebnikov

On Tue, 15 Jul 2014 15:58:32 +0400 Konstantin Khlebnikov <k.khlebnikov@samsung.com> wrote:

> From: Konstantin Khlebnikov <koct9i@gmail.com>
> 
> Faulting around non-linear page-fault has no sense and
> breaks logic in do_fault_around because pgoff is shifted.
> 

Please be a lot more careful with the changelogs?  This one failed to
describe the effects of the bug, failed to adequately describe the bug
itself, failed to describe the offending commits and failed to identify
which kernel versions need the patch.

Sigh.  I went back and assembled the necessary information, below. 
Please check it.



From: Konstantin Khlebnikov <koct9i@gmail.com>
Subject: mm: do not call do_fault_around for non-linear fault

Ingo Korb reported that "repeated mapping of the same file on tmpfs using
remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when the
process exits".  He bisected the bug to d7c1755179b82d ("mm: implement
->map_pages for shmem/tmpfs"), although the bug was actually added by
8c6e50b0290c4 ("mm: introduce vm_ops->map_pages()").

Problem is caused by calling do_fault_around for _non-linear_ faiult.  In
this case pgoff is shifted and might become negative during calculation.

Faulting around non-linear page-fault has no sense and breaks logic in
do_fault_around because pgoff is shifted.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
Tested-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Ning Qu <quning@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[3.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -puN mm/memory.c~mm-do-not-call-do_fault_around-for-non-linear-fault mm/memory.c
--- a/mm/memory.c~mm-do-not-call-do_fault_around-for-non-linear-fault
+++ a/mm/memory.c
@@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struc
 	 * if page by the offset is not ready to be mapped (cold cache or
 	 * something).
 	 */
-	if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
+	if (vma->vm_ops->map_pages && !(flags & FAULT_FLAG_NONLINEAR) &&
+	    fault_around_pages() > 1) {
 		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
 		do_fault_around(vma, address, pte, pgoff, flags);
 		if (!pte_same(*pte, orig_pte))
_


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] mm: do not call do_fault_around for non-linear fault
@ 2014-07-15 20:42         ` Andrew Morton
  0 siblings, 0 replies; 23+ messages in thread
From: Andrew Morton @ 2014-07-15 20:42 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Hugh Dickins, linux-kernel, Sasha Levin, Ingo Korb,
	Kirill A. Shutemov, Dave Jones, Ning Qu, Konstantin Khlebnikov

On Tue, 15 Jul 2014 15:58:32 +0400 Konstantin Khlebnikov <k.khlebnikov@samsung.com> wrote:

> From: Konstantin Khlebnikov <koct9i@gmail.com>
> 
> Faulting around non-linear page-fault has no sense and
> breaks logic in do_fault_around because pgoff is shifted.
> 

Please be a lot more careful with the changelogs?  This one failed to
describe the effects of the bug, failed to adequately describe the bug
itself, failed to describe the offending commits and failed to identify
which kernel versions need the patch.

Sigh.  I went back and assembled the necessary information, below. 
Please check it.



From: Konstantin Khlebnikov <koct9i@gmail.com>
Subject: mm: do not call do_fault_around for non-linear fault

Ingo Korb reported that "repeated mapping of the same file on tmpfs using
remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when the
process exits".  He bisected the bug to d7c1755179b82d ("mm: implement
->map_pages for shmem/tmpfs"), although the bug was actually added by
8c6e50b0290c4 ("mm: introduce vm_ops->map_pages()").

Problem is caused by calling do_fault_around for _non-linear_ faiult.  In
this case pgoff is shifted and might become negative during calculation.

Faulting around non-linear page-fault has no sense and breaks logic in
do_fault_around because pgoff is shifted.

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
Tested-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Dave Jones <davej@redhat.com>
Cc: Ning Qu <quning@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>	[3.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 mm/memory.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff -puN mm/memory.c~mm-do-not-call-do_fault_around-for-non-linear-fault mm/memory.c
--- a/mm/memory.c~mm-do-not-call-do_fault_around-for-non-linear-fault
+++ a/mm/memory.c
@@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struc
 	 * if page by the offset is not ready to be mapped (cold cache or
 	 * something).
 	 */
-	if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
+	if (vma->vm_ops->map_pages && !(flags & FAULT_FLAG_NONLINEAR) &&
+	    fault_around_pages() > 1) {
 		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
 		do_fault_around(vma, address, pte, pgoff, flags);
 		if (!pte_same(*pte, orig_pte))
_

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
  2014-07-15 11:54           ` Kirill A. Shutemov
@ 2014-07-15 20:46             ` Hugh Dickins
  -1 siblings, 0 replies; 23+ messages in thread
From: Hugh Dickins @ 2014-07-15 20:46 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Konstantin Khlebnikov, Hugh Dickins, Ingo Korb, Ning Qu,
	Dave Jones, Sasha Levin, Andrew Morton, linux-mm,
	Linux Kernel Mailing List

On Tue, 15 Jul 2014, Kirill A. Shutemov wrote:
> Konstantin Khlebnikov wrote:
> > On Tue, Jul 15, 2014 at 2:55 PM, Kirill A. Shutemov
> > <kirill.shutemov@linux.intel.com> wrote:
> > > Konstantin Khlebnikov wrote:
> > >> It seems boundng logic in do_fault_around is wrong:
> > >>
> > >> start_addr = max(address & fault_around_mask(), vma->vm_start);
> > >> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> > >> pte -= off;
> > >> pgoff -= off;
> > >>
> > >> Ok, off  <= 511, but it might be bigger than pte offset in pte table.
> > >
> > > I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
> > > (x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
> > > boundary in this case which is start of the page table pte belong to.
> > >
> > > Do I miss something?
> > 
> > Nope, you're right. This fixes kernel crash but not the original problem.
> > 
> > Problem is caused by calling do_fault_around for _non-linear_ faiult.
> > In this case pgoff is shifted and might become negative during calculation.
> > I'll send another patch.
> 
> I've got to the same conclusion. My patch is below.

Many thanks to Ingo and Konstantin and Kirill for nailing this.
So now we have two not-quite-identical patches to fix it.
I feel I have to judge a beauty contest.

I think my slight preference is for Kirill's below, because it has
a better description (mentions "kernel BUG at mm/filemap.c:202!" and
Ccs stable) and uses the familiar VM_NONLINEAR flag rather than the
never-heard-of-before-and-otherwise-unused FAULT_FLAG_NONLINEAR.

But please please add a credit to Ingo, who made the breakthrough for
us, and to Konstantin who analysed what was going on.  Ingo, this is
not quite the version you tested...

... ah, forget it, Andrew has just now gone for Konstantin's,
adding in more info from Kirill's: that's fine.

Thanks all,
Hugh

> 
> From dd761b693cd06c649499e913713ae5bc7c029f6e Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 15 Jul 2014 14:40:02 +0300
> Subject: [PATCH] mm: avoid do_fault_around() on non-linear mappings
> 
> Originally, I've wrongly assumed that non-linear mapping are always
> populated at least with pte_file() entries there, so !pte_none() check
> will catch them. It's not always the case: we can get there from
> __mm_populte in remap_file_pages() and pte will be clear.

__mm_populate

> 
> Let's put explicit check for non-linear mapping.
> 
> This is a root cause of recent "kernel BUG at mm/filemap.c:202!".
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 3.15+
> ---
>  mm/memory.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index d67fd9fcf1f2..440ad48266d6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	 * if page by the offset is not ready to be mapped (cold cache or
>  	 * something).
>  	 */
> -	if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
> +	if (vma->vm_ops->map_pages && fault_around_pages() > 1 &&
> +			!(vma->vm_flags & VM_NONLINEAR)) {
>  		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
>  		do_fault_around(vma, address, pte, pgoff, flags);
>  		if (!pte_same(*pte, orig_pte))
> -- 
> 2.0.1
> 
> -- 
>  Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit
@ 2014-07-15 20:46             ` Hugh Dickins
  0 siblings, 0 replies; 23+ messages in thread
From: Hugh Dickins @ 2014-07-15 20:46 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Konstantin Khlebnikov, Hugh Dickins, Ingo Korb, Ning Qu,
	Dave Jones, Sasha Levin, Andrew Morton, linux-mm,
	Linux Kernel Mailing List

On Tue, 15 Jul 2014, Kirill A. Shutemov wrote:
> Konstantin Khlebnikov wrote:
> > On Tue, Jul 15, 2014 at 2:55 PM, Kirill A. Shutemov
> > <kirill.shutemov@linux.intel.com> wrote:
> > > Konstantin Khlebnikov wrote:
> > >> It seems boundng logic in do_fault_around is wrong:
> > >>
> > >> start_addr = max(address & fault_around_mask(), vma->vm_start);
> > >> off = ((address - start_addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
> > >> pte -= off;
> > >> pgoff -= off;
> > >>
> > >> Ok, off  <= 511, but it might be bigger than pte offset in pte table.
> > >
> > > I don't see how it possible: fault_around_mask() cannot be more than 0x1ff000
> > > (x86-64, fault_around_bytes == 2M). It means start_addr will be aligned to 2M
> > > boundary in this case which is start of the page table pte belong to.
> > >
> > > Do I miss something?
> > 
> > Nope, you're right. This fixes kernel crash but not the original problem.
> > 
> > Problem is caused by calling do_fault_around for _non-linear_ faiult.
> > In this case pgoff is shifted and might become negative during calculation.
> > I'll send another patch.
> 
> I've got to the same conclusion. My patch is below.

Many thanks to Ingo and Konstantin and Kirill for nailing this.
So now we have two not-quite-identical patches to fix it.
I feel I have to judge a beauty contest.

I think my slight preference is for Kirill's below, because it has
a better description (mentions "kernel BUG at mm/filemap.c:202!" and
Ccs stable) and uses the familiar VM_NONLINEAR flag rather than the
never-heard-of-before-and-otherwise-unused FAULT_FLAG_NONLINEAR.

But please please add a credit to Ingo, who made the breakthrough for
us, and to Konstantin who analysed what was going on.  Ingo, this is
not quite the version you tested...

... ah, forget it, Andrew has just now gone for Konstantin's,
adding in more info from Kirill's: that's fine.

Thanks all,
Hugh

> 
> From dd761b693cd06c649499e913713ae5bc7c029f6e Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Date: Tue, 15 Jul 2014 14:40:02 +0300
> Subject: [PATCH] mm: avoid do_fault_around() on non-linear mappings
> 
> Originally, I've wrongly assumed that non-linear mapping are always
> populated at least with pte_file() entries there, so !pte_none() check
> will catch them. It's not always the case: we can get there from
> __mm_populte in remap_file_pages() and pte will be clear.

__mm_populate

> 
> Let's put explicit check for non-linear mapping.
> 
> This is a root cause of recent "kernel BUG at mm/filemap.c:202!".
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: stable@vger.kernel.org # 3.15+
> ---
>  mm/memory.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index d67fd9fcf1f2..440ad48266d6 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  	 * if page by the offset is not ready to be mapped (cold cache or
>  	 * something).
>  	 */
> -	if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
> +	if (vma->vm_ops->map_pages && fault_around_pages() > 1 &&
> +			!(vma->vm_flags & VM_NONLINEAR)) {
>  		pte = pte_offset_map_lock(mm, pmd, address, &ptl);
>  		do_fault_around(vma, address, pte, pgoff, flags);
>  		if (!pte_same(*pte, orig_pte))
> -- 
> 2.0.1
> 
> -- 
>  Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] mm: do not call do_fault_around for non-linear fault
  2014-07-15 20:42         ` Andrew Morton
@ 2014-07-15 21:07           ` Konstantin Khlebnikov
  -1 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-15 21:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Konstantin Khlebnikov, linux-mm, Hugh Dickins,
	Linux Kernel Mailing List, Sasha Levin, Ingo Korb,
	Kirill A. Shutemov, Dave Jones, Ning Qu

On Wed, Jul 16, 2014 at 12:42 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 15 Jul 2014 15:58:32 +0400 Konstantin Khlebnikov <k.khlebnikov@samsung.com> wrote:
>
>> From: Konstantin Khlebnikov <koct9i@gmail.com>
>>
>> Faulting around non-linear page-fault has no sense and
>> breaks logic in do_fault_around because pgoff is shifted.
>>
>
> Please be a lot more careful with the changelogs?  This one failed to
> describe the effects of the bug, failed to adequately describe the bug
> itself, failed to describe the offending commits and failed to identify
> which kernel versions need the patch.

Sorry for that. I thought I had already lost that bug-fixing race.

>
> Sigh.  I went back and assembled the necessary information, below.
> Please check it.
>
>
>
> From: Konstantin Khlebnikov <koct9i@gmail.com>
> Subject: mm: do not call do_fault_around for non-linear fault
>
> Ingo Korb reported that "repeated mapping of the same file on tmpfs using
> remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when the
> process exits".  He bisected the bug to d7c1755179b82d ("mm: implement
> ->map_pages for shmem/tmpfs"), although the bug was actually added by
> 8c6e50b0290c4 ("mm: introduce vm_ops->map_pages()").
>
> Problem is caused by calling do_fault_around for _non-linear_ faiult.  In
> this case pgoff is shifted and might become negative during calculation.
>
> Faulting around non-linear page-fault has no sense and breaks logic in
> do_fault_around because pgoff is shifted.
>
> Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
> Reported-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
> Tested-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Sasha Levin <sasha.levin@oracle.com>
> Cc: Dave Jones <davej@redhat.com>
> Cc: Ning Qu <quning@google.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: <stable@vger.kernel.org>    [3.15.x]
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
>  mm/memory.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff -puN mm/memory.c~mm-do-not-call-do_fault_around-for-non-linear-fault mm/memory.c
> --- a/mm/memory.c~mm-do-not-call-do_fault_around-for-non-linear-fault
> +++ a/mm/memory.c
> @@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struc
>          * if page by the offset is not ready to be mapped (cold cache or
>          * something).
>          */
> -       if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
> +       if (vma->vm_ops->map_pages && !(flags & FAULT_FLAG_NONLINEAR) &&
> +           fault_around_pages() > 1) {
>                 pte = pte_offset_map_lock(mm, pmd, address, &ptl);
>                 do_fault_around(vma, address, pte, pgoff, flags);
>                 if (!pte_same(*pte, orig_pte))
> _
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] mm: do not call do_fault_around for non-linear fault
@ 2014-07-15 21:07           ` Konstantin Khlebnikov
  0 siblings, 0 replies; 23+ messages in thread
From: Konstantin Khlebnikov @ 2014-07-15 21:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Konstantin Khlebnikov, linux-mm, Hugh Dickins,
	Linux Kernel Mailing List, Sasha Levin, Ingo Korb,
	Kirill A. Shutemov, Dave Jones, Ning Qu

On Wed, Jul 16, 2014 at 12:42 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Tue, 15 Jul 2014 15:58:32 +0400 Konstantin Khlebnikov <k.khlebnikov@samsung.com> wrote:
>
>> From: Konstantin Khlebnikov <koct9i@gmail.com>
>>
>> Faulting around non-linear page-fault has no sense and
>> breaks logic in do_fault_around because pgoff is shifted.
>>
>
> Please be a lot more careful with the changelogs?  This one failed to
> describe the effects of the bug, failed to adequately describe the bug
> itself, failed to describe the offending commits and failed to identify
> which kernel versions need the patch.

Sorry for that. I thought I had already lost that bug-fixing race.

>
> Sigh.  I went back and assembled the necessary information, below.
> Please check it.
>
>
>
> From: Konstantin Khlebnikov <koct9i@gmail.com>
> Subject: mm: do not call do_fault_around for non-linear fault
>
> Ingo Korb reported that "repeated mapping of the same file on tmpfs using
> remap_file_pages sometimes triggers a BUG at mm/filemap.c:202 when the
> process exits".  He bisected the bug to d7c1755179b82d ("mm: implement
> ->map_pages for shmem/tmpfs"), although the bug was actually added by
> 8c6e50b0290c4 ("mm: introduce vm_ops->map_pages()").
>
> Problem is caused by calling do_fault_around for _non-linear_ faiult.  In
> this case pgoff is shifted and might become negative during calculation.
>
> Faulting around non-linear page-fault has no sense and breaks logic in
> do_fault_around because pgoff is shifted.
>
> Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
> Reported-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
> Tested-by: "Ingo Korb" <ingo.korb@tu-dortmund.de>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Sasha Levin <sasha.levin@oracle.com>
> Cc: Dave Jones <davej@redhat.com>
> Cc: Ning Qu <quning@google.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: <stable@vger.kernel.org>    [3.15.x]
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
>  mm/memory.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff -puN mm/memory.c~mm-do-not-call-do_fault_around-for-non-linear-fault mm/memory.c
> --- a/mm/memory.c~mm-do-not-call-do_fault_around-for-non-linear-fault
> +++ a/mm/memory.c
> @@ -2882,7 +2882,8 @@ static int do_read_fault(struct mm_struc
>          * if page by the offset is not ready to be mapped (cold cache or
>          * something).
>          */
> -       if (vma->vm_ops->map_pages && fault_around_pages() > 1) {
> +       if (vma->vm_ops->map_pages && !(flags & FAULT_FLAG_NONLINEAR) &&
> +           fault_around_pages() > 1) {
>                 pte = pte_offset_map_lock(mm, pmd, address, &ptl);
>                 do_fault_around(vma, address, pte, pgoff, flags);
>                 if (!pte_same(*pte, orig_pte))
> _
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2014-07-15 21:07 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-14 12:58 PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit Ingo Korb
2014-07-14 19:22 ` Hugh Dickins
2014-07-14 19:22   ` Hugh Dickins
2014-07-14 20:13   ` Konstantin Khlebnikov
2014-07-14 20:13     ` Konstantin Khlebnikov
2014-07-15  9:55     ` [PATCH] mm: fix faulting range in do_fault_around Konstantin Khlebnikov
2014-07-15  9:55       ` Konstantin Khlebnikov
2014-07-15 10:55     ` PROBLEM: repeated remap_file_pages on tmpfs triggers bug on process exit Kirill A. Shutemov
2014-07-15 10:55       ` Kirill A. Shutemov
2014-07-15 11:33       ` Konstantin Khlebnikov
2014-07-15 11:33         ` Konstantin Khlebnikov
2014-07-15 11:54         ` Kirill A. Shutemov
2014-07-15 11:54           ` Kirill A. Shutemov
2014-07-15 20:46           ` Hugh Dickins
2014-07-15 20:46             ` Hugh Dickins
2014-07-15 11:58     ` [PATCH] mm: do not call do_fault_around for non-linear fault Konstantin Khlebnikov
2014-07-15 11:58       ` Konstantin Khlebnikov
2014-07-15 15:29       ` Ingo Korb
2014-07-15 15:29         ` Ingo Korb
2014-07-15 20:42       ` Andrew Morton
2014-07-15 20:42         ` Andrew Morton
2014-07-15 21:07         ` Konstantin Khlebnikov
2014-07-15 21:07           ` Konstantin Khlebnikov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.