linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* satisfying an mmap call with memory belonging to another process
@ 2019-05-30 16:05 Thanos Makatos
  0 siblings, 0 replies; only message in thread
From: Thanos Makatos @ 2019-05-30 16:05 UTC (permalink / raw)
  To: linux-mm; +Cc: Felipe Franciosi, Swapnil Ingle

I'm prototyping a device driver that is backed by a userspace process (server) instead of a physical device. In this use case, the client userspace process is supposed to mmap device memory. I does so by opening a custom device node and then calling mmap, where I have provided my own .mmap callback. When the custom device driver receives such a call, it instructs the server process to allocate the necessary memory using mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0). It then passes the virtual address of the server process back into the custom device driver, figures out which pages back this memory by calling get_user_pages_fast(), and then finally inserts these pages to the VMA provided by the mmap call by calling vm_insert_page().

This implementation works, however if one of the processes exits without cleanly unmapping this memory (e.g. it crashes), I get the following stack trace:

[  996.588022] ---[ end trace 6193ca2409940966 ]---
[  996.588770] BUG: Bad page cache in process a.out  pfn:1f3d38
[  996.589422] page:ffffdc4287cf4e00 count:5 mapcount:1 mapping:ffff9ce6981e5f20 index:0x0
[  996.590110] shmem_aops
[  996.590112] flags: 0x2ffff8000080037(locked|referenced|uptodate|lru|active|swapbacked)
[  996.591498] raw: 02ffff8000080037 ffffdc4287d22dc8 ffffdc4287d73748 ffff9ce6981e5f20
[  996.592179] raw: 0000000000000000 0000000000000000 0000000500000000 ffff9ce67c4d5000
[  996.592843] page dumped because: still mapped when deleted
[  996.593458] page->mem_cgroup:ffff9ce67c4d5000
[  996.594069] CPU: 1 PID: 670 Comm: a.out Tainted: G        W  O      5.1.0-rc4+ #3
[  996.594650] Hardware name: Nutanix AHV, BIOS 1.9.1-5.el6 04/01/2014
[  996.595260] Call Trace:
[  996.595794]  dump_stack+0x5c/0x7b
[  996.596358]  unaccount_page_cache_page+0x132/0x1c0
[  996.596868]  __delete_from_page_cache+0x39/0x200
[  996.597368]  ? xas_load+0x9/0x80
[  996.597882]  ? _cond_resched+0x16/0x40
[  996.598372]  ? down_write+0xe/0x40
[  996.598859]  ? unmap_mapping_pages+0x5e/0x130
[  996.599328]  delete_from_page_cache+0x45/0x70
[  996.599783]  truncate_inode_page+0x22/0x30
[  996.600275]  shmem_undo_range+0x1fd/0x840
[  996.600743]  ? native_usergs_sysret64+0xf/0x10
[  996.601207]  shmem_truncate_range+0x16/0x40
[  996.601671]  shmem_evict_inode+0xad/0x190
[  996.602130]  evict+0xc1/0x1c0
[  996.602561]  __dentry_kill+0xd3/0x180
[  996.602990]  dentry_kill+0x4d/0x1b0
[  996.603408]  dput+0xd7/0x130
[  996.603839]  __fput+0x108/0x230
[  996.604295]  task_work_run+0x8a/0xb0
[  996.604714]  do_exit+0x2df/0xbc0
[  996.605128]  ? vfs_write+0x148/0x190
[  996.605525]  do_group_exit+0x3a/0xa0
[  996.605946]  __x64_sys_exit_group+0x14/0x20
[  996.606358]  do_syscall_64+0x55/0x100
[  996.606763]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  996.607168] RIP: 0033:0x7f05826ca618
[  996.607573] Code: Bad RIP value.
[  996.608013] RSP: 002b:00007ffd59408798 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[  996.608468] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f05826ca618
[  996.608913] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[  996.609338] RBP: 00007f05829a78e0 R08: 00000000000000e7 R09: ffffffffffffff98
[  996.609754] R10: 00007ffd59408718 R11: 0000000000000246 R12: 00007f05829a78e0
[  996.610165] R13: 00007f05829acc20 R14: 0000000000000000 R15: 0000000000000000
[  996.610576] Disabling lock debugging due to kernel taint

My understanding is that there is some mix up with how the mmap'ed memory is set up for the client process, hence the "still mapped when deleted" as the reason stated for the failure. My guess is that the VMA/page of the server mmap (the one the sever process does using -1 as the fd) seems to be associated with /dev/zero or shm (?), while the VMA/page of the client is associated with the device file (I'm not even sure I fully understand the problem).

One way I though of fixing this is to make the server process allocate memory by mmap'ing using the device fd instead of -1, so that both VMAs/pages are against the same struct file. Obviously I'll have to somehow differentiate between an mmap belonging to the client and an mmap belonging to the server. Would this solve the problem? Is there another way of solving this?


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2019-05-30 16:05 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-30 16:05 satisfying an mmap call with memory belonging to another process Thanos Makatos

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).