From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
alan@lxorguk.ukuu.org.uk,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Hillf Danton <dhillf@gmail.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Hugh Dickins <hughd@google.com>
Subject: [ 045/108] hugetlbfs: avoid taking i_mutex from hugetlbfs_read()
Date: Fri, 30 Mar 2012 12:58:07 -0700 [thread overview]
Message-ID: <20120330195728.553324494@linuxfoundation.org> (raw)
In-Reply-To: <20120330195812.GA31833@kroah.com>
3.0-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
commit a05b0855fd15504972dba2358e5faa172a1e50ba upstream.
Taking i_mutex in hugetlbfs_read() can result in deadlock with mmap as
explained below
Thread A:
read() on hugetlbfs
hugetlbfs_read() called
i_mutex grabbed
hugetlbfs_read_actor() called
__copy_to_user() called
page fault is triggered
Thread B, sharing address space with A:
mmap() the same file
->mmap_sem is grabbed on task_B->mm->mmap_sem
hugetlbfs_file_mmap() is called
attempt to grab ->i_mutex and block waiting for A to give it up
Thread A:
pagefault handled blocked on attempt to grab task_A->mm->mmap_sem,
which happens to be the same thing as task_B->mm->mmap_sem. Block waiting
for B to give it up.
AFAIU the i_mutex locking was added to hugetlbfs_read() as per
http://lkml.indiana.edu/hypermail/linux/kernel/0707.2/3066.html to take
care of the race between truncate and read. This patch fixes this by
looking at page->mapping under lock_page() (find_lock_page()) to ensure
that the inode didn't get truncated in the range during a parallel read.
Ideally we can extend the patch to make sure we don't increase i_size in
mmap. But that will break userspace, because applications will now have
to use truncate(2) to increase i_size in hugetlbfs.
Based on the original patch from Hillf Danton.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Hillf Danton <dhillf@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
fs/hugetlbfs/inode.c | 25 +++++++++----------------
1 file changed, 9 insertions(+), 16 deletions(-)
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -238,17 +238,10 @@ static ssize_t hugetlbfs_read(struct fil
loff_t isize;
ssize_t retval = 0;
- mutex_lock(&inode->i_mutex);
-
/* validate length */
if (len == 0)
goto out;
- isize = i_size_read(inode);
- if (!isize)
- goto out;
-
- end_index = (isize - 1) >> huge_page_shift(h);
for (;;) {
struct page *page;
unsigned long nr, ret;
@@ -256,18 +249,21 @@ static ssize_t hugetlbfs_read(struct fil
/* nr is the maximum number of bytes to copy from this page */
nr = huge_page_size(h);
+ isize = i_size_read(inode);
+ if (!isize)
+ goto out;
+ end_index = (isize - 1) >> huge_page_shift(h);
if (index >= end_index) {
if (index > end_index)
goto out;
nr = ((isize - 1) & ~huge_page_mask(h)) + 1;
- if (nr <= offset) {
+ if (nr <= offset)
goto out;
- }
}
nr = nr - offset;
/* Find the page */
- page = find_get_page(mapping, index);
+ page = find_lock_page(mapping, index);
if (unlikely(page == NULL)) {
/*
* We have a HOLE, zero out the user-buffer for the
@@ -279,17 +275,18 @@ static ssize_t hugetlbfs_read(struct fil
else
ra = 0;
} else {
+ unlock_page(page);
+
/*
* We have the page, copy it to user space buffer.
*/
ra = hugetlbfs_read_actor(page, offset, buf, len, nr);
ret = ra;
+ page_cache_release(page);
}
if (ra < 0) {
if (retval == 0)
retval = ra;
- if (page)
- page_cache_release(page);
goto out;
}
@@ -299,16 +296,12 @@ static ssize_t hugetlbfs_read(struct fil
index += offset >> huge_page_shift(h);
offset &= ~huge_page_mask(h);
- if (page)
- page_cache_release(page);
-
/* short read or no more work */
if ((ret != nr) || (len == 0))
break;
}
out:
*ppos = ((loff_t)index << huge_page_shift(h)) + offset;
- mutex_unlock(&inode->i_mutex);
return retval;
}
next prev parent reply other threads:[~2012-03-30 23:00 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-30 19:58 [ 000/108] 3.0.27-stable review Greg KH
2012-03-30 19:57 ` [ 001/108] USB: option: Add MediaTek MT6276M modem&app interfaces Greg KH
2012-03-30 19:57 ` [ 002/108] USB: option driver: adding support for Telit CC864-SINGLE, CC864-DUAL and DE910-DUAL modems Greg KH
2012-03-30 19:57 ` [ 003/108] USB: option: make interface blacklist work again Greg KH
2012-03-30 19:57 ` [ 004/108] USB: option: add ZTE MF820D Greg KH
2012-03-30 19:57 ` [ 005/108] USB: ftdi_sio: fix problem when the manufacture is a NULL string Greg KH
2012-03-30 19:57 ` [ 006/108] USB: ftdi_sio: add support for BeagleBone rev A5+ Greg KH
2012-03-30 19:57 ` [ 007/108] USB: Microchip VID mislabeled as Hornby VID in ftdi_sio Greg KH
2012-03-30 19:57 ` [ 008/108] USB: ftdi_sio: new PID: Distortec JTAG-lock-pick Greg KH
2012-03-30 19:57 ` [ 009/108] USB: ftdi_sio: add support for FT-X series devices Greg KH
2012-03-30 19:57 ` [ 010/108] USB: ftdi_sio: new PID: LUMEL PD12 Greg KH
2012-03-30 19:57 ` [ 011/108] powerpc/usb: fix bug of kernel hang when initializing usb Greg KH
2012-04-13 5:21 ` Anthony Foiani
2012-04-13 17:42 ` Greg KH
2012-03-30 19:57 ` [ 012/108] usb: musb: Reselect index reg in interrupt context Greg KH
2012-03-30 19:57 ` [ 013/108] usb: gadgetfs: return number of bytes on ep0 read request Greg KH
2012-03-30 19:57 ` [ 014/108] USB: gadget: Make g_hid device class conform to spec Greg KH
2012-03-30 19:57 ` [ 015/108] futex: Cover all PI opcodes with cmpxchg enabled check Greg KH
2012-03-30 19:57 ` [ 016/108] sysfs: Fix memory leak in sysfs_sd_setsecdata() Greg KH
2012-03-30 19:57 ` [ 017/108] tty: moxa: fix bit test in moxa_start() Greg KH
2012-03-30 19:57 ` [ 018/108] TTY: Wrong unicode value copied in con_set_unimap() Greg KH
2012-03-30 19:57 ` [ 019/108] USB: serial: fix console error reporting Greg KH
2012-03-30 19:57 ` [ 020/108] cdc-wdm: Fix more races on the read path Greg KH
2012-03-30 19:57 ` [ 021/108] cdc-wdm: Dont clear WDM_READ unless entire read buffer is emptied Greg KH
2012-03-30 19:57 ` [ 022/108] usb: fsl_udc_core: Fix scheduling while atomic dump message Greg KH
2012-03-30 19:57 ` [ 023/108] usb: Fix build error due to dma_mask is not at pdev_archdata at ARM Greg KH
2012-03-30 19:57 ` [ 024/108] USB: qcserial: add several new serial devices Greg KH
2012-03-30 19:57 ` [ 025/108] USB: qcserial: dont grab QMI port on Gobi 1000 devices Greg KH
2012-03-30 19:57 ` [ 026/108] usb-serial: Add support for the Sealevel SeaLINK+8 2038-ROHS device Greg KH
2012-03-30 19:57 ` [ 027/108] usb: cp210x: Update to support CP2105 and multiple interface devices Greg KH
2012-03-30 19:57 ` [ 028/108] USB: serial: mos7840: Fixed MCS7820 device attach problem Greg KH
2012-03-30 19:57 ` [ 029/108] rt2x00: Add support for D-Link DWA-127 to rt2800usb Greg KH
2012-03-30 19:57 ` [ 030/108] rtlwifi: Handle previous allocation failures when freeing device memory Greg KH
2012-03-30 19:57 ` [ 031/108] rtlwifi: rtl8192c: Prevent sleeping from invalid context in rtl8192cu Greg KH
2012-03-30 19:57 ` [ 032/108] rtlwifi: rtl8192ce: Fix loss of receive performance Greg KH
2012-03-30 19:57 ` [ 033/108] serial: PL011: clear pending interrupts Greg KH
2012-04-01 11:43 ` Linus Walleij
2012-04-02 16:23 ` Greg KH
2012-04-03 7:46 ` Linus Walleij
2012-03-30 19:57 ` [ 034/108] math: Introduce div64_long Greg KH
2012-03-30 19:57 ` [ 035/108] ntp: Fix integer overflow when setting time Greg KH
2012-03-30 19:57 ` [ 036/108] uevent: send events in correct order according to seqnum (v3) Greg KH
2012-03-30 19:57 ` [ 037/108] genirq: Fix long-term regression in genirq irq_set_irq_type() handling Greg KH
2012-03-30 19:58 ` [ 038/108] genirq: Fix incorrect check for forced IRQ thread handler Greg KH
2012-03-30 19:58 ` [ 039/108] rtc: Disable the alarm in the hardware (v2) Greg KH
2012-03-30 19:58 ` [ 040/108] p54spi: Release GPIO lines and IRQ on error in p54spi_probe Greg KH
2012-03-30 19:58 ` [ 041/108] IB/iser: Post initial receive buffers before sending the final login request Greg KH
2012-03-30 19:58 ` [ 042/108] x86/ioapic: Add register level checks to detect bogus io-apic entries Greg KH
2012-03-30 19:58 ` [ 043/108] mm: thp: fix pmd_bad() triggering in code paths holding mmap_sem read mode Greg KH
2012-03-30 19:58 ` [ 044/108] bootmem/sparsemem: remove limit constraint in alloc_bootmem_section Greg KH
2012-03-30 19:58 ` Greg KH [this message]
2012-03-30 19:58 ` [ 046/108] ASoC: pxa-ssp: atomically set stream active masks Greg KH
2012-03-30 19:58 ` [ 047/108] tcm_loop: Set residual field for SCSI commands Greg KH
2012-03-30 19:58 ` [ 048/108] udlfb: remove sysfs framebuffer device with USB .disconnect() Greg KH
2012-03-30 19:58 ` [ 049/108] tcm_fc: Fix fc_exch memory leak in ft_send_resp_status Greg KH
2012-03-30 19:58 ` [ 050/108] md/bitmap: ensure to load bitmap when creating via sysfs Greg KH
2012-03-30 19:58 ` [ 051/108] md/raid1,raid10: avoid deadlock during resync/recovery Greg KH
2012-03-30 19:58 ` [ 052/108] drm/radeon: Restrict offset for legacy hardware cursor Greg KH
2012-03-30 19:58 ` [ 053/108] drm/radeon/kms: fix analog load detection on DVI-I connectors Greg KH
2012-03-30 19:58 ` [ 054/108] drm/radeon/kms: add connector quirk for Fujitsu D3003-S2 board Greg KH
2012-03-30 19:58 ` [ 055/108] target: Dont set WBUS16 or SYNC bits in INQUIRY response Greg KH
2012-03-30 19:58 ` [ 056/108] target: Fix 16-bit target ports for SET TARGET PORT GROUPS emulation Greg KH
2012-03-30 19:58 ` [ 057/108] Bluetooth: Add AR30XX device ID on Asus laptops Greg KH
2012-03-30 19:58 ` [ 058/108] HID: add extra hotkeys in Asus AIO keyboards Greg KH
2012-03-30 19:58 ` [ 059/108] HID: add more " Greg KH
2012-03-30 19:58 ` [ 060/108] pata_legacy: correctly mask recovery field for HT6560B Greg KH
2012-03-30 19:58 ` [ 061/108] firewire: ohci: fix too-early completion of IR multichannel buffers Greg KH
2012-03-30 19:58 ` [ 062/108] video:uvesafb: Fix oops that uvesafb try to execute NX-protected page Greg KH
2012-03-30 21:32 ` Florian Tobias Schandinat
2012-03-31 18:03 ` Greg KH
2012-03-30 19:58 ` [ 063/108] KVM: x86: extend "struct x86_emulate_ops" with "get_cpuid" Greg KH
2012-03-30 19:58 ` [ 064/108] KVM: x86: fix missing checks in syscall emulation Greg KH
2012-03-30 19:58 ` [ 065/108] NFS: Properly handle the case where the delegation is revoked Greg KH
2012-03-30 19:58 ` [ 066/108] NFSv4: Return the delegation if the server returns NFS4ERR_OPENMODE Greg KH
2012-03-30 19:58 ` [ 067/108] xfs: fix inode lookup race Greg KH
2012-03-30 19:58 ` [ 068/108] cifs: fix issue mounting of DFS ROOT when redirecting from one domain controller to the next Greg KH
2012-03-30 19:58 ` [ 069/108] UBI: fix error handling in ubi_scan() Greg KH
2012-03-30 19:58 ` [ 070/108] UBI: fix eraseblock picking criteria Greg KH
2012-03-30 19:58 ` [ 071/108] SUNRPC: We must not use list_for_each_entry_safe() in rpc_wake_up() Greg KH
2012-03-30 19:58 ` [ 072/108] usbnet: increase URB reference count before usb_unlink_urb Greg KH
2012-03-30 19:58 ` [ 073/108] usbnet: dont clear urb->dev in tx_complete Greg KH
2012-03-30 19:58 ` [ 074/108] x86-32: Fix endless loop when processing signals for kernel tasks Greg KH
2012-03-30 19:58 ` [ 075/108] proc-ns: use d_set_d_op() API to set dentry ops in proc_ns_instantiate() Greg KH
2012-03-30 19:58 ` [ 076/108] hwmon: (fam15h_power) Correct sign extension of running_avg_capture Greg KH
2012-03-30 19:58 ` [ 077/108] [media] lgdt330x: fix signedness error in i2c_read_demod_bytes() Greg KH
2012-03-30 19:58 ` [ 078/108] [media] pvrusb2: fix 7MHz & 8MHz DVB-T tuner support for HVR1900 rev D1F5 Greg KH
2012-03-30 19:58 ` [ 079/108] e1000e: Avoid wrong check on TX hang Greg KH
2012-03-30 19:58 ` [ 080/108] PM / Hibernate: Enable usermodehelpers in hibernate() error path Greg KH
2012-03-30 19:58 ` [ 081/108] ext4: flush any pending end_io requests before DIO reads w/dioread_nolock Greg KH
2012-03-30 19:58 ` [ 082/108] jbd2: clear BH_Delay & BH_Unwritten in journal_unmap_buffer Greg KH
2012-03-30 19:58 ` [ 083/108] ext4: ignore EXT4_INODE_JOURNAL_DATA flag with delalloc Greg KH
2012-03-30 19:58 ` [ 084/108] ext4: check for zero length extent Greg KH
2012-03-30 19:58 ` [ 085/108] vfs: fix d_ancestor() case in d_materialize_unique Greg KH
2012-03-30 19:58 ` [ 086/108] udf: Fix deadlock in udf_release_file() Greg KH
2012-03-30 19:58 ` [ 087/108] dm crypt: fix mempool deadlock Greg KH
2012-03-30 19:58 ` [ 088/108] dm crypt: add missing error handling Greg KH
2012-03-30 19:58 ` [ 089/108] dm exception store: fix init error path Greg KH
2012-03-30 19:58 ` [ 090/108] backlight: fix typo in tosa_lcd.c Greg KH
2012-03-30 19:58 ` [ 091/108] xfs: Fix oops on IO error during xlog_recover_process_iunlinks() Greg KH
2012-03-30 19:58 ` [ 092/108] slub: Do not hold slub_lock when calling sysfs_slab_add() Greg KH
2012-03-30 19:58 ` [ 093/108] module: Remove module size limit Greg KH
2012-03-30 19:58 ` [ 094/108] Bluetooth: btusb: fix bInterval for high/super speed isochronous endpoints Greg KH
2012-03-30 19:58 ` [ 095/108] drm/i915: suspend fbdev device around suspend/hibernate Greg KH
2012-03-30 19:58 ` [ 096/108] Fix pppol2tp getsockname() Greg KH
2012-03-30 19:58 ` [ 097/108] net: bpf_jit: fix BPF_S_LDX_B_MSH compilation Greg KH
2012-03-30 19:59 ` [ 098/108] net: fix a potential rcu_read_lock() imbalance in rt6_fill_node() Greg KH
2012-03-30 19:59 ` [ 099/108] net: fix napi_reuse_skb() skb reserve Greg KH
2012-03-30 19:59 ` [ 100/108] Remove printk from rds_sendmsg Greg KH
2012-03-30 19:59 ` [ 101/108] sky2: override for PCI legacy power management Greg KH
2012-03-30 19:59 ` [ 102/108] xfrm: Access the replay notify functions via the registered callbacks Greg KH
2012-03-30 19:59 ` [ 103/108] lockd: fix arg parsing for grace_period and timeout Greg KH
2012-03-30 19:59 ` [ 104/108] x86, tsc: Skip refined tsc calibration on systems with reliable TSC Greg KH
2012-03-30 19:59 ` [ 105/108] x86, tls: Off by one limit check Greg KH
2012-03-30 19:59 ` [ 106/108] compat: use sys_sendfile64() implementation for sendfile syscall Greg KH
2012-03-30 19:59 ` [ 107/108] nfsd: dont allow zero length strings in cache_parse() Greg KH
2012-03-30 19:59 ` [ 108/108] serial: sh-sci: fix a race of DMA submit_tx on transfer Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120330195728.553324494@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=dhillf@gmail.com \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).