linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: torvalds@linux-foundation.org, akpm@linux-foundation.org,
	alan@lxorguk.ukuu.org.uk, Nick Mathewson <nickm@freehaven.net>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Alexey Moiseytsev <himeraster@gmail.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [72/89] af_unix: fix EPOLLET regression for stream sockets
Date: Wed, 01 Feb 2012 13:00:36 -0800	[thread overview]
Message-ID: <20120201210050.210614371@clark.kroah.org> (raw)
In-Reply-To: <20120201210505.GA26028@kroah.com>

3.2-stable review patch.  If anyone has any objections, please let me know.

------------------


From: Eric Dumazet <eric.dumazet@gmail.com>

[ Upstream commit 6f01fd6e6f6809061b56e78f1e8d143099716d70 ]

Commit 0884d7aa24 (AF_UNIX: Fix poll blocking problem when reading from
a stream socket) added a regression for epoll() in Edge Triggered mode
(EPOLLET)

Appropriate fix is to use skb_peek()/skb_unlink() instead of
skb_dequeue(), and only call skb_unlink() when skb is fully consumed.

This remove the need to requeue a partial skb into sk_receive_queue head
and the extra sk->sk_data_ready() calls that added the regression.

This is safe because once skb is given to sk_receive_queue, it is not
modified by a writer, and readers are serialized by u->readlock mutex.

This also reduce number of spinlock acquisition for small reads or
MSG_PEEK users so should improve overall performance.

Reported-by: Nick Mathewson <nickm@freehaven.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Alexey Moiseytsev <himeraster@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/unix/af_unix.c |   19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1915,7 +1915,7 @@ static int unix_stream_recvmsg(struct ki
 		struct sk_buff *skb;
 
 		unix_state_lock(sk);
-		skb = skb_dequeue(&sk->sk_receive_queue);
+		skb = skb_peek(&sk->sk_receive_queue);
 		if (skb == NULL) {
 			unix_sk(sk)->recursion_level = 0;
 			if (copied >= target)
@@ -1955,11 +1955,8 @@ static int unix_stream_recvmsg(struct ki
 		if (check_creds) {
 			/* Never glue messages from different writers */
 			if ((UNIXCB(skb).pid  != siocb->scm->pid) ||
-			    (UNIXCB(skb).cred != siocb->scm->cred)) {
-				skb_queue_head(&sk->sk_receive_queue, skb);
-				sk->sk_data_ready(sk, skb->len);
+			    (UNIXCB(skb).cred != siocb->scm->cred))
 				break;
-			}
 		} else {
 			/* Copy credentials */
 			scm_set_cred(siocb->scm, UNIXCB(skb).pid, UNIXCB(skb).cred);
@@ -1974,8 +1971,6 @@ static int unix_stream_recvmsg(struct ki
 
 		chunk = min_t(unsigned int, skb->len, size);
 		if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
-			skb_queue_head(&sk->sk_receive_queue, skb);
-			sk->sk_data_ready(sk, skb->len);
 			if (copied == 0)
 				copied = -EFAULT;
 			break;
@@ -1990,13 +1985,10 @@ static int unix_stream_recvmsg(struct ki
 			if (UNIXCB(skb).fp)
 				unix_detach_fds(siocb->scm, skb);
 
-			/* put the skb back if we didn't use it up.. */
-			if (skb->len) {
-				skb_queue_head(&sk->sk_receive_queue, skb);
-				sk->sk_data_ready(sk, skb->len);
+			if (skb->len)
 				break;
-			}
 
+			skb_unlink(skb, &sk->sk_receive_queue);
 			consume_skb(skb);
 
 			if (siocb->scm->fp)
@@ -2007,9 +1999,6 @@ static int unix_stream_recvmsg(struct ki
 			if (UNIXCB(skb).fp)
 				siocb->scm->fp = scm_fp_dup(UNIXCB(skb).fp);
 
-			/* put message back and return */
-			skb_queue_head(&sk->sk_receive_queue, skb);
-			sk->sk_data_ready(sk, skb->len);
 			break;
 		}
 	} while (size);



  parent reply	other threads:[~2012-02-01 21:12 UTC|newest]

Thread overview: 90+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-01 21:05 [00/89] 3.2.3-stable review Greg KH
2012-02-01 20:59 ` [01/89] ALSA: hda - Fix buffer-alignment regression with Nvidia HDMI Greg KH
2012-02-01 20:59 ` [02/89] ALSA: hda - Fix silent outputs from docking-station jacks of Dell laptops Greg KH
2012-02-01 20:59 ` [03/89] eCryptfs: Sanitize write counts of /dev/ecryptfs Greg KH
2012-02-01 20:59 ` [04/89] ecryptfs: Improve metadata read failure logging Greg KH
2012-02-01 20:59 ` [05/89] eCryptfs: Make truncate path killable Greg KH
2012-02-01 20:59 ` [06/89] eCryptfs: Check inode changes in setattr Greg KH
2012-02-01 20:59 ` [07/89] eCryptfs: Fix oops when printing debug info in extent crypto functions Greg KH
2012-02-01 20:59 ` [08/89] drm/radeon/kms: Add an MSI quirk for Dell RS690 Greg KH
2012-02-01 20:59 ` [09/89] drm/radeon/kms: move panel mode setup into encoder mode set Greg KH
2012-02-01 20:59 ` [10/89] drm/radeon/kms: rework modeset sequence for DCE41 and DCE5 Greg KH
2012-02-01 20:59 ` [11/89] drm: Fix authentication kernel crash Greg KH
2012-02-01 20:59 ` [12/89] xfs: Fix missing xfs_iunlock() on error recovery path in xfs_readlink() Greg KH
2012-02-01 20:59 ` [13/89] ASoC: Mark WM5100 register map cache only when going into BIAS_OFF Greg KH
2012-02-01 20:59 ` [14/89] ASoC: Disable register synchronisation for low frequency WM8996 SYSCLK Greg KH
2012-02-01 20:59 ` [15/89] ASoC: Dont go through cache when applying WM5100 rev A updates Greg KH
2012-02-01 20:59 ` [16/89] ASoC: wm8996: Call _POST_PMU callback for CPVDD Greg KH
2012-02-01 20:59 ` [17/89] brcmsmac: fix tx queue flush infinite loop Greg KH
2012-02-01 20:59 ` [18/89] mac80211: fix work removal on deauth request Greg KH
2012-02-01 20:59 ` [19/89] jbd: Issue cache flush after checkpointing Greg KH
2012-02-01 20:59 ` [20/89] crypto: sha512 - make it work, undo percpu message schedule Greg KH
2012-02-01 20:59 ` [21/89] crypto: sha512 - reduce stack usage to safe number Greg KH
2012-02-01 20:59 ` [22/89] tpm_tis: add delay after aborting command Greg KH
2012-02-01 20:59 ` [23/89] x86/uv: Fix uninitialized spinlocks Greg KH
2012-02-01 20:59 ` [24/89] x86/uv: Fix uv_gpa_to_soc_phys_ram() shift Greg KH
2012-02-01 20:59 ` [25/89] x86/microcode_amd: Add support for CPU family specific container files Greg KH
2012-02-01 20:59 ` [26/89] m68k: Fix assembler constraint to prevent overeager gcc optimisation Greg KH
2012-02-01 20:59 ` [27/89] ALSA: hda: set mute led polarity for laptops with buggy BIOS based on SSID Greg KH
2012-02-01 20:59 ` [28/89] ALSA: hda - Fix silent output on ASUS A6Rp Greg KH
2012-02-01 20:59 ` [29/89] ALSA: hda - Fix silent output on Haier W18 laptop Greg KH
2012-02-01 20:59 ` [30/89] drm/i915: paper over missed irq issues with force wake voodoo Greg KH
2012-02-01 20:59 ` [31/89] drm/i915/sdvo: always set positive sync polarity Greg KH
2012-02-01 20:59 ` [32/89] drm/i915: Re-enable gen7 RC6 and GPU turbo after resume Greg KH
2012-02-01 20:59 ` [33/89] ARM: at91: fix at91rm9200 soc subtype handling Greg KH
2012-02-01 20:59 ` [34/89] mach-ux500: enable ARM errata 764369 Greg KH
2012-02-01 20:59 ` [35/89] ARM: 7296/1: proc-v7.S: remove HARVARD_CACHE preprocessor guards Greg KH
2012-02-01 21:00 ` [36/89] sysfs: Complain bitterly about attempts to remove files from nonexistent directories Greg KH
2012-02-01 21:00 ` [37/89] x86: xen: size struct xen_spinlock to always fit in arch_spinlock_t Greg KH
2012-02-01 21:00 ` [38/89] [SCSI] mpt2sas: Removed redundant calling of _scsih_probe_devices() from _scsih_probe Greg KH
2012-02-01 21:00 ` [39/89] USB: option: Add LG docomo L-02C Greg KH
2012-02-01 21:00 ` [40/89] USB: ftdi_sio: fix TIOCSSERIAL baud_base handling Greg KH
2012-02-01 21:00 ` [41/89] USB: ftdi_sio: fix initial baud rate Greg KH
2012-02-01 21:00 ` [42/89] USB: ftdi_sio: add PID for TI XDS100v2 / BeagleBone A3 Greg KH
2012-02-01 21:00 ` [43/89] USB: serial: ftdi additional IDs Greg KH
2012-02-01 21:00 ` [44/89] USB: ftdi_sio: Add more identifiers Greg KH
2012-02-01 21:00 ` [45/89] USB: cdc-wdm: updating desc->length must be protected by spin_lock Greg KH
2012-02-01 21:00 ` [46/89] USB: cdc-wdm: use two mutexes to allow simultaneous read and write Greg KH
2012-02-01 21:00 ` [47/89] qcaux: add more Pantech UML190 and UML290 ports Greg KH
2012-02-01 21:00 ` [48/89] usb: dwc3: ep0: tidy up Pending Request handling Greg KH
2012-02-01 21:00 ` [49/89] usb: io_ti: Make edge_remove_sysfs_attrs the port_remove method Greg KH
2012-02-01 21:00 ` [50/89] TTY: fix UV serial console regression Greg KH
2012-02-01 21:00 ` [51/89] serial: amba-pl011: lock console writes against interrupts Greg KH
2012-02-01 21:00 ` [52/89] jsm: Fixed EEH recovery error Greg KH
2012-02-01 21:00 ` [53/89] iwlwifi: fix PCI-E transport "inta" race Greg KH
2012-02-01 21:00 ` [54/89] vmwgfx: Fix assignment in vmw_framebuffer_create_handle Greg KH
2012-02-01 21:00 ` [55/89] USB: Realtek cr: fix autopm scheduling while atomic Greg KH
2012-02-01 21:00 ` [56/89] USB: usbsevseg: fix max length Greg KH
2012-02-01 21:00 ` [57/89] usb: gadget: langwell: dont call gadgets disconnect() Greg KH
2012-02-01 21:00 ` [58/89] usb: gadget: storage: endian fix Greg KH
2012-02-01 21:00 ` [59/89] drivers/usb/host/ehci-fsl.c: add missing iounmap Greg KH
2012-02-01 21:00 ` [60/89] xhci: Fix USB 3.0 device restart on resume Greg KH
2012-02-01 21:00 ` [61/89] xHCI: Cleanup isoc transfer ring when TD length mismatch found Greg KH
2012-02-01 21:00 ` [62/89] usb: musb: davinci: fix build breakage Greg KH
2012-02-01 21:00 ` [63/89] hwmon: (f71805f) Fix clamping of temperature limits Greg KH
2012-02-01 21:00 ` [64/89] hwmon: (w83627ehf) Disable setting DC mode for pwm2, pwm3 on NCT6776F Greg KH
2012-02-01 21:00 ` [65/89] hwmon: (sht15) fix bad error code Greg KH
2012-02-01 21:00 ` [66/89] USB: cdc-wdm: call wake_up_all to allow driver to shutdown on device removal Greg KH
2012-02-01 21:00 ` [67/89] USB: cdc-wdm: better allocate a buffer that is at least as big as we tell the USB core Greg KH
2012-02-01 21:00 ` [68/89] USB: cdc-wdm: Avoid hanging on interface with no USB_CDC_DMM_TYPE Greg KH
2012-02-01 21:00 ` [69/89] netns: fix net_alloc_generic() Greg KH
2012-02-01 21:00 ` [70/89] netns: Fail conspicously if someone uses net_generic at an inappropriate time Greg KH
2012-02-01 21:00 ` [71/89] net caif: Register properly as a pernet subsystem Greg KH
2012-02-01 21:00 ` Greg KH [this message]
2012-02-01 21:00 ` [73/89] bonding: fix enslaving in alb mode when link down Greg KH
2012-02-01 21:00 ` [74/89] l2tp: l2tp_ip - fix possible oops on packet receive Greg KH
2012-02-01 21:00 ` [75/89] macvlan: fix a possible use after free Greg KH
2012-02-01 21:00 ` [76/89] net: bpf_jit: fix divide by 0 generation Greg KH
2012-02-01 21:00 ` [77/89] net: reintroduce missing rcu_assign_pointer() calls Greg KH
2012-02-01 21:00 ` [78/89] rds: Make rds_sock_lock BH rather than IRQ safe Greg KH
2012-02-01 21:00 ` [79/89] tcp: fix tcp_trim_head() to adjust segment count with skb MSS Greg KH
2012-02-01 21:00 ` [80/89] tcp: md5: using remote adress for md5 lookup in rst packet Greg KH
2012-02-01 21:00 ` [81/89] USB: serial: CP210x: Added USB-ID for the Link Instruments MSO-19 Greg KH
2012-02-01 21:00 ` [82/89] USB: cp210x: call generic open last in open Greg KH
2012-02-01 21:00 ` [83/89] USB: cp210x: fix CP2104 baudrate usage Greg KH
2012-02-01 21:00 ` [84/89] USB: cp210x: do not map baud rates to B0 Greg KH
2012-02-01 21:00 ` [85/89] USB: cp210x: fix up set_termios variables Greg KH
2012-02-01 21:00 ` [86/89] USB: cp210x: clean up, refactor and document speed handling Greg KH
2012-02-01 21:00 ` [87/89] USB: cp210x: initialise baud rate at open Greg KH
2012-02-01 21:00 ` [88/89] USB: cp210x: allow more baud rates above 1Mbaud Greg KH
2012-02-01 21:00 ` [89/89] mach-ux500: no MMC_CAP_SD_HIGHSPEED on Snowball Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120201210050.210614371@clark.kroah.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=himeraster@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=nickm@freehaven.net \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).