netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Unix domain socket missing error code
@ 2019-11-11 13:38 Adeel Sharif
  2019-11-12  0:12 ` Cong Wang
  0 siblings, 1 reply; 5+ messages in thread
From: Adeel Sharif @ 2019-11-11 13:38 UTC (permalink / raw)
  To: netdev

Hello,

We are a group of people working on making Linux safe for everyone. In
hope of doing that I started testing the System Calls. The one I am
currently working on is send/write.

If send() is used to send datagrams on unix socket and the receiver
has stopped receiving, but still connected, there is a high
possibility that Linux kernel could eat up the whole system memory.
Although there is a system wide limit on write memory from wmem_max
parameter but this is sometimes also increased to system momory size
in order to avoid packet drops.

After having a look in the kernel implementation of
unix_dgram_sendmsg() it is obvious that user buffers are copied into
kernel socket buffers and they are queued to a linked list. This list
is growing without any limits. Although there is a qlen parameter but
it is never used to impose a limit on it. Could we perhaps impose a
limit on it and return an error with errcode Queue_Full or something
instead?

I don't know who is the maintainer of unix sockets. If someone knows
please let me know and I will discuss with him further.

Thank You.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unix domain socket missing error code
  2019-11-11 13:38 Unix domain socket missing error code Adeel Sharif
@ 2019-11-12  0:12 ` Cong Wang
  2019-11-12  8:56   ` Adeel Sharif
  0 siblings, 1 reply; 5+ messages in thread
From: Cong Wang @ 2019-11-12  0:12 UTC (permalink / raw)
  To: Adeel Sharif; +Cc: Linux Kernel Network Developers

On Mon, Nov 11, 2019 at 5:41 AM Adeel Sharif
<madeel.sharif@googlemail.com> wrote:
>
> Hello,
>
> We are a group of people working on making Linux safe for everyone. In
> hope of doing that I started testing the System Calls. The one I am
> currently working on is send/write.
>
> If send() is used to send datagrams on unix socket and the receiver
> has stopped receiving, but still connected, there is a high
> possibility that Linux kernel could eat up the whole system memory.
> Although there is a system wide limit on write memory from wmem_max
> parameter but this is sometimes also increased to system momory size
> in order to avoid packet drops.
>
> After having a look in the kernel implementation of
> unix_dgram_sendmsg() it is obvious that user buffers are copied into
> kernel socket buffers and they are queued to a linked list. This list
> is growing without any limits. Although there is a qlen parameter but
> it is never used to impose a limit on it. Could we perhaps impose a
> limit on it and return an error with errcode Queue_Full or something
> instead?

Isn't unix_recvq_full() supposed to do what you said? It is called inside
unix_dgram_sendmsg() to determine whether to wake up the dst socket.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unix domain socket missing error code
  2019-11-12  0:12 ` Cong Wang
@ 2019-11-12  8:56   ` Adeel Sharif
  2019-11-13 13:00     ` Adeel Sharif
  2019-11-19  5:43     ` Cong Wang
  0 siblings, 2 replies; 5+ messages in thread
From: Adeel Sharif @ 2019-11-12  8:56 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers

It should but it is not used when two different sockets are communicating.
This is the third check in the if statement and it is never called
because the first unlikely check was false:

if (other != sk &&
        unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {

Thanks.

On Tue, Nov 12, 2019 at 1:12 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
>
> On Mon, Nov 11, 2019 at 5:41 AM Adeel Sharif
> <madeel.sharif@googlemail.com> wrote:
> >
> > Hello,
> >
> > We are a group of people working on making Linux safe for everyone. In
> > hope of doing that I started testing the System Calls. The one I am
> > currently working on is send/write.
> >
> > If send() is used to send datagrams on unix socket and the receiver
> > has stopped receiving, but still connected, there is a high
> > possibility that Linux kernel could eat up the whole system memory.
> > Although there is a system wide limit on write memory from wmem_max
> > parameter but this is sometimes also increased to system momory size
> > in order to avoid packet drops.
> >
> > After having a look in the kernel implementation of
> > unix_dgram_sendmsg() it is obvious that user buffers are copied into
> > kernel socket buffers and they are queued to a linked list. This list
> > is growing without any limits. Although there is a qlen parameter but
> > it is never used to impose a limit on it. Could we perhaps impose a
> > limit on it and return an error with errcode Queue_Full or something
> > instead?
>
> Isn't unix_recvq_full() supposed to do what you said? It is called inside
> unix_dgram_sendmsg() to determine whether to wake up the dst socket.
>
> Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unix domain socket missing error code
  2019-11-12  8:56   ` Adeel Sharif
@ 2019-11-13 13:00     ` Adeel Sharif
  2019-11-19  5:43     ` Cong Wang
  1 sibling, 0 replies; 5+ messages in thread
From: Adeel Sharif @ 2019-11-13 13:00 UTC (permalink / raw)
  To: Cong Wang, Linux Kernel Network Developers

Eventually Kernel OOM will start and kill the process:

[  581.134746] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.10.2-1ubuntu1 04/01/2014
[  581.134816] Call Trace:
[  581.135550]  dump_stack+0x46/0x5b
[  581.135580]  dump_header.isra.35+0x5b/0x23c
[  581.135590]  oom_kill_process+0x20f/0x3d0
[  581.135603]  ? has_intersects_mems_allowed+0x6b/0x90
[  581.135623]  out_of_memory+0xe9/0x580
[  581.135630]  __alloc_pages_slowpath+0x9c9/0xd10
[  581.135640]  __alloc_pages_nodemask+0x237/0x260
[  581.135647]  filemap_fault+0x1eb/0x560
[  581.135656]  ? __switch_to_asm+0x40/0x70
[  581.135662]  ? __switch_to_asm+0x34/0x70
[  581.135667]  ? __switch_to_asm+0x40/0x70
[  581.135672]  ? alloc_set_pte+0x252/0x2f0
[  581.135680]  ext4_filemap_fault+0x27/0x36
[  581.135689]  __do_fault+0x2b/0x90
[  581.135694]  __handle_mm_fault+0x67e/0xae0
[  581.135704]  __do_page_fault+0x239/0x4b0
[  581.135713]  ? page_fault+0x8/0x30
[  581.135719]  page_fault+0x1e/0x30
[  581.135867] RIP: 0033:0x561979ac3050
[  581.136120] Code: Bad RIP value.
[  581.136140] RSP: 002b:00007ffe528f8668 EFLAGS: 00000246
[  581.136162] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 00007fd49d85615d
[  581.136170] RDX: 000056197b832ac0 RSI: 000056197b832ae0 RDI: 000056197b82da30
[  581.136177] RBP: 000056197b82da30 R08: 00007ffe528f86e0 R09: 00007ffe5292d080
[  581.136184] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000245
[  581.136191] R13: 0000561979b357e0 R14: 0000000000000003 R15: 00007ffe528f86e0

On Tue, Nov 12, 2019 at 9:56 AM Adeel Sharif
<madeel.sharif@googlemail.com> wrote:
>
> It should but it is not used when two different sockets are communicating.
> This is the third check in the if statement and it is never called
> because the first unlikely check was false:
>
> if (other != sk &&
>         unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
>
> Thanks.
>
> On Tue, Nov 12, 2019 at 1:12 AM Cong Wang <xiyou.wangcong@gmail.com> wrote:
> >
> > On Mon, Nov 11, 2019 at 5:41 AM Adeel Sharif
> > <madeel.sharif@googlemail.com> wrote:
> > >
> > > Hello,
> > >
> > > We are a group of people working on making Linux safe for everyone. In
> > > hope of doing that I started testing the System Calls. The one I am
> > > currently working on is send/write.
> > >
> > > If send() is used to send datagrams on unix socket and the receiver
> > > has stopped receiving, but still connected, there is a high
> > > possibility that Linux kernel could eat up the whole system memory.
> > > Although there is a system wide limit on write memory from wmem_max
> > > parameter but this is sometimes also increased to system momory size
> > > in order to avoid packet drops.
> > >
> > > After having a look in the kernel implementation of
> > > unix_dgram_sendmsg() it is obvious that user buffers are copied into
> > > kernel socket buffers and they are queued to a linked list. This list
> > > is growing without any limits. Although there is a qlen parameter but
> > > it is never used to impose a limit on it. Could we perhaps impose a
> > > limit on it and return an error with errcode Queue_Full or something
> > > instead?
> >
> > Isn't unix_recvq_full() supposed to do what you said? It is called inside
> > unix_dgram_sendmsg() to determine whether to wake up the dst socket.
> >
> > Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Unix domain socket missing error code
  2019-11-12  8:56   ` Adeel Sharif
  2019-11-13 13:00     ` Adeel Sharif
@ 2019-11-19  5:43     ` Cong Wang
  1 sibling, 0 replies; 5+ messages in thread
From: Cong Wang @ 2019-11-19  5:43 UTC (permalink / raw)
  To: Adeel Sharif; +Cc: Linux Kernel Network Developers

On Tue, Nov 12, 2019 at 12:56 AM Adeel Sharif
<madeel.sharif@googlemail.com> wrote:
>
> It should but it is not used when two different sockets are communicating.
> This is the third check in the if statement and it is never called
> because the first unlikely check was false:
>
> if (other != sk &&
>         unlikely(unix_peer(other) != sk && unix_recvq_full(other))) {
>

Good catch!

It seems you already have a reproducer to trigger this OOM? If so
please share. And let me see if and how I can fix it.

Thanks.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-11-19  6:01 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-11 13:38 Unix domain socket missing error code Adeel Sharif
2019-11-12  0:12 ` Cong Wang
2019-11-12  8:56   ` Adeel Sharif
2019-11-13 13:00     ` Adeel Sharif
2019-11-19  5:43     ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).