From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75284C43381 for ; Wed, 13 Feb 2019 23:39:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4B6B9222A4 for ; Wed, 13 Feb 2019 23:39:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390986AbfBMXj4 (ORCPT ); Wed, 13 Feb 2019 18:39:56 -0500 Received: from orbyte.nwl.cc ([151.80.46.58]:35140 "EHLO orbyte.nwl.cc" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728875AbfBMXjz (ORCPT ); Wed, 13 Feb 2019 18:39:55 -0500 Received: from n0-1 by orbyte.nwl.cc with local (Exim 4.91) (envelope-from ) id 1gu486-0002Qb-Vn; Thu, 14 Feb 2019 00:39:51 +0100 Date: Thu, 14 Feb 2019 00:39:50 +0100 From: Phil Sutter To: Stephen Hemminger Cc: Stefano Brivio , Eric Dumazet , netdev@vger.kernel.org, Sabrina Dubroca , David Ahern Subject: Re: [PATCH iproute2 net-next v2 3/4] ss: Buffer raw fields first, then render them as a table Message-ID: <20190213233950.GQ26388@orbyte.nwl.cc> Mail-Followup-To: Phil Sutter , Stephen Hemminger , Stefano Brivio , Eric Dumazet , netdev@vger.kernel.org, Sabrina Dubroca , David Ahern References: <82f1bc98-df6d-2b0a-17e5-fa057563284e@gmail.com> <20190213093711.13ab560e@redhat.com> <20190213221716.5f958c2a@redhat.com> <20190213135534.01dacee5@shemminger-XPS-13-9360> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190213135534.01dacee5@shemminger-XPS-13-9360> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hi Stephen, On Wed, Feb 13, 2019 at 01:55:34PM -0800, Stephen Hemminger wrote: > On Wed, 13 Feb 2019 22:17:16 +0100 > Stefano Brivio wrote: > > > On Wed, 13 Feb 2019 09:31:03 -0800 > > Eric Dumazet wrote: > > > > > On 02/13/2019 12:37 AM, Stefano Brivio wrote: > > > > On Tue, 12 Feb 2019 16:42:04 -0800 > > > > Eric Dumazet wrote: > > > > > > > >> I do not get it. > > > >> > > > >> "ss -emoi " uses almost 1KB per socket. > > > >> > > > >> 10,000,000 sockets -> we need about 10GB of memory ??? > > > >> > > > >> This is a serious regression. > > > > > > > > I guess this is rather subjective: the worst case I considered back then > > > > was the output of 'ss -tei0' (less than 500 bytes) for one million > > > > sockets, which gives 500M of memory, which should in turn be fine on a > > > > machine handling one million sockets. > > > > > > > > Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of > > > > curiosity: how are you going to process that output? Would JSON help?), > > > > I see two easy options to solve this: > > > > > > > > > ss -temoi | parser (written in shell or awk or whatever...) > > > > > > This is a use case, I just got bitten because using ss command > > > actually OOM my container, while trying to debug a busy GFE. > > > > > > The host itself can have 10,000,000 TCP sockets, but usually sysadmin shells > > > run in a container with no more than 500 MB available. > > > > > > Otherwise, it would be too easy for a buggy program to OOM the whole machine > > > and have angry customers. > > > > > > > > > > > 1. flush the output every time we reach a given buffer size (1M > > > > perhaps). This might make the resulting blocks slightly unaligned, > > > > with occasional loss of readability on lines occurring every 1k to > > > > 10k sockets approximately, even though after 1k sockets column sizes > > > > won't change much (it looks anyway better than the original), and I > > > > don't expect anybody to actually scroll that output > > > > > > > > 2. add a switch for unbuffered output, but then you need to remember to > > > > pass it manually, and the whole output would be as bad as the > > > > original in case you need the switch. > > > > > > > > I'd rather go with 1., it's easy to implement (we already have partial > > > > flushing with '--events') and it looks like a good compromise on > > > > usability. Thoughts? > > > > > > > > > > 1 seems fine, but a switch for 'please do not try to format' would be fine. > > > > > > I wonder why we try to 'format' when stdout is a pipe or a regular file . > > > > On a second thought: what about | less, or | grep [ports], > > or > readable.log? I guess those might also be rather common use cases, > > what do you think? > > > > I'm tempted to skip this for the moment and just go with option 1. > > > > What I would favor: > * use big enough columns that for the common case everything lines up fine > * if column is to wide just print that element wider (which is what print %Ns does) This is pretty much the situation Stefano attempted to improve, minus scaling the columns to max terminal width. ss output formatting being quirky and unreadable with either small or large terminals was the number one reason I heard so far why people prefer netstat. > and > * add json output for programs that want to parse > * use print_uint etc for that For Eric's use-case, skipping any buffering and tabular output if stdout is not a TTY suffices. In fact, iproute2 does this already for colored output (see check_enable_color() for reference). Adding JSON output support everywhere is a nice feature when it comes to scripting, but it won't help console users. Unless you expect CLI frontends to come turning that JSON into human-readable output. IMHO, JSON output wouldn't even help in this case - unless Eric indeed prefers to write/use a JSON parser for his analysis instead of something along 'ss | grep'. > The buffering patch (in iproute2-next) can/will be reverted. It's not fair to claim that despite Stefano's commitment to fix the reported issues. His ss output rewrite is there since v4.15.0 and according to git history it needed only two fixes so far. I've had one-liners which required more follow-ups than that! Also, we're still discovering issues introduced by all the jsonify patches. Allowing for people to get things right not the first time but after a few tries is important. If you want to revert something, start with features which have a fundamental design issue in the exact situation they tried to improve, like the MSG_PEEK | MSG_TRUNC thing Hangbin and me wrote. Thanks, Phil