linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
@ 2013-09-13  0:51 Fengguang Wu
  2013-09-13  1:09 ` Fengguang Wu
  2013-09-13  3:17 ` Greg KH
  0 siblings, 2 replies; 29+ messages in thread
From: Fengguang Wu @ 2013-09-13  0:51 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

[-- Attachment #1: Type: text/plain, Size: 5651 bytes --]

Hi Peter,

FYI, we noticed much increased vmap_area_lock contentions since this
commit:

commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
Author: Peter Hurley <peter@hurleysoftware.com>
Date:   Sat Jun 15 10:21:19 2013 -0400

    n_tty: Move buffers into n_tty_data
    
    Reduce pointer reloading and improve locality-of-reference;
    allocate read_buf and echo_buf within struct n_tty_data.
    
    Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

:040000 040000 96d92e4e242c4b2ff11b25c005bccd093865b350 2822d87b2425c3e7adc7b722a20d739c9d4a3046 M	drivers
bisect run success

# bad: [9cdfea961eeaf9b6b939fb9dc822a204a1d94d08] Merge remote-tracking branch 'ipvs-next/master' into devel-hourly-2013090112
# good: [d4e4ab86bcba5a72779c43dc1459f71fea3d89c8] Linux 3.11-rc5
git bisect start '9cdfea961eeaf9b6b939fb9dc822a204a1d94d08' 'd4e4ab86bcba5a72779c43dc1459f71fea3d89c8' '--'
# good: [903bc7e8c2a4f9ef8e40aed615ed48023f9460e3] Merge remote-tracking branch 'spi/for-next'
git bisect good 903bc7e8c2a4f9ef8e40aed615ed48023f9460e3
# bad: [d926318427c1fa8df182d3dcd7f802d7d0609225] Merge remote-tracking branch 'bcon/master'
git bisect bad d926318427c1fa8df182d3dcd7f802d7d0609225
# bad: [adcb2bcd0dd2bb88f21180c8ffb9ec2549f4d883] Merge remote-tracking branch 'usb/usb-next'
git bisect bad adcb2bcd0dd2bb88f21180c8ffb9ec2549f4d883
# good: [a83605266423437d8afe73cbd538df33cf8fb32b] Merge remote-tracking branch 'driver-core/driver-core-next'
git bisect good a83605266423437d8afe73cbd538df33cf8fb32b
# good: [224563b6ce034b82f8511969d9496113da34fb2c] Merge tag 'for-usb-next-2013-08-15' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci into usb-next
git bisect good 224563b6ce034b82f8511969d9496113da34fb2c
# bad: [5284eba7b37dfeffe75bfdf81a13e9efebe0480a] serial/arc-uart: Handle Rx Error Interrupts w/o any data
git bisect bad 5284eba7b37dfeffe75bfdf81a13e9efebe0480a
# bad: [7d88d637a3c75257b7611f9feee0c17fa187ecc9] n_tty: Factor standard per-char i/o into separate fn
git bisect bad 7d88d637a3c75257b7611f9feee0c17fa187ecc9
# good: [11b9faa44df76189b8346ff602a2c01c610c37eb] tty: Merge tty_buffer_find() into tty_buffer_alloc()
git bisect good 11b9faa44df76189b8346ff602a2c01c610c37eb
# good: [019ebdf9f26fd2e43b9e1af576835183e95dc82e] n_tty: Eliminate echo_commit memory barrier
git bisect good 019ebdf9f26fd2e43b9e1af576835183e95dc82e
# bad: [1bb9d562856279a217a4e581a803dee9bb48a708] n_tty: Rename process_char_map to char_map
git bisect bad 1bb9d562856279a217a4e581a803dee9bb48a708
# good: [9dfd16ddea9bdbc8343340e543732db0a467ae32] n_tty: Avoid false-sharing echo buffer indices
git bisect good 9dfd16ddea9bdbc8343340e543732db0a467ae32
# good: [8cb06c983822103da1cfe57b9901e60a00e61f67] n_tty: Remove alias ptrs in __receive_buf()
git bisect good 8cb06c983822103da1cfe57b9901e60a00e61f67
# bad: [20bafb3d23d108bc0a896eb8b7c1501f4f649b77] n_tty: Move buffers into n_tty_data
git bisect bad 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
# first bad commit: [20bafb3d23d108bc0a896eb8b7c1501f4f649b77] n_tty: Move buffers into n_tty_data


Compare of all good/bad commits in this bisect:

                        lock_stat.vmap_area_lock.contentions

   22000 ++-----------------------------------------------------------------+
   20000 ++O    O O O    O O O O OO O O   OO O O O OO   O O O O  O O O O    |
         O   OO       OO                O             O        O            |
   18000 ++                                                                 |
   16000 ++                                                                 |
         |                                                                  |
   14000 ++                                                                 |
   12000 ++                                                                 |
   10000 ++                                                                 |
         |                                                                  |
    8000 ++                                                                 |
    6000 ++                                                                 |
         |                                                                  |
    4000 *+*.**.*.*.*.**.*.*.*.*.**.*.*.*.**.*.*.*.**.*.*.*.*.**.*.*.*.**.*.*
    2000 ++-----------------------------------------------------------------+


                lock_stat.vmap_area_lock.contentions.get_vmalloc_info

   45000 ++-----------------------------------------------------------------+
         | O                   O    O            O      O   O      O        |
   40000 O+  OO O O O OO O O O   OO   O O OO O O   OO O   O   OO O   O O    |
   35000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   25000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         |                        *.*.     *. .*.          .*.  .*.*.       |
    5000 *+*-**-*-*-*-**-*-*-*-*-*----*-*-*--*---*-**-*-*-*---**-----*-**-*-*


[-- Attachment #2: 0001-n_tty-Move-buffers-into-n_tty_data.patch --]
[-- Type: text/x-diff, Size: 2468 bytes --]

>From 20bafb3d23d108bc0a896eb8b7c1501f4f649b77 Mon Sep 17 00:00:00 2001
From: Peter Hurley <peter@hurleysoftware.com>
Date: Sat, 15 Jun 2013 10:21:19 -0400
Subject: [PATCH] n_tty: Move buffers into n_tty_data

Reduce pointer reloading and improve locality-of-reference;
allocate read_buf and echo_buf within struct n_tty_data.

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/tty/n_tty.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c
index d6eba99..51cbdd2 100644
--- a/drivers/tty/n_tty.c
+++ b/drivers/tty/n_tty.c
@@ -105,9 +105,9 @@ struct n_tty_data {
 	unsigned char lnext:1, erasing:1, raw:1, real_raw:1, icanon:1;
 
 	/* shared by producer and consumer */
-	char *read_buf;
+	char read_buf[N_TTY_BUF_SIZE];
 	DECLARE_BITMAP(read_flags, N_TTY_BUF_SIZE);
-	unsigned char *echo_buf;
+	unsigned char echo_buf[N_TTY_BUF_SIZE];
 
 	int minimum_to_wake;
 
@@ -1695,9 +1695,7 @@ static void n_tty_close(struct tty_struct *tty)
 	if (tty->link)
 		n_tty_packet_mode_flush(tty);
 
-	kfree(ldata->read_buf);
-	kfree(ldata->echo_buf);
-	kfree(ldata);
+	vfree(ldata);
 	tty->disc_data = NULL;
 }
 
@@ -1715,7 +1713,8 @@ static int n_tty_open(struct tty_struct *tty)
 {
 	struct n_tty_data *ldata;
 
-	ldata = kzalloc(sizeof(*ldata), GFP_KERNEL);
+	/* Currently a malloc failure here can panic */
+	ldata = vmalloc(sizeof(*ldata));
 	if (!ldata)
 		goto err;
 
@@ -1723,16 +1722,14 @@ static int n_tty_open(struct tty_struct *tty)
 	mutex_init(&ldata->atomic_read_lock);
 	mutex_init(&ldata->output_lock);
 
-	/* These are ugly. Currently a malloc failure here can panic */
-	ldata->read_buf = kzalloc(N_TTY_BUF_SIZE, GFP_KERNEL);
-	ldata->echo_buf = kzalloc(N_TTY_BUF_SIZE, GFP_KERNEL);
-	if (!ldata->read_buf || !ldata->echo_buf)
-		goto err_free_bufs;
-
 	tty->disc_data = ldata;
 	reset_buffer_flags(tty->disc_data);
 	ldata->column = 0;
+	ldata->canon_column = 0;
 	ldata->minimum_to_wake = 1;
+	ldata->num_overrun = 0;
+	ldata->no_room = 0;
+	ldata->lnext = 0;
 	tty->closing = 0;
 	/* indicate buffer work may resume */
 	clear_bit(TTY_LDISC_HALTED, &tty->flags);
@@ -1740,10 +1737,6 @@ static int n_tty_open(struct tty_struct *tty)
 	tty_unthrottle(tty);
 
 	return 0;
-err_free_bufs:
-	kfree(ldata->read_buf);
-	kfree(ldata->echo_buf);
-	kfree(ldata);
 err:
 	return -ENOMEM;
 }
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  0:51 increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data" Fengguang Wu
@ 2013-09-13  1:09 ` Fengguang Wu
  2013-09-17 15:34   ` Peter Hurley
  2013-09-18  0:49   ` Peter Hurley
  2013-09-13  3:17 ` Greg KH
  1 sibling, 2 replies; 29+ messages in thread
From: Fengguang Wu @ 2013-09-13  1:09 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg KH, LKML, lkp, Tejun Heo

On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> Hi Peter,
> 
> FYI, we noticed much increased vmap_area_lock contentions since this
> commit:
> 
> commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
> Author: Peter Hurley <peter@hurleysoftware.com>
> Date:   Sat Jun 15 10:21:19 2013 -0400
> 
>     n_tty: Move buffers into n_tty_data
>     
>     Reduce pointer reloading and improve locality-of-reference;
>     allocate read_buf and echo_buf within struct n_tty_data.

Here are some comparison between this commit [o] with its parent commit [*].

                          slabinfo.kmalloc-128.active_objs

   10400 ++--------O---------------O----------------------------------------+
         |                 O    O       O  O      O    O  O  O O  O O  O O  O
   10200 ++      O    O O     O      O       O  O    O                      |
   10000 ++   O                                                             |
         O  O                                                               |
    9800 ++                                                                 |
         |                                                                  |
    9600 ++                                                                 |
         |                                                                  |
    9400 ++                                                                 |
    9200 ++                                                                 |
         |                                                                  |
    9000 *+.*.                                                              |
         |    *                                                             |
    8800 ++-----------------------------------------------------------------+


                          slabinfo.kmalloc-128.active_objs

   2400 ++------------------------------------------------------------------+
        |                                      O  O        O     O         O|
   2200 ++     O  OO    O    O   O    O O           O    OO         O   O   O
   2000 O+O OOO O   OOO  OOOO  OO OO O O O OOOO  O O O OO   O OOO O  OOO  O |
        |O                                                                  |
   1800 ++                                                                  |
        |                                                                   |
   1600 ++                                                                  |
        |                                                                   |
   1400 ++                                                                  |
   1200 ++                                                                  |
        |*                                                                  |
   1000 ++*.*                                                               |
        *    **                                                             |
    800 ++------------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89
------------------------  ------------------------
                  976.67      +108.3%      2034.67  lkp-a04/micro/netperf/120s-200%-TCP_STREAM
                 8971.36       +11.4%      9997.05  nhm-white/micro/aim7/exec_test
                 9948.03       +20.9%     12031.72  TOTAL slabinfo.kmalloc-128.active_objs

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                  976.67      +108.3%      2034.67  lkp-a04/micro/netperf/120s-200%-TCP_STREAM
                 9127.64       +11.4%     10164.15  nhm-white/micro/aim7/exec_test
                10104.31       +20.7%     12198.82  TOTAL slabinfo.kmalloc-128.num_objs

                        lock_stat.vmap_area_lock.holdtime-total

   1.8e+07 ++---------------------------------------------------------------+
           O  O   O      O         O                O  O                 O  |
   1.6e+07 ++        O      O   O     O   O  O   O         O  O   O  O      O
   1.4e+07 ++                                                               |
           |                                                                |
   1.2e+07 ++                                                               |
     1e+07 ++                                                               |
           |                                                                |
     8e+06 ++                                                               |
     6e+06 ++                                                               |
           |                                                                |
     4e+06 ++                                                               |
     2e+06 ++                                                               |
           *..*...*..*...*..*...*..*..*...*                                 |
         0 ++---------------------------------------------------------------+



                       lock_stat.vmap_area_lock.holdtime-total

   70000 ++-----------------------------------------------------------------+
         |                O                        O                    O   |
   60000 ++  O   O    O       O   O   O        O           O   O    O       |
         O                                 O           O                    O
   50000 ++                                                                 |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         *...*...*....*...*                                                 |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   80000 ++-----------------------------------------------------------------+
         |                                                                  |
   70000 ++                                         O                       |
   60000 ++                                                                 |
         |                       O                                          |
   50000 ++                                               O     O           |
         |                                                                  O
   40000 ++                                   O                             |
         |                                                                  |
   30000 O+                O                                                |
   20000 ++    O     O                 O                              O     |
         |                                                                  |
   10000 ++                                                                 |
         |                                                                  |
       0 *+----*-----*-----*-----*------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   90000 ++-----------------------------------------------------------------+
         |                                                                  |
   80000 ++                                             O                   |
   70000 ++                                                                 |
         |                                                                  |
   60000 ++                                                                 O
   50000 ++                                                          O      |
         |      O     O                          O                          |
   40000 O+                                                    O            |
   30000 ++                         O                                       |
         |                   O             O                                |
   20000 ++                                                                 |
   10000 ++                                                                 |
         *......                                                            |
       0 ++-----*-----*------*------*------*-----*--------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   80000 ++-----------------------------------------------------------------+
         |                             O              O                     |
   70000 ++                                                                 |
   60000 ++                                                                 |
         |                                                                  |
   50000 ++                    O                             O       O      |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 O+     O       O                                                   |
   20000 ++                                   O                             O
         |                                                                  |
   10000 ++                                                                 |
         *......*.......    ...*                                            |
       0 ++-------------*---------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   80000 ++-------------------------------------------------------------O---+
         |                                 O           O   O   O            O
   70000 ++                   O                                             |
   60000 ++                                                         O       |
         |                        O            O                            |
   50000 ++                                                                 |
         |                                                                  |
   40000 ++                                                                 |
         |                O                        O                        |
   30000 ++           O               O                                     |
   20000 O+  O   O                                                          |
         |                                                                  |
   10000 ++                                                                 |
         |                                                                  |
       0 *+--*---*----*-----------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   80000 ++-----------------------------------------------------------------+
         |                                                     O     O      |
   70000 ++                         O                                       O
   60000 ++                                                                 |
         |                                                                  |
   50000 ++                                                                 |
         |                                                                  |
   40000 ++                                                                 |
         O            O                                                     |
   30000 ++     O                          O     O                          |
   20000 ++                                             O                   |
         |                   O                                              |
   10000 ++                                                                 |
         |                                                                  |
       0 *+-----*-----*------*----------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   120000 ++----------------------------------------------------------------+
          |                                                                 |
   100000 ++                                                O               O
          |                                O                                |
          |                                        O                        |
    80000 ++                                                                |
          |                                                                 |
    60000 ++                                                        O       |
          |       O                                                         |
    40000 O+               O                                                |
          |                                                                 |
          |                        O                                        |
    20000 ++                                                                |
          |                                                                 |
        0 *+------*--------*-------*----------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   1.6e+07 ++---------------------------------------------------------------+
           |  O                                       O    O    O           O
   1.4e+07 O+   O  O O  O O  O O  O O  O O  O O  O O    O    O    O  O O  O |
   1.2e+07 ++                                                               |
           |                                                                |
     1e+07 ++                                                               |
           |                                                                |
     8e+06 ++                                                               |
           |                                                                |
     6e+06 ++                                                               |
     4e+06 ++                                                               |
           |                                                                |
     2e+06 ++                                                               |
           *..*.*                                                           |
         0 ++---------------------------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   400000 ++----------------------------------------------------------------+
          |                               O                      O          O
   350000 ++ O  O  O O           O     O    O                               |
          |                O  O     O          O     O  O  O  O    O  O     |
   300000 O+            O                         O                      O  |
   250000 ++                                                                |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
   100000 ++                                                                |
          |                                                                 |
    50000 ++                                                                |
          *..*..*                                                           |
        0 ++----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   60000 +O-----------O---------O-O--------O-----------------O-----------O--+
         OOOO O OOO OO OOOOOOOOO O OOOOOOOOOOO  O O  OO OOOOO O OOOO OOOOO OO
   50000 ++  O O   O   O O           O      O OO O OO OOO O O OOO  OOO O  O |
         |                                                                  |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
         |                                                                  |
   10000 ++                                                                 |
         **                                                                 |
       0 ++-----------------------------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   350000 ++----------------------------------------------------------------+
          |    O            O                                    O  O  O    O
   300000 O+         O    O         O       O  O  O       O O               |
          |  O    O    O       O  O    O  O         O  O       O         O  |
   250000 ++                                                                |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
          |                                                                 |
   100000 ++                                                                |
          |                                                                 |
    50000 ++                                                                |
          *..*.*..*                                                         |
        0 ++----------------------------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   300000 ++--------------------------O----O--------------O-------O---------+
          O  O O    O              O          O O  O    O    O      O       |
   250000 ++      O           O  O      O            O         O       O    O
          |            O O  O                                            O  |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
          |                                                                 |
   100000 ++                                                                |
          |                                                                 |
          |                                                                 |
    50000 ++                                                                |
          *..*.*..*.*                                                       |
        0 ++----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   70000 ++-----------------------------------------------------------------+
         |                          O       O                         O     |
   60000 +O  O  OOO   OO   O   O O O O  OOOO O O O OO O OO   O O   O O  O   |
         O O  OO    OO  OO  OOO   O   O         O O    O  O O O OO  O  O  OOO
   50000 ++                                                                 |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         ***.***                                                            |
       0 ++-----------------------------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   400000 ++----------------------------------------------------------------+
          |                     O                   O           O    O      |
   350000 O+   O    O    O O                O  O O    O O  O O              O
          | O    O    O       O   O  O O  O                       O    O  O |
   300000 ++                                                                |
   250000 ++                                                                |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
   100000 ++                                                                |
          |                                                                 |
    50000 ++                                                                |
          *.*..*.*..*.*                                                     |
        0 ++----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   80000 ++--------------------------O--------------------------------------+
         |                      O                        O            O     |
   70000 ++O    O    OO   OOO     OO  OO O OO O O  OOO  O OOO    OO OO OO O |
   60000 ++   OO O OO  O O   O   O      O    O O  O   O       OOO          OO
         OO  O                 O                                            |
   50000 ++                                                                 |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
   20000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         ***.***                                                            |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   50000 ++-------------------------------O---------------------------------O
   45000 OOO OOO O   O OOO  OO O  O   O    OO  OO   O  O O      OO OOOO    O|
         |      O O O O    O  O    OOO   O       OOO  O O O OO O       OO O |
   40000 ++                      O      O    O                O             |
   35000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
   25000 ++                                                                 |
   20000 ++                                                                 |
         |                                                                  |
   15000 ++                                                                 |
   10000 ++                                                                 |
         |                                                                  |
    5000 ***.***                                                            |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   45000 ++--------------O-----------O---O----------------------------------+
         | O OO  O  OOOO   O O OO             O O  O      OOO               |
   40000 O+    O   O      O O     OO   OO  OO  O  O OO  OO    OO OO OOOOO OOO
   35000 +O     O                O    O      O        O         O           |
         |                                                                  |
   30000 ++                                                                 |
   25000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
    5000 ++                                                                 |
         ***.***                                                            |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   45000 ++------------------------------O------O--O------------------------+
         |  O   O  O   O O   OOOOO  O  O     O   O  O   O  O O OO   O OO O  |
   40000 OO   OO OO  OO    O      O  OO O  OO OO     OOO  O O O   OO O    OOO
   35000 ++O            O O                                                 |
         |                                                                  |
   30000 ++                                                                 |
   25000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
    5000 ++                                                                 |
         ****.**                                                            |
       0 ++-----------------------------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   5.5e+06 O+---------------------------O------O----------------------------+
     5e+06 ++     O      O       O                    O       O      O      |
           |                                                                O
   4.5e+06 ++                                                               |
     4e+06 ++                                                               |
   3.5e+06 ++                                                               |
     3e+06 ++                                                               |
           |                                                                |
   2.5e+06 ++                                                               |
     2e+06 ++                                                               |
   1.5e+06 ++                                                               |
     1e+06 ++                                                               |
           |                                                                |
    500000 *+.....*......*.......*......*                                   |
         0 ++---------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   7e+06 ++-----------------------------------------------------------------+
         |             O                                                    |
   6e+06 ++O O O O O O   O O O O O   O O O  O O O O O O O O O O O O O O O O O
         O                         O                                        |
   5e+06 ++                                                                 |
         |                                                                  |
   4e+06 ++                                                                 |
         |                                                                  |
   3e+06 ++                                                                 |
         |                                                                  |
   2e+06 ++                                                                 |
         |                                                                  |
   1e+06 ++                                                                 |
         *.*.*.*.*.*.*.*.*.*                                                |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   45000 ++--O--------------------------------------------------------------+
         |          O   O      O         O    O O O O O      O O O O    O   |
   40000 O+O   O  O       O O    O O O O    O            O O          O   O |
   35000 ++           O                                                     O
         |                                                                  |
   30000 ++                                                                 |
   25000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
    5000 ++                                                                 |
         *.*.*                                                              |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   70000 ++-----------------------------------------------------------------+
         |                                       O                          |
   60000 O+  O    O   O        O    O   O    O        O   O    O   O    O   O
         |                 O                                                |
   50000 ++                                                                 |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         *...*....*                                                         |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   45000 ++-----------------------------------------------------------------+
         |         O                                                      O |
   40000 O+    O O   O O O O   O O O O   O  O O     O O     O O O O O   O   O
   35000 ++O O               O         O        O O     O O           O     |
         |                                                                  |
   30000 ++                                                                 |
   25000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
    5000 ++                                                                 |
         *.*.*.*                                                            |
       0 ++-----------------------------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   400000 ++----------------------------------------------------------------+
          |               O        O             O O                 O      |
   350000 ++       O O  O   O O  O   O  O O O O      O  O O O O  O O   O  O O
          O    O O                                                          |
   300000 ++O                                                               |
   250000 ++                                                                |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
   100000 ++                                                                |
          |                                                                 |
    50000 ++                                                                |
          |                                                                 |
        0 *+*--*-*----------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   50000 ++-----------------O-------O----------------------------O--------O-+
   45000 O+   O O O   OOO    OO   O   O  O OO  O OOOO   O O   OOO   OO OO  O|
         |OO O   O  OO   O O   O O O O  O O  O  O     OO O         O  O     |
   40000 ++    O                                            OO              O
   35000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
   25000 ++                                                                 |
   20000 ++                                                                 |
         |                                                                  |
   15000 ++                                                                 |
   10000 ++                                                                 |
         |                                                                  |
    5000 ***.***                                                            |
       0 ++-----------------------------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   120000 ++----------------------------------------------------------------+
          |                   O    O O           O   O                      |
   100000 ++              O      O          O O           O            O  O O
          |                               O        O    O   O    O   O      |
          |          O  O               O                                   |
    80000 ++                O                                 O             |
          |    O   O                                               O        |
    60000 O+O                                                               |
          |      O                                                          |
    40000 ++                                                                |
          |                                                                 |
          |                                                                 |
    20000 ++                                                                |
          |                                                                 |
        0 *+*--*-*-*--------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   10000 ++-----------------------------------------------------------------+
    9000 ++             O    O    O O                 O         O    O      |
         O                O    O       O O  O O  O O    O  O O    O    O  O O
    8000 ++O  O O  O O                                                      |
    7000 ++                                                                 |
         |                                                                  |
    6000 ++                                                                 |
    5000 ++                                                                 |
    4000 ++                                                                 |
         |                                                                  |
    3000 ++                                                                 |
    2000 ++                                                                 |
         |                                                                  |
    1000 ++                                                                 |
       0 *+*--*-*--*-*------------------------------------------------------+


                        lock_stat.vmap_area_lock.holdtime-total

   300000 ++------------------O----------------------------------O-O--------O
          O O  O O O O  O O O    O O O    O O O  O O O  O O O O      O O  O |
   250000 ++                            O                                   |
          |                                                                 |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
   150000 ++                                                                |
          |                                                                 |
   100000 ++                                                                |
          |                                                                 |
          |                                                                 |
    50000 ++                                                                |
          |                                                                 |
        0 *+*--*-*-*-*------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   70000 ++-----------------------------------------O-----------------------+
         |      O  O  O           O O    O                   O              |
   60000 OO O  O O   O OO     OOO    O O   OOO OOO   O     OO   O OO  O  OOO|
         | O  O   O      OOO O   O    O O     O    O  OOO O   OO    OO O    O
   50000 ++                                                                 |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         ****.**                                                            |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   50000 ++-----------------------------------------------------------------+
   45000 ++  O        O   O     O O        O         O O         OO O O   O O
         O O     O OO   O   O       O O O O  O O O O     O O O O        O   |
   40000 ++                   O                                             |
   35000 ++    O                                                            |
         |                                                                  |
   30000 ++                                                                 |
   25000 ++                                                                 |
   20000 ++                                                                 |
         |                                                                  |
   15000 ++                                                                 |
   10000 ++                                                                 |
         |                                                                  |
    5000 *+*.*.*.*.*                                                        |
       0 ++-----------------------------------------------------------------+


                       lock_stat.vmap_area_lock.holdtime-total

   60000 ++---------O----O------------O--------O--------------------------O-+
         O O    O O   O    O O      O   O    O   O O  O O O O  O O O O      O
   50000 ++  O                 O  O        O                            O   |
         |                                                                  |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
         |                                                                  |
   10000 ++                                                                 |
         *.*.*..*.*.*                                                       |
       0 ++-----------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                 3470.31     +1631.0%     60070.49  fat/micro/dd-write/1HDD-cfq-ext4-10dd
                 3339.57     +1676.9%     59340.71  fat/micro/dd-write/1HDD-cfq-ext4-1dd
                 2848.79     +1489.1%     45269.54  lkp-a04/micro/netperf/120s-200%-TCP_CRR
                 3563.01     +1515.2%     57548.50  lkp-a04/micro/netperf/120s-200%-TCP_MAERTS
                 2678.64     +1346.0%     38733.80  lkp-a04/micro/netperf/120s-200%-TCP_RR
                 2839.84     +1450.2%     44022.08  lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
                 3417.37     +1571.4%     57116.34  lkp-a04/micro/netperf/120s-200%-TCP_STREAM
                 2558.59     +1450.8%     39677.58  lkp-a04/micro/netperf/120s-200%-UDP_RR
                 3737.24     +1558.0%     61963.62  lkp-a04/micro/netperf/120s-200%-UDP_STREAM
                20219.50     +1488.7%    321218.02  lkp-a06/crypto/tcrypt/2s-200-204
                21017.17     +1457.1%    327257.41  lkp-a06/crypto/tcrypt/2s-205-210
                22109.84     +1240.3%    296346.33  lkp-a06/crypto/tcrypt/2s-401-417
                17909.60     +1467.3%    280693.71  lkp-a06/micro/dbench/100%
               489739.50      +978.5%   5281916.05  lkp-ne04/micro/aim7/shell_rtns_1
              1601675.63      +906.7%  16123642.52  lkp-snb01/micro/aim7/exec_test
                12105.00     +2453.6%    309110.42  nhm-white/micro/aim7/dbase
               822461.02     +1585.0%  13858430.62  nhm-white/micro/aim7/exec_test
                 9858.11     +2715.9%    277595.41  nhm-white/micro/aim7/fork_test
                 3452.91     +1685.5%     61650.74  nhm-white/micro/aim7/fserver
                  300.14     +2621.5%      8168.53  nhm-white/micro/aim7/misc_rtns_1
               345479.21     +1624.5%   5957828.25  nhm-white/micro/aim7/shell_rtns_1
                 2694.48     +1974.4%     55894.19  nhm-white/sysbench/oltp/100%-600-1000000
                 4415.67     +1202.2%     57501.52  nhm8/micro/dbench/100%
                 2284.65     +1505.2%     36672.75  snb-drag/crypto/tcrypt/2s-200-204
                 2446.02     +1537.1%     40042.87  snb-drag/crypto/tcrypt/2s-205-210
                 2484.11     +1599.6%     42219.71  snb-drag/crypto/tcrypt/2s-500-504
                 2118.55     +1155.8%     26604.99  vpx/crypto/tcrypt/2s-200-204
                 2713.48     +1198.5%     35234.77  vpx/crypto/tcrypt/2s-205-210
                 2711.31      +973.8%     29114.07  vpx/crypto/tcrypt/2s-301-319
                 2369.23      +940.3%     24648.12  vpx/crypto/tcrypt/2s-401-417
                 2620.64     +1428.7%     40060.71  vpx/crypto/tcrypt/2s-500-504
                 1713.98     +1624.3%     29553.72  vpx/crypto/tcrypt/2s-505-509
              3423353.12     +1184.9%  43985148.08  TOTAL lock_stat.vmap_area_lock.holdtime-total

                 lock_stat.vmap_area_lock.contentions.get_vmalloc_info

   200000 ++----------------------------------------------------------------+
          O  O   O  O   O      O  O   O  O                                  |
   180000 ++               O                 O      O  O   O  O   O  O   O  O
   160000 ++                                    O                           |
          |                                                                 |
   140000 ++                                                                |
   120000 ++                                                                |
          |                                                                 |
   100000 ++                                                                |
    80000 ++                                                                |
          |                                                                 |
    60000 ++                                                                |
    40000 ++                                                                |
          *..*...*..*...*..*...*..*...*..*                                  |
    20000 ++----------------------------------------------------------------+



                lock_stat.vmap_area_lock.contentions.get_vmalloc_info

   70000 ++-----------------------------------------------------------------+
         |                                                                  |
   60000 O+ O                                        O    O    O            O
         |    O  O O  O O  O  O O  O O  O  O O  O O    O     O    O O  O O  |
   50000 ++                                                                 |
         |                                                                  |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
   10000 *+.*.*                                                             |
         |                                                                  |
       0 ++-----------------------------------------------------------------+



                lock_stat.vmap_area_lock.contentions.get_vmalloc_info

   30000 ++-----------------------------------------------------------------+
         O      O              O       O      O       O              O      |
   25000 ++             O                                    O              O
         |                                                                  |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         |                                                                  |
         |                                                                  |
    5000 *+.....*.......*......*.......*                                    |
         |                                                                  |
       0 ++-----------------------------------------------------------------+



                lock_stat.vmap_area_lock.contentions.get_vmalloc_info

   14000 ++-----------------------------------------------------------------+
         |             O                                                    |
   12000 ++O O O O O O   O O   O            O   O O O O O       O O O O     |
         O                   O   O O O O O    O           O O O         O O O
   10000 ++                                                                 |
         |                                                                  |
    8000 ++                                                                 |
         |                                                                  |
    6000 ++                                                                 |
         |                                                                  |
    4000 ++                                                                 |
         |                                                                  |
    2000 ++                                                                 |
         *.*.*.*.*.*.*.*.*.*                                                |
       0 ++-----------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                 4952.40      +447.0%     27090.40  lkp-ne04/micro/aim7/shell_rtns_1
                28410.80      +556.2%    186423.00  lkp-snb01/micro/aim7/exec_test
                 8142.00      +615.4%     58247.33  nhm-white/micro/aim7/exec_test
                 1386.00      +762.6%     11955.20  nhm-white/micro/aim7/shell_rtns_1
                42891.20      +561.5%    283715.93  TOTAL lock_stat.vmap_area_lock.contentions.get_vmalloc_info

                        lock_stat.vmap_area_lock.waittime-total

   1.2e+06 ++---------------------------------------------------------------+
           |             O                                                  |
     1e+06 O+ O   O                O  O   O                                 |
           |         O      O   O            O      O  O      O          O  O
           |                                     O         O      O  O      |
    800000 ++                                                               |
           |                                                                |
    600000 ++                                                               |
           |                                                                |
    400000 ++                                                               |
           |                                                                |
           |                                                                |
    200000 ++                                                               |
           |                                                                |
         0 *+-*---*--*---*--*---*--*--*---*---------------------------------+



                        lock_stat.vmap_area_lock.waittime-total

   250000 ++----------------------------------------------------------------+
          |      O  O                                                       |
          O             O  O   O  O   O  O   *..*...     ..*..*...  .*...*..|
   200000 ++ O                               :      *..*.         *.        *
          |                                 :                               |
          |                                 :                               |
   150000 ++                                :                               |
          |                                :                                |
   100000 ++                               :                                |
          |                                :                                |
          |                               :                                 |
    50000 ++                              :                                 |
          |                               :                                 |
          |                              :                                  |
        0 *+-*---*--*---*--*---*--*---*--*----------------------------------+



                        lock_stat.vmap_area_lock.waittime-total

   250000 ++----------------------------------------------------------------+
          |  O                                       O    O    O            O
          O    O  O O  O         O    O    O                 O      O    O  |
   200000 ++             O  O O    O    O     O O  O    O         O    O    |
          |                                                                 |
          |                                                                 |
   150000 ++                                                                |
          |                                                                 |
   100000 ++                                                                |
          |                                                                 |
          |                                                                 |
    50000 ++                                                                |
          |                                                                 |
          |                                                                 |
        0 *+-*-*------------------------------------------------------------+



                        lock_stat.vmap_area_lock.waittime-total

   120000 ++----------------------------------------------------------------+
          O      O              O      O       O      O              O      |
   100000 ++             O                                   O              |
          |                                                                 O
          |                                                                 |
    80000 ++                                                                |
          |                                                                 |
    60000 ++                                                                |
          |                                                                 |
    40000 ++                                                                |
          |                                                                 |
          |                                                                 |
    20000 ++                                                                |
          |                                                                 |
        0 *+-----*-------*------*------*------------------------------------+



                       lock_stat.vmap_area_lock.waittime-total

   60000 ++-----------------------------------------------------------------+
         |             O                                                    |
   50000 ++  O O O O O   O                  O   O O   O O       O O O O     |
         O O               O O O O   O O O    O     O     O O O         O O O
         |                         O                                        |
   40000 ++                                                                 |
         |                                                                  |
   30000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
         |                                                                  |
   10000 ++                                                                 |
         |                                                                  |
       0 *+*-*-*-*-*-*-*-*-*------------------------------------------------+



                       lock_stat.vmap_area_lock.waittime-total

   3500 ++------------------------------------------------------------------+
        |             O                                                     |
   3000 ++                            O                                     |
        |                                                                   |
   2500 ++                                                                O |
        |                                                                   |
   2000 ++O                              O                                  |
        |                      O  O             O                  O O      |
   1500 ++   O          O  O                      O                         |
        O                                              O    O   O      O    |
   1000 ++     O    O        O             O                                O
        |        O                            O          O    O             |
    500 ++                          O                O                      |
        |                                                                   |
      0 *+*--*--------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                 1990.41     +5378.7%    109049.26  lkp-ne04/micro/aim7/shell_rtns_1
                18611.42     +5349.4%   1014205.59  lkp-snb01/micro/aim7/exec_test
                 4171.17     +5040.7%    214427.74  lkp-snb01/micro/aim7/shell_rtns_1
                   10.18    +14277.7%      1463.17  nhm-white/micro/aim7/dbase
                 1715.53    +12791.3%    221153.70  nhm-white/micro/aim7/exec_test
                  325.21    +15376.6%     50331.77  nhm-white/micro/aim7/shell_rtns_1
                26823.92     +5904.5%   1610631.24  TOTAL lock_stat.vmap_area_lock.waittime-total



                         lock_stat.vmap_area_lock.contentions

   100000 ++----------------------------------------------------------------+
          O  O   O  O   O      O  O   O  O                                  |
    90000 ++               O                 O      O  O   O  O   O  O   O  O
    80000 ++                                    O                           |
          |                                                                 |
    70000 ++                                                                |
    60000 ++                                                                |
          |                                                                 |
    50000 ++                                                                |
    40000 ++                                                                |
          |                                                                 |
    30000 ++                                                                |
    20000 ++                                                                |
          *..*...*..*...*..*...*..*...*..*                                  |
    10000 ++----------------------------------------------------------------+



                        lock_stat.vmap_area_lock.contentions

   35000 ++-----------------------------------------------------------------+
         |                                                                  |
   30000 O+ O                                        O    O    O            O
         |    O  O O  O O  O  O O  O O  O  O O  O O    O     O    O O  O O  |
   25000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         |                                                                  |
    5000 *+.*.*                                                             |
         |                                                                  |
       0 ++-----------------------------------------------------------------+



                        lock_stat.vmap_area_lock.contentions

   14000 O+-----O-----------------------------O-------O--------------O------+
         |              O      O       O                     O              O
   12000 ++                                                                 |
         |                                                                  |
         |                                                                  |
   10000 ++                                                                 |
         |                                                                  |
    8000 ++                                                                 |
         |                                                                  |
    6000 ++                                                                 |
         |                                                                  |
         |                                                                  |
    4000 ++                                                                 |
         *......    ....*......*.......*                                    |
    2000 ++-----*-----------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                 2476.20      +447.0%     13545.20  lkp-ne04/micro/aim7/shell_rtns_1
                14205.40      +556.2%     93211.50  lkp-snb01/micro/aim7/exec_test
                 4071.00      +615.4%     29123.67  nhm-white/micro/aim7/exec_test
                20752.60      +554.8%    135880.37  TOTAL lock_stat.vmap_area_lock.contentions



                         lock_stat.vmap_area_lock.con-bounces

   100000 ++----------------------------------------------------------------+
          O  O   O  O   O      O  O   O  O                                  |
    90000 ++               O                 O      O  O   O  O   O  O   O  O
    80000 ++                                    O                           |
          |                                                                 |
    70000 ++                                                                |
    60000 ++                                                                |
          |                                                                 |
    50000 ++                                                                |
    40000 ++                                                                |
          |                                                                 |
    30000 ++                                                                |
    20000 ++                                                                |
          *..*...*..*...*..*...*..*...*..*                                  |
    10000 ++----------------------------------------------------------------+



                        lock_stat.vmap_area_lock.con-bounces

   35000 ++-----------------------------------------------------------------+
         |                                                                  |
   30000 O+ O                                        O    O    O            O
         |    O  O O  O O  O  O O  O O  O  O O  O O    O     O    O O  O O  |
   25000 ++                                                                 |
         |                                                                  |
   20000 ++                                                                 |
         |                                                                  |
   15000 ++                                                                 |
         |                                                                  |
   10000 ++                                                                 |
         |                                                                  |
    5000 *+.*.*                                                             |
         |                                                                  |
       0 ++-----------------------------------------------------------------+



                        lock_stat.vmap_area_lock.con-bounces

   14000 O+-----O-----------------------------O-------O--------------O------+
         |              O      O       O                     O              O
   12000 ++                                                                 |
         |                                                                  |
         |                                                                  |
   10000 ++                                                                 |
         |                                                                  |
    8000 ++                                                                 |
         |                                                                  |
    6000 ++                                                                 |
         |                                                                  |
         |                                                                  |
    4000 ++                                                                 |
         *......    ....*......*.......*                                    |
    2000 ++-----*-----------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                 2476.20      +447.0%     13545.20  lkp-ne04/micro/aim7/shell_rtns_1
                14205.40      +556.2%     93211.50  lkp-snb01/micro/aim7/exec_test
                 4071.00      +615.4%     29123.67  nhm-white/micro/aim7/exec_test
                20752.60      +554.8%    135880.37  TOTAL lock_stat.vmap_area_lock.con-bounces

                                 time.system_time

   550 ++--------------------------------------O------O----------O----------+
   540 ++                                          O      O  O       O  O   O
       |                                   O                                |
   530 ++                                                                   |
   520 ++                                                                   |
       |                                                                    |
   510 ++                                                                   |
   500 ++                                                                   |
   490 ++                                                                   |
       |                                                                    |
   480 O+                       O                                           |
   470 *+..           O  O   O      O   O                                   |
       |   *..*.. O  .*..         ..*...                                    |
   460 ++        . ..    *...*..*.      *                                   |
   450 ++---------*---------------------------------------------------------+



                                 time.system_time

   30 ++--------------------------------------------------------------------+
      |                           O                                         |
      |                                                                     |
   25 ++                                                                    |
      |                                                                     |
      |                                                                     |
   20 ++                                                                    |
      |                                                                     |
   15 O+                   O                           O      O             O
      |                                                                     |
      |      O      O                           O                           |
   10 ++                                                                    |
      *......    ...*......*             O                           O      |
      |      *...                                                           |
    5 ++--------------------------------------------------------------------+



                                 time.system_time

   740 ++-------------------------------------------------------------------+
       |              O               O                                     |
   730 ++                     O                                             |
   720 O+                                                                   |
       |       O                                                            |
   710 ++                                                                   |
       |                                                                    |
   700 ++                                                                   |
       |                                                     O      O       |
   690 ++                                                                   |
   680 ++  ....*......    ....*.....         O       O                      |
       *...           *...          ..                                      |
   670 ++                             *                                     O
       |                                                                    |
   660 ++-------------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                  677.85        +7.4%       728.28  lkp-ne04/micro/aim7/shell_rtns_1
                  459.71        +2.4%       470.66  lkp-snb01/micro/aim7/exec_test
                    8.05       +80.1%        14.49  vpx/crypto/tcrypt/2s-505-509
                 1145.61        +5.9%      1213.42  TOTAL time.system_time



                                  time.user_time

   48 ++---------------------------------------------------------O----------+
      |                                        O      O                     |
   47 ++                                          O                         |
   46 ++                                                  O  O       O  O   O
      |                                    O                                |
   45 ++                                                                    |
      O                                                                     |
   44 ++                    O      O   O                                    |
      |              O  O       O                                           |
   43 *+..*..O   O                                                          |
   42 ++     *..                *..*..                                      |
      |         .  ..*..*..   ..      .                                     |
   41 ++         *.        . .         *                                    |
      |                     *                                               |
   40 ++--------------------------------------------------------------------+



                                  time.user_time

   162 ++-----------------------------O-------------------------------------+
       O                                                                    |
   160 ++             O       O                                             |
   158 ++                                                                   |
       |                                                                    |
   156 ++      O                                                            |
       |                                                                    |
   154 ++                                                    O              |
       |                                                                    |
   152 ++                                    O                              |
   150 ++                    .*...                   O              O       |
       |     ..*...        ..     ...                                       |
   148 ++....      ..    ..          .                                      O
       *.            . ..             *                                     |
   146 ++-------------*-----------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                  148.19        +7.8%       159.75  lkp-ne04/micro/aim7/shell_rtns_1
                   41.96        +3.6%        43.49  lkp-snb01/micro/aim7/exec_test
                  190.15        +6.9%       203.23  TOTAL time.user_time



                                   iostat.cpu.idle

     34 ++---------------------*--------------------------------------------+
        |                ....        .                                      |
        *.......*......*.             *                                     |
   33.5 O+             O                                                    |
        |                      O      O                                     |
        |       O                                                           |
     33 ++                                                                  |
        |                                                                   |
   32.5 ++                                                                  |
        |                                                                   |
        |                                            O                      |
     32 ++                                                                  O
        |                                                                   |
        |                                     O              O      O       |
   31.5 ++------------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                   33.75        -1.3%        33.31  lkp-ne04/micro/aim7/shell_rtns_1
                   33.75        -1.3%        33.31  TOTAL iostat.cpu.idle



                         time.percent_of_cpu_this_job_got

   270 ++--------------------------------------O------O----------O------O---+
       |                                           O      O  O       O      O
   265 ++                                  O                                |
       |                                                                    |
       |                                                                    |
   260 ++                                                                   |
       |                                                                    |
   255 ++                                                                   |
       O                     O  O       O                                   |
   250 ++             O                                                     |
       *...   O          O          O                                       |
       |   *..    O   *..         ..*..                                     |
   245 ++     *..   ..         .*.     .                                    |
       |         . .     *...*.         *                                   |
   240 ++---------*---------------------------------------------------------+



                         time.percent_of_cpu_this_job_got

   555 ++-------------------------------------------------------------------+
   550 ++                             O                                     |
       |              O       O                                             |
   545 O+                                                                   |
   540 ++      O                                                            |
       |                                                     O              |
   535 ++                                                           O       |
   530 ++                                    O                              |
   525 ++                                                                   |
       |                                             O                      |
   520 ++                                                                   |
   515 ++                   ..*.....                                        O
       *.......*......  ....        ..                                      |
   510 ++             *.              *                                     |
   505 ++-------------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                  510.40        +7.1%       546.60  lkp-ne04/micro/aim7/shell_rtns_1
                  244.60        +2.1%       249.80  lkp-snb01/micro/aim7/exec_test
                  755.00        +5.5%       796.40  TOTAL time.percent_of_cpu_this_job_got



                                  iostat.cpu.system

   34.2 ++------------------------------------------O-----------------------O
        |                             O                                  O  |
     34 ++       O                 O                            O           |
        |              O                               O  O  O     O  O     |
   33.8 ++          O        O  O          O  O                             |
        |                 O                      O                          |
   33.6 ++                               O                                  |
        |                                                                   |
   33.4 ++                                                                  |
        |                                                                   |
   33.2 ++                                                                  |
        |  O  O                                                             |
     33 *+.                                                                 |
        |  *..*                                                             |
   32.8 ++------------------------------------------------------------------+



                                  iostat.cpu.system

   34.4 ++------------------------------------------------------------------+
        |                           O                                       O
   34.2 ++                            O  O      O    O    O    O    O       |
        |              O  O    O O         O       O              O    O O  |
     34 ++                                              O    O              |
        |                   O                 O                             |
   33.8 ++                                                                  |
        |                                                                   |
   33.6 ++                                                                  |
        *..O      O  O                                                      |
   33.4 O+   O    *..                                                       |
        |  *.*..O:   *                                                      |
   33.2 ++       :                                                          |
        |       *                                                           |
     33 ++------------------------------------------------------------------+



                                  iostat.cpu.system

   56.6 ++----------------------------------------------------------O-------+
   56.4 ++                                    O              O              |
        |                                                                   O
   56.2 ++                                           O                      |
     56 ++                                                                  |
        |                                                                   |
   55.8 ++                                                                  |
   55.6 ++                                                                  |
   55.4 ++      O                                                           |
        |              O       O      O                                     |
   55.2 O+                                                                  |
     55 *+......*......                                                     |
        |              *.......    ...*                                     |
   54.8 ++                     *...                                         |
   54.6 ++------------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                   32.94        +0.4%        33.07  lkp-a06/crypto/tcrypt/2s-200-204
                   33.31        +0.4%        33.44  lkp-a06/crypto/tcrypt/2s-205-210
                   54.91        +0.8%        55.33  lkp-ne04/micro/aim7/shell_rtns_1
                  121.17        +0.6%       121.83  TOTAL iostat.cpu.system



                                  vmstat.system.in

   65500 ++-----------------------------------------------------------------+
         |     .*...         .*...  .*..                                    |
         *...*.     *..*...*.     *.    .                                   |
   65000 ++                              *                                  |
         |   O  O   O                                                       |
         |                                                                  |
   64500 O+            O   O  O      O   O                                  |
         |                        O                                         |
   64000 ++                                                                 |
         |                                                                  |
         |                                                                  |
   63500 ++                                 O                               |
         |                                         O   O  O   O      O      |
         |                                      O                       O   O
   63000 ++------------------------------------------------------O----------+



                                  vmstat.system.in

   20600 ++-----*-------*---------------------------------------------------+
         | .... O              *.......*                                    |
   20500 *+                                                                 |
   20400 ++                                                                 |
         O              O      O                                            |
   20300 ++                            O                                    |
         |                                                                  |
   20200 ++                                                                 |
         |                                                                  |
   20100 ++                                                                 |
   20000 ++                                                                 |
         |                                                                  |
   19900 ++                                                                 |
         |                                                           O      |
   19800 ++-----------------------------------O-------O------O--------------O



                                  vmstat.system.in

   1090 ++------------------------------------------------------------------+
        |  .*.*                                                             |
   1088 *+*                                                             O   |
   1086 ++                                     O                            |
        |         O        O                                                |
   1084 ++    O                O     O     O                      O         |
   1082 ++      O       O                    O   O            O             |
        |                              O           O        O               |
   1080 ++O                        O                     O      O   O       |
   1078 ++                                                                  |
        |   O                            O           O                      O
   1076 ++          O            O                     O                  O |
   1074 O+            O                                                     |
        |                    O                                        O     |
   1072 ++------------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                20557.20        -0.8%     20384.60  lkp-ne04/micro/aim7/shell_rtns_1
                65159.10        -1.0%     64507.71  lkp-snb01/micro/aim7/exec_test
                 1088.00        -0.9%      1078.50  snb-drag/crypto/tcrypt/2s-200-204
                86804.30        -1.0%     85970.81  TOTAL vmstat.system.in



                                 time.elapsed_time

   222 ++-------------------------------------------------------------------+
       |                                   O       O  O          O          |
   220 ++                                      O          O  O       O  O   O
   218 ++                                                                   |
       |                                                                    |
   216 ++                                                                   |
   214 ++                                                                   |
       |                                                                    |
   212 ++                                                                   |
   210 ++                                                                   |
       |                                                                    |
   208 ++                                                                   |
   206 ++                                                                   |
       *...*..*...O ..*..*...O .*...*...*                                   |
   204 ++---------*----------*----------------------------------------------+



                                  time.elapsed_time

   160.5 ++-----------------------------------------------------------------+
     160 ++                   O   O                                         |
         O ..*..O   O               .*...O                                  |
   159.5 *+     *...*..*...*..*...*. O   *                                  |
     159 ++                              :                                  |
         |                                :                                 |
   158.5 ++                               :                                 |
     158 ++                                :                                |
   157.5 ++                                :                                |
         |                                 :                                |
     157 ++                                 :                    *...       |
   156.5 ++                                 *...          *... ..    *..    |
         |                                      *..*... ..    *         *...|
     156 ++                                            *                    *
   155.5 ++-----------------------------------------------------------------+



                                 time.elapsed_time

   280 ++-------------------------------------------------------------------+
       |       O  O    O          O         O    O  O    O     O            |
   279 ++           O     O O       O  O  O    O            O    O  O  O O  O
   278 ++                      O                       O                    |
       |                                                                    |
   277 ++                                                                   |
   276 ++                                                                   |
       |                                                                    |
   275 ++                                                                   |
   274 ++                                                                   |
       |                                                                    |
   273 ++ O                                                                 |
   272 O+                                                                   |
       *..*.*                                                               |
   271 ++-------------------------------------------------------------------+



                                  time.elapsed_time

     602 ++--O--------------------------------------------------------------+
         |                                                                  |
   601.8 ++                                                                 |
   601.6 ++                                                                 O
         |                                                                  |
   601.4 ++                                  O                              |
   601.2 ++           O                                   O                 |
         |                     O                                            |
     601 ++       O                 O   O                      O            |
   600.8 O+                                           O            O        |
         *..               O                                                |
   600.6 ++ .                                    O                          |
   600.4 ++  *....                                                      O   |
         |        *                                                         |
   600.2 ++-----------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                  600.51        +0.1%       601.28  fat/micro/dd-write/1HDD-cfq-ext4-10dd
                  204.81        +0.3%       205.38  lkp-snb01/micro/aim7/exec_test
                  159.50        +0.2%       159.77  lkp-snb01/micro/aim7/shell_rtns_1
                  271.47        +0.4%       272.49  nhm-white/micro/aim7/exec_test
                 1236.29        +0.2%      1238.92  TOTAL time.elapsed_time



                                  vmstat.system.cs

   61600 ++-----------------------------------------------------------------+
         |                                                                  |
   61400 ++                                                                 *
   61200 ++                   *..                                          +|
         |                  ..   ..        ..*....                        + |
   61000 ++                .          ..*..       *....         ..*...   +  |
         |               ..        *..                 *.....*..      . +   |
   60800 ++       .*....*                                              *    |
         *....  ..                                                          |
   60600 ++   *.                             O         O                    |
   60400 ++   O                                                             |
         O                    O         O         O                         |
   60200 ++        O    O          O                                        |
         |                                                                  |
   60000 ++-----------------------------------------------------------------+



                                  vmstat.system.cs

   19200 ++--------------------O--------------------------------------------+
         |                                                                  |
   19100 ++                                                                 |
         |                                                                  |
         |                                                                  |
   19000 ++                                                                 |
         |                                                                  |
   18900 ++            O                              O                     |
         |          OO   O                                                  |
   18800 ++     O                       O   O      O     O  O        O      O
         | O     O        O     OOOO OOO       OO O  O     O  O   O   OOO  O|
         |O  O     O       OO              O OO     O   O                 O |
   18700 *+   OO      O                  O                O    OOO  O       |
         O**.***             O                                              |
   18600 ++-----------------------------------------------------------------+



                                  vmstat.system.cs

   49100 ++-----------------------------------------------------------------+
         |          ....*..                                                 |
   49000 ++    .*...       ..                                               |
         | ....              ..                                             |
   48900 *+                        ....*                             O      |
         |      O              *...                                         |
   48800 ++                                                  O              |
         |                                    O                             O
   48700 ++                                                                 |
         |                                            O                     |
   48600 ++                                                                 |
         |              O                                                   |
   48500 O+                                                                 |
         |                                                                  |
   48400 ++--------------------O-------O------------------------------------+



                                 vmstat.system.cs

   732 ++--*----------------------------------------------------------------+
       *.*  +                                                  O            |
   730 ++    *   O                    O        O                    O       |
       | O O O            O         O              O              O       O |
   728 ++                     O                                         O   |
       |            O   O                        O       O   O        O     O
   726 O+      O      O           O     O    O       O     O                |
       |                                                                    |
   724 ++                                  O                                |
       |                                               O                    |
   722 ++                                                                   |
       |                        O                                           |
   720 ++                   O                                               |
       |                                                                    |
   718 ++-------------------------------------------------------------------+

    8cb06c983822103da1cf      20bafb3d23d108bc0a89  
------------------------  ------------------------  
                18644.50        +0.3%     18702.67  lkp-a04/micro/netperf/120s-200%-UDP_RR
                48938.00        -0.8%     48539.80  lkp-ne04/micro/aim7/shell_rtns_1
                60751.18        -0.6%     60359.04  lkp-nex05/micro/aim7/exec_test
                  731.00        -0.4%       728.25  snb-drag/crypto/tcrypt/2s-200-204
               129064.68        -0.6%    128329.76  TOTAL vmstat.system.cs

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  0:51 increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data" Fengguang Wu
  2013-09-13  1:09 ` Fengguang Wu
@ 2013-09-13  3:17 ` Greg KH
  2013-09-13  3:38   ` Fengguang Wu
  1 sibling, 1 reply; 29+ messages in thread
From: Greg KH @ 2013-09-13  3:17 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Peter Hurley, LKML, lkp, Tejun Heo

On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> Hi Peter,
> 
> FYI, we noticed much increased vmap_area_lock contentions since this
> commit:

What does that mean?  What is happening, are we allocating/removing more
memory now?

What type of load were you running that showed this problem?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  3:17 ` Greg KH
@ 2013-09-13  3:38   ` Fengguang Wu
  2013-09-13  3:44     ` Greg KH
  2013-09-17  2:42     ` Peter Hurley
  0 siblings, 2 replies; 29+ messages in thread
From: Fengguang Wu @ 2013-09-13  3:38 UTC (permalink / raw)
  To: Greg KH; +Cc: Peter Hurley, LKML, lkp, Tejun Heo

On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> > Hi Peter,
> > 
> > FYI, we noticed much increased vmap_area_lock contentions since this
> > commit:
> 
> What does that mean?  What is happening, are we allocating/removing more
> memory now?

// leave this question to Peter and Tejun. :)

> What type of load were you running that showed this problem?

The increased contentions and lock hold/wait time showed up in a
number of test cases.

The second email has the details, and this section of data is most
illustrating.

    8cb06c983822103da1cf      20bafb3d23d108bc0a89
------------------------  ------------------------  
                 3470.31     +1631.0%     60070.49  fat/micro/dd-write/1HDD-cfq-ext4-10dd
                 3339.57     +1676.9%     59340.71  fat/micro/dd-write/1HDD-cfq-ext4-1dd
                 2848.79     +1489.1%     45269.54  lkp-a04/micro/netperf/120s-200%-TCP_CRR
                 3563.01     +1515.2%     57548.50  lkp-a04/micro/netperf/120s-200%-TCP_MAERTS
                 2678.64     +1346.0%     38733.80  lkp-a04/micro/netperf/120s-200%-TCP_RR
                 2839.84     +1450.2%     44022.08  lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
                 3417.37     +1571.4%     57116.34  lkp-a04/micro/netperf/120s-200%-TCP_STREAM
                 2558.59     +1450.8%     39677.58  lkp-a04/micro/netperf/120s-200%-UDP_RR
                 3737.24     +1558.0%     61963.62  lkp-a04/micro/netperf/120s-200%-UDP_STREAM
                20219.50     +1488.7%    321218.02  lkp-a06/crypto/tcrypt/2s-200-204
                21017.17     +1457.1%    327257.41  lkp-a06/crypto/tcrypt/2s-205-210
                22109.84     +1240.3%    296346.33  lkp-a06/crypto/tcrypt/2s-401-417
                17909.60     +1467.3%    280693.71  lkp-a06/micro/dbench/100%
               489739.50      +978.5%   5281916.05  lkp-ne04/micro/aim7/shell_rtns_1
              1601675.63      +906.7%  16123642.52  lkp-snb01/micro/aim7/exec_test
                12105.00     +2453.6%    309110.42  nhm-white/micro/aim7/dbase
               822461.02     +1585.0%  13858430.62  nhm-white/micro/aim7/exec_test
                 9858.11     +2715.9%    277595.41  nhm-white/micro/aim7/fork_test
                 3452.91     +1685.5%     61650.74  nhm-white/micro/aim7/fserver
                  300.14     +2621.5%      8168.53  nhm-white/micro/aim7/misc_rtns_1
               345479.21     +1624.5%   5957828.25  nhm-white/micro/aim7/shell_rtns_1
                 2694.48     +1974.4%     55894.19  nhm-white/sysbench/oltp/100%-600-1000000
                 4415.67     +1202.2%     57501.52  nhm8/micro/dbench/100%
                 2284.65     +1505.2%     36672.75  snb-drag/crypto/tcrypt/2s-200-204
                 2446.02     +1537.1%     40042.87  snb-drag/crypto/tcrypt/2s-205-210
                 2484.11     +1599.6%     42219.71  snb-drag/crypto/tcrypt/2s-500-504
                 2118.55     +1155.8%     26604.99  vpx/crypto/tcrypt/2s-200-204
                 2713.48     +1198.5%     35234.77  vpx/crypto/tcrypt/2s-205-210
                 2711.31      +973.8%     29114.07  vpx/crypto/tcrypt/2s-301-319
                 2369.23      +940.3%     24648.12  vpx/crypto/tcrypt/2s-401-417
                 2620.64     +1428.7%     40060.71  vpx/crypto/tcrypt/2s-500-504
                 1713.98     +1624.3%     29553.72  vpx/crypto/tcrypt/2s-505-509
              3423353.12     +1184.9%  43985148.08  TOTAL lock_stat.vmap_area_lock.holdtime-total

The format of report is

           parent commit                    commit
------------------------  ------------------------  
                  number   %change          number     testbox/testcase/test-params
                  ...
                  ...
                  ...
            total number   %change    total number TOTAL perf-metric-name being compared

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  3:38   ` Fengguang Wu
@ 2013-09-13  3:44     ` Greg KH
  2013-09-13  9:55       ` Peter Hurley
  2013-09-17  2:42     ` Peter Hurley
  1 sibling, 1 reply; 29+ messages in thread
From: Greg KH @ 2013-09-13  3:44 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Peter Hurley, LKML, lkp, Tejun Heo

On Fri, Sep 13, 2013 at 11:38:04AM +0800, Fengguang Wu wrote:
> On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
> > On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> > > Hi Peter,
> > > 
> > > FYI, we noticed much increased vmap_area_lock contentions since this
> > > commit:
> > 
> > What does that mean?  What is happening, are we allocating/removing more
> > memory now?
> 
> // leave this question to Peter and Tejun. :)
> 
> > What type of load were you running that showed this problem?
> 
> The increased contentions and lock hold/wait time showed up in a
> number of test cases.
> 
> The second email has the details, and this section of data is most
> illustrating.
> 
>     8cb06c983822103da1cf      20bafb3d23d108bc0a89
> ------------------------  ------------------------  
>                  3470.31     +1631.0%     60070.49  fat/micro/dd-write/1HDD-cfq-ext4-10dd
>                  3339.57     +1676.9%     59340.71  fat/micro/dd-write/1HDD-cfq-ext4-1dd
>                  2848.79     +1489.1%     45269.54  lkp-a04/micro/netperf/120s-200%-TCP_CRR
>                  3563.01     +1515.2%     57548.50  lkp-a04/micro/netperf/120s-200%-TCP_MAERTS
>                  2678.64     +1346.0%     38733.80  lkp-a04/micro/netperf/120s-200%-TCP_RR
>                  2839.84     +1450.2%     44022.08  lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
>                  3417.37     +1571.4%     57116.34  lkp-a04/micro/netperf/120s-200%-TCP_STREAM
>                  2558.59     +1450.8%     39677.58  lkp-a04/micro/netperf/120s-200%-UDP_RR
>                  3737.24     +1558.0%     61963.62  lkp-a04/micro/netperf/120s-200%-UDP_STREAM
>                 20219.50     +1488.7%    321218.02  lkp-a06/crypto/tcrypt/2s-200-204
>                 21017.17     +1457.1%    327257.41  lkp-a06/crypto/tcrypt/2s-205-210
>                 22109.84     +1240.3%    296346.33  lkp-a06/crypto/tcrypt/2s-401-417
>                 17909.60     +1467.3%    280693.71  lkp-a06/micro/dbench/100%
>                489739.50      +978.5%   5281916.05  lkp-ne04/micro/aim7/shell_rtns_1
>               1601675.63      +906.7%  16123642.52  lkp-snb01/micro/aim7/exec_test
>                 12105.00     +2453.6%    309110.42  nhm-white/micro/aim7/dbase
>                822461.02     +1585.0%  13858430.62  nhm-white/micro/aim7/exec_test
>                  9858.11     +2715.9%    277595.41  nhm-white/micro/aim7/fork_test
>                  3452.91     +1685.5%     61650.74  nhm-white/micro/aim7/fserver
>                   300.14     +2621.5%      8168.53  nhm-white/micro/aim7/misc_rtns_1
>                345479.21     +1624.5%   5957828.25  nhm-white/micro/aim7/shell_rtns_1
>                  2694.48     +1974.4%     55894.19  nhm-white/sysbench/oltp/100%-600-1000000
>                  4415.67     +1202.2%     57501.52  nhm8/micro/dbench/100%
>                  2284.65     +1505.2%     36672.75  snb-drag/crypto/tcrypt/2s-200-204
>                  2446.02     +1537.1%     40042.87  snb-drag/crypto/tcrypt/2s-205-210
>                  2484.11     +1599.6%     42219.71  snb-drag/crypto/tcrypt/2s-500-504
>                  2118.55     +1155.8%     26604.99  vpx/crypto/tcrypt/2s-200-204
>                  2713.48     +1198.5%     35234.77  vpx/crypto/tcrypt/2s-205-210
>                  2711.31      +973.8%     29114.07  vpx/crypto/tcrypt/2s-301-319
>                  2369.23      +940.3%     24648.12  vpx/crypto/tcrypt/2s-401-417
>                  2620.64     +1428.7%     40060.71  vpx/crypto/tcrypt/2s-500-504
>                  1713.98     +1624.3%     29553.72  vpx/crypto/tcrypt/2s-505-509
>               3423353.12     +1184.9%  43985148.08  TOTAL lock_stat.vmap_area_lock.holdtime-total
> 
> The format of report is
> 
>            parent commit                    commit
> ------------------------  ------------------------  
>                   number   %change          number     testbox/testcase/test-params
>                   ...
>                   ...
>                   ...
>             total number   %change    total number TOTAL perf-metric-name being compared

That's a lot of slowdowns, especially for such a simple patch.

Peter, any ideas?


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  3:44     ` Greg KH
@ 2013-09-13  9:55       ` Peter Hurley
  2013-09-13 12:34         ` Greg KH
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Hurley @ 2013-09-13  9:55 UTC (permalink / raw)
  To: Greg KH; +Cc: Fengguang Wu, LKML, lkp, Tejun Heo

On 09/12/2013 11:44 PM, Greg KH wrote:
> On Fri, Sep 13, 2013 at 11:38:04AM +0800, Fengguang Wu wrote:
>> On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
>>> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>>>> Hi Peter,
>>>>
>>>> FYI, we noticed much increased vmap_area_lock contentions since this
>>>> commit:
>>>
>>> What does that mean?  What is happening, are we allocating/removing more
>>> memory now?

No. Same amount of memory, allocated and freed with the same frequency as
before.

>>> What type of load were you running that showed this problem?
>>
>> The increased contentions and lock hold/wait time showed up in a
>> number of test cases.

[...]

> That's a lot of slowdowns, especially for such a simple patch.
>
> Peter, any ideas?

Looks like this patch incidentally triggers some worst-case behavior in
the memory manager. I'm not sure how this is possible with two 4k buffers,
but the evidence is substantial.

This patch isn't critical so I suggest we back out this patch for mainline
but use the patch to find out what's wrong in the vmap area.

Unfortunately, I'm on my way out the door and won't be back til Sunday pm (EST)
so I'll get a revert to you then. Sorry 'bout that.

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  9:55       ` Peter Hurley
@ 2013-09-13 12:34         ` Greg KH
  0 siblings, 0 replies; 29+ messages in thread
From: Greg KH @ 2013-09-13 12:34 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Fengguang Wu, LKML, lkp, Tejun Heo

On Fri, Sep 13, 2013 at 05:55:47AM -0400, Peter Hurley wrote:
> On 09/12/2013 11:44 PM, Greg KH wrote:
> > On Fri, Sep 13, 2013 at 11:38:04AM +0800, Fengguang Wu wrote:
> >> On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
> >>> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> >>>> Hi Peter,
> >>>>
> >>>> FYI, we noticed much increased vmap_area_lock contentions since this
> >>>> commit:
> >>>
> >>> What does that mean?  What is happening, are we allocating/removing more
> >>> memory now?
> 
> No. Same amount of memory, allocated and freed with the same frequency as
> before.

Ok, I thought so.

> >>> What type of load were you running that showed this problem?
> >>
> >> The increased contentions and lock hold/wait time showed up in a
> >> number of test cases.
> 
> [...]
> 
> > That's a lot of slowdowns, especially for such a simple patch.
> >
> > Peter, any ideas?
> 
> Looks like this patch incidentally triggers some worst-case behavior in
> the memory manager. I'm not sure how this is possible with two 4k buffers,
> but the evidence is substantial.
> 
> This patch isn't critical so I suggest we back out this patch for mainline
> but use the patch to find out what's wrong in the vmap area.
> 
> Unfortunately, I'm on my way out the door and won't be back til Sunday pm (EST)
> so I'll get a revert to you then. Sorry 'bout that.

No rush, we have plenty of time.

I think it would be good to track down the real root cause of this, so
that the allocators don't run into this problem again with some other
innocuous change.

thanks,

greg k-h

> 
> Regards,
> Peter Hurley

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  3:38   ` Fengguang Wu
  2013-09-13  3:44     ` Greg KH
@ 2013-09-17  2:42     ` Peter Hurley
  2013-09-17  2:56       ` Fengguang Wu
  1 sibling, 1 reply; 29+ messages in thread
From: Peter Hurley @ 2013-09-17  2:42 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Greg KH, LKML, lkp, Tejun Heo

On 09/12/2013 11:38 PM, Fengguang Wu wrote:
> On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
>> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>>> Hi Peter,
>>>
>>> FYI, we noticed much increased vmap_area_lock contentions since this
>>> commit:
>>
>> What does that mean?  What is happening, are we allocating/removing more
>> memory now?
>
> // leave this question to Peter and Tejun. :)
>
>> What type of load were you running that showed this problem?
>
> The increased contentions and lock hold/wait time showed up in a
> number of test cases.

How is the raw data acquired?

For example, is the test apparatus reading from /proc/vmallocinfo
(which claims the vmap_area_lock) during the test execution?

I'm wondering if a Heisenberg effect is happening here.

Regards,
Peter Hurley


> The second email has the details, and this section of data is most
> illustrating.
>
>      8cb06c983822103da1cf      20bafb3d23d108bc0a89
> ------------------------  ------------------------
>                   3470.31     +1631.0%     60070.49  fat/micro/dd-write/1HDD-cfq-ext4-10dd
>                   3339.57     +1676.9%     59340.71  fat/micro/dd-write/1HDD-cfq-ext4-1dd
>                   2848.79     +1489.1%     45269.54  lkp-a04/micro/netperf/120s-200%-TCP_CRR
>                   3563.01     +1515.2%     57548.50  lkp-a04/micro/netperf/120s-200%-TCP_MAERTS
>                   2678.64     +1346.0%     38733.80  lkp-a04/micro/netperf/120s-200%-TCP_RR
>                   2839.84     +1450.2%     44022.08  lkp-a04/micro/netperf/120s-200%-TCP_SENDFILE
>                   3417.37     +1571.4%     57116.34  lkp-a04/micro/netperf/120s-200%-TCP_STREAM
>                   2558.59     +1450.8%     39677.58  lkp-a04/micro/netperf/120s-200%-UDP_RR
>                   3737.24     +1558.0%     61963.62  lkp-a04/micro/netperf/120s-200%-UDP_STREAM
>                  20219.50     +1488.7%    321218.02  lkp-a06/crypto/tcrypt/2s-200-204
>                  21017.17     +1457.1%    327257.41  lkp-a06/crypto/tcrypt/2s-205-210
>                  22109.84     +1240.3%    296346.33  lkp-a06/crypto/tcrypt/2s-401-417
>                  17909.60     +1467.3%    280693.71  lkp-a06/micro/dbench/100%
>                 489739.50      +978.5%   5281916.05  lkp-ne04/micro/aim7/shell_rtns_1
>                1601675.63      +906.7%  16123642.52  lkp-snb01/micro/aim7/exec_test
>                  12105.00     +2453.6%    309110.42  nhm-white/micro/aim7/dbase
>                 822461.02     +1585.0%  13858430.62  nhm-white/micro/aim7/exec_test
>                   9858.11     +2715.9%    277595.41  nhm-white/micro/aim7/fork_test
>                   3452.91     +1685.5%     61650.74  nhm-white/micro/aim7/fserver
>                    300.14     +2621.5%      8168.53  nhm-white/micro/aim7/misc_rtns_1
>                 345479.21     +1624.5%   5957828.25  nhm-white/micro/aim7/shell_rtns_1
>                   2694.48     +1974.4%     55894.19  nhm-white/sysbench/oltp/100%-600-1000000
>                   4415.67     +1202.2%     57501.52  nhm8/micro/dbench/100%
>                   2284.65     +1505.2%     36672.75  snb-drag/crypto/tcrypt/2s-200-204
>                   2446.02     +1537.1%     40042.87  snb-drag/crypto/tcrypt/2s-205-210
>                   2484.11     +1599.6%     42219.71  snb-drag/crypto/tcrypt/2s-500-504
>                   2118.55     +1155.8%     26604.99  vpx/crypto/tcrypt/2s-200-204
>                   2713.48     +1198.5%     35234.77  vpx/crypto/tcrypt/2s-205-210
>                   2711.31      +973.8%     29114.07  vpx/crypto/tcrypt/2s-301-319
>                   2369.23      +940.3%     24648.12  vpx/crypto/tcrypt/2s-401-417
>                   2620.64     +1428.7%     40060.71  vpx/crypto/tcrypt/2s-500-504
>                   1713.98     +1624.3%     29553.72  vpx/crypto/tcrypt/2s-505-509
>                3423353.12     +1184.9%  43985148.08  TOTAL lock_stat.vmap_area_lock.holdtime-total
>
> The format of report is
>
>             parent commit                    commit
> ------------------------  ------------------------
>                    number   %change          number     testbox/testcase/test-params
>                    ...
>                    ...
>                    ...
>              total number   %change    total number TOTAL perf-metric-name being compared
>
> Thanks,
> Fengguang
>


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-17  2:42     ` Peter Hurley
@ 2013-09-17  2:56       ` Fengguang Wu
  0 siblings, 0 replies; 29+ messages in thread
From: Fengguang Wu @ 2013-09-17  2:56 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg KH, LKML, lkp, Tejun Heo

On Mon, Sep 16, 2013 at 10:42:11PM -0400, Peter Hurley wrote:
> On 09/12/2013 11:38 PM, Fengguang Wu wrote:
> >On Thu, Sep 12, 2013 at 08:17:00PM -0700, Greg KH wrote:
> >>On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> >>>Hi Peter,
> >>>
> >>>FYI, we noticed much increased vmap_area_lock contentions since this
> >>>commit:
> >>
> >>What does that mean?  What is happening, are we allocating/removing more
> >>memory now?
> >
> >// leave this question to Peter and Tejun. :)
> >
> >>What type of load were you running that showed this problem?
> >
> >The increased contentions and lock hold/wait time showed up in a
> >number of test cases.
> 
> How is the raw data acquired?
> 
> For example, is the test apparatus reading from /proc/vmallocinfo
> (which claims the vmap_area_lock) during the test execution?

We didn't read /proc/vmallocinfo and read /proc/lock_stat once when
the test ends.

> I'm wondering if a Heisenberg effect is happening here.

Thanks,
Fengguang

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  1:09 ` Fengguang Wu
@ 2013-09-17 15:34   ` Peter Hurley
  2013-09-17 23:22     ` Fengguang Wu
  2013-09-18  0:49   ` Peter Hurley
  1 sibling, 1 reply; 29+ messages in thread
From: Peter Hurley @ 2013-09-17 15:34 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Greg KH, LKML, lkp, Tejun Heo

On 09/12/2013 09:09 PM, Fengguang Wu wrote:
> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>> Hi Peter,
>>
>> FYI, we noticed much increased vmap_area_lock contentions since this
>> commit:
>>
>> commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
>> Author: Peter Hurley <peter@hurleysoftware.com>
>> Date:   Sat Jun 15 10:21:19 2013 -0400
>>
>>      n_tty: Move buffers into n_tty_data
>>
>>      Reduce pointer reloading and improve locality-of-reference;
>>      allocate read_buf and echo_buf within struct n_tty_data.
>
> Here are some comparison between this commit [o] with its parent commit [*].

Hi Fenguang,

Can you give the particulars of the aim7 test runs below?
I ask because I get _no_ added contention on the vmap_area_lock when I run
these tests on a dual-socket xeon.

What is the machine configuration(s)?
Are you using the aim7 'multitask' test driver or your own custom driver?
What is the load configuration (ie., constant, linearly increasing, convergence)?
How many loads are you simulating?

                        lock_stat.vmap_area_lock.holdtime-total
[...]
>                 489739.50      +978.5%   5281916.05  lkp-ne04/micro/aim7/shell_rtns_1
>                1601675.63      +906.7%  16123642.52  lkp-snb01/micro/aim7/exec_test
[...]
>                 822461.02     +1585.0%  13858430.62  nhm-white/micro/aim7/exec_test
>                   9858.11     +2715.9%    277595.41  nhm-white/micro/aim7/fork_test
[...]
>                    300.14     +2621.5%      8168.53  nhm-white/micro/aim7/misc_rtns_1
>                 345479.21     +1624.5%   5957828.25  nhm-white/micro/aim7/shell_rtns_1


None of the tests below execute a code path that leads to get_vmalloc_info().
The only in-kernel user of get_vmalloc_info() is a sysfs read of /proc/meminfo,
which none of the tests below perform.

What is reading /proc/meminfo?

Regards,
Peter Hurley

>                   lock_stat.vmap_area_lock.contentions.get_vmalloc_info
>
>      8cb06c983822103da1cf      20bafb3d23d108bc0a89
> ------------------------  ------------------------
>                   4952.40      +447.0%     27090.40  lkp-ne04/micro/aim7/shell_rtns_1
>                  28410.80      +556.2%    186423.00  lkp-snb01/micro/aim7/exec_test
>                   8142.00      +615.4%     58247.33  nhm-white/micro/aim7/exec_test
>                   1386.00      +762.6%     11955.20  nhm-white/micro/aim7/shell_rtns_1
>                  42891.20      +561.5%    283715.93  TOTAL lock_stat.vmap_area_lock.contentions.get_vmalloc_info


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-17 15:34   ` Peter Hurley
@ 2013-09-17 23:22     ` Fengguang Wu
  2013-09-18  0:22       ` Peter Hurley
  0 siblings, 1 reply; 29+ messages in thread
From: Fengguang Wu @ 2013-09-17 23:22 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg KH, LKML, lkp, Tejun Heo

On Tue, Sep 17, 2013 at 11:34:21AM -0400, Peter Hurley wrote:
> On 09/12/2013 09:09 PM, Fengguang Wu wrote:
> >On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
> >>Hi Peter,
> >>
> >>FYI, we noticed much increased vmap_area_lock contentions since this
> >>commit:
> >>
> >>commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
> >>Author: Peter Hurley <peter@hurleysoftware.com>
> >>Date:   Sat Jun 15 10:21:19 2013 -0400
> >>
> >>     n_tty: Move buffers into n_tty_data
> >>
> >>     Reduce pointer reloading and improve locality-of-reference;
> >>     allocate read_buf and echo_buf within struct n_tty_data.
> >
> >Here are some comparison between this commit [o] with its parent commit [*].
> 
> Hi Fenguang,
> 
> Can you give the particulars of the aim7 test runs below?
> I ask because I get _no_ added contention on the vmap_area_lock when I run
> these tests on a dual-socket xeon.
> 
> What is the machine configuration(s)?
> Are you using the aim7 'multitask' test driver or your own custom driver?
> What is the load configuration (ie., constant, linearly increasing, convergence)?
> How many loads are you simulating?

The aim7 tests are basically

        (       
                echo $HOSTNAME
                echo $workfile
                
                echo 1
                echo 2000
                echo 2
                echo 2000
                echo 1
        ) | ./multitask -t

>                        lock_stat.vmap_area_lock.holdtime-total
> [...]
> >                489739.50      +978.5%   5281916.05  lkp-ne04/micro/aim7/shell_rtns_1
> >               1601675.63      +906.7%  16123642.52  lkp-snb01/micro/aim7/exec_test
> [...]
> >                822461.02     +1585.0%  13858430.62  nhm-white/micro/aim7/exec_test
> >                  9858.11     +2715.9%    277595.41  nhm-white/micro/aim7/fork_test
> [...]
> >                   300.14     +2621.5%      8168.53  nhm-white/micro/aim7/misc_rtns_1
> >                345479.21     +1624.5%   5957828.25  nhm-white/micro/aim7/shell_rtns_1
> 
> 
> None of the tests below execute a code path that leads to get_vmalloc_info().
> The only in-kernel user of get_vmalloc_info() is a sysfs read of /proc/meminfo,
> which none of the tests below perform.
> 
> What is reading /proc/meminfo?

Good point! That may explain it:  I'm running a

loop:
        cat /proc/meminfo
        sleep 1
 
in all the tests.

Thanks,
Fengguang

> >                  lock_stat.vmap_area_lock.contentions.get_vmalloc_info
> >
> >     8cb06c983822103da1cf      20bafb3d23d108bc0a89
> >------------------------  ------------------------
> >                  4952.40      +447.0%     27090.40  lkp-ne04/micro/aim7/shell_rtns_1
> >                 28410.80      +556.2%    186423.00  lkp-snb01/micro/aim7/exec_test
> >                  8142.00      +615.4%     58247.33  nhm-white/micro/aim7/exec_test
> >                  1386.00      +762.6%     11955.20  nhm-white/micro/aim7/shell_rtns_1
> >                 42891.20      +561.5%    283715.93  TOTAL lock_stat.vmap_area_lock.contentions.get_vmalloc_info

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-17 23:22     ` Fengguang Wu
@ 2013-09-18  0:22       ` Peter Hurley
  2013-09-25  9:04         ` Lin Ming
  2013-09-26  7:33         ` Andrew Morton
  0 siblings, 2 replies; 29+ messages in thread
From: Peter Hurley @ 2013-09-18  0:22 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Greg KH, LKML, lkp, Tejun Heo

On 09/17/2013 07:22 PM, Fengguang Wu wrote:
> On Tue, Sep 17, 2013 at 11:34:21AM -0400, Peter Hurley wrote:
>> On 09/12/2013 09:09 PM, Fengguang Wu wrote:
>>> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>>>> Hi Peter,
>>>>
>>>> FYI, we noticed much increased vmap_area_lock contentions since this
>>>> commit:
>>>>
>>>> commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
>>>> Author: Peter Hurley <peter@hurleysoftware.com>
>>>> Date:   Sat Jun 15 10:21:19 2013 -0400
>>>>
>>>>      n_tty: Move buffers into n_tty_data
>>>>
>>>>      Reduce pointer reloading and improve locality-of-reference;
>>>>      allocate read_buf and echo_buf within struct n_tty_data.
>>>
>>> Here are some comparison between this commit [o] with its parent commit [*].
>>
>> Hi Fengguang,

Sorry for misspelling your name earlier. Fixed.

>> Can you give the particulars of the aim7 test runs below?
>> I ask because I get _no_ added contention on the vmap_area_lock when I run
>> these tests on a dual-socket xeon.
>>
>> What is the machine configuration(s)?
>> Are you using the aim7 'multitask' test driver or your own custom driver?
>> What is the load configuration (ie., constant, linearly increasing, convergence)?
>> How many loads are you simulating?
>
> The aim7 tests are basically
>
>          (
>                  echo $HOSTNAME
>                  echo $workfile
>
>                  echo 1
>                  echo 2000
>                  echo 2
>                  echo 2000
>                  echo 1
>          ) | ./multitask -t

Thanks for the profile. I ran the aim7 tests with these load parameters (2000!)
and didn't have any significant contention with vmap_area_lock (162).

I had to run a subset of the aim7 tests (just those below) because I don't have
anything fast enough to simulate 2000 loads on the entire workfile.shared testsuite.


>>                         lock_stat.vmap_area_lock.holdtime-total
>> [...]
>>>                 489739.50      +978.5%   5281916.05  lkp-ne04/micro/aim7/shell_rtns_1
>>>                1601675.63      +906.7%  16123642.52  lkp-snb01/micro/aim7/exec_test
>> [...]
>>>                 822461.02     +1585.0%  13858430.62  nhm-white/micro/aim7/exec_test
>>>                   9858.11     +2715.9%    277595.41  nhm-white/micro/aim7/fork_test
>> [...]
>>>                    300.14     +2621.5%      8168.53  nhm-white/micro/aim7/misc_rtns_1
>>>                 345479.21     +1624.5%   5957828.25  nhm-white/micro/aim7/shell_rtns_1
>>
>>
>> None of the tests below execute a code path that leads to get_vmalloc_info().
>> The only in-kernel user of get_vmalloc_info() is a sysfs read of /proc/meminfo,
>> which none of the tests below perform.
>>
>> What is reading /proc/meminfo?
>
> Good point! That may explain it:  I'm running a
>
> loop:
>          cat /proc/meminfo
>          sleep 1
>
> in all the tests.

Yep. That's what's creating the contention -- while the aim7 test is creating
ttys for each and every process (exec_test, shell_rtns_1, ...), the read of
/proc/meminfo is contending with the allocations/frees of 2000 tty ldisc buffers.

Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
could definitely be reduced (even nearly eliminated), but that's a project for
another day :)

Regards,
Peter Hurley


>>>                   lock_stat.vmap_area_lock.contentions.get_vmalloc_info
>>>
>>>      8cb06c983822103da1cf      20bafb3d23d108bc0a89
>>> ------------------------  ------------------------
>>>                   4952.40      +447.0%     27090.40  lkp-ne04/micro/aim7/shell_rtns_1
>>>                  28410.80      +556.2%    186423.00  lkp-snb01/micro/aim7/exec_test
>>>                   8142.00      +615.4%     58247.33  nhm-white/micro/aim7/exec_test
>>>                   1386.00      +762.6%     11955.20  nhm-white/micro/aim7/shell_rtns_1
>>>                  42891.20      +561.5%    283715.93  TOTAL lock_stat.vmap_area_lock.contentions.get_vmalloc_info


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-13  1:09 ` Fengguang Wu
  2013-09-17 15:34   ` Peter Hurley
@ 2013-09-18  0:49   ` Peter Hurley
  1 sibling, 0 replies; 29+ messages in thread
From: Peter Hurley @ 2013-09-18  0:49 UTC (permalink / raw)
  To: Fengguang Wu; +Cc: Greg KH, LKML, lkp, Tejun Heo

On 09/12/2013 09:09 PM, Fengguang Wu wrote:
> On Fri, Sep 13, 2013 at 08:51:33AM +0800, Fengguang Wu wrote:
>> Hi Peter,
>>
>> FYI, we noticed much increased vmap_area_lock contentions since this
>> commit:
>>
>> commit 20bafb3d23d108bc0a896eb8b7c1501f4f649b77
>> Author: Peter Hurley <peter@hurleysoftware.com>
>> Date:   Sat Jun 15 10:21:19 2013 -0400
>>
>>      n_tty: Move buffers into n_tty_data
>>
>>      Reduce pointer reloading and improve locality-of-reference;
>>      allocate read_buf and echo_buf within struct n_tty_data.
>
> Here are some comparison between this commit [o] with its parent commit [*].

[...]

>      8cb06c983822103da1cf      20bafb3d23d108bc0a89
> ------------------------  ------------------------
>                    976.67      +108.3%      2034.67  lkp-a04/micro/netperf/120s-200%-TCP_STREAM
>                   8971.36       +11.4%      9997.05  nhm-white/micro/aim7/exec_test
>                   9948.03       +20.9%     12031.72  TOTAL slabinfo.kmalloc-128.active_objs
>
>      8cb06c983822103da1cf      20bafb3d23d108bc0a89
> ------------------------  ------------------------
>                    976.67      +108.3%      2034.67  lkp-a04/micro/netperf/120s-200%-TCP_STREAM
>                   9127.64       +11.4%     10164.15  nhm-white/micro/aim7/exec_test
>                  10104.31       +20.7%     12198.82  TOTAL slabinfo.kmalloc-128.num_objs

The dramatic increase in 128-byte kmalloc blocks is from vmalloc overhead
with associated with each allocation. On a x86_64, struct vmap_area is 104 bytes,
rounded to 128, allocated with every vmalloc allocation. This is approx 1% overhead
(which seems high to me).

The reason this is still visible after the test completes is the vmap area is
lazily reclaimed (see mm/vmalloc.c:__purge_vmap_area_lazy()).

1% memory overhead coupled with the unwanted vmap_area_lock contention (even though it is
test-induced) -- I might revert this anyway.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-18  0:22       ` Peter Hurley
@ 2013-09-25  9:04         ` Lin Ming
  2013-09-25 11:30           ` Peter Hurley
  2013-09-26  7:33         ` Andrew Morton
  1 sibling, 1 reply; 29+ messages in thread
From: Lin Ming @ 2013-09-25  9:04 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On Wed, Sep 18, 2013 at 8:22 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
[snip]
>
> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
> could definitely be reduced (even nearly eliminated), but that's a project
> for
> another day :)

Hi Peter,

I also looked over vmallo.c, but didn't find obvious way to reduce the
critical section footprint.
Could you share some hints how to do this?

Thanks,
Lin Ming

>
> Regards,
> Peter Hurley

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-25  9:04         ` Lin Ming
@ 2013-09-25 11:30           ` Peter Hurley
  2013-09-25 14:53             ` Lin Ming
  2013-09-25 16:02             ` Lin Ming
  0 siblings, 2 replies; 29+ messages in thread
From: Peter Hurley @ 2013-09-25 11:30 UTC (permalink / raw)
  To: Lin Ming; +Cc: Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On 09/25/2013 05:04 AM, Lin Ming wrote:
> On Wed, Sep 18, 2013 at 8:22 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
> [snip]
>>
>> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
>> could definitely be reduced (even nearly eliminated), but that's a project
>> for
>> another day :)
>
> Hi Peter,
>
> I also looked over vmallo.c, but didn't find obvious way to reduce the
> critical section footprint.
> Could you share some hints how to do this?

vmap_area_list is an RCU list.
get_vmalloc_info() doesn't need to take the vmap_area_lock at all.
Look at __purge_vmap_area_lazy() for a howto.

Regards,
Peter Hurley


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-25 11:30           ` Peter Hurley
@ 2013-09-25 14:53             ` Lin Ming
  2013-09-25 16:02             ` Lin Ming
  1 sibling, 0 replies; 29+ messages in thread
From: Lin Ming @ 2013-09-25 14:53 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On Wed, Sep 25, 2013 at 7:30 PM, Peter Hurley <peter@hurleysoftware.com> wrote:
> On 09/25/2013 05:04 AM, Lin Ming wrote:
>>
>> On Wed, Sep 18, 2013 at 8:22 AM, Peter Hurley <peter@hurleysoftware.com>
>> wrote:
>> [snip]
>>>
>>>
>>> Looking over vmalloc.c, the critical section footprint of the
>>> vmap_area_lock
>>> could definitely be reduced (even nearly eliminated), but that's a
>>> project
>>> for
>>> another day :)
>>
>>
>> Hi Peter,
>>
>> I also looked over vmallo.c, but didn't find obvious way to reduce the
>> critical section footprint.
>> Could you share some hints how to do this?
>
>
> vmap_area_list is an RCU list.
> get_vmalloc_info() doesn't need to take the vmap_area_lock at all.
> Look at __purge_vmap_area_lazy() for a howto.

Looking at it now ...

Thanks,
Lin Ming

>
> Regards,
> Peter Hurley
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-25 11:30           ` Peter Hurley
  2013-09-25 14:53             ` Lin Ming
@ 2013-09-25 16:02             ` Lin Ming
  2013-09-26  3:20               ` Andi Kleen
  1 sibling, 1 reply; 29+ messages in thread
From: Lin Ming @ 2013-09-25 16:02 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On Wed, 2013-09-25 at 07:30 -0400, Peter Hurley wrote:
> On 09/25/2013 05:04 AM, Lin Ming wrote:
> > On Wed, Sep 18, 2013 at 8:22 AM, Peter Hurley <peter@hurleysoftware.com> wrote:
> > [snip]
> >>
> >> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
> >> could definitely be reduced (even nearly eliminated), but that's a project
> >> for
> >> another day :)
> >
> > Hi Peter,
> >
> > I also looked over vmallo.c, but didn't find obvious way to reduce the
> > critical section footprint.
> > Could you share some hints how to do this?
> 
> vmap_area_list is an RCU list.
> get_vmalloc_info() doesn't need to take the vmap_area_lock at all.
> Look at __purge_vmap_area_lazy() for a howto.

Would you like below patch?

From: Lin Ming <minggr@gmail.com>
Date: Wed, 25 Sep 2013 23:48:19 +0800
Subject: [PATCH] mm/vmalloc.c: eliminate vmap_area_lock in get_vmalloc_info()

vmap_area_list is an RCU list.
get_vmalloc_info() doesn't need to take the vmap_area_lock at all.
Use RCU to protect list scan.

Signed-off-by: Lin Ming <minggr@gmail.com>
---
 mm/vmalloc.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 13a5495..4523c9c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2684,14 +2684,14 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
 
 	prev_end = VMALLOC_START;
 
-	spin_lock(&vmap_area_lock);
+	rcu_read_lock();
 
 	if (list_empty(&vmap_area_list)) {
 		vmi->largest_chunk = VMALLOC_TOTAL;
 		goto out;
 	}
 
-	list_for_each_entry(va, &vmap_area_list, list) {
+	list_for_each_entry_rcu(va, &vmap_area_list, list) {
 		unsigned long addr = va->va_start;
 
 		/*
@@ -2718,7 +2718,7 @@ void get_vmalloc_info(struct vmalloc_info *vmi)
 		vmi->largest_chunk = VMALLOC_END - prev_end;
 
 out:
-	spin_unlock(&vmap_area_lock);
+	rcu_read_unlock();
 }
 #endif
 
-- 
1.7.2.5



> 
> Regards,
> Peter Hurley
> 



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-25 16:02             ` Lin Ming
@ 2013-09-26  3:20               ` Andi Kleen
  2013-09-26 11:52                 ` Peter Hurley
  0 siblings, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2013-09-26  3:20 UTC (permalink / raw)
  To: Lin Ming; +Cc: Peter Hurley, Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

Lin Ming <minggr@gmail.com> writes:
>
> Would you like below patch?

The loop body keeps rather complex state. It could easily 
get confused by parallel RCU changes.

So if the list changes in parallel you may suddenly
report very bogus values, as the va_start - prev_end 
computation may be bogus.

Perhaps it's ok (may report bogus gaps), but it seems a bit risky.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-18  0:22       ` Peter Hurley
  2013-09-25  9:04         ` Lin Ming
@ 2013-09-26  7:33         ` Andrew Morton
  2013-09-26 11:31           ` Peter Hurley
  1 sibling, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2013-09-26  7:33 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On Tue, 17 Sep 2013 20:22:42 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:

> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
> could definitely be reduced (even nearly eliminated), but that's a project for
> another day :)

20bafb3d23d10 ("n_tty: Move buffers into n_tty_data") switched a
kmalloc (which is very fast) to a vmalloc (which is very slow) without
so much as mentioning it in the changelog.  This should have been
picked up at review, btw.

Revert that part of the patch and the problem will be solved.

If we are really really worried that a ~9k kmalloc might fail or will
be slow, then implement a fallback to vmalloc() if kmalloc(GFP_NOWARN)
failed.  This kinda sucks, but is practical, but really should only be
done if necessary - ie, if problems with using plain old kmalloc are
demonstrable.

Or just revert all of 20bafb3d23d10 - it was supposed to be a small
performance improvement but turned out to be a significant performance
loss.  Therefore zap.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26  7:33         ` Andrew Morton
@ 2013-09-26 11:31           ` Peter Hurley
  2013-09-26 15:04             ` Greg KH
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Hurley @ 2013-09-26 11:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On 09/26/2013 03:33 AM, Andrew Morton wrote:
> On Tue, 17 Sep 2013 20:22:42 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:
>
>> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
>> could definitely be reduced (even nearly eliminated), but that's a project for
>> another day :)
>
> 20bafb3d23d10 ("n_tty: Move buffers into n_tty_data") switched a
> kmalloc (which is very fast) to a vmalloc (which is very slow) without
> so much as mentioning it in the changelog.  This should have been
> picked up at review, btw.
>
> Revert that part of the patch and the problem will be solved.
>
> If we are really really worried that a ~9k kmalloc might fail or will
> be slow, then implement a fallback to vmalloc() if kmalloc(GFP_NOWARN)
> failed.  This kinda sucks, but is practical, but really should only be
> done if necessary - ie, if problems with using plain old kmalloc are
> demonstrable.
>
> Or just revert all of 20bafb3d23d10 - it was supposed to be a small
> performance improvement but turned out to be a significant performance
> loss.  Therefore zap.

I have no particular objection to reverting the entire patch.

However, it's a mischaracterization to suggest that the reason is
because vmalloc() is very slow;  without reading /proc/meminfo,
there is no performance loss.

IOW, the lock contention this patch precipitated needs to get fixed
regardless.


Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26  3:20               ` Andi Kleen
@ 2013-09-26 11:52                 ` Peter Hurley
  2013-09-26 15:32                   ` Andi Kleen
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Hurley @ 2013-09-26 11:52 UTC (permalink / raw)
  To: Andi Kleen, Lin Ming; +Cc: Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On 09/25/2013 11:20 PM, Andi Kleen wrote:
> Lin Ming <minggr@gmail.com> writes:
>>
>> Would you like below patch?
>
> The loop body keeps rather complex state. It could easily
> get confused by parallel RCU changes.
>
> So if the list changes in parallel you may suddenly
> report very bogus values, as the va_start - prev_end
> computation may be bogus.
>
> Perhaps it's ok (may report bogus gaps), but it seems a bit risky.

I don't understand how the computed gap would be bogus; there
_was_ a list state in which that particular gap existed. The fact
that it may not exist anymore can also happen in the existing
algorithm the instant get_vmalloc_info() drops the vmap_area_lock.

OTOH, parallel list changes could cause an rcu-based get_vmalloc_info()
to over-report or under-report used memory due to parallel list
changes.

If this is a problem in practice, then usage and largest chunk
should be tracked by the allocator instead, obviating the need for
get_vmalloc_info() to traverse the vmap_area_list at all.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26 11:31           ` Peter Hurley
@ 2013-09-26 15:04             ` Greg KH
  2013-09-26 17:35               ` Peter Hurley
  0 siblings, 1 reply; 29+ messages in thread
From: Greg KH @ 2013-09-26 15:04 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Andrew Morton, Fengguang Wu, LKML, lkp, Tejun Heo

On Thu, Sep 26, 2013 at 07:31:47AM -0400, Peter Hurley wrote:
> On 09/26/2013 03:33 AM, Andrew Morton wrote:
> >On Tue, 17 Sep 2013 20:22:42 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:
> >
> >>Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
> >>could definitely be reduced (even nearly eliminated), but that's a project for
> >>another day :)
> >
> >20bafb3d23d10 ("n_tty: Move buffers into n_tty_data") switched a
> >kmalloc (which is very fast) to a vmalloc (which is very slow) without
> >so much as mentioning it in the changelog.  This should have been
> >picked up at review, btw.
> >
> >Revert that part of the patch and the problem will be solved.
> >
> >If we are really really worried that a ~9k kmalloc might fail or will
> >be slow, then implement a fallback to vmalloc() if kmalloc(GFP_NOWARN)
> >failed.  This kinda sucks, but is practical, but really should only be
> >done if necessary - ie, if problems with using plain old kmalloc are
> >demonstrable.
> >
> >Or just revert all of 20bafb3d23d10 - it was supposed to be a small
> >performance improvement but turned out to be a significant performance
> >loss.  Therefore zap.
> 
> I have no particular objection to reverting the entire patch.

How about just switching the call to vmalloc to kmalloc?  Yes, it's a
larger size that is being allocated here, but we were allocating that
much memory anyway before, so it should be the same "speed", if not
faster than before (1 call to kmalloc instead of 3).

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26 11:52                 ` Peter Hurley
@ 2013-09-26 15:32                   ` Andi Kleen
  2013-09-26 17:22                     ` Peter Hurley
  0 siblings, 1 reply; 29+ messages in thread
From: Andi Kleen @ 2013-09-26 15:32 UTC (permalink / raw)
  To: Peter Hurley
  Cc: Andi Kleen, Lin Ming, Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On Thu, Sep 26, 2013 at 07:52:23AM -0400, Peter Hurley wrote:
> On 09/25/2013 11:20 PM, Andi Kleen wrote:
> >Lin Ming <minggr@gmail.com> writes:
> >>
> >>Would you like below patch?
> >
> >The loop body keeps rather complex state. It could easily
> >get confused by parallel RCU changes.
> >
> >So if the list changes in parallel you may suddenly
> >report very bogus values, as the va_start - prev_end
> >computation may be bogus.
> >
> >Perhaps it's ok (may report bogus gaps), but it seems a bit risky.
> 
> I don't understand how the computed gap would be bogus; there
> _was_ a list state in which that particular gap existed. The fact

It could change any time as you don't have an atomic view
of vm_end / vm_start. It is valid to change the fields
with the lock held.

-Andi

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26 15:32                   ` Andi Kleen
@ 2013-09-26 17:22                     ` Peter Hurley
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Hurley @ 2013-09-26 17:22 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Lin Ming, Fengguang Wu, Greg KH, LKML, lkp, Tejun Heo

On 09/26/2013 11:32 AM, Andi Kleen wrote:
> On Thu, Sep 26, 2013 at 07:52:23AM -0400, Peter Hurley wrote:
>> On 09/25/2013 11:20 PM, Andi Kleen wrote:
>>> Lin Ming <minggr@gmail.com> writes:
>>>>
>>>> Would you like below patch?
>>>
>>> The loop body keeps rather complex state. It could easily
>>> get confused by parallel RCU changes.
>>>
>>> So if the list changes in parallel you may suddenly
>>> report very bogus values, as the va_start - prev_end
>>> computation may be bogus.
>>>
>>> Perhaps it's ok (may report bogus gaps), but it seems a bit risky.
>>
>> I don't understand how the computed gap would be bogus; there
>> _was_ a list state in which that particular gap existed. The fact
>
> It could change any time as you don't have an atomic view
> of vm_end / vm_start. It is valid to change the fields
> with the lock held.

va_start and va_end are constant for the lifetime of their vmap_area
(if it's accessible by traversing the vmap_area_list), so it is
not possible for an rcu-based list traversal to see different
values of these individual fields than the spin-locked version.

In addition, for the rcu-based traversal to have arrived at any given
vmap_area requires that the previous vmap_area was its adjacent
lower range at the instant in time when the list cursor was advanced;
again, this is no different than if the spin-locked version had
happened to begin at that same instant.

Regards,
Peter Hurley




^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26 15:04             ` Greg KH
@ 2013-09-26 17:35               ` Peter Hurley
  2013-09-26 18:05                 ` Andrew Morton
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Hurley @ 2013-09-26 17:35 UTC (permalink / raw)
  To: Greg KH; +Cc: Andrew Morton, Fengguang Wu, LKML, lkp, Tejun Heo

On 09/26/2013 11:04 AM, Greg KH wrote:
> On Thu, Sep 26, 2013 at 07:31:47AM -0400, Peter Hurley wrote:
>> On 09/26/2013 03:33 AM, Andrew Morton wrote:
>>> On Tue, 17 Sep 2013 20:22:42 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:
>>>
>>>> Looking over vmalloc.c, the critical section footprint of the vmap_area_lock
>>>> could definitely be reduced (even nearly eliminated), but that's a project for
>>>> another day :)
>>>
>>> 20bafb3d23d10 ("n_tty: Move buffers into n_tty_data") switched a
>>> kmalloc (which is very fast) to a vmalloc (which is very slow) without
>>> so much as mentioning it in the changelog.  This should have been
>>> picked up at review, btw.
>>>
>>> Revert that part of the patch and the problem will be solved.
>>>
>>> If we are really really worried that a ~9k kmalloc might fail or will
>>> be slow, then implement a fallback to vmalloc() if kmalloc(GFP_NOWARN)
>>> failed.  This kinda sucks, but is practical, but really should only be
>>> done if necessary - ie, if problems with using plain old kmalloc are
>>> demonstrable.
>>>
>>> Or just revert all of 20bafb3d23d10 - it was supposed to be a small
>>> performance improvement but turned out to be a significant performance
>>> loss.  Therefore zap.
>>
>> I have no particular objection to reverting the entire patch.
>
> How about just switching the call to vmalloc to kmalloc?  Yes, it's a
> larger size that is being allocated here, but we were allocating that
> much memory anyway before, so it should be the same "speed", if not
> faster than before (1 call to kmalloc instead of 3).

The allocation itself isn't performance-critical. The speed difference
between 1 kmalloc and 3 kmallocs here will be unmeasurable from any
user-space test.

And the only reason vmalloc has any measurable impact stems from the way
reads of /proc/meminfo behave (to which there are a number of appropriate
solutions).

The issue with a single large kmalloc is that it may fail where
3 separate, page-or-less kmallocs would not have.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26 17:35               ` Peter Hurley
@ 2013-09-26 18:05                 ` Andrew Morton
  2013-09-26 21:42                   ` Peter Hurley
  0 siblings, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2013-09-26 18:05 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg KH, Fengguang Wu, LKML, lkp, Tejun Heo

On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:

> The issue with a single large kmalloc is that it may fail where
> 3 separate, page-or-less kmallocs would not have.

Or vmalloc fails first, because of internal fragmentation of the vmap
arena.  This problem plus vmalloc's slowness are the reasons why
vmalloc should be avoided.

A tremendous number of places in the kernel perform higher-order
allocations nowadays.  The page allocator works damn hard to service
them and I expect that switching to kmalloc here will be OK.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26 18:05                 ` Andrew Morton
@ 2013-09-26 21:42                   ` Peter Hurley
  2013-09-26 21:58                     ` Andrew Morton
  0 siblings, 1 reply; 29+ messages in thread
From: Peter Hurley @ 2013-09-26 21:42 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Greg KH, Fengguang Wu, LKML, lkp, Tejun Heo

On 09/26/2013 02:05 PM, Andrew Morton wrote:
> On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:
>
>> The issue with a single large kmalloc is that it may fail where
>> 3 separate, page-or-less kmallocs would not have.
>
> Or vmalloc fails first, because of internal fragmentation of the vmap
> arena.  This problem plus vmalloc's slowness are the reasons why
> vmalloc should be avoided.

Ok, no vmalloc.

> A tremendous number of places in the kernel perform higher-order
> allocations nowadays.  The page allocator works damn hard to service
> them and I expect that switching to kmalloc here will be OK.

I've had order-4 allocation failures before on 10Gb. In fact, the
nouveau driver switched to vmalloc for that very reason (commit
d005f51eb93d71cd40ebd11dd377453fa8c8a42a, drm/nouveau: use vmalloc
for pgt allocation).

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26 21:42                   ` Peter Hurley
@ 2013-09-26 21:58                     ` Andrew Morton
  2013-09-26 22:21                       ` Peter Hurley
  0 siblings, 1 reply; 29+ messages in thread
From: Andrew Morton @ 2013-09-26 21:58 UTC (permalink / raw)
  To: Peter Hurley; +Cc: Greg KH, Fengguang Wu, LKML, lkp, Tejun Heo

On Thu, 26 Sep 2013 17:42:52 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:

> On 09/26/2013 02:05 PM, Andrew Morton wrote:
> > On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:
> >
> >> The issue with a single large kmalloc is that it may fail where
> >> 3 separate, page-or-less kmallocs would not have.
> >
> > Or vmalloc fails first, because of internal fragmentation of the vmap
> > arena.  This problem plus vmalloc's slowness are the reasons why
> > vmalloc should be avoided.
> 
> Ok, no vmalloc.
> 
> > A tremendous number of places in the kernel perform higher-order
> > allocations nowadays.  The page allocator works damn hard to service
> > them and I expect that switching to kmalloc here will be OK.
> 
> I've had order-4 allocation failures before on 10Gb.

Yep.  But this allocation will be order=2, yes?  And
PAGE_ALLOC_COSTLY_ORDER=3.  So if that thing is working correctly,
order=2 will do a lot better than order=4.

> In fact, the
> nouveau driver switched to vmalloc for that very reason (commit
> d005f51eb93d71cd40ebd11dd377453fa8c8a42a, drm/nouveau: use vmalloc
> for pgt allocation).

Sigh.  I'm not aware of any reports of anyone hitting arena
fragmentation problems yet, so it remains a theoretical thing.  But the
more we use vmalloc, the more likely it becomes.  And because the usage
sites are so disparate, fixing it will be pretty horrid.

For this reason (plus vmalloc is slow), I do think it's better to do
the old

	foo = kmalloc(__GFP_NOWARN);
	if (!foo)
		foo = vmalloc();

thing.  It's ugly, but will greatly reduce the amount of vmallocing
which happens.

Someone had a patch a while back which wraps this operation (and the
corresponding free) into library functions.  I said yuk and it wasn't
merged.  Perhaps that was a mistake.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data"
  2013-09-26 21:58                     ` Andrew Morton
@ 2013-09-26 22:21                       ` Peter Hurley
  0 siblings, 0 replies; 29+ messages in thread
From: Peter Hurley @ 2013-09-26 22:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Greg KH, Fengguang Wu, LKML, lkp, Tejun Heo

On 09/26/2013 05:58 PM, Andrew Morton wrote:
> On Thu, 26 Sep 2013 17:42:52 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:
>
>> On 09/26/2013 02:05 PM, Andrew Morton wrote:
>>> On Thu, 26 Sep 2013 13:35:32 -0400 Peter Hurley <peter@hurleysoftware.com> wrote:
>>>
>>>> The issue with a single large kmalloc is that it may fail where
>>>> 3 separate, page-or-less kmallocs would not have.
>>>
>>> Or vmalloc fails first, because of internal fragmentation of the vmap
>>> arena.  This problem plus vmalloc's slowness are the reasons why
>>> vmalloc should be avoided.
>>
>> Ok, no vmalloc.
>>
>>> A tremendous number of places in the kernel perform higher-order
>>> allocations nowadays.  The page allocator works damn hard to service
>>> them and I expect that switching to kmalloc here will be OK.
>>
>> I've had order-4 allocation failures before on 10Gb.
>
> Yep.  But this allocation will be order=2, yes?  And
> PAGE_ALLOC_COSTLY_ORDER=3.  So if that thing is working correctly,
> order=2 will do a lot better than order=4.

PAGE_ALLOC_COSTLY_ORDER was a subtlety I wasn't aware of; thanks
for the info.

>> In fact, the
>> nouveau driver switched to vmalloc for that very reason (commit
>> d005f51eb93d71cd40ebd11dd377453fa8c8a42a, drm/nouveau: use vmalloc
>> for pgt allocation).
>
> Sigh.  I'm not aware of any reports of anyone hitting arena
> fragmentation problems yet, so it remains a theoretical thing.  But the
> more we use vmalloc, the more likely it becomes.  And because the usage
> sites are so disparate, fixing it will be pretty horrid.
>
> For this reason (plus vmalloc is slow), I do think it's better to do
> the old
>
> 	foo = kmalloc(__GFP_NOWARN);
> 	if (!foo)
> 		foo = vmalloc();
>
> thing.  It's ugly, but will greatly reduce the amount of vmallocing
> which happens.
>
> Someone had a patch a while back which wraps this operation (and the
> corresponding free) into library functions.  I said yuk and it wasn't
> merged.  Perhaps that was a mistake.

I would suggest either
1. documenting the bulk of our conversation in either/both
    mm/vmalloc.c:vmalloc() and include/linux/slab.h
or
2. require that new vmalloc() users get your ack.

Regards,
Peter Hurley

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2013-09-26 22:21 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-09-13  0:51 increased vmap_area_lock contentions on "n_tty: Move buffers into n_tty_data" Fengguang Wu
2013-09-13  1:09 ` Fengguang Wu
2013-09-17 15:34   ` Peter Hurley
2013-09-17 23:22     ` Fengguang Wu
2013-09-18  0:22       ` Peter Hurley
2013-09-25  9:04         ` Lin Ming
2013-09-25 11:30           ` Peter Hurley
2013-09-25 14:53             ` Lin Ming
2013-09-25 16:02             ` Lin Ming
2013-09-26  3:20               ` Andi Kleen
2013-09-26 11:52                 ` Peter Hurley
2013-09-26 15:32                   ` Andi Kleen
2013-09-26 17:22                     ` Peter Hurley
2013-09-26  7:33         ` Andrew Morton
2013-09-26 11:31           ` Peter Hurley
2013-09-26 15:04             ` Greg KH
2013-09-26 17:35               ` Peter Hurley
2013-09-26 18:05                 ` Andrew Morton
2013-09-26 21:42                   ` Peter Hurley
2013-09-26 21:58                     ` Andrew Morton
2013-09-26 22:21                       ` Peter Hurley
2013-09-18  0:49   ` Peter Hurley
2013-09-13  3:17 ` Greg KH
2013-09-13  3:38   ` Fengguang Wu
2013-09-13  3:44     ` Greg KH
2013-09-13  9:55       ` Peter Hurley
2013-09-13 12:34         ` Greg KH
2013-09-17  2:42     ` Peter Hurley
2013-09-17  2:56       ` Fengguang Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).