* Re: HugePages shared memory support in LLTng [not found] <CAO+PNdHdLFk=Q0L2BLGnz8xCvdgMw3aYpZuAZumBOWKraKTnAw@mail.gmail.com> @ 2019-07-15 14:33 ` Jonathan Rajotte-Julien [not found] ` <20190715143302.GA2017@joraj-alpa> 1 sibling, 0 replies; 13+ messages in thread From: Jonathan Rajotte-Julien @ 2019-07-15 14:33 UTC (permalink / raw) To: Yiteng Guo; +Cc: lttng-dev Hi Yiteng, On Fri, Jul 12, 2019 at 06:18:44PM -0400, Yiteng Guo wrote: > Hello, > > I am wondering if there is any way for lttng-ust to create its shm on > hugepages. I noticed that there was an option `--shm-path` which can be > used to change the location of shm. However, if I specified the path to a > `hugetlbfs` such as /dev/hugepages, I would get errors in lttng-sessiond > and no trace data were generated. > > The error I got was > ``` > PERROR - 17:54:56.740674 [8163/8168]: Error appending to metadata file: > Invalid argument (in lttng_metadata_printf() at ust-metadata.c:176) > Error: Failed to generate session metadata (errno = -1) > ``` > I took a look at lttng code base and found that lttng used `write` to > generate a metadata file under `--shm-path`. However, it looks like > `hugetlbfs` does not support `write` operation. I did a simple patch with > `mmap` to get around this problem. Then, I got another error: Would you be interested in sharing this patch so we can help you figure out the problem? A github branch would be perfect. > ``` > Error: Error creating UST channel "my-channel" on the consumer daemon > ``` Make sure to pass "--verbose-consumer" to lttng-sessiond. It will ensure that the lttng-consumerd output is present in lttng-sesssiond logs. It should help a bit I suspect that we fail on buffers allocation. > This time, I could not locate the problem anymore :(. Do you have any idea > of how to get hugepages shm work in lttng? > > To give you more context here, I was tracing a performance sensitive > program. I didn't want to suffer from the sub-buffer switch cost so I > created a very large sub-buffer (1MB). If you don't mind, how many core are present? How much memory is available on the host? Could you share with us the complete sequence of command you use to setup your tracing session? If it is not much trouble could you also share the step you took to setup/mount your hugetlbfs path? > I did a benchmark on my tracepoint > and noticed that after running a certain number of tracepoints, I got a > noticeably larger overhead (1200ns larger than other) for every ~130 > tracepoints. It turned out that this large overhead was due to a page > fault. The numbers were matched up (130 * 32 bytes = 4160 bytes, which is > approximately the size of a normal page 4kB) and I also used lttng perf > page fault counters to verify it. Therefore, I am looking for a solution to > have lttng create shm on hugepages. Quite interesting! > > Thank you very much! I look forward to hearing from you. > > Best, > Yiteng > _______________________________________________ > lttng-dev mailing list > lttng-dev@lists.lttng.org > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev -- Jonathan Rajotte-Julien EfficiOS ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20190715143302.GA2017@joraj-alpa>]
* Re: HugePages shared memory support in LLTng [not found] ` <20190715143302.GA2017@joraj-alpa> @ 2019-07-15 19:21 ` Yiteng Guo [not found] ` <CAO+PNdHW7O98QSdWyA5U6e=gtLmdFt77wHOT=eHb-Py1W3A-oQ@mail.gmail.com> 1 sibling, 0 replies; 13+ messages in thread From: Yiteng Guo @ 2019-07-15 19:21 UTC (permalink / raw) To: Jonathan Rajotte-Julien; +Cc: lttng-dev Hi Jonathan, Thank you for your prompt reply. On Mon, Jul 15, 2019 at 10:33 AM Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com> wrote: > > Hi Yiteng, > > On Fri, Jul 12, 2019 at 06:18:44PM -0400, Yiteng Guo wrote: > > Hello, > > > > I am wondering if there is any way for lttng-ust to create its shm on > > hugepages. I noticed that there was an option `--shm-path` which can be > > used to change the location of shm. However, if I specified the path to a > > `hugetlbfs` such as /dev/hugepages, I would get errors in lttng-sessiond > > and no trace data were generated. > > > > The error I got was > > ``` > > PERROR - 17:54:56.740674 [8163/8168]: Error appending to metadata file: > > Invalid argument (in lttng_metadata_printf() at ust-metadata.c:176) > > Error: Failed to generate session metadata (errno = -1) > > ``` > > I took a look at lttng code base and found that lttng used `write` to > > generate a metadata file under `--shm-path`. However, it looks like > > `hugetlbfs` does not support `write` operation. I did a simple patch with > > `mmap` to get around this problem. Then, I got another error: > > Would you be interested in sharing this patch so we can help you figure out the > problem? > > A github branch would be perfect. > You can check out my patch here: https://github.com/guoyiteng/lttng-tools/compare/master...guoyiteng:hugepage > > ``` > > Error: Error creating UST channel "my-channel" on the consumer daemon > > ``` > > Make sure to pass "--verbose-consumer" to lttng-sessiond. It will ensure that > the lttng-consumerd output is present in lttng-sesssiond logs. It should help a > bit > > I suspect that we fail on buffers allocation. > After I passed "--verbose-consumer", I got the following logs. DEBUG1 - 18:59:55.773387304 [8844/8844]: Health check time delta in seconds set to 20 (in health_init() at health.c:73) DEBUG3 - 18:59:55.773544396 [8844/8844]: Created hashtable size 4 at 0x5625c427d4c0 of type 2 (in lttng_ht_new() at hashtable.c:145) DEBUG3 - 18:59:55.773560630 [8844/8844]: Created hashtable size 4 at 0x5625c427dc00 of type 2 (in lttng_ht_new() at hashtable.c:145) DEBUG3 - 18:59:55.773566277 [8844/8844]: Created hashtable size 4 at 0x5625c427df30 of type 2 (in lttng_ht_new() at hashtable.c:145) DEBUG3 - 18:59:55.773572450 [8844/8844]: Created hashtable size 4 at 0x5625c427ead0 of type 2 (in lttng_ht_new() at hashtable.c:145) DEBUG3 - 18:59:55.773576515 [8844/8844]: Created hashtable size 4 at 0x5625c427f210 of type 2 (in lttng_ht_new() at hashtable.c:145) DEBUG3 - 18:59:55.773582290 [8844/8844]: Created hashtable size 4 at 0x5625c427f950 of type 2 (in lttng_ht_new() at hashtable.c:145) DEBUG1 - 18:59:55.773605669 [8844/8844]: TCP inet operation timeout set to 216 sec (in lttcomm_inet_init() at inet.c:546) DEBUG1 - 18:59:55.773627249 [8844/8844]: Connecting to error socket /home/vagrant/.lttng/ustconsumerd64/error (in main() at lttng-consumerd.c:464) DEBUG1 - 18:59:55.773767487 [8844/8848]: [thread] Manage health check started (in thread_manage_health() at health-consumerd.c:167) DEBUG1 - 18:59:55.773849241 [8844/8848]: epoll set max size is 1659863 (in compat_epoll_set_max_size() at compat-epoll.c:337) DEBUG1 - 18:59:55.773884368 [8844/8848]: Health check ready (in thread_manage_health() at health-consumerd.c:247) DEBUG3 - 18:59:55.883547291 [8844/8850]: Created hashtable size 4 at 0x7ff5ec000b40 of type 2 (in lttng_ht_new() at hashtable.c:145) DEBUG1 - 18:59:55.884682278 [8844/8850]: Thread channel poll started (in consumer_thread_channel_poll() at consumer.c:2941) DEBUG1 - 18:59:55.883573028 [8844/8853]: Creating command socket /home/vagrant/.lttng/ustconsumerd64/command (in consumer_thread_sessiond_poll() at consumer.c:3204) DEBUG1 - 18:59:55.885435478 [8844/8853]: Sending ready command to lttng-sessiond (in consumer_thread_sessiond_poll() at consumer.c:3217) DEBUG1 - 18:59:55.883646301 [8844/8852]: Updating poll fd array (in update_poll_array() at consumer.c:1103) DEBUG1 - 18:59:55.885574718 [8844/8853]: Connection on client_socket (in consumer_thread_sessiond_poll() at consumer.c:3239) DEBUG1 - 18:59:55.885583183 [8844/8852]: polling on 2 fd (in consumer_thread_data_poll() at consumer.c:2630) DEBUG1 - 18:59:55.885596572 [8844/8853]: Metadata connection on client_socket (in set_metadata_socket() at consumer.c:3165) DEBUG1 - 18:59:55.885612073 [8844/8853]: Incoming command on sock (in consumer_thread_sessiond_poll() at consumer.c:3285) DEBUG1 - 18:59:55.883553158 [8844/8851]: Thread metadata poll started (in consumer_thread_metadata_poll() at consumer.c:2351) DEBUG1 - 18:59:55.885714717 [8844/8851]: Metadata main loop started (in consumer_thread_metadata_poll() at consumer.c:2367) DEBUG1 - 18:59:55.885726270 [8844/8851]: Metadata poll wait (in consumer_thread_metadata_poll() at consumer.c:2373) DEBUG1 - 18:59:55.885781919 [8844/8853]: Received channel monitor pipe (29) (in lttng_ustconsumer_recv_cmd() at ust-consumer.c:1903) DEBUG1 - 18:59:55.885803340 [8844/8853]: Channel monitor pipe set as non-blocking (in lttng_ustconsumer_recv_cmd() at ust-consumer.c:1924) DEBUG1 - 18:59:55.885810860 [8844/8853]: received command on sock (in consumer_thread_sessiond_poll() at consumer.c:3301) DEBUG1 - 18:59:55.887146328 [8844/8850]: Channel main loop started (in consumer_thread_channel_poll() at consumer.c:2956) DEBUG1 - 18:59:55.887497303 [8844/8850]: Channel poll wait (in consumer_thread_channel_poll() at consumer.c:2961) DEBUG1 - 18:59:55.892440821 [8844/8853]: Incoming command on sock (in consumer_thread_sessiond_poll() at consumer.c:3285) DEBUG1 - 18:59:55.892479711 [8844/8853]: Consumer mkdir /home/vagrant/lttng-traces/auto-20190715-185955//ust in session 0 (in lttng_ustconsumer_recv_cmd() at ust-consumer.c:2093) DEBUG3 - 18:59:55.892486547 [8844/8853]: mkdirat() recursive fd = -100 (AT_FDCWD), path = /home/vagrant/lttng-traces/auto-20190715-185955//ust, mode = 504, uid = 1000, gid = 1000 (in run_as_mkdirat_recursive() at runas.c:1147) DEBUG1 - 18:59:55.892500964 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG1 - 18:59:55.892852801 [8844/8853]: received command on sock (in consumer_thread_sessiond_poll() at consumer.c:3301) DEBUG1 - 18:59:57.964977091 [8844/8853]: Incoming command on sock (in consumer_thread_sessiond_poll() at consumer.c:3285) DEBUG1 - 18:59:57.965041124 [8844/8853]: Allocated channel (key 1) (in consumer_allocate_channel() at consumer.c:1043) DEBUG3 - 18:59:57.965052309 [8844/8853]: Creating channel to ustctl with attr: [overwrite: 0, subbuf_size: 524288, num_subbuf: 4, switch_timer_interval: 0, read_timer_interval: 0, output: 0, type: 0 (in create_ust_channel() at ust-consumer.c:457) DEBUG3 - 18:59:57.965104805 [8844/8853]: open() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_0 with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at runas.c:1212) DEBUG1 - 18:59:57.965120609 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.965317517 [8844/8853]: open() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_1 with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at runas.c:1212) DEBUG1 - 18:59:57.965335148 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.965445811 [8844/8853]: open() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_2 with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at runas.c:1212) DEBUG1 - 18:59:57.965461438 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.966116363 [8844/8853]: open() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_3 with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at runas.c:1212) DEBUG1 - 18:59:57.966145191 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.966341799 [8844/8853]: open() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_4 with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at runas.c:1212) DEBUG1 - 18:59:57.966420313 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.966548533 [8844/8853]: open() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_5 with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at runas.c:1212) DEBUG1 - 18:59:57.966567778 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.966932907 [8844/8853]: unlink() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_5 with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) DEBUG1 - 18:59:57.966950256 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.967061802 [8844/8853]: unlink() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_4 with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) DEBUG1 - 18:59:57.967081332 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.967366982 [8844/8853]: unlink() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_3 with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) DEBUG1 - 18:59:57.967419957 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.967562353 [8844/8853]: unlink() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_2 with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) DEBUG1 - 18:59:57.967587355 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.968008237 [8844/8853]: unlink() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_1 with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) DEBUG1 - 18:59:57.968104447 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.968327138 [8844/8853]: unlink() /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_0 with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) DEBUG1 - 18:59:57.968349750 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG3 - 18:59:57.968562473 [8844/8853]: rmdir_recursive() /dev/hugepages/auto-20190715-185955 with for uid 1000 and gid 1000 (in run_as_rmdir_recursive() at runas.c:1251) DEBUG1 - 18:59:57.968582498 [8844/8853]: Using run_as worker (in run_as() at runas.c:1100) DEBUG1 - 18:59:57.968934753 [8844/8853]: UST consumer cleaning stream list (in destroy_channel() at ust-consumer.c:67) DEBUG1 - 18:59:57.969019502 [8844/8853]: received command on sock (in consumer_thread_sessiond_poll() at consumer.c:3301) Error: ask_channel_creation consumer command failed Error: Error creating UST channel "channel0" on the consumer daemon > > This time, I could not locate the problem anymore :(. Do you have any idea > > of how to get hugepages shm work in lttng? > > > > To give you more context here, I was tracing a performance sensitive > > program. I didn't want to suffer from the sub-buffer switch cost so I > > created a very large sub-buffer (1MB). > > If you don't mind, how many core are present? How much memory is available on > the host? I compiled and played around with lttng source codes on my vagrant vm environment. I assigned 6 cores and 7.8G memory to it. My vm OS is Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-51-generic x86_64). > > Could you share with us the complete sequence of command you use to setup your > tracing session? > I used the following commands to test if lttng works with hugepages. ``` lttng create --shm-path=/dev/hugepages lttng enable-event --userspace hello_world:my_first_tracepoint lttng start ``` And the binary program I traced was the hello_world example in lttng documentation page. > If it is not much trouble could you also share the step you took to setup/mount > your hugetlbfs path? > I followed the first section in https://wiki.debian.org/Hugepages to set up my hugetlbfs, except I used /dev/hugepages instead of /hugepages. > > I did a benchmark on my tracepoint > > and noticed that after running a certain number of tracepoints, I got a > > noticeably larger overhead (1200ns larger than other) for every ~130 > > tracepoints. It turned out that this large overhead was due to a page > > fault. The numbers were matched up (130 * 32 bytes = 4160 bytes, which is > > approximately the size of a normal page 4kB) and I also used lttng perf > > page fault counters to verify it. Therefore, I am looking for a solution to > > have lttng create shm on hugepages. > > Quite interesting! > > > > > Thank you very much! I look forward to hearing from you. > > > > Best, > > Yiteng > > > _______________________________________________ > > lttng-dev mailing list > > lttng-dev@lists.lttng.org > > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev > > > -- > Jonathan Rajotte-Julien > EfficiOS Best, Yiteng ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CAO+PNdHW7O98QSdWyA5U6e=gtLmdFt77wHOT=eHb-Py1W3A-oQ@mail.gmail.com>]
* Re: HugePages shared memory support in LLTng [not found] ` <CAO+PNdHW7O98QSdWyA5U6e=gtLmdFt77wHOT=eHb-Py1W3A-oQ@mail.gmail.com> @ 2019-07-22 18:44 ` Yiteng Guo [not found] ` <CAO+PNdFotFk6uCF1dySZi9dV6PYpAazWoQpsnU+N58F2b-73FQ@mail.gmail.com> 1 sibling, 0 replies; 13+ messages in thread From: Yiteng Guo @ 2019-07-22 18:44 UTC (permalink / raw) To: Jonathan Rajotte-Julien; +Cc: lttng-dev Hi Jonathan, I spent these days on this problem and finally figured it out. Here are patches I've written. https://github.com/lttng/lttng-ust/compare/master...guoyiteng:hugepages https://github.com/lttng/lttng-tools/compare/master...guoyiteng:hugepages These two patches are just ad-hoc supports for hugepages, which are not intended to be a pull request. If you want to support hugepages in future lttng releases, I am glad to help you with that. What I did here is to replace `shm_open` with `open` on a hugetlbfs directory. I also modified other parts of code (such as memory alignment) to make them compatible with huge pages. I didn't use `shm-path` option because I noticed that this option would not only relocate the shm of ring buffer but also other shm and metadata files. However, we only wanted to use huge pages for ring buffer here. Here are commands I used to launch an lttng session. ``` lttng create lttng enable-channel --userspace --subbuf-size=4M --num-subbuf=2 --buffers-pid my-channel lttng add-context --userspace --type=perf:thread:page-fault lttng enable-event --userspace -c my-channel hello_world:my_first_tracepoint lttng start ``` My patches worked very well and I didn't get page faults anymore. However, the only caveat of this patch is that ringbuffers are not destroyed correctly. It leads to a problem that every new lttng session acquires some hugepages but never releases them. After I created and destroyed several sessions, I would get an error that told me there were not enough hugepages to be used. I get around this problem by restarting the session daemon. But there should be some way to have ringbuffers (or its channel) destroyed elegently when its session is destroyed. In the meantime, I am also trying another way to get rid of these page faults, which is to prefault the ringbuffer shared memory in my program. This solution does not need any modification on lttng souce codes, which, I think, is a safer way to go. However, to prefault the ringbuffer shm, I need to know the address (and size) of the ringbuffer. Is there any way to learn this piece of information from the user program? I wish you could have a plan to support the hugepages in the future. I am more than happy to help you with that. Thank you very much and I look forward to hearing from you. Best, Yiteng On Mon, Jul 15, 2019 at 3:21 PM Yiteng Guo <guoyiteng@gmail.com> wrote: > > Hi Jonathan, > > Thank you for your prompt reply. > > On Mon, Jul 15, 2019 at 10:33 AM Jonathan Rajotte-Julien > <jonathan.rajotte-julien@efficios.com> wrote: > > > > Hi Yiteng, > > > > On Fri, Jul 12, 2019 at 06:18:44PM -0400, Yiteng Guo wrote: > > > Hello, > > > > > > I am wondering if there is any way for lttng-ust to create its shm on > > > hugepages. I noticed that there was an option `--shm-path` which can be > > > used to change the location of shm. However, if I specified the path to a > > > `hugetlbfs` such as /dev/hugepages, I would get errors in lttng-sessiond > > > and no trace data were generated. > > > > > > The error I got was > > > ``` > > > PERROR - 17:54:56.740674 [8163/8168]: Error appending to metadata file: > > > Invalid argument (in lttng_metadata_printf() at ust-metadata.c:176) > > > Error: Failed to generate session metadata (errno = -1) > > > ``` > > > I took a look at lttng code base and found that lttng used `write` to > > > generate a metadata file under `--shm-path`. However, it looks like > > > `hugetlbfs` does not support `write` operation. I did a simple patch with > > > `mmap` to get around this problem. Then, I got another error: > > > > Would you be interested in sharing this patch so we can help you figure out the > > problem? > > > > A github branch would be perfect. > > > > You can check out my patch here: > https://github.com/guoyiteng/lttng-tools/compare/master...guoyiteng:hugepage > > > > ``` > > > Error: Error creating UST channel "my-channel" on the consumer daemon > > > ``` > > > > Make sure to pass "--verbose-consumer" to lttng-sessiond. It will ensure that > > the lttng-consumerd output is present in lttng-sesssiond logs. It should help a > > bit > > > > I suspect that we fail on buffers allocation. > > > > After I passed "--verbose-consumer", I got the following logs. > > DEBUG1 - 18:59:55.773387304 [8844/8844]: Health check time delta in > seconds set to 20 (in health_init() at health.c:73) > DEBUG3 - 18:59:55.773544396 [8844/8844]: Created hashtable size 4 at > 0x5625c427d4c0 of type 2 (in lttng_ht_new() at hashtable.c:145) > DEBUG3 - 18:59:55.773560630 [8844/8844]: Created hashtable size 4 at > 0x5625c427dc00 of type 2 (in lttng_ht_new() at hashtable.c:145) > DEBUG3 - 18:59:55.773566277 [8844/8844]: Created hashtable size 4 at > 0x5625c427df30 of type 2 (in lttng_ht_new() at hashtable.c:145) > DEBUG3 - 18:59:55.773572450 [8844/8844]: Created hashtable size 4 at > 0x5625c427ead0 of type 2 (in lttng_ht_new() at hashtable.c:145) > DEBUG3 - 18:59:55.773576515 [8844/8844]: Created hashtable size 4 at > 0x5625c427f210 of type 2 (in lttng_ht_new() at hashtable.c:145) > DEBUG3 - 18:59:55.773582290 [8844/8844]: Created hashtable size 4 at > 0x5625c427f950 of type 2 (in lttng_ht_new() at hashtable.c:145) > DEBUG1 - 18:59:55.773605669 [8844/8844]: TCP inet operation timeout > set to 216 sec (in lttcomm_inet_init() at inet.c:546) > DEBUG1 - 18:59:55.773627249 [8844/8844]: Connecting to error socket > /home/vagrant/.lttng/ustconsumerd64/error (in main() at > lttng-consumerd.c:464) > DEBUG1 - 18:59:55.773767487 [8844/8848]: [thread] Manage health check > started (in thread_manage_health() at health-consumerd.c:167) > DEBUG1 - 18:59:55.773849241 [8844/8848]: epoll set max size is 1659863 > (in compat_epoll_set_max_size() at compat-epoll.c:337) > DEBUG1 - 18:59:55.773884368 [8844/8848]: Health check ready (in > thread_manage_health() at health-consumerd.c:247) > DEBUG3 - 18:59:55.883547291 [8844/8850]: Created hashtable size 4 at > 0x7ff5ec000b40 of type 2 (in lttng_ht_new() at hashtable.c:145) > DEBUG1 - 18:59:55.884682278 [8844/8850]: Thread channel poll started > (in consumer_thread_channel_poll() at consumer.c:2941) > DEBUG1 - 18:59:55.883573028 [8844/8853]: Creating command socket > /home/vagrant/.lttng/ustconsumerd64/command (in > consumer_thread_sessiond_poll() at consumer.c:3204) > DEBUG1 - 18:59:55.885435478 [8844/8853]: Sending ready command to > lttng-sessiond (in consumer_thread_sessiond_poll() at consumer.c:3217) > DEBUG1 - 18:59:55.883646301 [8844/8852]: Updating poll fd array (in > update_poll_array() at consumer.c:1103) > DEBUG1 - 18:59:55.885574718 [8844/8853]: Connection on client_socket > (in consumer_thread_sessiond_poll() at consumer.c:3239) > DEBUG1 - 18:59:55.885583183 [8844/8852]: polling on 2 fd (in > consumer_thread_data_poll() at consumer.c:2630) > DEBUG1 - 18:59:55.885596572 [8844/8853]: Metadata connection on > client_socket (in set_metadata_socket() at consumer.c:3165) > DEBUG1 - 18:59:55.885612073 [8844/8853]: Incoming command on sock (in > consumer_thread_sessiond_poll() at consumer.c:3285) > DEBUG1 - 18:59:55.883553158 [8844/8851]: Thread metadata poll started > (in consumer_thread_metadata_poll() at consumer.c:2351) > DEBUG1 - 18:59:55.885714717 [8844/8851]: Metadata main loop started > (in consumer_thread_metadata_poll() at consumer.c:2367) > DEBUG1 - 18:59:55.885726270 [8844/8851]: Metadata poll wait (in > consumer_thread_metadata_poll() at consumer.c:2373) > DEBUG1 - 18:59:55.885781919 [8844/8853]: Received channel monitor pipe > (29) (in lttng_ustconsumer_recv_cmd() at ust-consumer.c:1903) > DEBUG1 - 18:59:55.885803340 [8844/8853]: Channel monitor pipe set as > non-blocking (in lttng_ustconsumer_recv_cmd() at ust-consumer.c:1924) > DEBUG1 - 18:59:55.885810860 [8844/8853]: received command on sock (in > consumer_thread_sessiond_poll() at consumer.c:3301) > DEBUG1 - 18:59:55.887146328 [8844/8850]: Channel main loop started (in > consumer_thread_channel_poll() at consumer.c:2956) > DEBUG1 - 18:59:55.887497303 [8844/8850]: Channel poll wait (in > consumer_thread_channel_poll() at consumer.c:2961) > DEBUG1 - 18:59:55.892440821 [8844/8853]: Incoming command on sock (in > consumer_thread_sessiond_poll() at consumer.c:3285) > DEBUG1 - 18:59:55.892479711 [8844/8853]: Consumer mkdir > /home/vagrant/lttng-traces/auto-20190715-185955//ust in session 0 (in > lttng_ustconsumer_recv_cmd() at ust-consumer.c:2093) > DEBUG3 - 18:59:55.892486547 [8844/8853]: mkdirat() recursive fd = -100 > (AT_FDCWD), path = > /home/vagrant/lttng-traces/auto-20190715-185955//ust, mode = 504, uid > = 1000, gid = 1000 (in run_as_mkdirat_recursive() at runas.c:1147) > DEBUG1 - 18:59:55.892500964 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG1 - 18:59:55.892852801 [8844/8853]: received command on sock (in > consumer_thread_sessiond_poll() at consumer.c:3301) > DEBUG1 - 18:59:57.964977091 [8844/8853]: Incoming command on sock (in > consumer_thread_sessiond_poll() at consumer.c:3285) > DEBUG1 - 18:59:57.965041124 [8844/8853]: Allocated channel (key 1) (in > consumer_allocate_channel() at consumer.c:1043) > DEBUG3 - 18:59:57.965052309 [8844/8853]: Creating channel to ustctl > with attr: [overwrite: 0, subbuf_size: 524288, num_subbuf: 4, > switch_timer_interval: 0, read_timer_interval: 0, output: 0, type: 0 > (in create_ust_channel() at ust-consumer.c:457) > DEBUG3 - 18:59:57.965104805 [8844/8853]: open() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_0 > with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at > runas.c:1212) > DEBUG1 - 18:59:57.965120609 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.965317517 [8844/8853]: open() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_1 > with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at > runas.c:1212) > DEBUG1 - 18:59:57.965335148 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.965445811 [8844/8853]: open() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_2 > with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at > runas.c:1212) > DEBUG1 - 18:59:57.965461438 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.966116363 [8844/8853]: open() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_3 > with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at > runas.c:1212) > DEBUG1 - 18:59:57.966145191 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.966341799 [8844/8853]: open() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_4 > with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at > runas.c:1212) > DEBUG1 - 18:59:57.966420313 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.966548533 [8844/8853]: open() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_5 > with flags C2 mode 384 for uid 1000 and gid 1000 (in run_as_open() at > runas.c:1212) > DEBUG1 - 18:59:57.966567778 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.966932907 [8844/8853]: unlink() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_5 > with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) > DEBUG1 - 18:59:57.966950256 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.967061802 [8844/8853]: unlink() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_4 > with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) > DEBUG1 - 18:59:57.967081332 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.967366982 [8844/8853]: unlink() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_3 > with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) > DEBUG1 - 18:59:57.967419957 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.967562353 [8844/8853]: unlink() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_2 > with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) > DEBUG1 - 18:59:57.967587355 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.968008237 [8844/8853]: unlink() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_1 > with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) > DEBUG1 - 18:59:57.968104447 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.968327138 [8844/8853]: unlink() > /dev/hugepages/auto-20190715-185955/ust/uid/1000/64-bit/channel0_0 > with for uid 1000 and gid 1000 (in run_as_unlink() at runas.c:1233) > DEBUG1 - 18:59:57.968349750 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG3 - 18:59:57.968562473 [8844/8853]: rmdir_recursive() > /dev/hugepages/auto-20190715-185955 with for uid 1000 and gid 1000 (in > run_as_rmdir_recursive() at runas.c:1251) > DEBUG1 - 18:59:57.968582498 [8844/8853]: Using run_as worker (in > run_as() at runas.c:1100) > DEBUG1 - 18:59:57.968934753 [8844/8853]: UST consumer cleaning stream > list (in destroy_channel() at ust-consumer.c:67) > DEBUG1 - 18:59:57.969019502 [8844/8853]: received command on sock (in > consumer_thread_sessiond_poll() at consumer.c:3301) > Error: ask_channel_creation consumer command failed > Error: Error creating UST channel "channel0" on the consumer daemon > > > > This time, I could not locate the problem anymore :(. Do you have any idea > > > of how to get hugepages shm work in lttng? > > > > > > To give you more context here, I was tracing a performance sensitive > > > program. I didn't want to suffer from the sub-buffer switch cost so I > > > created a very large sub-buffer (1MB). > > > > If you don't mind, how many core are present? How much memory is available on > > the host? > > I compiled and played around with lttng source codes on my vagrant vm > environment. I assigned 6 cores and 7.8G memory to it. My vm OS is > Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-51-generic x86_64). > > > > > Could you share with us the complete sequence of command you use to setup your > > tracing session? > > > > I used the following commands to test if lttng works with hugepages. > ``` > lttng create --shm-path=/dev/hugepages > lttng enable-event --userspace hello_world:my_first_tracepoint > lttng start > ``` > And the binary program I traced was the hello_world example in lttng > documentation page. > > > If it is not much trouble could you also share the step you took to setup/mount > > your hugetlbfs path? > > > > I followed the first section in https://wiki.debian.org/Hugepages to > set up my hugetlbfs, except I used /dev/hugepages instead of > /hugepages. > > > > I did a benchmark on my tracepoint > > > and noticed that after running a certain number of tracepoints, I got a > > > noticeably larger overhead (1200ns larger than other) for every ~130 > > > tracepoints. It turned out that this large overhead was due to a page > > > fault. The numbers were matched up (130 * 32 bytes = 4160 bytes, which is > > > approximately the size of a normal page 4kB) and I also used lttng perf > > > page fault counters to verify it. Therefore, I am looking for a solution to > > > have lttng create shm on hugepages. > > > > Quite interesting! > > > > > > > > Thank you very much! I look forward to hearing from you. > > > > > > Best, > > > Yiteng > > > > > _______________________________________________ > > > lttng-dev mailing list > > > lttng-dev@lists.lttng.org > > > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev > > > > > > -- > > Jonathan Rajotte-Julien > > EfficiOS > > Best, > Yiteng ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CAO+PNdFotFk6uCF1dySZi9dV6PYpAazWoQpsnU+N58F2b-73FQ@mail.gmail.com>]
* Re: HugePages shared memory support in LLTng [not found] ` <CAO+PNdFotFk6uCF1dySZi9dV6PYpAazWoQpsnU+N58F2b-73FQ@mail.gmail.com> @ 2019-07-22 19:23 ` Jonathan Rajotte-Julien [not found] ` <20190722192308.GA803@joraj-alpa> 1 sibling, 0 replies; 13+ messages in thread From: Jonathan Rajotte-Julien @ 2019-07-22 19:23 UTC (permalink / raw) To: Yiteng Guo; +Cc: lttng-dev Hi Yiteng, On Mon, Jul 22, 2019 at 02:44:09PM -0400, Yiteng Guo wrote: > Hi Jonathan, > > I spent these days on this problem and finally figured it out. Here > are patches I've written. Sorry for that, I had other stuff ongoing. I had a brief discussion about this with Mathieu Desnoyers. Mathieu mentioned that the page faults you are seeing might be related to qemu/kvm usage of KSM [1]. I did not have time to play around with it and see if this indeed have an effect. You might be better off trying it since you are already all setup. Might want to disable it and retry your experiment (if only doing this on a vm). [1] https://www.linux-kvm.org/page/KSM > > https://github.com/lttng/lttng-ust/compare/master...guoyiteng:hugepages > https://github.com/lttng/lttng-tools/compare/master...guoyiteng:hugepages I'll have a look as soon as possible. > > These two patches are just ad-hoc supports for hugepages, which are > not intended to be a pull request. If you want to support hugepages in > future lttng releases, I am glad to help you with that. What I did > here is to replace `shm_open` with `open` on a hugetlbfs directory. I > also modified other parts of code (such as memory alignment) to make > them compatible with huge pages. I didn't use `shm-path` option > because I noticed that this option would not only relocate the shm of > ring buffer but also other shm and metadata files. However, we only > wanted to use huge pages for ring buffer here. Here are commands I > used to launch an lttng session. > > ``` > lttng create > lttng enable-channel --userspace --subbuf-size=4M --num-subbuf=2 > --buffers-pid my-channel Any particular reason to user per-pid buffering? We normally recommend per-uid tracing + lttng track when possible. Depends on the final usecase. > lttng add-context --userspace --type=perf:thread:page-fault > lttng enable-event --userspace -c my-channel hello_world:my_first_tracepoint > lttng start > ``` > > My patches worked very well and I didn't get page faults anymore. > However, the only caveat of this patch is that ringbuffers are not > destroyed correctly. It leads to a problem that every new lttng > session acquires some hugepages but never releases them. After I > created and destroyed several sessions, I would get an error that told > me there were not enough hugepages to be used. I get around this > problem by restarting the session daemon. But there should be some way > to have ringbuffers (or its channel) destroyed elegently when its > session is destroyed. That is weird. I would expect the cleanup code to get rid of the ringbuffers as needed. Or at least try and fail. > > In the meantime, I am also trying another way to get rid of these page > faults, which is to prefault the ringbuffer shared memory in my > program. This solution does not need any modification on lttng souce > codes, which, I think, is a safer way to go. However, to prefault the > ringbuffer shm, I need to know the address (and size) of the > ringbuffer. Is there any way to learn this piece of information from > the user program? AFAIK, we do not expose the address. I might be wrong here. How to you plan on prefaulting the pages? MAP_POPULATE? Cheers ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20190722192308.GA803@joraj-alpa>]
* Re: HugePages shared memory support in LLTng [not found] ` <20190722192308.GA803@joraj-alpa> @ 2019-07-23 15:07 ` Jonathan Rajotte-Julien [not found] ` <20190723150744.GC803@joraj-alpa> 1 sibling, 0 replies; 13+ messages in thread From: Jonathan Rajotte-Julien @ 2019-07-23 15:07 UTC (permalink / raw) To: Yiteng Guo; +Cc: lttng-dev [-- Attachment #1: Type: text/plain, Size: 1081 bytes --] hi Yiteng, On Mon, Jul 22, 2019 at 03:23:08PM -0400, Jonathan Rajotte-Julien wrote: > Hi Yiteng, > > On Mon, Jul 22, 2019 at 02:44:09PM -0400, Yiteng Guo wrote: > > Hi Jonathan, > > > > I spent these days on this problem and finally figured it out. Here > > are patches I've written. > > Sorry for that, I had other stuff ongoing. > > I had a brief discussion about this with Mathieu Desnoyers. > > Mathieu mentioned that the page faults you are seeing might be related to > qemu/kvm usage of KSM [1]. I did not have time to play around with it and see if > this indeed have an effect. You might be better off trying it since you are > already all setup. Might want to disable it and retry your experiment (if only > doing this on a vm). Disregard all of this for now. I think we misunderstood the first email and got too far too fast. I modified lttng-ust to use MAP_POPULATE and based on the result from the page_fault perf counter it seems to achieve what you are looking for. See attached patch. Let me know if this help. Cheers. -- Jonathan Rajotte-Julien EfficiOS [-- Attachment #2: 0001-Use-MAP_POPULATE-to-reduce-pagefault.patch --] [-- Type: text/x-diff, Size: 1222 bytes --] From 2b065e5988067291e3367f413571248f4551acb2 Mon Sep 17 00:00:00 2001 From: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Date: Mon, 22 Jul 2019 17:37:43 -0400 Subject: [PATCH lttng-ust] Use MAP_POPULATE to reduce pagefault Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> --- libringbuffer/shm.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libringbuffer/shm.c b/libringbuffer/shm.c index 10b3bcef..489322e6 100644 --- a/libringbuffer/shm.c +++ b/libringbuffer/shm.c @@ -154,7 +154,7 @@ struct shm_object *_shm_object_table_alloc_shm(struct shm_object_table *table, /* memory_map: mmap */ memory_map = mmap(NULL, memory_map_size, PROT_READ | PROT_WRITE, - MAP_SHARED, shmfd, 0); + MAP_SHARED | MAP_POPULATE, shmfd, 0); if (memory_map == MAP_FAILED) { PERROR("mmap"); goto error_mmap; @@ -341,7 +341,7 @@ struct shm_object *shm_object_table_append_shm(struct shm_object_table *table, /* memory_map: mmap */ memory_map = mmap(NULL, memory_map_size, PROT_READ | PROT_WRITE, - MAP_SHARED, shm_fd, 0); + MAP_SHARED | MAP_POPULATE, shm_fd, 0); if (memory_map == MAP_FAILED) { PERROR("mmap"); goto error_mmap; -- 2.17.1 [-- Attachment #3: Type: text/plain, Size: 156 bytes --] _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ^ permalink raw reply related [flat|nested] 13+ messages in thread
[parent not found: <20190723150744.GC803@joraj-alpa>]
[parent not found: <CAO+PNdGhEgeTo35du4ysMcCOUQ0PKE4tuyGg593AE5feZZ4_JQ@mail.gmail.com>]
* Re: HugePages shared memory support in LLTng [not found] ` <CAO+PNdGhEgeTo35du4ysMcCOUQ0PKE4tuyGg593AE5feZZ4_JQ@mail.gmail.com> @ 2019-07-23 20:27 ` Jonathan Rajotte-Julien [not found] ` <20190723202723.GD803@joraj-alpa> 1 sibling, 0 replies; 13+ messages in thread From: Jonathan Rajotte-Julien @ 2019-07-23 20:27 UTC (permalink / raw) To: Yiteng Guo; +Cc: lttng-dev CC'ing the mailing list back. On Tue, Jul 23, 2019 at 03:58:09PM -0400, Yiteng Guo wrote: > Hi Jonathan, > > Thank you for the patch! It is really helpful. Were you able to observe a positive impact? This is something we might be interested in upstreaming if we have good feedback. > > Is there any disadvantage of per-pid buffering? I don't want to have > processes interfere with each other so I choose per-pid buffering. The main downside is that each registered applications will get their own subbuffers resulting in a lot of memory usage depending on your session configuration. This can get out of hand quickly, especially on systems withs lots of cores and unknown number of instrumented applications. If you completely control the runtime, for example when doing benchmarking or simple analysis, feel free to use what make more sense to you as long as you understand the pitfalls of each mode. Cheers -- Jonathan Rajotte-Julien EfficiOS ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <20190723202723.GD803@joraj-alpa>]
* Re: HugePages shared memory support in LLTng [not found] ` <20190723202723.GD803@joraj-alpa> @ 2019-07-24 15:54 ` Yiteng Guo [not found] ` <CAO+PNdEfTq5vAqWJAoWK_hyxdjUuQgPPf0sqJXNO9jw1J6RoNg@mail.gmail.com> ` (2 subsequent siblings) 3 siblings, 0 replies; 13+ messages in thread From: Yiteng Guo @ 2019-07-24 15:54 UTC (permalink / raw) To: Jonathan Rajotte-Julien; +Cc: lttng-dev (Forgot to cc mailing list in the previous email) Hi Jonathan, On Tue, Jul 23, 2019 at 4:27 PM Jonathan Rajotte-Julien <jonathan.rajotte-julien@efficios.com> wrote: > > CC'ing the mailing list back. > > On Tue, Jul 23, 2019 at 03:58:09PM -0400, Yiteng Guo wrote: > > Hi Jonathan, > > > > Thank you for the patch! It is really helpful. > > Were you able to observe a positive impact? > > This is something we might be interested in upstreaming if we have good > feedback. Yes, page faults disappeared and I didn't get those periodic overheads anymore. And I also solved the problem that hugepages are not closed correctly in my patch. It is my fault that I forgot to close the mmap pointer. I updated the patch here: https://github.com/lttng/lttng-ust/compare/master...guoyiteng:hugepages The current prefault solution works well for me and I will use that for now. In my opinion, using hugepages could further reduce the TLB misses, but that involved more changes in source codes than the prefault solution. Best, Yiteng ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CAO+PNdEfTq5vAqWJAoWK_hyxdjUuQgPPf0sqJXNO9jw1J6RoNg@mail.gmail.com>]
* Re: HugePages shared memory support in LLTng [not found] ` <CAO+PNdEfTq5vAqWJAoWK_hyxdjUuQgPPf0sqJXNO9jw1J6RoNg@mail.gmail.com> @ 2019-07-24 15:59 ` Jonathan Rajotte-Julien 0 siblings, 0 replies; 13+ messages in thread From: Jonathan Rajotte-Julien @ 2019-07-24 15:59 UTC (permalink / raw) To: Yiteng Guo; +Cc: lttng-dev Hi Yiteng, Make sure to always CC the mailing list. > > Were you able to observe a positive impact? > > > > This is something we might be interested in upstreaming if we have good > > feedback. > > Yes, page faults disappeared and I didn't get those periodic overheads anymore. Good. We will have to discuss this with Mathieu Desnoyers when he is back from vacation and see if always using MAP_POPULATE make sense. > > And I also solved the problem that hugepages are not closed correctly > in my patch. It is my fault that I forgot to close the mmap pointer. I > updated the patch here: > https://github.com/lttng/lttng-ust/compare/master...guoyiteng:hugepages Good. Would you be interested in posting those patches as RFC on the mailing list so that we have a trace of this work in the future? Github cannot give us the persistence needed for this. It might also lead to broader discussion. > The current prefault solution works well for me and I will use that > for now. In my opinion, using hugepages could further reduce the TLB > misses, but that involved more changes in source codes than the > prefault solution. > > Best, > Yiteng -- Jonathan Rajotte-Julien EfficiOS ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: HugePages shared memory support in LLTng [not found] ` <20190723202723.GD803@joraj-alpa> 2019-07-24 15:54 ` Yiteng Guo [not found] ` <CAO+PNdEfTq5vAqWJAoWK_hyxdjUuQgPPf0sqJXNO9jw1J6RoNg@mail.gmail.com> @ 2019-07-25 15:40 ` Mathieu Desnoyers [not found] ` <1962899258.11638.1564069223526.JavaMail.zimbra@efficios.com> 3 siblings, 0 replies; 13+ messages in thread From: Mathieu Desnoyers @ 2019-07-25 15:40 UTC (permalink / raw) To: Jonathan Rajotte, Yiteng Guo; +Cc: lttng-dev ----- On Jul 23, 2019, at 9:27 PM, Jonathan Rajotte jonathan.rajotte-julien@efficios.com wrote: > CC'ing the mailing list back. > > On Tue, Jul 23, 2019 at 03:58:09PM -0400, Yiteng Guo wrote: >> Hi Jonathan, >> >> Thank you for the patch! It is really helpful. > > Were you able to observe a positive impact? > > This is something we might be interested in upstreaming if we have good > feedback. > >> >> Is there any disadvantage of per-pid buffering? I don't want to have >> processes interfere with each other so I choose per-pid buffering. > > The main downside is that each registered applications will get their own > subbuffers resulting in a lot of memory usage depending on your session > configuration. This can get out of hand quickly, especially on systems withs > lots > of cores and unknown number of instrumented applications. I can add 2 extra cents (or actually a few more) to this answer: There are a few reasons for using per-uid buffers over per-pid: - Lower memory consumption for use-cases with many processes, - Faster process launch time: no need to allocate buffers for each process. Useful for use-cases with short-lived processes. - Keep a flight recorder "snapshot" available for all processes, including those which recently exited. Indeed, the per-pid buffers don't stay around for snapshot after a process exits or is killed. There are however a few advantages for per-pid buffers: - Isolation: if one PID generates corrupted trace data, it does not interfere with other PIDs buffers, - If one PID is killed between reserve and commit, it does not make that specific per-cpu ring buffer unusable for the rest of the tracing session lifetime. Hoping this information helps making the right choice for your deployment! Thanks, Mathieu > > If you completely control the runtime, for example when doing benchmarking or > simple analysis, feel free to use what make more sense to you as long as you > understand the pitfalls of each mode. > > Cheers > > -- > Jonathan Rajotte-Julien > EfficiOS > _______________________________________________ > lttng-dev mailing list > lttng-dev@lists.lttng.org > https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <1962899258.11638.1564069223526.JavaMail.zimbra@efficios.com>]
* Re: HugePages shared memory support in LLTng [not found] ` <1962899258.11638.1564069223526.JavaMail.zimbra@efficios.com> @ 2019-07-25 17:59 ` Trent Piepho via lttng-dev [not found] ` <1564077561.2343.121.camel@impinj.com> 1 sibling, 0 replies; 13+ messages in thread From: Trent Piepho via lttng-dev @ 2019-07-25 17:59 UTC (permalink / raw) To: mathieu.desnoyers, guoyiteng, jonathan.rajotte-julien; +Cc: lttng-dev On Thu, 2019-07-25 at 11:40 -0400, Mathieu Desnoyers wrote: > There are a few reasons for using per-uid buffers over per-pid: > > - Lower memory consumption for use-cases with many processes, > - Faster process launch time: no need to allocate buffers for each process. > Useful for use-cases with short-lived processes. > - Keep a flight recorder "snapshot" available for all processes, including > those which recently exited. Indeed, the per-pid buffers don't stay around > for snapshot after a process exits or is killed. > > There are however a few advantages for per-pid buffers: > > - Isolation: if one PID generates corrupted trace data, it does not interfere > with other PIDs buffers, > - If one PID is killed between reserve and commit, it does not make that specific > per-cpu ring buffer unusable for the rest of the tracing session lifetime. > > Hoping this information helps making the right choice for your deployment! We recently had this discussion for an embedded product that uses LTTng to gather trace data during operation. In our case, we want to have a flight recorder of the last X seconds of trace data, for the entire device. X seconds times Y byte/sec data generation rate ends up being a very large portion (~30%) of the total memory available. This has to be in RAM, using flash memory for this is not a good idea. If we use per-PID buffers, then the buffer size needed for the largest producer of trace data times the total number of processes is too large: far larger than the device's memory size. Some processes produce trace data at a much higher rate than others. A buffer for X seconds of data on one processes ends up being a buffer for 10*X seconds of data on another. There's not enough RAM for 10*X second buffers. If we use per-UID buffers, then we must run everything as one UID. Which, on an embedded system, is not that bad, but negatively impacts the security of the software. Now all processes, which generate data at different rates, can share one buffer. Much more efficient that having to reserve space the same space for the largest and smallest producers. But there ends up being another problem, the flight recorder data needs to be saved to make use of. To tmpfs in RAM, since the device's flash is not suitable and used elsewhere anyway. So one needs 2x the RAM, one for the ring buffer and one for the trace data dump in tmpfs of the ring buffer. So what we did was not use flight recorder mode. We configured lttng to use a limited number of smaller trace files and trace file rotation. And used small ring buffers, which ended up not needing to be very large to avoid overflow (I imagine saving the data to tmpfs is fast). The trace files are in effect a per-session buffer, which is what we want for greatest efficiency in space utilization. And we can archive those and download them when "something happens" without paying extra cost for space. ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <1564077561.2343.121.camel@impinj.com>]
* Re: HugePages shared memory support in LLTng [not found] ` <1564077561.2343.121.camel@impinj.com> @ 2019-07-26 4:37 ` Yiteng Guo [not found] ` <CAO+PNdGRD3BkfEOgjCLo+kgXreZD_GqYnW6LB7-ELjnowk+GjQ@mail.gmail.com> 1 sibling, 0 replies; 13+ messages in thread From: Yiteng Guo @ 2019-07-26 4:37 UTC (permalink / raw) To: Trent Piepho, mathieu.desnoyers, jonathan.rajotte-julien; +Cc: lttng-dev Hello, Thank you very much for all the information about per-uid and per-pid buffer. It is really helpful for me to make a decision. @Jonathan: This is my first time to get involved in an open-source project on the mailing list. I don't quite know how RFC works. Should I just do `git format-patch` and copy-paste the diff to an email? Is there any specific format for the RFC email and its title? Best, Yiteng ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CAO+PNdGRD3BkfEOgjCLo+kgXreZD_GqYnW6LB7-ELjnowk+GjQ@mail.gmail.com>]
* Re: HugePages shared memory support in LLTng [not found] ` <CAO+PNdGRD3BkfEOgjCLo+kgXreZD_GqYnW6LB7-ELjnowk+GjQ@mail.gmail.com> @ 2019-07-29 19:05 ` Jonathan Rajotte-Julien 0 siblings, 0 replies; 13+ messages in thread From: Jonathan Rajotte-Julien @ 2019-07-29 19:05 UTC (permalink / raw) To: Yiteng Guo; +Cc: lttng-dev > @Jonathan: This is my first time to get involved in an open-source > project on the mailing list. I don't quite know how RFC works. Should > I just do `git format-patch` and copy-paste the diff to an email? Is > there any specific format for the RFC email and its title? Copy paste to email is the way to go. You can also lookup "git send email" if you want [1]. Make sure to have the following prefix in the email title: [RFC PATCH <project name>] Hugepages ... <project name> would be either lttng-ust or lttng-tools for your patches. Also make sure to give all the necessary details to test the patches and also the pitfall/advantages you know regarding the use of hugepages. We do no expect the patchset to be "integrated" into lttng (command line switch etc.) but one should be able to take your patch and at least get a version of lttng working. Make sure to indicate the current commit for each projects you are basing this work on. As I said before, this is more a way of archiving the work you have done then actively working on making lttng support hugepages. [1] https://git-send-email.io Cheers -- Jonathan Rajotte-Julien EfficiOS ^ permalink raw reply [flat|nested] 13+ messages in thread
* HugePages shared memory support in LLTng @ 2019-07-12 22:18 Yiteng Guo 0 siblings, 0 replies; 13+ messages in thread From: Yiteng Guo @ 2019-07-12 22:18 UTC (permalink / raw) To: lttng-dev [-- Attachment #1.1: Type: text/plain, Size: 1781 bytes --] Hello, I am wondering if there is any way for lttng-ust to create its shm on hugepages. I noticed that there was an option `--shm-path` which can be used to change the location of shm. However, if I specified the path to a `hugetlbfs` such as /dev/hugepages, I would get errors in lttng-sessiond and no trace data were generated. The error I got was ``` PERROR - 17:54:56.740674 [8163/8168]: Error appending to metadata file: Invalid argument (in lttng_metadata_printf() at ust-metadata.c:176) Error: Failed to generate session metadata (errno = -1) ``` I took a look at lttng code base and found that lttng used `write` to generate a metadata file under `--shm-path`. However, it looks like `hugetlbfs` does not support `write` operation. I did a simple patch with `mmap` to get around this problem. Then, I got another error: ``` Error: Error creating UST channel "my-channel" on the consumer daemon ``` This time, I could not locate the problem anymore :(. Do you have any idea of how to get hugepages shm work in lttng? To give you more context here, I was tracing a performance sensitive program. I didn't want to suffer from the sub-buffer switch cost so I created a very large sub-buffer (1MB). I did a benchmark on my tracepoint and noticed that after running a certain number of tracepoints, I got a noticeably larger overhead (1200ns larger than other) for every ~130 tracepoints. It turned out that this large overhead was due to a page fault. The numbers were matched up (130 * 32 bytes = 4160 bytes, which is approximately the size of a normal page 4kB) and I also used lttng perf page fault counters to verify it. Therefore, I am looking for a solution to have lttng create shm on hugepages. Thank you very much! I look forward to hearing from you. Best, Yiteng [-- Attachment #1.2: Type: text/html, Size: 2041 bytes --] [-- Attachment #2: Type: text/plain, Size: 156 bytes --] _______________________________________________ lttng-dev mailing list lttng-dev@lists.lttng.org https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-07-29 19:05 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CAO+PNdHdLFk=Q0L2BLGnz8xCvdgMw3aYpZuAZumBOWKraKTnAw@mail.gmail.com> 2019-07-15 14:33 ` HugePages shared memory support in LLTng Jonathan Rajotte-Julien [not found] ` <20190715143302.GA2017@joraj-alpa> 2019-07-15 19:21 ` Yiteng Guo [not found] ` <CAO+PNdHW7O98QSdWyA5U6e=gtLmdFt77wHOT=eHb-Py1W3A-oQ@mail.gmail.com> 2019-07-22 18:44 ` Yiteng Guo [not found] ` <CAO+PNdFotFk6uCF1dySZi9dV6PYpAazWoQpsnU+N58F2b-73FQ@mail.gmail.com> 2019-07-22 19:23 ` Jonathan Rajotte-Julien [not found] ` <20190722192308.GA803@joraj-alpa> 2019-07-23 15:07 ` Jonathan Rajotte-Julien [not found] ` <20190723150744.GC803@joraj-alpa> [not found] ` <CAO+PNdGhEgeTo35du4ysMcCOUQ0PKE4tuyGg593AE5feZZ4_JQ@mail.gmail.com> 2019-07-23 20:27 ` Jonathan Rajotte-Julien [not found] ` <20190723202723.GD803@joraj-alpa> 2019-07-24 15:54 ` Yiteng Guo [not found] ` <CAO+PNdEfTq5vAqWJAoWK_hyxdjUuQgPPf0sqJXNO9jw1J6RoNg@mail.gmail.com> 2019-07-24 15:59 ` Jonathan Rajotte-Julien 2019-07-25 15:40 ` Mathieu Desnoyers [not found] ` <1962899258.11638.1564069223526.JavaMail.zimbra@efficios.com> 2019-07-25 17:59 ` Trent Piepho via lttng-dev [not found] ` <1564077561.2343.121.camel@impinj.com> 2019-07-26 4:37 ` Yiteng Guo [not found] ` <CAO+PNdGRD3BkfEOgjCLo+kgXreZD_GqYnW6LB7-ELjnowk+GjQ@mail.gmail.com> 2019-07-29 19:05 ` Jonathan Rajotte-Julien 2019-07-12 22:18 Yiteng Guo
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.