All of lore.kernel.org
 help / color / mirror / Atom feed
* netipmid consumes much CPU when obmc-console socket is shutdown
@ 2022-01-05  2:25 Heyi Guo
  2022-01-06  1:54 ` Heyi Guo
  2022-01-06  4:45 ` Ed Tanous
  0 siblings, 2 replies; 6+ messages in thread
From: Heyi Guo @ 2022-01-05  2:25 UTC (permalink / raw)
  To: openbmc; +Cc: Vernon Mauery, Tom Joseph

Hi all,

We found netipmid will consumes much CPU when SOL is activated but 
obmc-console socket is shutdown by some reason (can simply shutdown 
obmc-console by systemctl stop ....).

After obmc-console socket is closed, the async_wait() in 
startHostConsole() is always triggered, and consoleInputHandler() will 
read empty data (readSize == 0 and readDataLen == 0), but all the ec 
condition check will NOT hit!

 From boost reference, it is said the function read_some() will:

The function call will block until one or more bytes of data has been 
read successfully, or until an error occurs.

Is it a bug of boost? Or is there anything wrong in ipmi-net? And how 
can we make netipmid more robust on obmc-console socket shutdown?

Thanks,

Heyi


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netipmid consumes much CPU when obmc-console socket is shutdown
  2022-01-05  2:25 netipmid consumes much CPU when obmc-console socket is shutdown Heyi Guo
@ 2022-01-06  1:54 ` Heyi Guo
  2022-01-06  4:45 ` Ed Tanous
  1 sibling, 0 replies; 6+ messages in thread
From: Heyi Guo @ 2022-01-06  1:54 UTC (permalink / raw)
  To: openbmc; +Cc: Vernon Mauery, Tom Joseph

Hi all,

Any comments?

Thanks,

Heyi

在 2022/1/5 上午10:25, Heyi Guo 写道:
> Hi all,
>
> We found netipmid will consumes much CPU when SOL is activated but 
> obmc-console socket is shutdown by some reason (can simply shutdown 
> obmc-console by systemctl stop ....).
>
> After obmc-console socket is closed, the async_wait() in 
> startHostConsole() is always triggered, and consoleInputHandler() will 
> read empty data (readSize == 0 and readDataLen == 0), but all the ec 
> condition check will NOT hit!
>
> From boost reference, it is said the function read_some() will:
>
> The function call will block until one or more bytes of data has been 
> read successfully, or until an error occurs.
>
> Is it a bug of boost? Or is there anything wrong in ipmi-net? And how 
> can we make netipmid more robust on obmc-console socket shutdown?
>
> Thanks,
>
> Heyi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netipmid consumes much CPU when obmc-console socket is shutdown
  2022-01-05  2:25 netipmid consumes much CPU when obmc-console socket is shutdown Heyi Guo
  2022-01-06  1:54 ` Heyi Guo
@ 2022-01-06  4:45 ` Ed Tanous
  2022-01-14 14:07   ` Heyi Guo
  1 sibling, 1 reply; 6+ messages in thread
From: Ed Tanous @ 2022-01-06  4:45 UTC (permalink / raw)
  To: Heyi Guo; +Cc: Vernon Mauery, openbmc, Tom Joseph

On Tue, Jan 4, 2022 at 6:31 PM Heyi Guo <guoheyi@linux.alibaba.com> wrote:
>
> Hi all,
>
> We found netipmid will consumes much CPU when SOL is activated but
> obmc-console socket is shutdown by some reason (can simply shutdown
> obmc-console by systemctl stop ....).
>
> After obmc-console socket is closed, the async_wait() in
> startHostConsole() is always triggered, and consoleInputHandler() will
> read empty data (readSize == 0 and readDataLen == 0), but all the ec
> condition check will NOT hit!
>
>  From boost reference, it is said the function read_some() will:
>
> The function call will block until one or more bytes of data has been
> read successfully, or until an error occurs.
>
> Is it a bug of boost? Or is there anything wrong in ipmi-net? And how
> can we make netipmid more robust on obmc-console socket shutdown?
>

With not much knowledge of IPMI, but coming from a lot of knowledge of
boost and asio, that usage looks odd.  Instead of the
consoleSocket.async_wait done here:
https://github.com/openbmc/phosphor-net-ipmid/blob/12d199b27764496bfff8a45661239b1e509c336f/sol/sol_manager.cpp#L92
Which then calls into a blocking async_read on the socket, I would've
expected a consoleSocket.async_read_some with a given buffer to reduce
the number of system calls, and to read out partial data as it's
available.  Whether or not it would have different behavior in this
case, I can't say, but doing things the more expected way, and letting
asio handle it in the expected way in the past has netted us good
results in other applications.

Another interesting thing is the use of std::deque for the console
buffer type here.
https://github.com/openbmc/phosphor-net-ipmid/blob/d4a4bed525f79c39705fa526b20ab663bb2c2069/sol/console_buffer.hpp#L12

I would've expected to see one of the streaming buffer types like
flat_buffer (https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/ref/boost__beast__flat_buffer.html)
or multi-buffer
(https://www.boost.org/doc/libs/1_78_0/libs/beast/doc/html/beast/ref/boost__beast__multi_buffer.html),
which are designed for exactly what's being done here, streaming data
in and out of a pipe of variable lengths, and can be streamed into and
out of directly without having the extra copy.  Additionally,
deque<uint8_t> is going to have a lot of memory overhead compared to a
flat buffer type.

Not sure if any of the above is helpful to you or not, but it might
give you some things to try.

> Thanks,
>
> Heyi
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netipmid consumes much CPU when obmc-console socket is shutdown
  2022-01-06  4:45 ` Ed Tanous
@ 2022-01-14 14:07   ` Heyi Guo
  2022-01-28 18:33     ` Ed Tanous
  0 siblings, 1 reply; 6+ messages in thread
From: Heyi Guo @ 2022-01-14 14:07 UTC (permalink / raw)
  To: Ed Tanous; +Cc: Vernon Mauery, openbmc, Tom Joseph

Hi Ed,

Thanks for your advice. I'll make a try later. But I'm still curious why 
boost read_some() function returns with 0 data byte and none error code, 
which seems to violate the reference obviously.

Thanks,

Heyi

在 2022/1/6 下午12:45, Ed Tanous 写道:
> On Tue, Jan 4, 2022 at 6:31 PM Heyi Guo <guoheyi@linux.alibaba.com> wrote:
>> Hi all,
>>
>> We found netipmid will consumes much CPU when SOL is activated but
>> obmc-console socket is shutdown by some reason (can simply shutdown
>> obmc-console by systemctl stop ....).
>>
>> After obmc-console socket is closed, the async_wait() in
>> startHostConsole() is always triggered, and consoleInputHandler() will
>> read empty data (readSize == 0 and readDataLen == 0), but all the ec
>> condition check will NOT hit!
>>
>>   From boost reference, it is said the function read_some() will:
>>
>> The function call will block until one or more bytes of data has been
>> read successfully, or until an error occurs.
>>
>> Is it a bug of boost? Or is there anything wrong in ipmi-net? And how
>> can we make netipmid more robust on obmc-console socket shutdown?
>>
> With not much knowledge of IPMI, but coming from a lot of knowledge of
> boost and asio, that usage looks odd.  Instead of the
> consoleSocket.async_wait done here:
> https://github.com/openbmc/phosphor-net-ipmid/blob/12d199b27764496bfff8a45661239b1e509c336f/sol/sol_manager.cpp#L92
> Which then calls into a blocking async_read on the socket, I would've
> expected a consoleSocket.async_read_some with a given buffer to reduce
> the number of system calls, and to read out partial data as it's
> available.  Whether or not it would have different behavior in this
> case, I can't say, but doing things the more expected way, and letting
> asio handle it in the expected way in the past has netted us good
> results in other applications.
>
> Another interesting thing is the use of std::deque for the console
> buffer type here.
> https://github.com/openbmc/phosphor-net-ipmid/blob/d4a4bed525f79c39705fa526b20ab663bb2c2069/sol/console_buffer.hpp#L12
>
> I would've expected to see one of the streaming buffer types like
> flat_buffer (https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/ref/boost__beast__flat_buffer.html)
> or multi-buffer
> (https://www.boost.org/doc/libs/1_78_0/libs/beast/doc/html/beast/ref/boost__beast__multi_buffer.html),
> which are designed for exactly what's being done here, streaming data
> in and out of a pipe of variable lengths, and can be streamed into and
> out of directly without having the extra copy.  Additionally,
> deque<uint8_t> is going to have a lot of memory overhead compared to a
> flat buffer type.
>
> Not sure if any of the above is helpful to you or not, but it might
> give you some things to try.
>
>> Thanks,
>>
>> Heyi
>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netipmid consumes much CPU when obmc-console socket is shutdown
  2022-01-14 14:07   ` Heyi Guo
@ 2022-01-28 18:33     ` Ed Tanous
  2022-02-01  3:06       ` Heyi Guo
  0 siblings, 1 reply; 6+ messages in thread
From: Ed Tanous @ 2022-01-28 18:33 UTC (permalink / raw)
  To: Heyi Guo; +Cc: Vernon Mauery, openbmc, Tom Joseph

On Fri, Jan 14, 2022 at 6:08 AM Heyi Guo <guoheyi@linux.alibaba.com> wrote:
>
> Hi Ed,
>
> Thanks for your advice. I'll make a try later. But I'm still curious why
> boost read_some() function returns with 0 data byte and none error code,
> which seems to violate the reference obviously.

Like I said before, my guess is it's related to the fact that you're
combining an async_wait with a read_some in a way that asio didn't
intend in an evented system.

>
> Thanks,
>
> Heyi
>
> 在 2022/1/6 下午12:45, Ed Tanous 写道:
> > On Tue, Jan 4, 2022 at 6:31 PM Heyi Guo <guoheyi@linux.alibaba.com> wrote:
> >> Hi all,
> >>
> >> We found netipmid will consumes much CPU when SOL is activated but
> >> obmc-console socket is shutdown by some reason (can simply shutdown
> >> obmc-console by systemctl stop ....).
> >>
> >> After obmc-console socket is closed, the async_wait() in
> >> startHostConsole() is always triggered, and consoleInputHandler() will
> >> read empty data (readSize == 0 and readDataLen == 0), but all the ec
> >> condition check will NOT hit!
> >>
> >>   From boost reference, it is said the function read_some() will:
> >>
> >> The function call will block until one or more bytes of data has been
> >> read successfully, or until an error occurs.
> >>
> >> Is it a bug of boost? Or is there anything wrong in ipmi-net? And how
> >> can we make netipmid more robust on obmc-console socket shutdown?
> >>
> > With not much knowledge of IPMI, but coming from a lot of knowledge of
> > boost and asio, that usage looks odd.  Instead of the
> > consoleSocket.async_wait done here:
> > https://github.com/openbmc/phosphor-net-ipmid/blob/12d199b27764496bfff8a45661239b1e509c336f/sol/sol_manager.cpp#L92
> > Which then calls into a blocking async_read on the socket, I would've
> > expected a consoleSocket.async_read_some with a given buffer to reduce
> > the number of system calls, and to read out partial data as it's
> > available.  Whether or not it would have different behavior in this
> > case, I can't say, but doing things the more expected way, and letting
> > asio handle it in the expected way in the past has netted us good
> > results in other applications.
> >
> > Another interesting thing is the use of std::deque for the console
> > buffer type here.
> > https://github.com/openbmc/phosphor-net-ipmid/blob/d4a4bed525f79c39705fa526b20ab663bb2c2069/sol/console_buffer.hpp#L12
> >
> > I would've expected to see one of the streaming buffer types like
> > flat_buffer (https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/ref/boost__beast__flat_buffer.html)
> > or multi-buffer
> > (https://www.boost.org/doc/libs/1_78_0/libs/beast/doc/html/beast/ref/boost__beast__multi_buffer.html),
> > which are designed for exactly what's being done here, streaming data
> > in and out of a pipe of variable lengths, and can be streamed into and
> > out of directly without having the extra copy.  Additionally,
> > deque<uint8_t> is going to have a lot of memory overhead compared to a
> > flat buffer type.
> >
> > Not sure if any of the above is helpful to you or not, but it might
> > give you some things to try.
> >
> >> Thanks,
> >>
> >> Heyi
> >>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: netipmid consumes much CPU when obmc-console socket is shutdown
  2022-01-28 18:33     ` Ed Tanous
@ 2022-02-01  3:06       ` Heyi Guo
  0 siblings, 0 replies; 6+ messages in thread
From: Heyi Guo @ 2022-02-01  3:06 UTC (permalink / raw)
  To: Ed Tanous; +Cc: Vernon Mauery, openbmc, Tom Joseph


在 2022/1/29 上午2:33, Ed Tanous 写道:
> On Fri, Jan 14, 2022 at 6:08 AM Heyi Guo <guoheyi@linux.alibaba.com> wrote:
>> Hi Ed,
>>
>> Thanks for your advice. I'll make a try later. But I'm still curious why
>> boost read_some() function returns with 0 data byte and none error code,
>> which seems to violate the reference obviously.
> Like I said before, my guess is it's related to the fact that you're
> combining an async_wait with a read_some in a way that asio didn't
> intend in an evented system.

Thanks, I'll take a try on this.

Heyi


>
>> Thanks,
>>
>> Heyi
>>
>> 在 2022/1/6 下午12:45, Ed Tanous 写道:
>>> On Tue, Jan 4, 2022 at 6:31 PM Heyi Guo <guoheyi@linux.alibaba.com> wrote:
>>>> Hi all,
>>>>
>>>> We found netipmid will consumes much CPU when SOL is activated but
>>>> obmc-console socket is shutdown by some reason (can simply shutdown
>>>> obmc-console by systemctl stop ....).
>>>>
>>>> After obmc-console socket is closed, the async_wait() in
>>>> startHostConsole() is always triggered, and consoleInputHandler() will
>>>> read empty data (readSize == 0 and readDataLen == 0), but all the ec
>>>> condition check will NOT hit!
>>>>
>>>>    From boost reference, it is said the function read_some() will:
>>>>
>>>> The function call will block until one or more bytes of data has been
>>>> read successfully, or until an error occurs.
>>>>
>>>> Is it a bug of boost? Or is there anything wrong in ipmi-net? And how
>>>> can we make netipmid more robust on obmc-console socket shutdown?
>>>>
>>> With not much knowledge of IPMI, but coming from a lot of knowledge of
>>> boost and asio, that usage looks odd.  Instead of the
>>> consoleSocket.async_wait done here:
>>> https://github.com/openbmc/phosphor-net-ipmid/blob/12d199b27764496bfff8a45661239b1e509c336f/sol/sol_manager.cpp#L92
>>> Which then calls into a blocking async_read on the socket, I would've
>>> expected a consoleSocket.async_read_some with a given buffer to reduce
>>> the number of system calls, and to read out partial data as it's
>>> available.  Whether or not it would have different behavior in this
>>> case, I can't say, but doing things the more expected way, and letting
>>> asio handle it in the expected way in the past has netted us good
>>> results in other applications.
>>>
>>> Another interesting thing is the use of std::deque for the console
>>> buffer type here.
>>> https://github.com/openbmc/phosphor-net-ipmid/blob/d4a4bed525f79c39705fa526b20ab663bb2c2069/sol/console_buffer.hpp#L12
>>>
>>> I would've expected to see one of the streaming buffer types like
>>> flat_buffer (https://www.boost.org/doc/libs/develop/libs/beast/doc/html/beast/ref/boost__beast__flat_buffer.html)
>>> or multi-buffer
>>> (https://www.boost.org/doc/libs/1_78_0/libs/beast/doc/html/beast/ref/boost__beast__multi_buffer.html),
>>> which are designed for exactly what's being done here, streaming data
>>> in and out of a pipe of variable lengths, and can be streamed into and
>>> out of directly without having the extra copy.  Additionally,
>>> deque<uint8_t> is going to have a lot of memory overhead compared to a
>>> flat buffer type.
>>>
>>> Not sure if any of the above is helpful to you or not, but it might
>>> give you some things to try.
>>>
>>>> Thanks,
>>>>
>>>> Heyi
>>>>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-02-01  3:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-05  2:25 netipmid consumes much CPU when obmc-console socket is shutdown Heyi Guo
2022-01-06  1:54 ` Heyi Guo
2022-01-06  4:45 ` Ed Tanous
2022-01-14 14:07   ` Heyi Guo
2022-01-28 18:33     ` Ed Tanous
2022-02-01  3:06       ` Heyi Guo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.