linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] ipmi: fix ipmi memleak and unload bug
@ 2022-10-07  9:26 Zhang Yuchen
  2022-10-07  9:26 ` [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected Zhang Yuchen
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Zhang Yuchen @ 2022-10-07  9:26 UTC (permalink / raw)
  To: minyard; +Cc: openipmi-developer, linux-kernel, qi.zheng, Zhang Yuchen

Our users have reported a memory leak. The IPMI driver consumes a lot of
memory.

We found this problem because IPMI was slow to release messages after
being disconnected, causing what appeared to be a memory leak.

Fix commit as:
  ipmi: fix msg stack when IPMI is disconnected

Two other issues were solved in the process of solving this problem.

1. Unloading the driver takes a long time.

Fix commit as:
  ipmi: fix long wait in unload when IPMI disconnect

2. Memory leaks during driver unloading.

Fix commit as:
  ipmi: fix memleak when unload ipmi driver


Zhang Yuchen (3):
  ipmi: fix msg stack when IPMI is disconnected
  ipmi: fix long wait in unload when IPMI disconnect
  ipmi: fix memleak when unload ipmi driver

 drivers/char/ipmi/ipmi_kcs_sm.c     | 14 ++++++++++----
 drivers/char/ipmi/ipmi_msghandler.c |  5 ++++-
 drivers/char/ipmi/ipmi_si_intf.c    | 27 +++++++++++++++++++--------
 3 files changed, 33 insertions(+), 13 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected
  2022-10-07  9:26 [PATCH 0/3] ipmi: fix ipmi memleak and unload bug Zhang Yuchen
@ 2022-10-07  9:26 ` Zhang Yuchen
  2022-10-07 19:43   ` Corey Minyard
  2022-10-07  9:26 ` [PATCH 2/3] ipmi: fix long wait in unload when IPMI disconnect Zhang Yuchen
  2022-10-07  9:26 ` [PATCH 3/3] ipmi: fix memleak when unload ipmi driver Zhang Yuchen
  2 siblings, 1 reply; 10+ messages in thread
From: Zhang Yuchen @ 2022-10-07  9:26 UTC (permalink / raw)
  To: minyard; +Cc: openipmi-developer, linux-kernel, qi.zheng, Zhang Yuchen

If you continue to access and send messages at a high frequency (once
every 55s) when the IPMI is disconnected, messages will accumulate in
intf->[hp_]xmit_msg. If it lasts long enough, it takes up a lot of
memory.

The reason is that if IPMI is disconnected, each message will be set to
IDLE after it returns to HOSED through IDLE->ERROR0->HOSED. The next
message goes through the same process when it comes in. This process
needs to wait for IBF_TIMEOUT * (MAX_ERROR_RETRIES + 1) = 55s.

Each message takes 55S to destroy. This results in a continuous increase
in memory.

I find that if I wait 5 seconds after the first message fails, the
status changes to ERROR0 in smi_timeout(). The next message will return
the error code IPMI_NOT_IN_MY_STATE_ERR directly without wait.

This is more in line with our needs.

So instead of setting each message state to IDLE after it reaches the
state HOSED, set state to ERROR0.

After testing, the problem has been solved, no matter how many
consecutive sends, will not cause continuous memory growth. It also
returns to normal immediately after the IPMI is restored.

The verification operations are as follows:

1. Use BPF to record the ipmi_alloc/free_smi_msg().

  $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
      %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'

2. Exec `date; time for x in $(seq 1 2); do ipmitool mc info; done`.
3. Record the output of `time` and when free all msgs.

Before:

`time` takes 120s, This is because `ipmitool mc info` send 4 msgs and
waits only 15 seconds for each message. Last msg is free after 440s.

  $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
      %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'
  Oct 05 11:40:55 Attaching 2 probes...
  Oct 05 11:41:12 alloc 0xffff9558a05f0c00
  Oct 05 11:41:27 alloc 0xffff9558a05f1a00
  Oct 05 11:41:42 alloc 0xffff9558a05f0000
  Oct 05 11:41:57 alloc 0xffff9558a05f1400
  Oct 05 11:42:07 free  0xffff9558a05f0c00
  Oct 05 11:42:07 alloc 0xffff9558a05f7000
  Oct 05 11:42:22 alloc 0xffff9558a05f2a00
  Oct 05 11:42:37 alloc 0xffff9558a05f5a00
  Oct 05 11:42:52 alloc 0xffff9558a05f3a00
  Oct 05 11:43:02 free  0xffff9558a05f1a00
  Oct 05 11:43:57 free  0xffff9558a05f0000
  Oct 05 11:44:52 free  0xffff9558a05f1400
  Oct 05 11:45:47 free  0xffff9558a05f7000
  Oct 05 11:46:42 free  0xffff9558a05f2a00
  Oct 05 11:47:37 free  0xffff9558a05f5a00
  Oct 05 11:48:32 free  0xffff9558a05f3a00

  $ root@dc00-pb003-t106-n078:~# date;time for x in $(seq 1 2); do
  ipmitool mc info; done

  Wed Oct  5 11:41:12 CST 2022
  No data available
  Get Device ID command failed
  No data available
  No data available
  No valid response received
  Get Device ID command failed: Unspecified error
  No data available
  Get Device ID command failed
  No data available
  No data available
  No valid response received
  No data available
  Get Device ID command failed

  real        1m55.052s
  user        0m0.001s
  sys        0m0.001s

After:

`time` takes 55s, all msgs is returned and free after 55s.

  $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
      %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'

  Oct 07 16:30:35 Attaching 2 probes...
  Oct 07 16:30:45 alloc 0xffff955943aa9800
  Oct 07 16:31:00 alloc 0xffff955943aacc00
  Oct 07 16:31:15 alloc 0xffff955943aa8c00
  Oct 07 16:31:30 alloc 0xffff955943aaf600
  Oct 07 16:31:40 free  0xffff955943aa9800
  Oct 07 16:31:40 free  0xffff955943aacc00
  Oct 07 16:31:40 free  0xffff955943aa8c00
  Oct 07 16:31:40 free  0xffff955943aaf600
  Oct 07 16:31:40 alloc 0xffff9558ec8f7e00
  Oct 07 16:31:40 free  0xffff9558ec8f7e00
  Oct 07 16:31:40 alloc 0xffff9558ec8f7800
  Oct 07 16:31:40 free  0xffff9558ec8f7800
  Oct 07 16:31:40 alloc 0xffff9558ec8f7e00
  Oct 07 16:31:40 free  0xffff9558ec8f7e00
  Oct 07 16:31:40 alloc 0xffff9558ec8f7800
  Oct 07 16:31:40 free  0xffff9558ec8f7800

  root@dc00-pb003-t106-n078:~# date;time for x in $(seq 1 2); do
  ipmitool mc info; done
  Fri Oct  7 16:30:45 CST 2022
  No data available
  Get Device ID command failed
  No data available
  No data available
  No valid response received
  Get Device ID command failed: Unspecified error
  Get Device ID command failed: 0xd5 Command not supported in present state
  Get Device ID command failed: Command not supported in present state

  real        0m55.038s
  user        0m0.001s
  sys        0m0.001s

Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
---
 drivers/char/ipmi/ipmi_kcs_sm.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c b/drivers/char/ipmi/ipmi_kcs_sm.c
index efda90dcf5b3..e7f2cd10e5a6 100644
--- a/drivers/char/ipmi/ipmi_kcs_sm.c
+++ b/drivers/char/ipmi/ipmi_kcs_sm.c
@@ -122,10 +122,10 @@ struct si_sm_data {
 	unsigned long  error0_timeout;
 };
 
-static unsigned int init_kcs_data(struct si_sm_data *kcs,
-				  struct si_sm_io *io)
+static unsigned int init_kcs_data_with_state(struct si_sm_data *kcs,
+				  struct si_sm_io *io, enum kcs_states state)
 {
-	kcs->state = KCS_IDLE;
+	kcs->state = state;
 	kcs->io = io;
 	kcs->write_pos = 0;
 	kcs->write_count = 0;
@@ -140,6 +140,12 @@ static unsigned int init_kcs_data(struct si_sm_data *kcs,
 	return 2;
 }
 
+static unsigned int init_kcs_data(struct si_sm_data *kcs,
+				  struct si_sm_io *io)
+{
+	return init_kcs_data_with_state(kcs, io, KCS_IDLE);
+}
+
 static inline unsigned char read_status(struct si_sm_data *kcs)
 {
 	return kcs->io->inputb(kcs->io, 1);
@@ -495,7 +501,7 @@ static enum si_sm_result kcs_event(struct si_sm_data *kcs, long time)
 	}
 
 	if (kcs->state == KCS_HOSED) {
-		init_kcs_data(kcs, kcs->io);
+		init_kcs_data_with_state(kcs, kcs->io, KCS_ERROR0);
 		return SI_SM_HOSED;
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/3] ipmi: fix long wait in unload when IPMI disconnect
  2022-10-07  9:26 [PATCH 0/3] ipmi: fix ipmi memleak and unload bug Zhang Yuchen
  2022-10-07  9:26 ` [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected Zhang Yuchen
@ 2022-10-07  9:26 ` Zhang Yuchen
  2022-10-07 19:48   ` Corey Minyard
  2022-10-07  9:26 ` [PATCH 3/3] ipmi: fix memleak when unload ipmi driver Zhang Yuchen
  2 siblings, 1 reply; 10+ messages in thread
From: Zhang Yuchen @ 2022-10-07  9:26 UTC (permalink / raw)
  To: minyard; +Cc: openipmi-developer, linux-kernel, qi.zheng, Zhang Yuchen

When fixing the problem mentioned in PATCH1, we also found
the following problem:

If the IPMI is disconnected and in the sending process, the
uninstallation driver will be stuck for a long time.

The main problem is that uninstalling the driver waits for curr_msg to
be sent or HOSED. After stopping tasklet, the only place to trigger the
timeout mechanism is the circular poll in shutdown_smi.

The poll function delays 10us and calls smi_event_handler(smi_info,10).
Smi_event_handler deducts 10us from kcs->ibf_timeout.

But the poll func is followed by schedule_timeout_uninterruptible(1).
The time consumed here is not counted in kcs->ibf_timeout.

So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has
actually passed. The waiting time has increased by more than a
hundredfold.

Now instead of calling poll(). call smi_event_handler() directly and
calculate the elapsed time.

For verification, you can directly use ebpf to check the kcs->
ibf_timeout for each call to kcs_event() when IPMI is disconnected.
Decrement at normal rate before unloading. The decrement rate becomes
very slow after unloading.

  $ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n",
      *(arg0+584));}'

Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
---
 drivers/char/ipmi/ipmi_si_intf.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
index 6e357ad76f2e..abddd7e43a9a 100644
--- a/drivers/char/ipmi/ipmi_si_intf.c
+++ b/drivers/char/ipmi/ipmi_si_intf.c
@@ -2153,6 +2153,20 @@ static int __init init_ipmi_si(void)
 }
 module_init(init_ipmi_si);
 
+static void wait_msg_processed(struct smi_info *smi_info)
+{
+	unsigned long jiffies_now;
+	long time_diff;
+
+	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
+		jiffies_now = jiffies;
+		time_diff = (((long)jiffies_now - (long)smi_info->last_timeout_jiffies)
+		     * SI_USEC_PER_JIFFY);
+		smi_event_handler(smi_info, time_diff);
+		schedule_timeout_uninterruptible(1);
+	}
+}
+
 static void shutdown_smi(void *send_info)
 {
 	struct smi_info *smi_info = send_info;
@@ -2187,16 +2201,13 @@ static void shutdown_smi(void *send_info)
 	 * in the BMC.  Note that timers and CPU interrupts are off,
 	 * so no need for locks.
 	 */
-	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
-		poll(smi_info);
-		schedule_timeout_uninterruptible(1);
-	}
+	wait_msg_processed(smi_info);
+
 	if (smi_info->handlers)
 		disable_si_irq(smi_info);
-	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
-		poll(smi_info);
-		schedule_timeout_uninterruptible(1);
-	}
+
+	wait_msg_processed(smi_info);
+
 	if (smi_info->handlers)
 		smi_info->handlers->cleanup(smi_info->si_sm);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/3] ipmi: fix memleak when unload ipmi driver
  2022-10-07  9:26 [PATCH 0/3] ipmi: fix ipmi memleak and unload bug Zhang Yuchen
  2022-10-07  9:26 ` [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected Zhang Yuchen
  2022-10-07  9:26 ` [PATCH 2/3] ipmi: fix long wait in unload when IPMI disconnect Zhang Yuchen
@ 2022-10-07  9:26 ` Zhang Yuchen
  2022-10-07 19:51   ` Corey Minyard
  2 siblings, 1 reply; 10+ messages in thread
From: Zhang Yuchen @ 2022-10-07  9:26 UTC (permalink / raw)
  To: minyard; +Cc: openipmi-developer, linux-kernel, qi.zheng, Zhang Yuchen

After the IPMI disconnect problem, the memory kept rising and we tried
to unload the driver to free the memory. However, only part of the
free memory is recovered after the driver is uninstalled. Using
ebpf to hook free functions, we find that neither ipmi_user nor
ipmi_smi_msg is free, only ipmi_recv_msg is free.

We find that the deliver_smi_err_response call in clean_smi_msgs does
the destroy processing on each message from the xmit_msg queue without
checking the return value and free ipmi_smi_msg.

deliver_smi_err_response is called only at this location. Adding the
free handling has no effect.

To verify, try using ebpf to trace the free function.

  $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc rcv
      %p\n",retval);} kprobe:free_recv_msg {printf("free recv %p\n",
      arg0)} kretprobe:ipmi_alloc_smi_msg {printf("alloc smi %p\n",
        retval);} kprobe:free_smi_msg {printf("free smi  %p\n",arg0)}'

Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
---
 drivers/char/ipmi/ipmi_msghandler.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
index c8a3b208f923..7a7534046b5b 100644
--- a/drivers/char/ipmi/ipmi_msghandler.c
+++ b/drivers/char/ipmi/ipmi_msghandler.c
@@ -3710,12 +3710,15 @@ static void deliver_smi_err_response(struct ipmi_smi *intf,
 				     struct ipmi_smi_msg *msg,
 				     unsigned char err)
 {
+	int rv;
 	msg->rsp[0] = msg->data[0] | 4;
 	msg->rsp[1] = msg->data[1];
 	msg->rsp[2] = err;
 	msg->rsp_size = 3;
 	/* It's an error, so it will never requeue, no need to check return. */
-	handle_one_recv_msg(intf, msg);
+	rv = handle_one_recv_msg(intf, msg);
+	if (rv == 0)
+		ipmi_free_smi_msg(msg);
 }
 
 static void cleanup_smi_msgs(struct ipmi_smi *intf)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected
  2022-10-07  9:26 ` [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected Zhang Yuchen
@ 2022-10-07 19:43   ` Corey Minyard
  2022-10-08  1:36     ` Yuchen Zhang
  0 siblings, 1 reply; 10+ messages in thread
From: Corey Minyard @ 2022-10-07 19:43 UTC (permalink / raw)
  To: Zhang Yuchen; +Cc: openipmi-developer, linux-kernel, qi.zheng

On Fri, Oct 07, 2022 at 05:26:15PM +0800, Zhang Yuchen wrote:
> If you continue to access and send messages at a high frequency (once
> every 55s) when the IPMI is disconnected, messages will accumulate in
> intf->[hp_]xmit_msg. If it lasts long enough, it takes up a lot of
> memory.

The IPMI driver really wasn't designed to handle this sort of thing.  If
there is a BMC there, it should be there except when it's rebooting,
which should only take a few seconds.  Which is what this is all
designed to handle.

> 
> The reason is that if IPMI is disconnected, each message will be set to
> IDLE after it returns to HOSED through IDLE->ERROR0->HOSED. The next
> message goes through the same process when it comes in. This process
> needs to wait for IBF_TIMEOUT * (MAX_ERROR_RETRIES + 1) = 55s.
> 
> Each message takes 55S to destroy. This results in a continuous increase
> in memory.
> 
> I find that if I wait 5 seconds after the first message fails, the
> status changes to ERROR0 in smi_timeout(). The next message will return
> the error code IPMI_NOT_IN_MY_STATE_ERR directly without wait.

So basically, you will stay in error state until the BMC recovers.  The
KCS state machine will reject messages until the state machine detects
that the BMC is working again.  I think this is ok.

Have you tested that if the BMC comes back that the driver recovers and
works?  Looking at the code it seems to be the case, but can you test to
be sure, if you have not already?

Also, the following is in start_kcs_transaction():

	if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) {
		dev_warn(kcs->io->dev, "KCS in invalid state %d\n", kcs->state);
		return IPMI_NOT_IN_MY_STATE_ERR;
	}

You probably need to remove the (kcs->state != KCS_HOSED) part of this
now.  Would you agree?

-corey

> 
> This is more in line with our needs.
> 
> So instead of setting each message state to IDLE after it reaches the
> state HOSED, set state to ERROR0.
> 
> After testing, the problem has been solved, no matter how many
> consecutive sends, will not cause continuous memory growth. It also
> returns to normal immediately after the IPMI is restored.
> 
> The verification operations are as follows:
> 
> 1. Use BPF to record the ipmi_alloc/free_smi_msg().
> 
>   $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
>       %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'
> 
> 2. Exec `date; time for x in $(seq 1 2); do ipmitool mc info; done`.
> 3. Record the output of `time` and when free all msgs.
> 
> Before:
> 
> `time` takes 120s, This is because `ipmitool mc info` send 4 msgs and
> waits only 15 seconds for each message. Last msg is free after 440s.
> 
>   $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
>       %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'
>   Oct 05 11:40:55 Attaching 2 probes...
>   Oct 05 11:41:12 alloc 0xffff9558a05f0c00
>   Oct 05 11:41:27 alloc 0xffff9558a05f1a00
>   Oct 05 11:41:42 alloc 0xffff9558a05f0000
>   Oct 05 11:41:57 alloc 0xffff9558a05f1400
>   Oct 05 11:42:07 free  0xffff9558a05f0c00
>   Oct 05 11:42:07 alloc 0xffff9558a05f7000
>   Oct 05 11:42:22 alloc 0xffff9558a05f2a00
>   Oct 05 11:42:37 alloc 0xffff9558a05f5a00
>   Oct 05 11:42:52 alloc 0xffff9558a05f3a00
>   Oct 05 11:43:02 free  0xffff9558a05f1a00
>   Oct 05 11:43:57 free  0xffff9558a05f0000
>   Oct 05 11:44:52 free  0xffff9558a05f1400
>   Oct 05 11:45:47 free  0xffff9558a05f7000
>   Oct 05 11:46:42 free  0xffff9558a05f2a00
>   Oct 05 11:47:37 free  0xffff9558a05f5a00
>   Oct 05 11:48:32 free  0xffff9558a05f3a00
> 
>   $ root@dc00-pb003-t106-n078:~# date;time for x in $(seq 1 2); do
>   ipmitool mc info; done
> 
>   Wed Oct  5 11:41:12 CST 2022
>   No data available
>   Get Device ID command failed
>   No data available
>   No data available
>   No valid response received
>   Get Device ID command failed: Unspecified error
>   No data available
>   Get Device ID command failed
>   No data available
>   No data available
>   No valid response received
>   No data available
>   Get Device ID command failed
> 
>   real        1m55.052s
>   user        0m0.001s
>   sys        0m0.001s
> 
> After:
> 
> `time` takes 55s, all msgs is returned and free after 55s.
> 
>   $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
>       %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'
> 
>   Oct 07 16:30:35 Attaching 2 probes...
>   Oct 07 16:30:45 alloc 0xffff955943aa9800
>   Oct 07 16:31:00 alloc 0xffff955943aacc00
>   Oct 07 16:31:15 alloc 0xffff955943aa8c00
>   Oct 07 16:31:30 alloc 0xffff955943aaf600
>   Oct 07 16:31:40 free  0xffff955943aa9800
>   Oct 07 16:31:40 free  0xffff955943aacc00
>   Oct 07 16:31:40 free  0xffff955943aa8c00
>   Oct 07 16:31:40 free  0xffff955943aaf600
>   Oct 07 16:31:40 alloc 0xffff9558ec8f7e00
>   Oct 07 16:31:40 free  0xffff9558ec8f7e00
>   Oct 07 16:31:40 alloc 0xffff9558ec8f7800
>   Oct 07 16:31:40 free  0xffff9558ec8f7800
>   Oct 07 16:31:40 alloc 0xffff9558ec8f7e00
>   Oct 07 16:31:40 free  0xffff9558ec8f7e00
>   Oct 07 16:31:40 alloc 0xffff9558ec8f7800
>   Oct 07 16:31:40 free  0xffff9558ec8f7800
> 
>   root@dc00-pb003-t106-n078:~# date;time for x in $(seq 1 2); do
>   ipmitool mc info; done
>   Fri Oct  7 16:30:45 CST 2022
>   No data available
>   Get Device ID command failed
>   No data available
>   No data available
>   No valid response received
>   Get Device ID command failed: Unspecified error
>   Get Device ID command failed: 0xd5 Command not supported in present state
>   Get Device ID command failed: Command not supported in present state
> 
>   real        0m55.038s
>   user        0m0.001s
>   sys        0m0.001s
> 
> Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
> ---
>  drivers/char/ipmi/ipmi_kcs_sm.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c b/drivers/char/ipmi/ipmi_kcs_sm.c
> index efda90dcf5b3..e7f2cd10e5a6 100644
> --- a/drivers/char/ipmi/ipmi_kcs_sm.c
> +++ b/drivers/char/ipmi/ipmi_kcs_sm.c
> @@ -122,10 +122,10 @@ struct si_sm_data {
>  	unsigned long  error0_timeout;
>  };
>  
> -static unsigned int init_kcs_data(struct si_sm_data *kcs,
> -				  struct si_sm_io *io)
> +static unsigned int init_kcs_data_with_state(struct si_sm_data *kcs,
> +				  struct si_sm_io *io, enum kcs_states state)
>  {
> -	kcs->state = KCS_IDLE;
> +	kcs->state = state;
>  	kcs->io = io;
>  	kcs->write_pos = 0;
>  	kcs->write_count = 0;
> @@ -140,6 +140,12 @@ static unsigned int init_kcs_data(struct si_sm_data *kcs,
>  	return 2;
>  }
>  
> +static unsigned int init_kcs_data(struct si_sm_data *kcs,
> +				  struct si_sm_io *io)
> +{
> +	return init_kcs_data_with_state(kcs, io, KCS_IDLE);
> +}
> +
>  static inline unsigned char read_status(struct si_sm_data *kcs)
>  {
>  	return kcs->io->inputb(kcs->io, 1);
> @@ -495,7 +501,7 @@ static enum si_sm_result kcs_event(struct si_sm_data *kcs, long time)
>  	}
>  
>  	if (kcs->state == KCS_HOSED) {
> -		init_kcs_data(kcs, kcs->io);
> +		init_kcs_data_with_state(kcs, kcs->io, KCS_ERROR0);
>  		return SI_SM_HOSED;
>  	}
>  
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/3] ipmi: fix long wait in unload when IPMI disconnect
  2022-10-07  9:26 ` [PATCH 2/3] ipmi: fix long wait in unload when IPMI disconnect Zhang Yuchen
@ 2022-10-07 19:48   ` Corey Minyard
  0 siblings, 0 replies; 10+ messages in thread
From: Corey Minyard @ 2022-10-07 19:48 UTC (permalink / raw)
  To: Zhang Yuchen; +Cc: openipmi-developer, linux-kernel, qi.zheng

On Fri, Oct 07, 2022 at 05:26:16PM +0800, Zhang Yuchen wrote:
> When fixing the problem mentioned in PATCH1, we also found
> the following problem:
> 
> If the IPMI is disconnected and in the sending process, the
> uninstallation driver will be stuck for a long time.
> 
> The main problem is that uninstalling the driver waits for curr_msg to
> be sent or HOSED. After stopping tasklet, the only place to trigger the
> timeout mechanism is the circular poll in shutdown_smi.
> 
> The poll function delays 10us and calls smi_event_handler(smi_info,10).
> Smi_event_handler deducts 10us from kcs->ibf_timeout.
> 
> But the poll func is followed by schedule_timeout_uninterruptible(1).
> The time consumed here is not counted in kcs->ibf_timeout.
> 
> So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has
> actually passed. The waiting time has increased by more than a
> hundredfold.
> 
> Now instead of calling poll(). call smi_event_handler() directly and
> calculate the elapsed time.

Yes, you are correct.

I've included this for 6.2, and added:

  Cc: stable@vger.kernel.org

I would like it to soak for a bit.

-corey

> 
> For verification, you can directly use ebpf to check the kcs->
> ibf_timeout for each call to kcs_event() when IPMI is disconnected.
> Decrement at normal rate before unloading. The decrement rate becomes
> very slow after unloading.
> 
>   $ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n",
>       *(arg0+584));}'
> 
> Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
> ---
>  drivers/char/ipmi/ipmi_si_intf.c | 27 +++++++++++++++++++--------
>  1 file changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_si_intf.c b/drivers/char/ipmi/ipmi_si_intf.c
> index 6e357ad76f2e..abddd7e43a9a 100644
> --- a/drivers/char/ipmi/ipmi_si_intf.c
> +++ b/drivers/char/ipmi/ipmi_si_intf.c
> @@ -2153,6 +2153,20 @@ static int __init init_ipmi_si(void)
>  }
>  module_init(init_ipmi_si);
>  
> +static void wait_msg_processed(struct smi_info *smi_info)
> +{
> +	unsigned long jiffies_now;
> +	long time_diff;
> +
> +	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
> +		jiffies_now = jiffies;
> +		time_diff = (((long)jiffies_now - (long)smi_info->last_timeout_jiffies)
> +		     * SI_USEC_PER_JIFFY);
> +		smi_event_handler(smi_info, time_diff);
> +		schedule_timeout_uninterruptible(1);
> +	}
> +}
> +
>  static void shutdown_smi(void *send_info)
>  {
>  	struct smi_info *smi_info = send_info;
> @@ -2187,16 +2201,13 @@ static void shutdown_smi(void *send_info)
>  	 * in the BMC.  Note that timers and CPU interrupts are off,
>  	 * so no need for locks.
>  	 */
> -	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
> -		poll(smi_info);
> -		schedule_timeout_uninterruptible(1);
> -	}
> +	wait_msg_processed(smi_info);
> +
>  	if (smi_info->handlers)
>  		disable_si_irq(smi_info);
> -	while (smi_info->curr_msg || (smi_info->si_state != SI_NORMAL)) {
> -		poll(smi_info);
> -		schedule_timeout_uninterruptible(1);
> -	}
> +
> +	wait_msg_processed(smi_info);
> +
>  	if (smi_info->handlers)
>  		smi_info->handlers->cleanup(smi_info->si_sm);
>  
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3/3] ipmi: fix memleak when unload ipmi driver
  2022-10-07  9:26 ` [PATCH 3/3] ipmi: fix memleak when unload ipmi driver Zhang Yuchen
@ 2022-10-07 19:51   ` Corey Minyard
  0 siblings, 0 replies; 10+ messages in thread
From: Corey Minyard @ 2022-10-07 19:51 UTC (permalink / raw)
  To: Zhang Yuchen; +Cc: openipmi-developer, linux-kernel, qi.zheng

On Fri, Oct 07, 2022 at 05:26:17PM +0800, Zhang Yuchen wrote:
> After the IPMI disconnect problem, the memory kept rising and we tried
> to unload the driver to free the memory. However, only part of the
> free memory is recovered after the driver is uninstalled. Using
> ebpf to hook free functions, we find that neither ipmi_user nor
> ipmi_smi_msg is free, only ipmi_recv_msg is free.
> 
> We find that the deliver_smi_err_response call in clean_smi_msgs does
> the destroy processing on each message from the xmit_msg queue without
> checking the return value and free ipmi_smi_msg.
> 
> deliver_smi_err_response is called only at this location. Adding the
> free handling has no effect.
> 
> To verify, try using ebpf to trace the free function.
> 
>   $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc rcv
>       %p\n",retval);} kprobe:free_recv_msg {printf("free recv %p\n",
>       arg0)} kretprobe:ipmi_alloc_smi_msg {printf("alloc smi %p\n",
>         retval);} kprobe:free_smi_msg {printf("free smi  %p\n",arg0)}'
> 
> Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
> ---
>  drivers/char/ipmi/ipmi_msghandler.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/char/ipmi/ipmi_msghandler.c b/drivers/char/ipmi/ipmi_msghandler.c
> index c8a3b208f923..7a7534046b5b 100644
> --- a/drivers/char/ipmi/ipmi_msghandler.c
> +++ b/drivers/char/ipmi/ipmi_msghandler.c
> @@ -3710,12 +3710,15 @@ static void deliver_smi_err_response(struct ipmi_smi *intf,
>  				     struct ipmi_smi_msg *msg,
>  				     unsigned char err)
>  {
> +	int rv;
>  	msg->rsp[0] = msg->data[0] | 4;
>  	msg->rsp[1] = msg->data[1];
>  	msg->rsp[2] = err;
>  	msg->rsp_size = 3;
>  	/* It's an error, so it will never requeue, no need to check return. */

The above comment is wrong, but yes, this is correct.  I'll queue this
and remove the comment.

Thanks,

-corey

> -	handle_one_recv_msg(intf, msg);
> +	rv = handle_one_recv_msg(intf, msg);
> +	if (rv == 0)
> +		ipmi_free_smi_msg(msg);
>  }
>  
>  static void cleanup_smi_msgs(struct ipmi_smi *intf)
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected
  2022-10-07 19:43   ` Corey Minyard
@ 2022-10-08  1:36     ` Yuchen Zhang
  2022-10-08 11:49       ` Corey Minyard
  0 siblings, 1 reply; 10+ messages in thread
From: Yuchen Zhang @ 2022-10-08  1:36 UTC (permalink / raw)
  To: minyard; +Cc: openipmi-developer, linux-kernel, qi.zheng

Thanks for your reply.

On 10/8/22 3:43 AM, Corey Minyard wrote:
> On Fri, Oct 07, 2022 at 05:26:15PM +0800, Zhang Yuchen wrote:
>> If you continue to access and send messages at a high frequency (once
>> every 55s) when the IPMI is disconnected, messages will accumulate in
>> intf->[hp_]xmit_msg. If it lasts long enough, it takes up a lot of
>> memory.
> 
> The IPMI driver really wasn't designed to handle this sort of thing.  If
> there is a BMC there, it should be there except when it's rebooting,
> which should only take a few seconds.  Which is what this is all
> designed to handle.
> 
I think no matter what machine it is, there is always a chance that it
will make mistakes. IPMI drivers should consider this.
>>
>> The reason is that if IPMI is disconnected, each message will be set to
>> IDLE after it returns to HOSED through IDLE->ERROR0->HOSED. The next
>> message goes through the same process when it comes in. This process
>> needs to wait for IBF_TIMEOUT * (MAX_ERROR_RETRIES + 1) = 55s.
>>
>> Each message takes 55S to destroy. This results in a continuous increase
>> in memory.
>>
>> I find that if I wait 5 seconds after the first message fails, the
>> status changes to ERROR0 in smi_timeout(). The next message will return
>> the error code IPMI_NOT_IN_MY_STATE_ERR directly without wait.
> 
> So basically, you will stay in error state until the BMC recovers.  The
> KCS state machine will reject messages until the state machine detects
> that the BMC is working again.  I think this is ok.
> 
> Have you tested that if the BMC comes back that the driver recovers and
> works?  Looking at the code it seems to be the case, but can you test to
> be sure, if you have not already?
> 
I have tested this case. IPMI will be restored immediately when BMC is
restored.
> Also, the following is in start_kcs_transaction():
> 
> 	if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) {
> 		dev_warn(kcs->io->dev, "KCS in invalid state %d\n", kcs->state);
> 		return IPMI_NOT_IN_MY_STATE_ERR;
> 	}
> 
> You probably need to remove the (kcs->state != KCS_HOSED) part of this
> now.  Would you agree?
> 
> -corey
> 
I agree. KCS_HOSED state should be an invalid state.
>>
>> This is more in line with our needs.
>>
>> So instead of setting each message state to IDLE after it reaches the
>> state HOSED, set state to ERROR0.
>>
>> After testing, the problem has been solved, no matter how many
>> consecutive sends, will not cause continuous memory growth. It also
>> returns to normal immediately after the IPMI is restored.
>>
>> The verification operations are as follows:
>>
>> 1. Use BPF to record the ipmi_alloc/free_smi_msg().
>>
>>    $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
>>        %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'
>>
>> 2. Exec `date; time for x in $(seq 1 2); do ipmitool mc info; done`.
>> 3. Record the output of `time` and when free all msgs.
>>
>> Before:
>>
>> `time` takes 120s, This is because `ipmitool mc info` send 4 msgs and
>> waits only 15 seconds for each message. Last msg is free after 440s.
>>
>>    $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
>>        %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'
>>    Oct 05 11:40:55 Attaching 2 probes...
>>    Oct 05 11:41:12 alloc 0xffff9558a05f0c00
>>    Oct 05 11:41:27 alloc 0xffff9558a05f1a00
>>    Oct 05 11:41:42 alloc 0xffff9558a05f0000
>>    Oct 05 11:41:57 alloc 0xffff9558a05f1400
>>    Oct 05 11:42:07 free  0xffff9558a05f0c00
>>    Oct 05 11:42:07 alloc 0xffff9558a05f7000
>>    Oct 05 11:42:22 alloc 0xffff9558a05f2a00
>>    Oct 05 11:42:37 alloc 0xffff9558a05f5a00
>>    Oct 05 11:42:52 alloc 0xffff9558a05f3a00
>>    Oct 05 11:43:02 free  0xffff9558a05f1a00
>>    Oct 05 11:43:57 free  0xffff9558a05f0000
>>    Oct 05 11:44:52 free  0xffff9558a05f1400
>>    Oct 05 11:45:47 free  0xffff9558a05f7000
>>    Oct 05 11:46:42 free  0xffff9558a05f2a00
>>    Oct 05 11:47:37 free  0xffff9558a05f5a00
>>    Oct 05 11:48:32 free  0xffff9558a05f3a00
>>
>>    $ root@dc00-pb003-t106-n078:~# date;time for x in $(seq 1 2); do
>>    ipmitool mc info; done
>>
>>    Wed Oct  5 11:41:12 CST 2022
>>    No data available
>>    Get Device ID command failed
>>    No data available
>>    No data available
>>    No valid response received
>>    Get Device ID command failed: Unspecified error
>>    No data available
>>    Get Device ID command failed
>>    No data available
>>    No data available
>>    No valid response received
>>    No data available
>>    Get Device ID command failed
>>
>>    real        1m55.052s
>>    user        0m0.001s
>>    sys        0m0.001s
>>
>> After:
>>
>> `time` takes 55s, all msgs is returned and free after 55s.
>>
>>    $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
>>        %p\n",retval);} kprobe:free_recv_msg {printf("free  %p\n",arg0)}'
>>
>>    Oct 07 16:30:35 Attaching 2 probes...
>>    Oct 07 16:30:45 alloc 0xffff955943aa9800
>>    Oct 07 16:31:00 alloc 0xffff955943aacc00
>>    Oct 07 16:31:15 alloc 0xffff955943aa8c00
>>    Oct 07 16:31:30 alloc 0xffff955943aaf600
>>    Oct 07 16:31:40 free  0xffff955943aa9800
>>    Oct 07 16:31:40 free  0xffff955943aacc00
>>    Oct 07 16:31:40 free  0xffff955943aa8c00
>>    Oct 07 16:31:40 free  0xffff955943aaf600
>>    Oct 07 16:31:40 alloc 0xffff9558ec8f7e00
>>    Oct 07 16:31:40 free  0xffff9558ec8f7e00
>>    Oct 07 16:31:40 alloc 0xffff9558ec8f7800
>>    Oct 07 16:31:40 free  0xffff9558ec8f7800
>>    Oct 07 16:31:40 alloc 0xffff9558ec8f7e00
>>    Oct 07 16:31:40 free  0xffff9558ec8f7e00
>>    Oct 07 16:31:40 alloc 0xffff9558ec8f7800
>>    Oct 07 16:31:40 free  0xffff9558ec8f7800
>>
>>    root@dc00-pb003-t106-n078:~# date;time for x in $(seq 1 2); do
>>    ipmitool mc info; done
>>    Fri Oct  7 16:30:45 CST 2022
>>    No data available
>>    Get Device ID command failed
>>    No data available
>>    No data available
>>    No valid response received
>>    Get Device ID command failed: Unspecified error
>>    Get Device ID command failed: 0xd5 Command not supported in present state
>>    Get Device ID command failed: Command not supported in present state
>>
>>    real        0m55.038s
>>    user        0m0.001s
>>    sys        0m0.001s
>>
>> Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
>> ---
>>   drivers/char/ipmi/ipmi_kcs_sm.c | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/char/ipmi/ipmi_kcs_sm.c b/drivers/char/ipmi/ipmi_kcs_sm.c
>> index efda90dcf5b3..e7f2cd10e5a6 100644
>> --- a/drivers/char/ipmi/ipmi_kcs_sm.c
>> +++ b/drivers/char/ipmi/ipmi_kcs_sm.c
>> @@ -122,10 +122,10 @@ struct si_sm_data {
>>   	unsigned long  error0_timeout;
>>   };
>>   
>> -static unsigned int init_kcs_data(struct si_sm_data *kcs,
>> -				  struct si_sm_io *io)
>> +static unsigned int init_kcs_data_with_state(struct si_sm_data *kcs,
>> +				  struct si_sm_io *io, enum kcs_states state)
>>   {
>> -	kcs->state = KCS_IDLE;
>> +	kcs->state = state;
>>   	kcs->io = io;
>>   	kcs->write_pos = 0;
>>   	kcs->write_count = 0;
>> @@ -140,6 +140,12 @@ static unsigned int init_kcs_data(struct si_sm_data *kcs,
>>   	return 2;
>>   }
>>   
>> +static unsigned int init_kcs_data(struct si_sm_data *kcs,
>> +				  struct si_sm_io *io)
>> +{
>> +	return init_kcs_data_with_state(kcs, io, KCS_IDLE);
>> +}
>> +
>>   static inline unsigned char read_status(struct si_sm_data *kcs)
>>   {
>>   	return kcs->io->inputb(kcs->io, 1);
>> @@ -495,7 +501,7 @@ static enum si_sm_result kcs_event(struct si_sm_data *kcs, long time)
>>   	}
>>   
>>   	if (kcs->state == KCS_HOSED) {
>> -		init_kcs_data(kcs, kcs->io);
>> +		init_kcs_data_with_state(kcs, kcs->io, KCS_ERROR0);
>>   		return SI_SM_HOSED;
>>   	}
>>   
>> -- 
>> 2.30.2
>>

-- 
Thanks,
Zhang Yuchen

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected
  2022-10-08  1:36     ` Yuchen Zhang
@ 2022-10-08 11:49       ` Corey Minyard
  2022-10-09  0:53         ` Yuchen Zhang
  0 siblings, 1 reply; 10+ messages in thread
From: Corey Minyard @ 2022-10-08 11:49 UTC (permalink / raw)
  To: Yuchen Zhang; +Cc: openipmi-developer, linux-kernel, qi.zheng

On Sat, Oct 08, 2022 at 09:36:16AM +0800, Yuchen Zhang wrote:
> > Also, the following is in start_kcs_transaction():
> > 
> > 	if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) {
> > 		dev_warn(kcs->io->dev, "KCS in invalid state %d\n", kcs->state);
> > 		return IPMI_NOT_IN_MY_STATE_ERR;
> > 	}
> > 
> > You probably need to remove the (kcs->state != KCS_HOSED) part of this
> > now.  Would you agree?
> > 
> > -corey
> > 
> I agree. KCS_HOSED state should be an invalid state.

Can you make this change, run a quick test, and re-submit this one
patch?  With that, I can include this.

Thanks,

-corey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected
  2022-10-08 11:49       ` Corey Minyard
@ 2022-10-09  0:53         ` Yuchen Zhang
  0 siblings, 0 replies; 10+ messages in thread
From: Yuchen Zhang @ 2022-10-09  0:53 UTC (permalink / raw)
  To: minyard; +Cc: openipmi-developer, linux-kernel, qi.zheng

Ok, I will send v2 today.

On 10/8/22 7:49 PM, Corey Minyard wrote:
> On Sat, Oct 08, 2022 at 09:36:16AM +0800, Yuchen Zhang wrote:
>>> Also, the following is in start_kcs_transaction():
>>>
>>> 	if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) {
>>> 		dev_warn(kcs->io->dev, "KCS in invalid state %d\n", kcs->state);
>>> 		return IPMI_NOT_IN_MY_STATE_ERR;
>>> 	}
>>>
>>> You probably need to remove the (kcs->state != KCS_HOSED) part of this
>>> now.  Would you agree?
>>>
>>> -corey
>>>
>> I agree. KCS_HOSED state should be an invalid state.
> 
> Can you make this change, run a quick test, and re-submit this one
> patch?  With that, I can include this.
> 
> Thanks,
> 
> -corey

--
Thanks,
Zhang Yuchen

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-10-09  0:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-07  9:26 [PATCH 0/3] ipmi: fix ipmi memleak and unload bug Zhang Yuchen
2022-10-07  9:26 ` [PATCH 1/3] ipmi: fix msg stack when IPMI is disconnected Zhang Yuchen
2022-10-07 19:43   ` Corey Minyard
2022-10-08  1:36     ` Yuchen Zhang
2022-10-08 11:49       ` Corey Minyard
2022-10-09  0:53         ` Yuchen Zhang
2022-10-07  9:26 ` [PATCH 2/3] ipmi: fix long wait in unload when IPMI disconnect Zhang Yuchen
2022-10-07 19:48   ` Corey Minyard
2022-10-07  9:26 ` [PATCH 3/3] ipmi: fix memleak when unload ipmi driver Zhang Yuchen
2022-10-07 19:51   ` Corey Minyard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).