All of lore.kernel.org
 help / color / mirror / Atom feed
From: Li Fei1 <fei1.li@intel.com>
To: Jakob Koschel <jakobkoschel@gmail.com>
Cc: linux-kernel@vger.kernel.org, rppt@kernel.org,
	bjohannesmeyer@gmail.com, c.giuffrida@vu.nl, h.j.bos@vu.nl,
	fei1.li@intel.com
Subject: Re: [PATCH] virt: acrn: fix invalid check past list iterator
Date: Fri, 1 Apr 2022 17:05:45 +0800	[thread overview]
Message-ID: <20220401090545.GA32168@louislifei-OptiPlex-7050> (raw)
In-Reply-To: <65660B0C-8888-4283-80E4-D1A334EEF166@gmail.com>

On Fri, Apr 01, 2022 at 10:50:51AM +0200, Jakob Koschel wrote:
> 
> 
> > On 1. Apr 2022, at 09:57, Li Fei1 <fei1.li@intel.com> wrote:
> > 
> > On Fri, Apr 01, 2022 at 09:16:48AM +0200, Jakob Koschel wrote:
> >> 
> >> 
> >>> On 1. Apr 2022, at 05:57, Li Fei1 <fei1.li@intel.com> wrote:
> >>> 
> >>> On Fri, Apr 01, 2022 at 05:22:36AM +0200, Jakob Koschel wrote:
> >>>> 
> >>>>> On 1. Apr 2022, at 03:15, Li Fei1 <fei1.li@intel.com> wrote:
> >>>>> 
> >>>>> On Thu, Mar 31, 2022 at 01:20:50PM +0200, Jakob Koschel wrote:
> >>>>>> 
> >>>>>>> On 30. Mar 2022, at 09:57, Li Fei1 <fei1.li@intel.com> wrote:
> >>>>>>> 
> >>>>>>> On Sat, Mar 19, 2022 at 09:38:19PM +0100, Jakob Koschel wrote:
> >>>>>>>> The condition retry == 0 is theoretically possible even if 'client'
> >>>>>>>> does not point to a valid element because no break was hit.
> >>>>>>>> 
> >>>>>>>> To only execute the dev_warn if actually a break within the loop was
> >>>>>>>> hit, a separate variable is used that is only set if it is ensured to
> >>>>>>>> point to a valid client struct.
> >>>>>>>> 
> >>>>>>> Hi Koschel
> >>>>>>> 
> >>>>>>> Thanks for you to help us to try to improve the code. Maybe you don't get the point.
> >>>>>>> The dev_warn should only been called when has_pending = true && retry == 0
> >>>>>> 
> >>>>>> Maybe I don't understand but looking isolated at this function I could see a way to call
> >>>>>> the dev_warn() with has_pending = false && retry == 0.
> >>>>> Yes, even has_pending = true && retry == 0 at the beginning, we could not make sure
> >>>>> has_pending is true after schedule_timeout_interruptible and we even didn't check
> >>>>> there're other pending client on the ioreq_clients (because we can't make sure
> >>>>> when we dev_warn this clent is still pending). So we just use dev_warn not higher log level.
> >>>> 
> >>>> I'm sorry, I don't quite understand what you mean by that.
> >>>> Do you agree that has_pending = false && retry == 0 is possible when calling the dev_warn()
> >>>> or not?
> >>>> 
> >>> Yes, so what ? It just a hint there may have pending request.
> >> 
> >> if has_pending == false && retry == 0 when calling dev_warn() there are very clear
> >> dependencies met:
> >> 
> >> * has_pending == false means that the list_for_each_entry() macro it *not* exit early.
> >> * since list_for_each_entry() did *not* exit early, client will not hold a valid list entry
> >> * using client->name is not safe and will not point to a valid string (perhaps not even an address)
> > So you'd better to check when the client in ioreq_clients would been destroyed, right ?
> 
> I'm afraid I don't know exactly what you mean here.
> 
> More specifically:
> to check what? and I'm also not sure what you mean with "when client in ioreq_clients would been destroyed".
> 
> Actually thinking about it more the check should be
> 
> 	if (client && retry == 0)
> 
> to be correct.
The client in ioreq_clients would always been "valid" (here valid means the client struct would not
been destroyed) when this function been called. That's guaranteed by the code logic.
> 
> if has_pending == false I read the code as if no client was found that has a pending request 
> so I don't follow how:
> 
> 	"%s cannot flush pending request!\n", client->name);
> 
> can be valid since no client has a pending request.
> 
> > 
> >> 
> >> I'm *only* talking about the case where has_pending == false, in case that's not clear.
> >> 
> >> 
> >>> Even retry == 0 && has_pending = true,
> >>> When we call dev_warn, the clent may not is pending.
> >>> 
> >>> 
> >>>>>> 
> >>>>>>> 		list_for_each_entry(client, &vm->ioreq_clients, list) {
> >>>>>>> 			has_pending = has_pending_request(client);
> >>>>>>> 			if (has_pending)
> >>>>>>> 		}
> >>>>>>> 		spin_unlock_bh(&vm->ioreq_clients_lock);
> >>>>>> 
> >>>>>> imagine has_pending == false && retry == 1 here, then client will not hold a valid list entry.
> >>>>> What do you mean "client will not hold a valid list entry" ? 
> >>>> 
> >>>> Imagine a very simple example:
> >>>> 
> >>>> 	struct acrn_ioreq_client *client;
> >>>> 	list_for_each_entry(client, &vm->ioreq_clients, list) {
> >>>> 		continue;
> >>>> 	}
> >>>> 
> >>>> 	dev_warn(acrn_dev.this_device,
> >>>> 		 "%s cannot flush pending request!\n", client->name); /* NOT GOOD */
> >>>> 
> >>> If there're pending request, we would call schedule to schedule out then schedule back
> >>> to check the list from the beginning. If there's no pending client, we point to the last
> >>> client and break out the while loop.
> >>> 
> >>> The code doesn't want to find the pending client and break out the while loop and call
> >>> dev_warn. Please see the function comment.
> >>> 
> >>> 
> >>>> 
> >>>> Since there is no break for the list_for_each_entry() iterator to return early,
> >>>> client *cannot* be a valid entry of the list. In fact if you look at the list_for_each_entry()
> >>>> macro, it will be an 'bogus' pointer, pointing somewhere into 'vm'.
> >>>> Essentially before the terminating condition of the list traversal the following code is called:
> >>>> 
> >>>> 	list_entry(&vm->ioreq_clients, struct acrn_ioreq_client *, list);
> >>>> 
> >>>> resulting in a:
> >>>> 
> >>>> 	container_of(&vm->ioreq_clients, struct acrn_ioreq_client *, list);
> >>>> 
> >>>> &vm->ioreq_clients however is not contained in a struct acrn_ioreq_client, making
> >>>> this call compute an invalid pointer.
> >>>> Therefore using 'client' as in the example above (e.g. client->name) after the loop is
> >>>> not safe. Since the loop can never return early in the simple example above it will
> >>>> always break. On other cases (the one we are discussing here) there might be a chance that
> >>>> there is one code path (in theory) where the loop did not exit early and 'client'
> >>>> holds that 'invalid entry'.
> >>>> 
> >>>> This would be the case with has_pending = false && retry == 0.
> >>>> 
> >>>> I hope this makes sense.
> >>>> 
> >>>>> 
> >>>>>> 
> >>>>>>> 
> >>>>>>> 		if (has_pending)
> >>>>>>> 			schedule_timeout_interruptible(HZ / 100);
> >>>>>>> 	} while (has_pending && --retry > 0);
> >>>>>> 
> >>>>>> since has_pending && --retry > 0 is no longer true the loop stops.
> >>>>>> 
> >>>>>>> 	if (retry == 0)
> >>>>>>> 		dev_warn(acrn_dev.this_device,
> >>>>>>> 			 "%s cannot flush pending request!\n", client->name);
> >>>>>> client->name is accessed since retry == 0 now, but client is not a valid struct ending up
> >>>>>> in a type confusion.
> >>>>>> 
> >>>>>>> 
> >>>>>>> If retry > 0 and has_pending is true, we would call schedule_timeout_interruptible
> >>>>>>> to schedule out to wait all the pending I/O requests would been completed.
> >>>>>>> 
> >>>>>>> Thanks.
> >>>>>> 
> >>>>>> Again, I'm not sure if this is realistically possible. I'm trying to remove
> >>>>>> any use of the list iterator after the loop to make such potentially issues detectable
> >>>>> You may think we still in the loop (could we ?), even that we didn't write the list iterator then.
> >>>> 
> >>>> I'm not exactly sure which loop you are referring to but no, I don't think we are still in
> >>>> the do while loop.
> >>>> 
> >>>> The only thing we know after the do while loop is that: !has_pending || retry == 0
> >>>> And iff has_pending && retry == 0, then we shouldn't call the dev_warn().
> >>>> 
> >>>>>> at compile time instead of relying on certain (difficult to maintain) conditions to be met
> >>>>>> to avoid the type confusion.
> >>>>>> 
> >>>>>> Thanks,
> >>>>>> Jakob
> >>>> 
> >>>> 	Jakob
> >> 
> >> 	Jakob
> 

  reply	other threads:[~2022-04-01  9:06 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-19 20:38 [PATCH] virt: acrn: fix invalid check past list iterator Jakob Koschel
2022-03-30  7:57 ` Li Fei1
2022-03-31 11:20   ` Jakob Koschel
2022-04-01  1:15     ` Li Fei1
2022-04-01  3:22       ` Jakob Koschel
2022-04-01  3:57         ` Li Fei1
2022-04-01  7:16           ` Jakob Koschel
2022-04-01  7:57             ` Li Fei1
2022-04-01  8:50               ` Jakob Koschel
2022-04-01  9:05                 ` Li Fei1 [this message]
2022-04-01  9:08                   ` Jakob Koschel
2022-04-01  9:12                     ` Li Fei1
2022-04-01  9:24                       ` Jakob Koschel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220401090545.GA32168@louislifei-OptiPlex-7050 \
    --to=fei1.li@intel.com \
    --cc=bjohannesmeyer@gmail.com \
    --cc=c.giuffrida@vu.nl \
    --cc=h.j.bos@vu.nl \
    --cc=jakobkoschel@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rppt@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.