All of lore.kernel.org
 help / color / mirror / Atom feed
From: Luis Chamberlain <mcgrof@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>
Cc: osandov@fb.com, linux-block@vger.kernel.org
Subject: Re: [PATCH v2] blktests: replace module removal with patient module removal
Date: Mon, 18 Apr 2022 08:57:07 -0700	[thread overview]
Message-ID: <Yl2KU6vLxawrIXi/@bombadil.infradead.org> (raw)
In-Reply-To: <c584cf40-2181-2617-92aa-bcdbc56a5ab8@acm.org>

On Fri, Apr 15, 2022 at 09:01:03PM -0700, Bart Van Assche wrote:
> On 4/15/22 18:49, Luis Chamberlain wrote:
> > -	if ! modprobe -r null_blk || ! modprobe null_blk "$@" "${zoned}" ; then
> > -		return 1
> > -	fi
> > +	_patient_rmmod null_blk || return 1
> > +	modprobe null_blk "$@" "${zoned}" || 1
> 
> "1" is not a valid command. Should "|| 1" perhaps be changed into "|| return
> 1"?

That was a typo, fixed.

> > +_has_modprobe_patient()
> > +{
> > +	modprobe --help >& /dev/null || return 1
> > +	modprobe --help | grep -q -1 "remove-patiently" || return 1
> > +	return 0
> > +}
> 
> I can't find the meaning of "-1" in the grep man page. Did I perhaps
> overlook something?

That's shorthand for -C 1, but we can remove it as it is not needed.

> > +# checks the refcount and returns 0 if we can safely remove the module. rmmod
> > +# does this check for us, but we can use this to also iterate checking for this
> > +# refcount before we even try to remove the module. This is useful when using
> > +# debug test modules which take a while to quiesce.
> > +_patient_rmmod_check_refcnt()
> > +{
> > +	local module=$1
> > +	local refcnt=0
> > +
> > +	if [[ -f "/sys/module/$module/refcnt" ]]; then
> > +		refcnt=$(cat "/sys/module/$module/refcnt" 2>/dev/null)
> > +		if [[ $? -ne 0 || $refcnt -eq 0 ]]; then
> > +			return 0
> > +		fi
> > +		return 1
> > +	fi
> > +	return 0
> > +}
> 
> Hmm ... why is the check for existence of the refcnt separate from reading
> the refcnt? I think that just reading the refcnt should be sufficient.
> Additionally, that will avoid the race where the module is unloaded after
> the check and before the refcnt is read.

We can certainly simplify it as follows:

_patient_rmmod_check_refcnt()
{
	local module=$1
	local refcnt=0

	refcnt=$(cat "/sys/module/$module/refcnt" 2>/dev/null)
	if [[ $? -ne 0 || $refcnt -eq 0 ]]; then
		return 0
	fi
	return 1
}

> > -	modprobe -r nvme-"${nvme_trtype}" 2>/dev/null
> > -	if [[ "${nvme_trtype}" != "loop" ]]; then
> > -		modprobe -r nvmet-"${nvme_trtype}" 2>/dev/null
> > -	fi
> > -	modprobe -r nvmet 2>/dev/null
> > +	if [[ "${nvme_trtype}" == "loop" ]]; then
> > +		_patient_rmmod nvme_"${nvme_trtype}"
> > +        else
> > +                _patient_rmmod nvme-"${nvme_trtype}"
> > +                _patient_rmmod nvmet-"${nvme_trtype}"
> > +        fi
> > +	_patient_rmmod nvmet 2>/dev/null
> 
> The statement _patient_rmmod nvme-"${nvme_trtype}" occurs twice in the above
> code. How about preserving the structure of the existing code such that that
> statement only occurs once?

There is one call for nvme-"${nvme_trtype}", the other is for the
underscore version, so there are no two calls.

Did I miss something?

> >   # Unload the SRP initiator driver.
> >   stop_srp_ini() {
> > -	local i
> > -
> >   	log_out
> > -	for ((i=40;i>=0;i--)); do
> > -		remove_mpath_devs || return $?
> > -		unload_module ib_srp >/dev/null 2>&1 && break
> > -		sleep 1
> > -	done
> > -	if [ -e /sys/module/ib_srp ]; then
> > -		echo "Error: unloading kernel module ib_srp failed"
> > -		return 1
> > -	fi
> > -	unload_module scsi_transport_srp || return $?
> > +	remove_mpath_devs || return $?
> > +	_patient_rmmod ib_srp || return 1
> > +	_patient_rmmod scsi_transport_srp || return $?
> >   }
> 
> Removing the loop from around remove_mpath_devs is wrong. It is important
> that that loop is preserved.

Why ? Can you test and verify?

I can explain why I'm removing it. Code which typically used to try to
insist on a module removal were just running into situations where the
refcnt got bumped but they could not explain why, and the reason is
that module refcnt's are finicky. *Anything* in userspace can easily
trigger this, and this is module specific. While doing a module removal,
if you are running into this, the only thing you can do is to patiently
wait until userspace is done with whatever it may try to do. The old
re-try attempts are buggy because of this.

A future mechansim for kmod will be to allow userspace to try to remove
first the modules which hold a refcnt, but that does not do anything for
whatever userspace might do. These entry points from userspace are
completely module specific and cannot be generalized (nor do I think we
want to add semantics to the kernelf or it).

So the only thing one can sensibly do, specially in test frameworks, is
to wait and use a timeout for it.

  Luis

  reply	other threads:[~2022-04-18 15:57 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-16  1:49 [PATCH v2] blktests: replace module removal with patient module removal Luis Chamberlain
2022-04-16  4:01 ` Bart Van Assche
2022-04-18 15:57   ` Luis Chamberlain [this message]
2022-04-18 16:39     ` Bart Van Assche
2022-05-11 17:44       ` Luis Chamberlain
2022-05-11 18:22         ` Bart Van Assche
2022-12-20 21:28           ` Luis Chamberlain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yl2KU6vLxawrIXi/@bombadil.infradead.org \
    --to=mcgrof@kernel.org \
    --cc=bvanassche@acm.org \
    --cc=linux-block@vger.kernel.org \
    --cc=osandov@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.