* [PATCH] kobject: record and send PREV_SEQNUM with uevents @ 2017-10-16 12:35 Peter Rajnoha 2017-10-16 13:55 ` Greg KH 0 siblings, 1 reply; 8+ messages in thread From: Peter Rajnoha @ 2017-10-16 12:35 UTC (permalink / raw) To: gregkh; +Cc: linux-kernel, Peter Rajnoha Record last uevent seqnum that was used with kobject's uevent and send it with next uevent as PREV_SEQNUM=<num> variable. This enhances the way we are able to track and handle uevents in userspace if we are trying to catch up with all the uevents that were generated while we were not listening or the uevents which were not delivered. This may happen if the userspace listener is not running yet when the uevent is triggered or there's a period of time when it's not listening (e.g. because the userspace listener is being restarted while a new uevent fires). A userspace listener can compare the last uevent seqnum it knows about with the last uevent seqnum the kernel reports through uevents delivered to userspace, both genuine and synthetic ones (the ones generated by writing uevent action name to /sys/.../uevent file). Then, if needed, userspace may execute appropriate actions based on that. We don't need to reevaluate and recheck all items, instead, we can concentrate only on items for which we really and actually need to reflect changed state in kernel while the userspace listener was not listening for uevents. Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> --- include/linux/kobject.h | 1 + lib/kobject.c | 1 + lib/kobject_uevent.c | 15 +++++++++++++-- 3 files changed, 15 insertions(+), 2 deletions(-) diff --git a/include/linux/kobject.h b/include/linux/kobject.h index e0a6205caa71..ee6db28b6186 100644 --- a/include/linux/kobject.h +++ b/include/linux/kobject.h @@ -73,6 +73,7 @@ struct kobject { #ifdef CONFIG_DEBUG_KOBJECT_RELEASE struct delayed_work release; #endif + u64 last_uevent_seqnum; unsigned int state_initialized:1; unsigned int state_in_sysfs:1; unsigned int state_add_uevent_sent:1; diff --git a/lib/kobject.c b/lib/kobject.c index 763d70a18941..2cc809f10ae7 100644 --- a/lib/kobject.c +++ b/lib/kobject.c @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) return; kref_init(&kobj->kref); INIT_LIST_HEAD(&kobj->entry); + kobj->last_uevent_seqnum = 0; kobj->state_in_sysfs = 0; kobj->state_add_uevent_sent = 0; kobj->state_remove_uevent_sent = 0; diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c index f237a09a5862..f1cbed06a395 100644 --- a/lib/kobject_uevent.c +++ b/lib/kobject_uevent.c @@ -454,8 +454,19 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, } mutex_lock(&uevent_sock_mutex); - /* we will send an event, so request a new sequence number */ - retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); + /* + * We will send an event, so request a new sequence number. + * Also, record the number for next uevent's PREV_SEQNUM. + */ + kobj->last_uevent_seqnum = ++uevent_seqnum; + retval = add_uevent_var(env, "SEQNUM=%llu", + (unsigned long long)uevent_seqnum); + if (retval) { + mutext_unlock(&uevent_sock_mutex); + goto exit; + } + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", + (unsigned long long)kobj->last_uevent_seqnum); if (retval) { mutex_unlock(&uevent_sock_mutex); goto exit; -- 2.13.6 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] kobject: record and send PREV_SEQNUM with uevents 2017-10-16 12:35 [PATCH] kobject: record and send PREV_SEQNUM with uevents Peter Rajnoha @ 2017-10-16 13:55 ` Greg KH 2017-10-16 14:00 ` Peter Rajnoha 0 siblings, 1 reply; 8+ messages in thread From: Greg KH @ 2017-10-16 13:55 UTC (permalink / raw) To: Peter Rajnoha; +Cc: linux-kernel On Mon, Oct 16, 2017 at 02:35:44PM +0200, Peter Rajnoha wrote: > Record last uevent seqnum that was used with kobject's uevent and send > it with next uevent as PREV_SEQNUM=<num> variable. > > This enhances the way we are able to track and handle uevents in userspace > if we are trying to catch up with all the uevents that were generated > while we were not listening or the uevents which were not delivered. > This may happen if the userspace listener is not running yet when the > uevent is triggered or there's a period of time when it's not listening > (e.g. because the userspace listener is being restarted while a new > uevent fires). > > A userspace listener can compare the last uevent seqnum it knows about > with the last uevent seqnum the kernel reports through uevents delivered > to userspace, both genuine and synthetic ones (the ones generated by > writing uevent action name to /sys/.../uevent file). Then, if needed, > userspace may execute appropriate actions based on that. We don't need > to reevaluate and recheck all items, instead, we can concentrate only > on items for which we really and actually need to reflect changed state > in kernel while the userspace listener was not listening for uevents. > > Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> > --- > include/linux/kobject.h | 1 + > lib/kobject.c | 1 + > lib/kobject_uevent.c | 15 +++++++++++++-- > 3 files changed, 15 insertions(+), 2 deletions(-) > > diff --git a/include/linux/kobject.h b/include/linux/kobject.h > index e0a6205caa71..ee6db28b6186 100644 > --- a/include/linux/kobject.h > +++ b/include/linux/kobject.h > @@ -73,6 +73,7 @@ struct kobject { > #ifdef CONFIG_DEBUG_KOBJECT_RELEASE > struct delayed_work release; > #endif > + u64 last_uevent_seqnum; > unsigned int state_initialized:1; > unsigned int state_in_sysfs:1; > unsigned int state_add_uevent_sent:1; > diff --git a/lib/kobject.c b/lib/kobject.c > index 763d70a18941..2cc809f10ae7 100644 > --- a/lib/kobject.c > +++ b/lib/kobject.c > @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) > return; > kref_init(&kobj->kref); > INIT_LIST_HEAD(&kobj->entry); > + kobj->last_uevent_seqnum = 0; > kobj->state_in_sysfs = 0; > kobj->state_add_uevent_sent = 0; > kobj->state_remove_uevent_sent = 0; > diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c > index f237a09a5862..f1cbed06a395 100644 > --- a/lib/kobject_uevent.c > +++ b/lib/kobject_uevent.c > @@ -454,8 +454,19 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, > } > > mutex_lock(&uevent_sock_mutex); > - /* we will send an event, so request a new sequence number */ > - retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); > + /* > + * We will send an event, so request a new sequence number. > + * Also, record the number for next uevent's PREV_SEQNUM. > + */ > + kobj->last_uevent_seqnum = ++uevent_seqnum; > + retval = add_uevent_var(env, "SEQNUM=%llu", > + (unsigned long long)uevent_seqnum); > + if (retval) { > + mutext_unlock(&uevent_sock_mutex); > + goto exit; > + } > + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", > + (unsigned long long)kobj->last_uevent_seqnum); I'm confused, why are we storing "last_uevent_seqnum" here at all, if it is only just the current one-1? What else do you do with that value? thanks, greg k-h ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kobject: record and send PREV_SEQNUM with uevents 2017-10-16 13:55 ` Greg KH @ 2017-10-16 14:00 ` Peter Rajnoha 2017-10-16 14:49 ` Greg KH 0 siblings, 1 reply; 8+ messages in thread From: Peter Rajnoha @ 2017-10-16 14:00 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel On 10/16/2017 03:55 PM, Greg KH wrote: > On Mon, Oct 16, 2017 at 02:35:44PM +0200, Peter Rajnoha wrote: >> Record last uevent seqnum that was used with kobject's uevent and send >> it with next uevent as PREV_SEQNUM=<num> variable. >> >> This enhances the way we are able to track and handle uevents in userspace >> if we are trying to catch up with all the uevents that were generated >> while we were not listening or the uevents which were not delivered. >> This may happen if the userspace listener is not running yet when the >> uevent is triggered or there's a period of time when it's not listening >> (e.g. because the userspace listener is being restarted while a new >> uevent fires). >> >> A userspace listener can compare the last uevent seqnum it knows about >> with the last uevent seqnum the kernel reports through uevents delivered >> to userspace, both genuine and synthetic ones (the ones generated by >> writing uevent action name to /sys/.../uevent file). Then, if needed, >> userspace may execute appropriate actions based on that. We don't need >> to reevaluate and recheck all items, instead, we can concentrate only >> on items for which we really and actually need to reflect changed state >> in kernel while the userspace listener was not listening for uevents. >> >> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> >> --- >> include/linux/kobject.h | 1 + >> lib/kobject.c | 1 + >> lib/kobject_uevent.c | 15 +++++++++++++-- >> 3 files changed, 15 insertions(+), 2 deletions(-) >> >> diff --git a/include/linux/kobject.h b/include/linux/kobject.h >> index e0a6205caa71..ee6db28b6186 100644 >> --- a/include/linux/kobject.h >> +++ b/include/linux/kobject.h >> @@ -73,6 +73,7 @@ struct kobject { >> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE >> struct delayed_work release; >> #endif >> + u64 last_uevent_seqnum; >> unsigned int state_initialized:1; >> unsigned int state_in_sysfs:1; >> unsigned int state_add_uevent_sent:1; >> diff --git a/lib/kobject.c b/lib/kobject.c >> index 763d70a18941..2cc809f10ae7 100644 >> --- a/lib/kobject.c >> +++ b/lib/kobject.c >> @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) >> return; >> kref_init(&kobj->kref); >> INIT_LIST_HEAD(&kobj->entry); >> + kobj->last_uevent_seqnum = 0; >> kobj->state_in_sysfs = 0; >> kobj->state_add_uevent_sent = 0; >> kobj->state_remove_uevent_sent = 0; >> diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c >> index f237a09a5862..f1cbed06a395 100644 >> --- a/lib/kobject_uevent.c >> +++ b/lib/kobject_uevent.c >> @@ -454,8 +454,19 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, >> } >> >> mutex_lock(&uevent_sock_mutex); >> - /* we will send an event, so request a new sequence number */ >> - retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); >> + /* >> + * We will send an event, so request a new sequence number. >> + * Also, record the number for next uevent's PREV_SEQNUM. >> + */ >> + kobj->last_uevent_seqnum = ++uevent_seqnum; >> + retval = add_uevent_var(env, "SEQNUM=%llu", >> + (unsigned long long)uevent_seqnum); >> + if (retval) { >> + mutext_unlock(&uevent_sock_mutex); >> + goto exit; >> + } >> + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", >> + (unsigned long long)kobj->last_uevent_seqnum); > > I'm confused, why are we storing "last_uevent_seqnum" here at all, if it > is only just the current one-1? What else do you do with that value? I'm sorry, I've sent incorrect patch by mistake. This is the correct one: >From 91fa0d0f5c52b959a5ca4cc79d1a4f5970db75e2 Mon Sep 17 00:00:00 2001 From: Peter Rajnoha <prajnoha@redhat.com> Date: Thu, 12 Oct 2017 11:33:48 +0200 Subject: [PATCH] kobject: record and send PREV_SEQNUM with uevents Record last uevent seqnum that was used with kobject's uevent and send it with next uevent as PREV_SEQNUM=<num> variable. This enhances the way we are able to track and handle uevents in userspace if we are trying to catch up with all the uevents that were generated while we were not listening or the uevents which were not delivered. This may happen if the userspace listener is not running yet when the uevent is triggered or there's a period of time when it's not listening (e.g. because the userspace listener is being restarted while a new uevent fires). A userspace listener can compare the last uevent seqnum it knows about with the last uevent seqnum the kernel reports through uevents delivered to userspace, both genuine and synthetic ones (the ones generated by writing uevent action name to /sys/.../uevent file). Then, if needed, userspace may execute appropriate actions based on that. We don't need to reevaluate and recheck all items, instead, we can concentrate only on items for which we really and actually need to reflect changed state in kernel while the userspace listener was not listening for uevents. Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> --- include/linux/kobject.h | 1 + lib/kobject.c | 1 + lib/kobject_uevent.c | 12 +++++++++++- 3 files changed, 13 insertions(+), 1 deletion(-) diff --git a/include/linux/kobject.h b/include/linux/kobject.h index e0a6205caa71..ee6db28b6186 100644 --- a/include/linux/kobject.h +++ b/include/linux/kobject.h @@ -73,6 +73,7 @@ struct kobject { #ifdef CONFIG_DEBUG_KOBJECT_RELEASE struct delayed_work release; #endif + u64 last_uevent_seqnum; unsigned int state_initialized:1; unsigned int state_in_sysfs:1; unsigned int state_add_uevent_sent:1; diff --git a/lib/kobject.c b/lib/kobject.c index 763d70a18941..2cc809f10ae7 100644 --- a/lib/kobject.c +++ b/lib/kobject.c @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) return; kref_init(&kobj->kref); INIT_LIST_HEAD(&kobj->entry); + kobj->last_uevent_seqnum = 0; kobj->state_in_sysfs = 0; kobj->state_add_uevent_sent = 0; kobj->state_remove_uevent_sent = 0; diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c index f237a09a5862..b1d301bab3e6 100644 --- a/lib/kobject_uevent.c +++ b/lib/kobject_uevent.c @@ -454,12 +454,22 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, } mutex_lock(&uevent_sock_mutex); - /* we will send an event, so request a new sequence number */ + /* + * We will send an event, so request a new sequence number. + * Also, record the number for next uevent's PREV_SEQNUM. + */ retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); if (retval) { mutex_unlock(&uevent_sock_mutex); goto exit; } + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", + (unsigned long long)kobj->last_uevent_seqnum); + kobj->last_uevent_seqnum = uevent_seqnum; + if (retval) { + mutex_unlock(&uevent_sock_mutex); + goto exit; + } #if defined(CONFIG_NET) /* send netlink message */ -- 2.13.6 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] kobject: record and send PREV_SEQNUM with uevents 2017-10-16 14:00 ` Peter Rajnoha @ 2017-10-16 14:49 ` Greg KH 2017-10-17 10:04 ` Peter Rajnoha 0 siblings, 1 reply; 8+ messages in thread From: Greg KH @ 2017-10-16 14:49 UTC (permalink / raw) To: Peter Rajnoha; +Cc: linux-kernel On Mon, Oct 16, 2017 at 04:00:26PM +0200, Peter Rajnoha wrote: > On 10/16/2017 03:55 PM, Greg KH wrote: > > On Mon, Oct 16, 2017 at 02:35:44PM +0200, Peter Rajnoha wrote: > >> Record last uevent seqnum that was used with kobject's uevent and send > >> it with next uevent as PREV_SEQNUM=<num> variable. > >> > >> This enhances the way we are able to track and handle uevents in userspace > >> if we are trying to catch up with all the uevents that were generated > >> while we were not listening or the uevents which were not delivered. > >> This may happen if the userspace listener is not running yet when the > >> uevent is triggered or there's a period of time when it's not listening > >> (e.g. because the userspace listener is being restarted while a new > >> uevent fires). > >> > >> A userspace listener can compare the last uevent seqnum it knows about > >> with the last uevent seqnum the kernel reports through uevents delivered > >> to userspace, both genuine and synthetic ones (the ones generated by > >> writing uevent action name to /sys/.../uevent file). Then, if needed, > >> userspace may execute appropriate actions based on that. We don't need > >> to reevaluate and recheck all items, instead, we can concentrate only > >> on items for which we really and actually need to reflect changed state > >> in kernel while the userspace listener was not listening for uevents. > >> > >> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> > >> --- > >> include/linux/kobject.h | 1 + > >> lib/kobject.c | 1 + > >> lib/kobject_uevent.c | 15 +++++++++++++-- > >> 3 files changed, 15 insertions(+), 2 deletions(-) > >> > >> diff --git a/include/linux/kobject.h b/include/linux/kobject.h > >> index e0a6205caa71..ee6db28b6186 100644 > >> --- a/include/linux/kobject.h > >> +++ b/include/linux/kobject.h > >> @@ -73,6 +73,7 @@ struct kobject { > >> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE > >> struct delayed_work release; > >> #endif > >> + u64 last_uevent_seqnum; > >> unsigned int state_initialized:1; > >> unsigned int state_in_sysfs:1; > >> unsigned int state_add_uevent_sent:1; > >> diff --git a/lib/kobject.c b/lib/kobject.c > >> index 763d70a18941..2cc809f10ae7 100644 > >> --- a/lib/kobject.c > >> +++ b/lib/kobject.c > >> @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) > >> return; > >> kref_init(&kobj->kref); > >> INIT_LIST_HEAD(&kobj->entry); > >> + kobj->last_uevent_seqnum = 0; > >> kobj->state_in_sysfs = 0; > >> kobj->state_add_uevent_sent = 0; > >> kobj->state_remove_uevent_sent = 0; > >> diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c > >> index f237a09a5862..f1cbed06a395 100644 > >> --- a/lib/kobject_uevent.c > >> +++ b/lib/kobject_uevent.c > >> @@ -454,8 +454,19 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, > >> } > >> > >> mutex_lock(&uevent_sock_mutex); > >> - /* we will send an event, so request a new sequence number */ > >> - retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); > >> + /* > >> + * We will send an event, so request a new sequence number. > >> + * Also, record the number for next uevent's PREV_SEQNUM. > >> + */ > >> + kobj->last_uevent_seqnum = ++uevent_seqnum; > >> + retval = add_uevent_var(env, "SEQNUM=%llu", > >> + (unsigned long long)uevent_seqnum); > >> + if (retval) { > >> + mutext_unlock(&uevent_sock_mutex); > >> + goto exit; > >> + } > >> + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", > >> + (unsigned long long)kobj->last_uevent_seqnum); > > > > I'm confused, why are we storing "last_uevent_seqnum" here at all, if it > > is only just the current one-1? What else do you do with that value? > I'm sorry, I've sent incorrect patch by mistake. This is the correct one: Ok, this is a bit better (but it's not in a format I could apply it in, if I wanted to...) > From 91fa0d0f5c52b959a5ca4cc79d1a4f5970db75e2 Mon Sep 17 00:00:00 2001 > From: Peter Rajnoha <prajnoha@redhat.com> > Date: Thu, 12 Oct 2017 11:33:48 +0200 > Subject: [PATCH] kobject: record and send PREV_SEQNUM with uevents > > Record last uevent seqnum that was used with kobject's uevent and send > it with next uevent as PREV_SEQNUM=<num> variable. > > This enhances the way we are able to track and handle uevents in userspace > if we are trying to catch up with all the uevents that were generated > while we were not listening or the uevents which were not delivered. > This may happen if the userspace listener is not running yet when the > uevent is triggered or there's a period of time when it's not listening > (e.g. because the userspace listener is being restarted while a new > uevent fires). > > A userspace listener can compare the last uevent seqnum it knows about > with the last uevent seqnum the kernel reports through uevents delivered > to userspace, both genuine and synthetic ones (the ones generated by > writing uevent action name to /sys/.../uevent file). Then, if needed, > userspace may execute appropriate actions based on that. We don't need > to reevaluate and recheck all items, instead, we can concentrate only > on items for which we really and actually need to reflect changed state > in kernel while the userspace listener was not listening for uevents. > > Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> I don't really understand what userspace can do with this. Why does it matter what the sequence number was for a specific kobject? What can happen because of this? Do you have any example libudev code changes to show this in action so I can maybe understand the need? thanks, greg k-h ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kobject: record and send PREV_SEQNUM with uevents 2017-10-16 14:49 ` Greg KH @ 2017-10-17 10:04 ` Peter Rajnoha 2017-10-17 13:26 ` Greg KH 0 siblings, 1 reply; 8+ messages in thread From: Peter Rajnoha @ 2017-10-17 10:04 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel On 10/16/2017 04:49 PM, Greg KH wrote: > On Mon, Oct 16, 2017 at 04:00:26PM +0200, Peter Rajnoha wrote: >> On 10/16/2017 03:55 PM, Greg KH wrote: >>> On Mon, Oct 16, 2017 at 02:35:44PM +0200, Peter Rajnoha wrote: >>>> Record last uevent seqnum that was used with kobject's uevent and send >>>> it with next uevent as PREV_SEQNUM=<num> variable. >>>> >>>> This enhances the way we are able to track and handle uevents in userspace >>>> if we are trying to catch up with all the uevents that were generated >>>> while we were not listening or the uevents which were not delivered. >>>> This may happen if the userspace listener is not running yet when the >>>> uevent is triggered or there's a period of time when it's not listening >>>> (e.g. because the userspace listener is being restarted while a new >>>> uevent fires). >>>> >>>> A userspace listener can compare the last uevent seqnum it knows about >>>> with the last uevent seqnum the kernel reports through uevents delivered >>>> to userspace, both genuine and synthetic ones (the ones generated by >>>> writing uevent action name to /sys/.../uevent file). Then, if needed, >>>> userspace may execute appropriate actions based on that. We don't need >>>> to reevaluate and recheck all items, instead, we can concentrate only >>>> on items for which we really and actually need to reflect changed state >>>> in kernel while the userspace listener was not listening for uevents. >>>> >>>> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> >>>> --- >>>> include/linux/kobject.h | 1 + >>>> lib/kobject.c | 1 + >>>> lib/kobject_uevent.c | 15 +++++++++++++-- >>>> 3 files changed, 15 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/include/linux/kobject.h b/include/linux/kobject.h >>>> index e0a6205caa71..ee6db28b6186 100644 >>>> --- a/include/linux/kobject.h >>>> +++ b/include/linux/kobject.h >>>> @@ -73,6 +73,7 @@ struct kobject { >>>> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE >>>> struct delayed_work release; >>>> #endif >>>> + u64 last_uevent_seqnum; >>>> unsigned int state_initialized:1; >>>> unsigned int state_in_sysfs:1; >>>> unsigned int state_add_uevent_sent:1; >>>> diff --git a/lib/kobject.c b/lib/kobject.c >>>> index 763d70a18941..2cc809f10ae7 100644 >>>> --- a/lib/kobject.c >>>> +++ b/lib/kobject.c >>>> @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) >>>> return; >>>> kref_init(&kobj->kref); >>>> INIT_LIST_HEAD(&kobj->entry); >>>> + kobj->last_uevent_seqnum = 0; >>>> kobj->state_in_sysfs = 0; >>>> kobj->state_add_uevent_sent = 0; >>>> kobj->state_remove_uevent_sent = 0; >>>> diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c >>>> index f237a09a5862..f1cbed06a395 100644 >>>> --- a/lib/kobject_uevent.c >>>> +++ b/lib/kobject_uevent.c >>>> @@ -454,8 +454,19 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, >>>> } >>>> >>>> mutex_lock(&uevent_sock_mutex); >>>> - /* we will send an event, so request a new sequence number */ >>>> - retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); >>>> + /* >>>> + * We will send an event, so request a new sequence number. >>>> + * Also, record the number for next uevent's PREV_SEQNUM. >>>> + */ >>>> + kobj->last_uevent_seqnum = ++uevent_seqnum; >>>> + retval = add_uevent_var(env, "SEQNUM=%llu", >>>> + (unsigned long long)uevent_seqnum); >>>> + if (retval) { >>>> + mutext_unlock(&uevent_sock_mutex); >>>> + goto exit; >>>> + } >>>> + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", >>>> + (unsigned long long)kobj->last_uevent_seqnum); >>> >>> I'm confused, why are we storing "last_uevent_seqnum" here at all, if it >>> is only just the current one-1? What else do you do with that value? >> I'm sorry, I've sent incorrect patch by mistake. This is the correct one: > > Ok, this is a bit better (but it's not in a format I could apply it in, > if I wanted to...) > >> From 91fa0d0f5c52b959a5ca4cc79d1a4f5970db75e2 Mon Sep 17 00:00:00 2001 >> From: Peter Rajnoha <prajnoha@redhat.com> >> Date: Thu, 12 Oct 2017 11:33:48 +0200 >> Subject: [PATCH] kobject: record and send PREV_SEQNUM with uevents >> >> Record last uevent seqnum that was used with kobject's uevent and send >> it with next uevent as PREV_SEQNUM=<num> variable. >> >> This enhances the way we are able to track and handle uevents in userspace >> if we are trying to catch up with all the uevents that were generated >> while we were not listening or the uevents which were not delivered. >> This may happen if the userspace listener is not running yet when the >> uevent is triggered or there's a period of time when it's not listening >> (e.g. because the userspace listener is being restarted while a new >> uevent fires). >> >> A userspace listener can compare the last uevent seqnum it knows about >> with the last uevent seqnum the kernel reports through uevents delivered >> to userspace, both genuine and synthetic ones (the ones generated by >> writing uevent action name to /sys/.../uevent file). Then, if needed, >> userspace may execute appropriate actions based on that. We don't need >> to reevaluate and recheck all items, instead, we can concentrate only >> on items for which we really and actually need to reflect changed state >> in kernel while the userspace listener was not listening for uevents. >> >> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> > > I don't really understand what userspace can do with this. Why does it > matter what the sequence number was for a specific kobject? What can > happen because of this? > > Do you have any example libudev code changes to show this in action so I > can maybe understand the need? > Not a libudev code change for this, at least I don't have this in plan at this moment. It's more related to the way the udev rules are processed and, in general, the way any other uevent listener can react to uevents. So the change is in how "end consumer" of the uevent does its processing. At the moment, the sequence number is used globally, we don't have per-kobject sequence numbers. So we can't use current SEQNUM that is attached to each uevent to tell whether we have missed previous uevent(s) for concrete device or not, we just know that some uevents for some devices were not processed. Therefore, when we're triggering uevents from userspace (udevadm trigger), we need to reevaluate all the rules for all devices, no matter if it's really needed or not. For example, when we're switching from initramfs to root fs, we stop the first udevd instance which is in initramfs, we keep the database (at the moment, only for selected devices where we know that it's safe to do so - same rules used in initramfs as in root fs). Then we switch over to root fs and start the udevd again and we trigger uevents to fully refresh the state, taking also the existing database into account. Now, it would help if we knew whether there were any uevents issued after we stopped udevd in initramfs and before we started a new one from root fs. If there are no uevents in between, we don't need to reevaluate all the rules, we can just keep the existing database content. This can become prominent if we have more devices in the system. It helps to minimize the number of execution during boot, the overall impact of triggering uevents and related processing, hence also making it more straightforward to watch the state of the device and possibly debug related problems. The same principle applies to any uevent listener in userspace, not just udevd: - recording last seen SEQNUM - when new instance of the listener is running - trigger synthetic uevents to refresh state - comparing the PREV_SEQNUM from uevent generated with stored SEQNUM: - If they differ, refresh the state fully, considering the state we know of as obsolete as there were already other uevents before we haven't processed. This may include deeper scans which may take longer, executing more helper tools to get the state of the device. - If the are same, reuse the state we already know about is up-to-date (there were no uevents we haven't seen), don't reevaluate everything. With the new synthetic uevent interface (the extention to /sys/.../uevent) that accepts arguments (UUID and key=value pairs), we can make it possible to specify the nature of the triggered uevent, we can mark it through the key=value pairs which the uevent listener in userspace can check for, so if we can also: - either refresh only the state for devices where we missed uevents (that could be used during bootup, only to refresh the state for devices where we have no current state yet), comparing current PREV_SEQNUM and stored last SEQNUM in userspace - or to force full refresh if that's needed for a reason. Also, some devices may have more complex activation scheme than usual "device addition" to the system. There can be a sequence of uevents which are expected for a successful change of device state, so we have a state machine that tracks uevents (as a confirmation from kernel for userspace that we can hook on in udev rules and uevent listeners). And here, it's very helpful to know that we have missed uevent(s) before or not, either they were not delivered or the uevent listener itself has been restarted and during this period, it may have missed uevents. In summary, it's adding more robustness to the uevent interface where we can directly and quickly detect we missed the uevent or not for a specific device and hence we can optimize udev rules/uevent listeners based on that so we don't dully refresh all the state and reexecute all rules and code for any triggered uevents. It's really not needed all the time. -- Peter ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kobject: record and send PREV_SEQNUM with uevents 2017-10-17 10:04 ` Peter Rajnoha @ 2017-10-17 13:26 ` Greg KH 2017-10-18 7:21 ` Peter Rajnoha 0 siblings, 1 reply; 8+ messages in thread From: Greg KH @ 2017-10-17 13:26 UTC (permalink / raw) To: Peter Rajnoha; +Cc: linux-kernel On Tue, Oct 17, 2017 at 12:04:41PM +0200, Peter Rajnoha wrote: > On 10/16/2017 04:49 PM, Greg KH wrote: > > On Mon, Oct 16, 2017 at 04:00:26PM +0200, Peter Rajnoha wrote: > >> On 10/16/2017 03:55 PM, Greg KH wrote: > >>> On Mon, Oct 16, 2017 at 02:35:44PM +0200, Peter Rajnoha wrote: > >>>> Record last uevent seqnum that was used with kobject's uevent and send > >>>> it with next uevent as PREV_SEQNUM=<num> variable. > >>>> > >>>> This enhances the way we are able to track and handle uevents in userspace > >>>> if we are trying to catch up with all the uevents that were generated > >>>> while we were not listening or the uevents which were not delivered. > >>>> This may happen if the userspace listener is not running yet when the > >>>> uevent is triggered or there's a period of time when it's not listening > >>>> (e.g. because the userspace listener is being restarted while a new > >>>> uevent fires). > >>>> > >>>> A userspace listener can compare the last uevent seqnum it knows about > >>>> with the last uevent seqnum the kernel reports through uevents delivered > >>>> to userspace, both genuine and synthetic ones (the ones generated by > >>>> writing uevent action name to /sys/.../uevent file). Then, if needed, > >>>> userspace may execute appropriate actions based on that. We don't need > >>>> to reevaluate and recheck all items, instead, we can concentrate only > >>>> on items for which we really and actually need to reflect changed state > >>>> in kernel while the userspace listener was not listening for uevents. > >>>> > >>>> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> > >>>> --- > >>>> include/linux/kobject.h | 1 + > >>>> lib/kobject.c | 1 + > >>>> lib/kobject_uevent.c | 15 +++++++++++++-- > >>>> 3 files changed, 15 insertions(+), 2 deletions(-) > >>>> > >>>> diff --git a/include/linux/kobject.h b/include/linux/kobject.h > >>>> index e0a6205caa71..ee6db28b6186 100644 > >>>> --- a/include/linux/kobject.h > >>>> +++ b/include/linux/kobject.h > >>>> @@ -73,6 +73,7 @@ struct kobject { > >>>> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE > >>>> struct delayed_work release; > >>>> #endif > >>>> + u64 last_uevent_seqnum; > >>>> unsigned int state_initialized:1; > >>>> unsigned int state_in_sysfs:1; > >>>> unsigned int state_add_uevent_sent:1; > >>>> diff --git a/lib/kobject.c b/lib/kobject.c > >>>> index 763d70a18941..2cc809f10ae7 100644 > >>>> --- a/lib/kobject.c > >>>> +++ b/lib/kobject.c > >>>> @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) > >>>> return; > >>>> kref_init(&kobj->kref); > >>>> INIT_LIST_HEAD(&kobj->entry); > >>>> + kobj->last_uevent_seqnum = 0; > >>>> kobj->state_in_sysfs = 0; > >>>> kobj->state_add_uevent_sent = 0; > >>>> kobj->state_remove_uevent_sent = 0; > >>>> diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c > >>>> index f237a09a5862..f1cbed06a395 100644 > >>>> --- a/lib/kobject_uevent.c > >>>> +++ b/lib/kobject_uevent.c > >>>> @@ -454,8 +454,19 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, > >>>> } > >>>> > >>>> mutex_lock(&uevent_sock_mutex); > >>>> - /* we will send an event, so request a new sequence number */ > >>>> - retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); > >>>> + /* > >>>> + * We will send an event, so request a new sequence number. > >>>> + * Also, record the number for next uevent's PREV_SEQNUM. > >>>> + */ > >>>> + kobj->last_uevent_seqnum = ++uevent_seqnum; > >>>> + retval = add_uevent_var(env, "SEQNUM=%llu", > >>>> + (unsigned long long)uevent_seqnum); > >>>> + if (retval) { > >>>> + mutext_unlock(&uevent_sock_mutex); > >>>> + goto exit; > >>>> + } > >>>> + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", > >>>> + (unsigned long long)kobj->last_uevent_seqnum); > >>> > >>> I'm confused, why are we storing "last_uevent_seqnum" here at all, if it > >>> is only just the current one-1? What else do you do with that value? > >> I'm sorry, I've sent incorrect patch by mistake. This is the correct one: > > > > Ok, this is a bit better (but it's not in a format I could apply it in, > > if I wanted to...) > > > >> From 91fa0d0f5c52b959a5ca4cc79d1a4f5970db75e2 Mon Sep 17 00:00:00 2001 > >> From: Peter Rajnoha <prajnoha@redhat.com> > >> Date: Thu, 12 Oct 2017 11:33:48 +0200 > >> Subject: [PATCH] kobject: record and send PREV_SEQNUM with uevents > >> > >> Record last uevent seqnum that was used with kobject's uevent and send > >> it with next uevent as PREV_SEQNUM=<num> variable. > >> > >> This enhances the way we are able to track and handle uevents in userspace > >> if we are trying to catch up with all the uevents that were generated > >> while we were not listening or the uevents which were not delivered. > >> This may happen if the userspace listener is not running yet when the > >> uevent is triggered or there's a period of time when it's not listening > >> (e.g. because the userspace listener is being restarted while a new > >> uevent fires). > >> > >> A userspace listener can compare the last uevent seqnum it knows about > >> with the last uevent seqnum the kernel reports through uevents delivered > >> to userspace, both genuine and synthetic ones (the ones generated by > >> writing uevent action name to /sys/.../uevent file). Then, if needed, > >> userspace may execute appropriate actions based on that. We don't need > >> to reevaluate and recheck all items, instead, we can concentrate only > >> on items for which we really and actually need to reflect changed state > >> in kernel while the userspace listener was not listening for uevents. > >> > >> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> > > > > I don't really understand what userspace can do with this. Why does it > > matter what the sequence number was for a specific kobject? What can > > happen because of this? > > > > Do you have any example libudev code changes to show this in action so I > > can maybe understand the need? > > > > Not a libudev code change for this, at least I don't have this in plan > at this moment. > > It's more related to the way the udev rules are processed and, in > general, the way any other uevent listener can react to uevents. So the > change is in how "end consumer" of the uevent does its processing. > > At the moment, the sequence number is used globally, we don't have > per-kobject sequence numbers. So we can't use current SEQNUM that is > attached to each uevent to tell whether we have missed previous > uevent(s) for concrete device or not, we just know that some uevents for > some devices were not processed. Therefore, when we're triggering > uevents from userspace (udevadm trigger), we need to reevaluate all the > rules for all devices, no matter if it's really needed or not. > > For example, when we're switching from initramfs to root fs, we stop the > first udevd instance which is in initramfs, we keep the database (at the > moment, only for selected devices where we know that it's safe to do so > - same rules used in initramfs as in root fs). Then we switch over to > root fs and start the udevd again and we trigger uevents to fully > refresh the state, taking also the existing database into account. > > Now, it would help if we knew whether there were any uevents issued > after we stopped udevd in initramfs and before we started a new one from > root fs. If there are no uevents in between, we don't need to reevaluate > all the rules, we can just keep the existing database content. This can > become prominent if we have more devices in the system. It helps to > minimize the number of execution during boot, the overall impact of > triggering uevents and related processing, hence also making it more > straightforward to watch the state of the device and possibly debug > related problems. > > The same principle applies to any uevent listener in userspace, not just > udevd: > > - recording last seen SEQNUM > > - when new instance of the listener is running > > - trigger synthetic uevents to refresh state > > - comparing the PREV_SEQNUM from uevent generated with stored SEQNUM: Why can't you just compare SEQNUM with the "last seen" SEQNUM, they should just be 1 apart, right? Isn't that what is done today? How will this differ from looking at PREV_SEQNUM? > - If they differ, refresh the state fully, considering the state > we know of as obsolete as there were already other uevents before we > haven't processed. This may include deeper scans which may take longer, > executing more helper tools to get the state of the device. But that is what "last seen" also shows. > - If the are same, reuse the state we already know about is > up-to-date (there were no uevents we haven't seen), don't reevaluate > everything. Testing for +1 should be the same. > With the new synthetic uevent interface (the extention to > /sys/.../uevent) that accepts arguments (UUID and key=value pairs), we > can make it possible to specify the nature of the triggered uevent, we > can mark it through the key=value pairs which the uevent listener in > userspace can check for, so if we can also: but that doesn't matter for PREV_SEQNUM. > - either refresh only the state for devices where we missed uevents > (that could be used during bootup, only to refresh the state for devices > where we have no current state yet), comparing current PREV_SEQNUM and > stored last SEQNUM in userspace > > - or to force full refresh if that's needed for a reason. > > Also, some devices may have more complex activation scheme than usual > "device addition" to the system. There can be a sequence of uevents > which are expected for a successful change of device state, so we have a > state machine that tracks uevents (as a confirmation from kernel for > userspace that we can hook on in udev rules and uevent listeners). And > here, it's very helpful to know that we have missed uevent(s) before or > not, either they were not delivered or the uevent listener itself has > been restarted and during this period, it may have missed uevents. > > In summary, it's adding more robustness to the uevent interface where we > can directly and quickly detect we missed the uevent or not for a > specific device and hence we can optimize udev rules/uevent listeners > based on that so we don't dully refresh all the state and reexecute all > rules and code for any triggered uevents. It's really not needed all the > time. I'd like to see real, working code that uses this new interface, showing how it is not possible to do what is needed without the PREV_SEQNUM, before considering adding this. Otherwise it's just stuff we have to drag around for forever. thanks, greg k-h ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kobject: record and send PREV_SEQNUM with uevents 2017-10-17 13:26 ` Greg KH @ 2017-10-18 7:21 ` Peter Rajnoha 2017-10-18 9:42 ` Peter Rajnoha 0 siblings, 1 reply; 8+ messages in thread From: Peter Rajnoha @ 2017-10-18 7:21 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel On 10/17/2017 03:26 PM, Greg KH wrote: > On Tue, Oct 17, 2017 at 12:04:41PM +0200, Peter Rajnoha wrote: >> On 10/16/2017 04:49 PM, Greg KH wrote: >>> On Mon, Oct 16, 2017 at 04:00:26PM +0200, Peter Rajnoha wrote: >>>> On 10/16/2017 03:55 PM, Greg KH wrote: >>>>> On Mon, Oct 16, 2017 at 02:35:44PM +0200, Peter Rajnoha wrote: >>>>>> Record last uevent seqnum that was used with kobject's uevent and send >>>>>> it with next uevent as PREV_SEQNUM=<num> variable. >>>>>> >>>>>> This enhances the way we are able to track and handle uevents in userspace >>>>>> if we are trying to catch up with all the uevents that were generated >>>>>> while we were not listening or the uevents which were not delivered. >>>>>> This may happen if the userspace listener is not running yet when the >>>>>> uevent is triggered or there's a period of time when it's not listening >>>>>> (e.g. because the userspace listener is being restarted while a new >>>>>> uevent fires). >>>>>> >>>>>> A userspace listener can compare the last uevent seqnum it knows about >>>>>> with the last uevent seqnum the kernel reports through uevents delivered >>>>>> to userspace, both genuine and synthetic ones (the ones generated by >>>>>> writing uevent action name to /sys/.../uevent file). Then, if needed, >>>>>> userspace may execute appropriate actions based on that. We don't need >>>>>> to reevaluate and recheck all items, instead, we can concentrate only >>>>>> on items for which we really and actually need to reflect changed state >>>>>> in kernel while the userspace listener was not listening for uevents. >>>>>> >>>>>> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> >>>>>> --- >>>>>> include/linux/kobject.h | 1 + >>>>>> lib/kobject.c | 1 + >>>>>> lib/kobject_uevent.c | 15 +++++++++++++-- >>>>>> 3 files changed, 15 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/include/linux/kobject.h b/include/linux/kobject.h >>>>>> index e0a6205caa71..ee6db28b6186 100644 >>>>>> --- a/include/linux/kobject.h >>>>>> +++ b/include/linux/kobject.h >>>>>> @@ -73,6 +73,7 @@ struct kobject { >>>>>> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE >>>>>> struct delayed_work release; >>>>>> #endif >>>>>> + u64 last_uevent_seqnum; >>>>>> unsigned int state_initialized:1; >>>>>> unsigned int state_in_sysfs:1; >>>>>> unsigned int state_add_uevent_sent:1; >>>>>> diff --git a/lib/kobject.c b/lib/kobject.c >>>>>> index 763d70a18941..2cc809f10ae7 100644 >>>>>> --- a/lib/kobject.c >>>>>> +++ b/lib/kobject.c >>>>>> @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) >>>>>> return; >>>>>> kref_init(&kobj->kref); >>>>>> INIT_LIST_HEAD(&kobj->entry); >>>>>> + kobj->last_uevent_seqnum = 0; >>>>>> kobj->state_in_sysfs = 0; >>>>>> kobj->state_add_uevent_sent = 0; >>>>>> kobj->state_remove_uevent_sent = 0; >>>>>> diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c >>>>>> index f237a09a5862..f1cbed06a395 100644 >>>>>> --- a/lib/kobject_uevent.c >>>>>> +++ b/lib/kobject_uevent.c >>>>>> @@ -454,8 +454,19 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, >>>>>> } >>>>>> >>>>>> mutex_lock(&uevent_sock_mutex); >>>>>> - /* we will send an event, so request a new sequence number */ >>>>>> - retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); >>>>>> + /* >>>>>> + * We will send an event, so request a new sequence number. >>>>>> + * Also, record the number for next uevent's PREV_SEQNUM. >>>>>> + */ >>>>>> + kobj->last_uevent_seqnum = ++uevent_seqnum; >>>>>> + retval = add_uevent_var(env, "SEQNUM=%llu", >>>>>> + (unsigned long long)uevent_seqnum); >>>>>> + if (retval) { >>>>>> + mutext_unlock(&uevent_sock_mutex); >>>>>> + goto exit; >>>>>> + } >>>>>> + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", >>>>>> + (unsigned long long)kobj->last_uevent_seqnum); >>>>> >>>>> I'm confused, why are we storing "last_uevent_seqnum" here at all, if it >>>>> is only just the current one-1? What else do you do with that value? >>>> I'm sorry, I've sent incorrect patch by mistake. This is the correct one: >>> >>> Ok, this is a bit better (but it's not in a format I could apply it in, >>> if I wanted to...) >>> >>>> From 91fa0d0f5c52b959a5ca4cc79d1a4f5970db75e2 Mon Sep 17 00:00:00 2001 >>>> From: Peter Rajnoha <prajnoha@redhat.com> >>>> Date: Thu, 12 Oct 2017 11:33:48 +0200 >>>> Subject: [PATCH] kobject: record and send PREV_SEQNUM with uevents >>>> >>>> Record last uevent seqnum that was used with kobject's uevent and send >>>> it with next uevent as PREV_SEQNUM=<num> variable. >>>> >>>> This enhances the way we are able to track and handle uevents in userspace >>>> if we are trying to catch up with all the uevents that were generated >>>> while we were not listening or the uevents which were not delivered. >>>> This may happen if the userspace listener is not running yet when the >>>> uevent is triggered or there's a period of time when it's not listening >>>> (e.g. because the userspace listener is being restarted while a new >>>> uevent fires). >>>> >>>> A userspace listener can compare the last uevent seqnum it knows about >>>> with the last uevent seqnum the kernel reports through uevents delivered >>>> to userspace, both genuine and synthetic ones (the ones generated by >>>> writing uevent action name to /sys/.../uevent file). Then, if needed, >>>> userspace may execute appropriate actions based on that. We don't need >>>> to reevaluate and recheck all items, instead, we can concentrate only >>>> on items for which we really and actually need to reflect changed state >>>> in kernel while the userspace listener was not listening for uevents. >>>> >>>> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> >>> >>> I don't really understand what userspace can do with this. Why does it >>> matter what the sequence number was for a specific kobject? What can >>> happen because of this? >>> >>> Do you have any example libudev code changes to show this in action so I >>> can maybe understand the need? >>> >> >> Not a libudev code change for this, at least I don't have this in plan >> at this moment. >> >> It's more related to the way the udev rules are processed and, in >> general, the way any other uevent listener can react to uevents. So the >> change is in how "end consumer" of the uevent does its processing. >> >> At the moment, the sequence number is used globally, we don't have >> per-kobject sequence numbers. So we can't use current SEQNUM that is >> attached to each uevent to tell whether we have missed previous >> uevent(s) for concrete device or not, we just know that some uevents for >> some devices were not processed. Therefore, when we're triggering >> uevents from userspace (udevadm trigger), we need to reevaluate all the >> rules for all devices, no matter if it's really needed or not. >> >> For example, when we're switching from initramfs to root fs, we stop the >> first udevd instance which is in initramfs, we keep the database (at the >> moment, only for selected devices where we know that it's safe to do so >> - same rules used in initramfs as in root fs). Then we switch over to >> root fs and start the udevd again and we trigger uevents to fully >> refresh the state, taking also the existing database into account. >> >> Now, it would help if we knew whether there were any uevents issued >> after we stopped udevd in initramfs and before we started a new one from >> root fs. If there are no uevents in between, we don't need to reevaluate >> all the rules, we can just keep the existing database content. This can >> become prominent if we have more devices in the system. It helps to >> minimize the number of execution during boot, the overall impact of >> triggering uevents and related processing, hence also making it more >> straightforward to watch the state of the device and possibly debug >> related problems. >> >> The same principle applies to any uevent listener in userspace, not just >> udevd: >> >> - recording last seen SEQNUM >> >> - when new instance of the listener is running >> >> - trigger synthetic uevents to refresh state >> >> - comparing the PREV_SEQNUM from uevent generated with stored SEQNUM: > > Why can't you just compare SEQNUM with the "last seen" SEQNUM, they > should just be 1 apart, right? Isn't that what is done today? How will > this differ from looking at PREV_SEQNUM? The SEQNUM is global, not per device (per kobject). It's 1 apart if you have only 1 device, but that doesn't apply if you have more devices with possible uevents interleaved. For example (just provoking uevents here for clear example): echo add > /sys/block/sda/uevent echo add > /sys/block/sda/uevent echo add > /sys/block/sdb/uevent echo add > /sys/block/sda/uevent Generates these uevents (I've removed uninteresting parts): KERNEL[45.739557] add ACTION=add DEVNAME=/dev/sda PREV_SEQNUM=2096 SEQNUM=2112 KERNEL[55.131767] add ACTION=add DEVNAME=/dev/sda PREV_SEQNUM=2112 SEQNUM=2113 SUBSYSTEM=block KERNEL[59.236131] add ACTION=add DEVNAME=/dev/sdb PREV_SEQNUM=2097 SEQNUM=2114 KERNEL[61.708551] add ACTION=add DEVNAME=/dev/sda PREV_SEQNUM=2113 SEQNUM=2115 Here, the PREV_SEQNUM is 1 apart with SEQNUM for the 2nd uevent (sda), because there was no other uevent (for a different device). But if you check the 4th uevent (sda) where previous one was generated for another device (sdb), the PREV_SEQNUM is not 1 apart. So in general, it may be "n" apart from previous SEQNUM for a *particular device*. Globally, for all devices, it is 1 apart, yes, but we don't care about that as that is not quite useful. So without PREV_SEQNUM, we can detect we have missed uevents, but we don't know for which devices exactly, so we need to refresh everything, not just the items we actually need. With PREV_SEQNUM we can identify exact devices. > >> - If they differ, refresh the state fully, considering the state >> we know of as obsolete as there were already other uevents before we >> haven't processed. This may include deeper scans which may take longer, >> executing more helper tools to get the state of the device. > > But that is what "last seen" also shows. Not per device, but only globally. So without PREV_SEQNUM it's either "refresh all or nothing". > >> - If the are same, reuse the state we already know about is >> up-to-date (there were no uevents we haven't seen), don't reevaluate >> everything. > > Testing for +1 should be the same. > >> With the new synthetic uevent interface (the extention to >> /sys/.../uevent) that accepts arguments (UUID and key=value pairs), we >> can make it possible to specify the nature of the triggered uevent, we >> can mark it through the key=value pairs which the uevent listener in >> userspace can check for, so if we can also: > > but that doesn't matter for PREV_SEQNUM. I've mentioned that only for completeness and in advance, for the ones who would argue that there can be a need to force full refresh for all devices even though it may not be needed based on PREV_SEQNUM which will cause selective refresh. So there's still a way to fallback to original behavior, if that's needed. > >> - either refresh only the state for devices where we missed uevents >> (that could be used during bootup, only to refresh the state for devices >> where we have no current state yet), comparing current PREV_SEQNUM and >> stored last SEQNUM in userspace >> >> - or to force full refresh if that's needed for a reason. >> >> Also, some devices may have more complex activation scheme than usual >> "device addition" to the system. There can be a sequence of uevents >> which are expected for a successful change of device state, so we have a >> state machine that tracks uevents (as a confirmation from kernel for >> userspace that we can hook on in udev rules and uevent listeners). And >> here, it's very helpful to know that we have missed uevent(s) before or >> not, either they were not delivered or the uevent listener itself has >> been restarted and during this period, it may have missed uevents. >> >> In summary, it's adding more robustness to the uevent interface where we >> can directly and quickly detect we missed the uevent or not for a >> specific device and hence we can optimize udev rules/uevent listeners >> based on that so we don't dully refresh all the state and reexecute all >> rules and code for any triggered uevents. It's really not needed all the >> time. > > I'd like to see real, working code that uses this new interface, showing > how it is not possible to do what is needed without the PREV_SEQNUM, > before considering adding this. Otherwise it's just stuff we have to > drag around for forever. > OK... well, it's about changing concrete rules to skip various calls, scans (e.g. blkid in case of block devices) and handling which is not necessary if we detect that the uevent is synthetic and it's just used to catch up with any missed uevents. In that case, we'll reuse existing records. The same applies to any uevent listener besides udevd and its records which are managed based on uevents. -- Peter ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] kobject: record and send PREV_SEQNUM with uevents 2017-10-18 7:21 ` Peter Rajnoha @ 2017-10-18 9:42 ` Peter Rajnoha 0 siblings, 0 replies; 8+ messages in thread From: Peter Rajnoha @ 2017-10-18 9:42 UTC (permalink / raw) To: Greg KH; +Cc: linux-kernel On 10/18/2017 09:21 AM, Peter Rajnoha wrote: > On 10/17/2017 03:26 PM, Greg KH wrote: >> On Tue, Oct 17, 2017 at 12:04:41PM +0200, Peter Rajnoha wrote: >>> On 10/16/2017 04:49 PM, Greg KH wrote: >>>> On Mon, Oct 16, 2017 at 04:00:26PM +0200, Peter Rajnoha wrote: >>>>> On 10/16/2017 03:55 PM, Greg KH wrote: >>>>>> On Mon, Oct 16, 2017 at 02:35:44PM +0200, Peter Rajnoha wrote: >>>>>>> Record last uevent seqnum that was used with kobject's uevent and send >>>>>>> it with next uevent as PREV_SEQNUM=<num> variable. >>>>>>> >>>>>>> This enhances the way we are able to track and handle uevents in userspace >>>>>>> if we are trying to catch up with all the uevents that were generated >>>>>>> while we were not listening or the uevents which were not delivered. >>>>>>> This may happen if the userspace listener is not running yet when the >>>>>>> uevent is triggered or there's a period of time when it's not listening >>>>>>> (e.g. because the userspace listener is being restarted while a new >>>>>>> uevent fires). >>>>>>> >>>>>>> A userspace listener can compare the last uevent seqnum it knows about >>>>>>> with the last uevent seqnum the kernel reports through uevents delivered >>>>>>> to userspace, both genuine and synthetic ones (the ones generated by >>>>>>> writing uevent action name to /sys/.../uevent file). Then, if needed, >>>>>>> userspace may execute appropriate actions based on that. We don't need >>>>>>> to reevaluate and recheck all items, instead, we can concentrate only >>>>>>> on items for which we really and actually need to reflect changed state >>>>>>> in kernel while the userspace listener was not listening for uevents. >>>>>>> >>>>>>> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> >>>>>>> --- >>>>>>> include/linux/kobject.h | 1 + >>>>>>> lib/kobject.c | 1 + >>>>>>> lib/kobject_uevent.c | 15 +++++++++++++-- >>>>>>> 3 files changed, 15 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/include/linux/kobject.h b/include/linux/kobject.h >>>>>>> index e0a6205caa71..ee6db28b6186 100644 >>>>>>> --- a/include/linux/kobject.h >>>>>>> +++ b/include/linux/kobject.h >>>>>>> @@ -73,6 +73,7 @@ struct kobject { >>>>>>> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE >>>>>>> struct delayed_work release; >>>>>>> #endif >>>>>>> + u64 last_uevent_seqnum; >>>>>>> unsigned int state_initialized:1; >>>>>>> unsigned int state_in_sysfs:1; >>>>>>> unsigned int state_add_uevent_sent:1; >>>>>>> diff --git a/lib/kobject.c b/lib/kobject.c >>>>>>> index 763d70a18941..2cc809f10ae7 100644 >>>>>>> --- a/lib/kobject.c >>>>>>> +++ b/lib/kobject.c >>>>>>> @@ -190,6 +190,7 @@ static void kobject_init_internal(struct kobject *kobj) >>>>>>> return; >>>>>>> kref_init(&kobj->kref); >>>>>>> INIT_LIST_HEAD(&kobj->entry); >>>>>>> + kobj->last_uevent_seqnum = 0; >>>>>>> kobj->state_in_sysfs = 0; >>>>>>> kobj->state_add_uevent_sent = 0; >>>>>>> kobj->state_remove_uevent_sent = 0; >>>>>>> diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c >>>>>>> index f237a09a5862..f1cbed06a395 100644 >>>>>>> --- a/lib/kobject_uevent.c >>>>>>> +++ b/lib/kobject_uevent.c >>>>>>> @@ -454,8 +454,19 @@ int kobject_uevent_env(struct kobject *kobj, enum kobject_action action, >>>>>>> } >>>>>>> >>>>>>> mutex_lock(&uevent_sock_mutex); >>>>>>> - /* we will send an event, so request a new sequence number */ >>>>>>> - retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum); >>>>>>> + /* >>>>>>> + * We will send an event, so request a new sequence number. >>>>>>> + * Also, record the number for next uevent's PREV_SEQNUM. >>>>>>> + */ >>>>>>> + kobj->last_uevent_seqnum = ++uevent_seqnum; >>>>>>> + retval = add_uevent_var(env, "SEQNUM=%llu", >>>>>>> + (unsigned long long)uevent_seqnum); >>>>>>> + if (retval) { >>>>>>> + mutext_unlock(&uevent_sock_mutex); >>>>>>> + goto exit; >>>>>>> + } >>>>>>> + retval = add_uevent_var(env, "PREV_SEQNUM=%llu", >>>>>>> + (unsigned long long)kobj->last_uevent_seqnum); >>>>>> >>>>>> I'm confused, why are we storing "last_uevent_seqnum" here at all, if it >>>>>> is only just the current one-1? What else do you do with that value? >>>>> I'm sorry, I've sent incorrect patch by mistake. This is the correct one: >>>> >>>> Ok, this is a bit better (but it's not in a format I could apply it in, >>>> if I wanted to...) >>>> >>>>> From 91fa0d0f5c52b959a5ca4cc79d1a4f5970db75e2 Mon Sep 17 00:00:00 2001 >>>>> From: Peter Rajnoha <prajnoha@redhat.com> >>>>> Date: Thu, 12 Oct 2017 11:33:48 +0200 >>>>> Subject: [PATCH] kobject: record and send PREV_SEQNUM with uevents >>>>> >>>>> Record last uevent seqnum that was used with kobject's uevent and send >>>>> it with next uevent as PREV_SEQNUM=<num> variable. >>>>> >>>>> This enhances the way we are able to track and handle uevents in userspace >>>>> if we are trying to catch up with all the uevents that were generated >>>>> while we were not listening or the uevents which were not delivered. >>>>> This may happen if the userspace listener is not running yet when the >>>>> uevent is triggered or there's a period of time when it's not listening >>>>> (e.g. because the userspace listener is being restarted while a new >>>>> uevent fires). >>>>> >>>>> A userspace listener can compare the last uevent seqnum it knows about >>>>> with the last uevent seqnum the kernel reports through uevents delivered >>>>> to userspace, both genuine and synthetic ones (the ones generated by >>>>> writing uevent action name to /sys/.../uevent file). Then, if needed, >>>>> userspace may execute appropriate actions based on that. We don't need >>>>> to reevaluate and recheck all items, instead, we can concentrate only >>>>> on items for which we really and actually need to reflect changed state >>>>> in kernel while the userspace listener was not listening for uevents. >>>>> >>>>> Signed-off-by: Peter Rajnoha <prajnoha@redhat.com> >>>> >>>> I don't really understand what userspace can do with this. Why does it >>>> matter what the sequence number was for a specific kobject? What can >>>> happen because of this? >>>> >>>> Do you have any example libudev code changes to show this in action so I >>>> can maybe understand the need? >>>> >>> >>> Not a libudev code change for this, at least I don't have this in plan >>> at this moment. >>> >>> It's more related to the way the udev rules are processed and, in >>> general, the way any other uevent listener can react to uevents. So the >>> change is in how "end consumer" of the uevent does its processing. >>> >>> At the moment, the sequence number is used globally, we don't have >>> per-kobject sequence numbers. So we can't use current SEQNUM that is >>> attached to each uevent to tell whether we have missed previous >>> uevent(s) for concrete device or not, we just know that some uevents for >>> some devices were not processed. Therefore, when we're triggering >>> uevents from userspace (udevadm trigger), we need to reevaluate all the >>> rules for all devices, no matter if it's really needed or not. >>> >>> For example, when we're switching from initramfs to root fs, we stop the >>> first udevd instance which is in initramfs, we keep the database (at the >>> moment, only for selected devices where we know that it's safe to do so >>> - same rules used in initramfs as in root fs). Then we switch over to >>> root fs and start the udevd again and we trigger uevents to fully >>> refresh the state, taking also the existing database into account. >>> >>> Now, it would help if we knew whether there were any uevents issued >>> after we stopped udevd in initramfs and before we started a new one from >>> root fs. If there are no uevents in between, we don't need to reevaluate >>> all the rules, we can just keep the existing database content. This can >>> become prominent if we have more devices in the system. It helps to >>> minimize the number of execution during boot, the overall impact of >>> triggering uevents and related processing, hence also making it more >>> straightforward to watch the state of the device and possibly debug >>> related problems. >>> >>> The same principle applies to any uevent listener in userspace, not just >>> udevd: >>> >>> - recording last seen SEQNUM >>> >>> - when new instance of the listener is running >>> >>> - trigger synthetic uevents to refresh state >>> >>> - comparing the PREV_SEQNUM from uevent generated with stored SEQNUM: >> >> Why can't you just compare SEQNUM with the "last seen" SEQNUM, they >> should just be 1 apart, right? Isn't that what is done today? How will >> this differ from looking at PREV_SEQNUM? > > The SEQNUM is global, not per device (per kobject). It's 1 apart if you > have only 1 device, but that doesn't apply if you have more devices with > possible uevents interleaved. > > For example (just provoking uevents here for clear example): > > echo add > /sys/block/sda/uevent > echo add > /sys/block/sda/uevent > echo add > /sys/block/sdb/uevent > echo add > /sys/block/sda/uevent > > Generates these uevents (I've removed uninteresting parts): > > KERNEL[45.739557] add > ACTION=add > DEVNAME=/dev/sda > PREV_SEQNUM=2096 > SEQNUM=2112 > > KERNEL[55.131767] add > ACTION=add > DEVNAME=/dev/sda > PREV_SEQNUM=2112 > SEQNUM=2113 > SUBSYSTEM=block > > KERNEL[59.236131] add > ACTION=add > DEVNAME=/dev/sdb > PREV_SEQNUM=2097 > SEQNUM=2114 > > KERNEL[61.708551] add > ACTION=add > DEVNAME=/dev/sda > PREV_SEQNUM=2113 > SEQNUM=2115 > > Here, the PREV_SEQNUM is 1 apart with SEQNUM for the 2nd uevent (sda), > because there was no other uevent (for a different device). But if you > check the 4th uevent (sda) where previous one was generated for another > device (sdb), the PREV_SEQNUM is not 1 apart. So in general, it may be > "n" apart from previous SEQNUM for a *particular device*. Globally, for > all devices, it is 1 apart, yes, but we don't care about that as that is > not quite useful. > > So without PREV_SEQNUM, we can detect we have missed uevents, but we > don't know for which devices exactly, so we need to refresh everything, > not just the items we actually need. With PREV_SEQNUM we can identify > exact devices. Also, without PREV_SEQNUM, to tell at least whether I have missed some uevents, I need to check /sys/kernel/uevent_seqnum and then compare with what I have in userspace as "last seen seqnum". But that is a race - after reading the uevent_seqnum file, more uevents may trigger... I wouldn't need to read the uevent_seqnum file though if I just read the SEQNUM directly from the uevent. But then, if I detected the SEQNUM is not good (it's not the "last know seqnum" + 1), I'd need to trigger the full refresh for all devices while processing that one particular uevent, and that's becoming messy. While with the PREV_SEQNUM, we don't need to care, we just trigger the synthetic uevents whenever we need and we let the check to be done *per each device* while processing those triggered uevents: - if PREV_SEQNUM matches with what we have in records as "last seen" for that particular device, we skip the uevent (we keep the old records as they don't need any refresh) - if PREV_SEQNUM doesn't match, we refresh ONLY that one device So it's also cleaner way to deal with refreshes, besides the fact it's done per-device, not the "all devices or nothing" like it is with having only the global seqnum view. -- Peter ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2017-10-18 9:42 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-10-16 12:35 [PATCH] kobject: record and send PREV_SEQNUM with uevents Peter Rajnoha 2017-10-16 13:55 ` Greg KH 2017-10-16 14:00 ` Peter Rajnoha 2017-10-16 14:49 ` Greg KH 2017-10-17 10:04 ` Peter Rajnoha 2017-10-17 13:26 ` Greg KH 2017-10-18 7:21 ` Peter Rajnoha 2017-10-18 9:42 ` Peter Rajnoha
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.