All of lore.kernel.org
 help / color / mirror / Atom feed
* kernfs: can read/write method grow buffer size?
@ 2019-03-29  3:09 Marek Behun
  2019-03-29  6:22 ` Greg Kroah-Hartman
  0 siblings, 1 reply; 10+ messages in thread
From: Marek Behun @ 2019-03-29  3:09 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Tejun Heo
  Cc: linux-kernel, Pavel Machek, Jacek Anaszewski

Hello Tejun and Greg,

kernfs_fop_open/read/write allocates a buffer for the ->read, ->write,
or ->seq_read methods. This buffer is either preallocated or allocated
on the spot, with minimum size being PAGE_SIZE, if ->atomic_write_len
is not given.

There is a question/problem currently in the led-trigger API, that the
PAGE_SIZE buffer can in some specific scenarios be too short.
(The trigger file on read returns space separated list of all supported
triggers, and the currently chosen one is marked specially. The cpu
activity trigger lists "cpu%i" for all CPU cores, which actually broke
on some machines with very large number of CPUs. Granted, this could
have been solved another way (and maybe will be), but we are now
discussing API for HW LED triggers, which can raise the problem anyway,
if a specific LED controller supports too many HW LED triggers.)

Is it allowed to grow this buffer if needed, either via krealloc or by
creating a special function in kernfs API which does this so that
led-trigger could use it?

Or is this completely forbidden?

Thank you.

Marek

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29  3:09 kernfs: can read/write method grow buffer size? Marek Behun
@ 2019-03-29  6:22 ` Greg Kroah-Hartman
  2019-03-29  6:51   ` Marek Behun
  2019-03-29  8:34   ` Pavel Machek
  0 siblings, 2 replies; 10+ messages in thread
From: Greg Kroah-Hartman @ 2019-03-29  6:22 UTC (permalink / raw)
  To: Marek Behun; +Cc: Tejun Heo, linux-kernel, Pavel Machek, Jacek Anaszewski

On Fri, Mar 29, 2019 at 04:09:22AM +0100, Marek Behun wrote:
> Hello Tejun and Greg,
> 
> kernfs_fop_open/read/write allocates a buffer for the ->read, ->write,
> or ->seq_read methods. This buffer is either preallocated or allocated
> on the spot, with minimum size being PAGE_SIZE, if ->atomic_write_len
> is not given.
> 
> There is a question/problem currently in the led-trigger API, that the
> PAGE_SIZE buffer can in some specific scenarios be too short.

And that file is in sysfs?  That's a huge abuse of the sysfs api then :(

> (The trigger file on read returns space separated list of all supported
> triggers, and the currently chosen one is marked specially. The cpu
> activity trigger lists "cpu%i" for all CPU cores, which actually broke
> on some machines with very large number of CPUs. Granted, this could
> have been solved another way (and maybe will be), but we are now
> discussing API for HW LED triggers, which can raise the problem anyway,
> if a specific LED controller supports too many HW LED triggers.)
> 
> Is it allowed to grow this buffer if needed, either via krealloc or by
> creating a special function in kernfs API which does this so that
> led-trigger could use it?
> 
> Or is this completely forbidden?

If this is just for kernfs, and you have your own filesystem, sure, we
can probably do something here.  But if this is for sysfs, no, you all
need to keep to the "one value per file" rule that we have there please.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29  6:22 ` Greg Kroah-Hartman
@ 2019-03-29  6:51   ` Marek Behun
  2019-03-29  6:58     ` Greg Kroah-Hartman
  2019-03-29  8:34   ` Pavel Machek
  1 sibling, 1 reply; 10+ messages in thread
From: Marek Behun @ 2019-03-29  6:51 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Tejun Heo, linux-kernel, Pavel Machek, Jacek Anaszewski

> If this is just for kernfs, and you have your own filesystem, sure, we
> can probably do something here.  But if this is for sysfs, no, you all
> need to keep to the "one value per file" rule that we have there
> please.

Greg, the file satisfies the "one value per file" rule. That value is
the currently selected trigger. Writing a trigger name changes this
value. It is just that reading also prints all the supported triggers.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29  6:51   ` Marek Behun
@ 2019-03-29  6:58     ` Greg Kroah-Hartman
  0 siblings, 0 replies; 10+ messages in thread
From: Greg Kroah-Hartman @ 2019-03-29  6:58 UTC (permalink / raw)
  To: Marek Behun; +Cc: Tejun Heo, linux-kernel, Pavel Machek, Jacek Anaszewski

On Fri, Mar 29, 2019 at 07:51:07AM +0100, Marek Behun wrote:
> > If this is just for kernfs, and you have your own filesystem, sure, we
> > can probably do something here.  But if this is for sysfs, no, you all
> > need to keep to the "one value per file" rule that we have there
> > please.
> 
> Greg, the file satisfies the "one value per file" rule. That value is
> the currently selected trigger. Writing a trigger name changes this
> value. It is just that reading also prints all the supported triggers.

"all supported triggers" is not a "single value" :)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29  6:22 ` Greg Kroah-Hartman
  2019-03-29  6:51   ` Marek Behun
@ 2019-03-29  8:34   ` Pavel Machek
  2019-03-29  8:48     ` Marek Behun
  1 sibling, 1 reply; 10+ messages in thread
From: Pavel Machek @ 2019-03-29  8:34 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Marek Behun, Tejun Heo, linux-kernel, Jacek Anaszewski

[-- Attachment #1: Type: text/plain, Size: 1586 bytes --]

On Fri 2019-03-29 07:22:53, Greg Kroah-Hartman wrote:
> On Fri, Mar 29, 2019 at 04:09:22AM +0100, Marek Behun wrote:
> > Hello Tejun and Greg,
> > 
> > kernfs_fop_open/read/write allocates a buffer for the ->read, ->write,
> > or ->seq_read methods. This buffer is either preallocated or allocated
> > on the spot, with minimum size being PAGE_SIZE, if ->atomic_write_len
> > is not given.
> > 
> > There is a question/problem currently in the led-trigger API, that the
> > PAGE_SIZE buffer can in some specific scenarios be too short.
> 
> And that file is in sysfs?  That's a huge abuse of the sysfs api
> then :(

Yes, that's how we do selection from the list in sysfs, and
led-trigger is not alone there:

pavel@amd:~/cip$ cat /sys/power/state
freeze mem disk
pavel@amd:~/cip$ cat /sys/class/leds/phy0-led/trigger
none bluetooth-power rfkill-any rfkill-none kbd-scrolllock kbd-numlock
kbd-capslock kbd-kanalock kbd-shiftlock kbd-altgrlock kbd-ctrllock
kbd-altlock kbd-shiftllock kbd-shiftrlock kbd-ctrlllock kbd-ctrlrlock
AC-online BAT0-charging-or-full BAT0-charging BAT0-full
BAT0-charging-blink-full-solid rfkill0 phy0rx phy0tx phy0assoc
phy0radio [phy0tpt] mmc0 timer heartbeat audio-mute audio-micmute
rfkill1 hci0-power rfkill8
pavel@amd:~/cip$

Now problem is that list grew long over time... and it is too late to
change the API now (and alternatives seem worse than present
solution).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29  8:34   ` Pavel Machek
@ 2019-03-29  8:48     ` Marek Behun
  2019-03-29 10:24       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 10+ messages in thread
From: Marek Behun @ 2019-03-29  8:48 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Greg Kroah-Hartman, Tejun Heo, linux-kernel, Jacek Anaszewski

> pavel@amd:~/cip$ cat /sys/power/state
> freeze mem disk
> pavel@amd:~/cip$ cat /sys/class/leds/phy0-led/trigger
> none bluetooth-power rfkill-any rfkill-none kbd-scrolllock kbd-numlock
> kbd-capslock kbd-kanalock kbd-shiftlock kbd-altgrlock kbd-ctrllock
> kbd-altlock kbd-shiftllock kbd-shiftrlock kbd-ctrlllock kbd-ctrlrlock
> AC-online BAT0-charging-or-full BAT0-charging BAT0-full
> BAT0-charging-blink-full-solid rfkill0 phy0rx phy0tx phy0assoc
> phy0radio [phy0tpt] mmc0 timer heartbeat audio-mute audio-micmute
> rfkill1 hci0-power rfkill8
> pavel@amd:~/cip$
> 

Yes, and cpufreq governors too list available governosrs as space
separated list.
Maybe the "one value per file" rule was thought-of only after these
were merged?

Marek

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29  8:48     ` Marek Behun
@ 2019-03-29 10:24       ` Greg Kroah-Hartman
  2019-03-29 10:26         ` Greg Kroah-Hartman
  2019-03-29 10:32         ` Pavel Machek
  0 siblings, 2 replies; 10+ messages in thread
From: Greg Kroah-Hartman @ 2019-03-29 10:24 UTC (permalink / raw)
  To: Marek Behun; +Cc: Pavel Machek, Tejun Heo, linux-kernel, Jacek Anaszewski

On Fri, Mar 29, 2019 at 09:48:23AM +0100, Marek Behun wrote:
> > pavel@amd:~/cip$ cat /sys/power/state
> > freeze mem disk
> > pavel@amd:~/cip$ cat /sys/class/leds/phy0-led/trigger
> > none bluetooth-power rfkill-any rfkill-none kbd-scrolllock kbd-numlock
> > kbd-capslock kbd-kanalock kbd-shiftlock kbd-altgrlock kbd-ctrllock
> > kbd-altlock kbd-shiftllock kbd-shiftrlock kbd-ctrlllock kbd-ctrlrlock
> > AC-online BAT0-charging-or-full BAT0-charging BAT0-full
> > BAT0-charging-blink-full-solid rfkill0 phy0rx phy0tx phy0assoc
> > phy0radio [phy0tpt] mmc0 timer heartbeat audio-mute audio-micmute
> > rfkill1 hci0-power rfkill8
> > pavel@amd:~/cip$
> > 
> 
> Yes, and cpufreq governors too list available governosrs as space
> separated list.
> Maybe the "one value per file" rule was thought-of only after these
> were merged?

For small numbers of things, like /sys/power/state, which was the first
file to use this style, it was fine as we "knew" this was going to be a
small, well-bounded list of states that the file could be in.

As you have seen, 'trigger' is not that, and I am pretty sure I have
complained about this in the past.

I suggest you use a different way of "discovering" what types of
triggers are available.  I don't know what would work best for you, but
any time you are ever worried about the size of a sysfs file's buffer,
you know you are doing something wrong.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29 10:24       ` Greg Kroah-Hartman
@ 2019-03-29 10:26         ` Greg Kroah-Hartman
  2019-03-29 10:38           ` Pavel Machek
  2019-03-29 10:32         ` Pavel Machek
  1 sibling, 1 reply; 10+ messages in thread
From: Greg Kroah-Hartman @ 2019-03-29 10:26 UTC (permalink / raw)
  To: Marek Behun; +Cc: Pavel Machek, Tejun Heo, linux-kernel, Jacek Anaszewski

On Fri, Mar 29, 2019 at 11:24:36AM +0100, Greg Kroah-Hartman wrote:
> On Fri, Mar 29, 2019 at 09:48:23AM +0100, Marek Behun wrote:
> > > pavel@amd:~/cip$ cat /sys/power/state
> > > freeze mem disk
> > > pavel@amd:~/cip$ cat /sys/class/leds/phy0-led/trigger
> > > none bluetooth-power rfkill-any rfkill-none kbd-scrolllock kbd-numlock
> > > kbd-capslock kbd-kanalock kbd-shiftlock kbd-altgrlock kbd-ctrllock
> > > kbd-altlock kbd-shiftllock kbd-shiftrlock kbd-ctrlllock kbd-ctrlrlock
> > > AC-online BAT0-charging-or-full BAT0-charging BAT0-full
> > > BAT0-charging-blink-full-solid rfkill0 phy0rx phy0tx phy0assoc
> > > phy0radio [phy0tpt] mmc0 timer heartbeat audio-mute audio-micmute
> > > rfkill1 hci0-power rfkill8
> > > pavel@amd:~/cip$
> > > 
> > 
> > Yes, and cpufreq governors too list available governosrs as space
> > separated list.
> > Maybe the "one value per file" rule was thought-of only after these
> > were merged?
> 
> For small numbers of things, like /sys/power/state, which was the first
> file to use this style, it was fine as we "knew" this was going to be a
> small, well-bounded list of states that the file could be in.
> 
> As you have seen, 'trigger' is not that, and I am pretty sure I have
> complained about this in the past.
> 
> I suggest you use a different way of "discovering" what types of
> triggers are available.  I don't know what would work best for you, but
> any time you are ever worried about the size of a sysfs file's buffer,
> you know you are doing something wrong.

Ok, while writing this out, I realized that to keep things still
working, and to enable you to have an unlimited list of triggers, why
not just turn the file into a binary sysfs file?

Yes, that's not what binary sysfs files are for, they are supposed to be
only used for data that is "pass through" from userspace to hardware,
where the kernel does not touch the information at all.  But I'm willing
to give you an exception here as long as you document the heck out of it
in the code itself, saying that no one else should ever copy this way of
doing things again.

Would that work?

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29 10:24       ` Greg Kroah-Hartman
  2019-03-29 10:26         ` Greg Kroah-Hartman
@ 2019-03-29 10:32         ` Pavel Machek
  1 sibling, 0 replies; 10+ messages in thread
From: Pavel Machek @ 2019-03-29 10:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Marek Behun, Tejun Heo, linux-kernel, Jacek Anaszewski

> On Fri, Mar 29, 2019 at 09:48:23AM +0100, Marek Behun wrote:
> > > pavel@amd:~/cip$ cat /sys/power/state
> > > freeze mem disk
> > > pavel@amd:~/cip$ cat /sys/class/leds/phy0-led/trigger
> > > none bluetooth-power rfkill-any rfkill-none kbd-scrolllock kbd-numlock
> > > kbd-capslock kbd-kanalock kbd-shiftlock kbd-altgrlock kbd-ctrllock
> > > kbd-altlock kbd-shiftllock kbd-shiftrlock kbd-ctrlllock kbd-ctrlrlock
> > > AC-online BAT0-charging-or-full BAT0-charging BAT0-full
> > > BAT0-charging-blink-full-solid rfkill0 phy0rx phy0tx phy0assoc
> > > phy0radio [phy0tpt] mmc0 timer heartbeat audio-mute audio-micmute
> > > rfkill1 hci0-power rfkill8
> > > pavel@amd:~/cip$
> > > 
> > 
> > Yes, and cpufreq governors too list available governosrs as space
> > separated list.
> > Maybe the "one value per file" rule was thought-of only after these
> > were merged?
> 
> For small numbers of things, like /sys/power/state, which was the first
> file to use this style, it was fine as we "knew" this was going to be a
> small, well-bounded list of states that the file could be in.
> 
> As you have seen, 'trigger' is not that, and I am pretty sure I have
> complained about this in the past.
> 
> I suggest you use a different way of "discovering" what types of
> triggers are available.  I don't know what would work best for you, but
> any time you are ever worried about the size of a sysfs file's buffer,
> you know you are doing something wrong.

Are we doing something wrong?

I don't think so. It looks like sysfs has arbitrary limit it should
not have.

Can we get that fixed? Because userland already knows about this
interface, it is one-type-of-value-per-file, and just removing the
limit seems to be the best way forward.

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: kernfs: can read/write method grow buffer size?
  2019-03-29 10:26         ` Greg Kroah-Hartman
@ 2019-03-29 10:38           ` Pavel Machek
  0 siblings, 0 replies; 10+ messages in thread
From: Pavel Machek @ 2019-03-29 10:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Marek Behun, Tejun Heo, linux-kernel, Jacek Anaszewski

Hi!

> > > Yes, and cpufreq governors too list available governosrs as space
> > > separated list.
> > > Maybe the "one value per file" rule was thought-of only after these
> > > were merged?
> > 
> > For small numbers of things, like /sys/power/state, which was the first
> > file to use this style, it was fine as we "knew" this was going to be a
> > small, well-bounded list of states that the file could be in.
> > 
> > As you have seen, 'trigger' is not that, and I am pretty sure I have
> > complained about this in the past.
> > 
> > I suggest you use a different way of "discovering" what types of
> > triggers are available.  I don't know what would work best for you, but
> > any time you are ever worried about the size of a sysfs file's buffer,
> > you know you are doing something wrong.
> 
> Ok, while writing this out, I realized that to keep things still
> working, and to enable you to have an unlimited list of triggers, why
> not just turn the file into a binary sysfs file?
> 
> Yes, that's not what binary sysfs files are for, they are supposed to be
> only used for data that is "pass through" from userspace to hardware,
> where the kernel does not touch the information at all.  But I'm willing
> to give you an exception here as long as you document the heck out of it
> in the code itself, saying that no one else should ever copy this way of
> doing things again.
> 
> Would that work?

That would work, I guess.

We'll also want to limit number of triggers in LEDs -- there should be
one trigger "cpu" with parameters, not 1024 triggers "cpu0"
.. "cpu1023". But that will take some time to sort out.

Best regards,

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-03-29 10:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-29  3:09 kernfs: can read/write method grow buffer size? Marek Behun
2019-03-29  6:22 ` Greg Kroah-Hartman
2019-03-29  6:51   ` Marek Behun
2019-03-29  6:58     ` Greg Kroah-Hartman
2019-03-29  8:34   ` Pavel Machek
2019-03-29  8:48     ` Marek Behun
2019-03-29 10:24       ` Greg Kroah-Hartman
2019-03-29 10:26         ` Greg Kroah-Hartman
2019-03-29 10:38           ` Pavel Machek
2019-03-29 10:32         ` Pavel Machek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.