* counting file descriptors with a cgroup controller [not found] <CGME20170217093725eucas1p12478baf297d25303f3020f4973fbf3b0@eucas1p1.samsung.com> @ 2017-02-17 9:37 ` Łukasz Stelmach 2017-02-17 11:37 ` Krzysztof Opasiak 0 siblings, 1 reply; 17+ messages in thread From: Łukasz Stelmach @ 2017-02-17 9:37 UTC (permalink / raw) To: linux-kernel; +Cc: Krzysztof Opasiak, Karol Lewandowski [-- Attachment #1: Type: text/plain, Size: 786 bytes --] Hi, We need to limit and monitor the number of file descriptors processes keep open. If a process exceeds certain limit we'd like to terminate it and restart it or reboot the whole system. Currently the RLIMIT API allows limiting the number of file descriptors but to achieve our goals we'd need to make sure all programmes we run handle EMFILE errno properly. That is why we consider developing a cgroup controller that limits the number of open file descriptors of its members (similar to memory controler). Any comments? Is there any alternative that: + does not require modifications of user-land code, + enables other process (e.g. init) to be notified and apply policy. Kind regards, -- Łukasz Stelmach Samsung R&D Institute Poland Samsung Electronics [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 472 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller 2017-02-17 9:37 ` counting file descriptors with a cgroup controller Łukasz Stelmach @ 2017-02-17 11:37 ` Krzysztof Opasiak 2017-03-06 18:58 ` Tejun Heo 0 siblings, 1 reply; 17+ messages in thread From: Krzysztof Opasiak @ 2017-02-17 11:37 UTC (permalink / raw) To: tj, lizefan, hannes Cc: Łukasz Stelmach, linux-kernel, Karol Lewandowski, cgroups + cgroups mailing list + cgroup maintainers On 02/17/2017 10:37 AM, Łukasz Stelmach wrote: > Hi, > > We need to limit and monitor the number of file descriptors processes > keep open. If a process exceeds certain limit we'd like to terminate it > and restart it or reboot the whole system. Currently the RLIMIT API > allows limiting the number of file descriptors but to achieve our goals > we'd need to make sure all programmes we run handle EMFILE errno > properly. That is why we consider developing a cgroup controller that > limits the number of open file descriptors of its members (similar to > memory controler). > > Any comments? Is there any alternative that: > > + does not require modifications of user-land code, > + enables other process (e.g. init) to be notified and apply policy. > > Kind regards, > -- Krzysztof Opasiak Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-06 18:58 ` Tejun Heo 0 siblings, 0 replies; 17+ messages in thread From: Tejun Heo @ 2017-03-06 18:58 UTC (permalink / raw) To: Krzysztof Opasiak Cc: lizefan, hannes, Łukasz Stelmach, linux-kernel, Karol Lewandowski, cgroups Hello, On Fri, Feb 17, 2017 at 12:37:11PM +0100, Krzysztof Opasiak wrote: > > We need to limit and monitor the number of file descriptors processes > > keep open. If a process exceeds certain limit we'd like to terminate it > > and restart it or reboot the whole system. Currently the RLIMIT API > > allows limiting the number of file descriptors but to achieve our goals > > we'd need to make sure all programmes we run handle EMFILE errno > > properly. That is why we consider developing a cgroup controller that > > limits the number of open file descriptors of its members (similar to > > memory controler). > > > > Any comments? Is there any alternative that: > > > > + does not require modifications of user-land code, > > + enables other process (e.g. init) to be notified and apply policy. Hmm... I'm not quite sure fds qualify as an independent system-wide resource. We did that for pids because pids are globally limited and can run out way earlier than memory backing it. I don't think we have similar restructions for fds, do we? Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-06 18:58 ` Tejun Heo 0 siblings, 0 replies; 17+ messages in thread From: Tejun Heo @ 2017-03-06 18:58 UTC (permalink / raw) To: Krzysztof Opasiak Cc: lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Łukasz Stelmach, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Karol Lewandowski, cgroups-u79uwXL29TY76Z2rM5mHXA Hello, On Fri, Feb 17, 2017 at 12:37:11PM +0100, Krzysztof Opasiak wrote: > > We need to limit and monitor the number of file descriptors processes > > keep open. If a process exceeds certain limit we'd like to terminate it > > and restart it or reboot the whole system. Currently the RLIMIT API > > allows limiting the number of file descriptors but to achieve our goals > > we'd need to make sure all programmes we run handle EMFILE errno > > properly. That is why we consider developing a cgroup controller that > > limits the number of open file descriptors of its members (similar to > > memory controler). > > > > Any comments? Is there any alternative that: > > > > + does not require modifications of user-land code, > > + enables other process (e.g. init) to be notified and apply policy. Hmm... I'm not quite sure fds qualify as an independent system-wide resource. We did that for pids because pids are globally limited and can run out way earlier than memory backing it. I don't think we have similar restructions for fds, do we? Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-07 11:19 ` Krzysztof Opasiak 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Opasiak @ 2017-03-07 11:19 UTC (permalink / raw) To: Tejun Heo Cc: lizefan, hannes, Łukasz Stelmach, linux-kernel, Karol Lewandowski, cgroups Hi On 03/06/2017 07:58 PM, Tejun Heo wrote: > Hello, > > On Fri, Feb 17, 2017 at 12:37:11PM +0100, Krzysztof Opasiak wrote: >>> We need to limit and monitor the number of file descriptors processes >>> keep open. If a process exceeds certain limit we'd like to terminate it >>> and restart it or reboot the whole system. Currently the RLIMIT API >>> allows limiting the number of file descriptors but to achieve our goals >>> we'd need to make sure all programmes we run handle EMFILE errno >>> properly. That is why we consider developing a cgroup controller that >>> limits the number of open file descriptors of its members (similar to >>> memory controler). >>> >>> Any comments? Is there any alternative that: >>> >>> + does not require modifications of user-land code, >>> + enables other process (e.g. init) to be notified and apply policy. > > Hmm... I'm not quite sure fds qualify as an independent system-wide > resource. We did that for pids because pids are globally limited and > can run out way earlier than memory backing it. I don't think we have > similar restructions for fds, do we? Well I'm not aware of such restrictions... So maybe let me clarify our use case so we can have some more discussion about this. We are dealing with task of monitoring system services on an IoT system. So this system needs to run as long as possible without reboot just like server. In server world almost whole system state is being monitored by services like nagios. They measure each parameter (like cpu, memory etc) with some interval. Unfortunately we cannot use this it in an embedded system due to power consumption. So generally now we consider two approaches: 1) Use rlimits when possible to limit resources for each process. The problem here is that this creates an implicit requirement that all system services are well written and able to detect that they for example run out of fd and they will just exit with a suitable error code instead of hanging forever and responding to clients that they are unable to handle their request due to lack of fd. This is hard specially when service use a lot of libraries under the hood because they also need to return this error code from each functions which opens some files. This is especially hard when using some proprietary services or libraries for we don't have access to source code. 2) Use cgroups to limit and monitor resources usage Generally systemd creates a cgroup for each service. cgroups like memory cgroup has an ability to notify userspace when memory usage reaches some level. So for example systemd could get notification that one of cgroups is using more memory than it should but as long as it's not a hard limit of the cgroup this service is not going to even notice this. So instead of returning error from for example malloc() in service, systemd could just send signal to that service and ask it to exit gracefully and the restart it. The disadvantage of this solution is the need of having cgroup for each resource we would like to monitor. For now we have suitable cgroups for everything we need apart from file descriptors. What do you think about this? Maybe you have some other ideas how we could achieve this? Best regards, -- Krzysztof Opasiak Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-07 11:19 ` Krzysztof Opasiak 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Opasiak @ 2017-03-07 11:19 UTC (permalink / raw) To: Tejun Heo Cc: lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Łukasz Stelmach, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Karol Lewandowski, cgroups-u79uwXL29TY76Z2rM5mHXA Hi On 03/06/2017 07:58 PM, Tejun Heo wrote: > Hello, > > On Fri, Feb 17, 2017 at 12:37:11PM +0100, Krzysztof Opasiak wrote: >>> We need to limit and monitor the number of file descriptors processes >>> keep open. If a process exceeds certain limit we'd like to terminate it >>> and restart it or reboot the whole system. Currently the RLIMIT API >>> allows limiting the number of file descriptors but to achieve our goals >>> we'd need to make sure all programmes we run handle EMFILE errno >>> properly. That is why we consider developing a cgroup controller that >>> limits the number of open file descriptors of its members (similar to >>> memory controler). >>> >>> Any comments? Is there any alternative that: >>> >>> + does not require modifications of user-land code, >>> + enables other process (e.g. init) to be notified and apply policy. > > Hmm... I'm not quite sure fds qualify as an independent system-wide > resource. We did that for pids because pids are globally limited and > can run out way earlier than memory backing it. I don't think we have > similar restructions for fds, do we? Well I'm not aware of such restrictions... So maybe let me clarify our use case so we can have some more discussion about this. We are dealing with task of monitoring system services on an IoT system. So this system needs to run as long as possible without reboot just like server. In server world almost whole system state is being monitored by services like nagios. They measure each parameter (like cpu, memory etc) with some interval. Unfortunately we cannot use this it in an embedded system due to power consumption. So generally now we consider two approaches: 1) Use rlimits when possible to limit resources for each process. The problem here is that this creates an implicit requirement that all system services are well written and able to detect that they for example run out of fd and they will just exit with a suitable error code instead of hanging forever and responding to clients that they are unable to handle their request due to lack of fd. This is hard specially when service use a lot of libraries under the hood because they also need to return this error code from each functions which opens some files. This is especially hard when using some proprietary services or libraries for we don't have access to source code. 2) Use cgroups to limit and monitor resources usage Generally systemd creates a cgroup for each service. cgroups like memory cgroup has an ability to notify userspace when memory usage reaches some level. So for example systemd could get notification that one of cgroups is using more memory than it should but as long as it's not a hard limit of the cgroup this service is not going to even notice this. So instead of returning error from for example malloc() in service, systemd could just send signal to that service and ask it to exit gracefully and the restart it. The disadvantage of this solution is the need of having cgroup for each resource we would like to monitor. For now we have suitable cgroups for everything we need apart from file descriptors. What do you think about this? Maybe you have some other ideas how we could achieve this? Best regards, -- Krzysztof Opasiak Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller 2017-03-07 11:19 ` Krzysztof Opasiak (?) @ 2017-03-07 19:41 ` Tejun Heo 2017-03-07 20:06 ` Krzysztof Opasiak -1 siblings, 1 reply; 17+ messages in thread From: Tejun Heo @ 2017-03-07 19:41 UTC (permalink / raw) To: Krzysztof Opasiak Cc: lizefan, hannes, Łukasz Stelmach, linux-kernel, Karol Lewandowski, cgroups Hello, Krzysztof. On Tue, Mar 07, 2017 at 12:19:52PM +0100, Krzysztof Opasiak wrote: > So maybe let me clarify our use case so we can have some more discussion > about this. We are dealing with task of monitoring system services on an IoT > system. So this system needs to run as long as possible without reboot just > like server. In server world almost whole system state is being monitored by > services like nagios. They measure each parameter (like cpu, memory etc) > with some interval. Unfortunately we cannot use this it in an embedded > system due to power consumption. So, we don't add controllers for specific use case scenarios. The target actually has to be a fundamental resource which can't be isolated in a different way. The use case you're describing is more about working around shortcomings in userspace by implemneting a major kernel feature, when the said shortcomings can easily be controlled and mitigated from userspace - e.g. if running out of fds can't be handled reliably from the target application for some reason and the application may lock up from the condition, protect the base resources so that a monitoring process can always reliably run and let that take a corrective action when such condition is detected. This doesn't really seem to qualify as a dedicated kernel functionality. Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-07 20:06 ` Krzysztof Opasiak 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Opasiak @ 2017-03-07 20:06 UTC (permalink / raw) To: Tejun Heo Cc: lizefan, hannes, Łukasz Stelmach, linux-kernel, Karol Lewandowski, cgroups On 03/07/2017 08:41 PM, Tejun Heo wrote: > Hello, Krzysztof. > > On Tue, Mar 07, 2017 at 12:19:52PM +0100, Krzysztof Opasiak wrote: >> So maybe let me clarify our use case so we can have some more discussion >> about this. We are dealing with task of monitoring system services on an IoT >> system. So this system needs to run as long as possible without reboot just >> like server. In server world almost whole system state is being monitored by >> services like nagios. They measure each parameter (like cpu, memory etc) >> with some interval. Unfortunately we cannot use this it in an embedded >> system due to power consumption. > > So, we don't add controllers for specific use case scenarios. The > target actually has to be a fundamental resource which can't be > isolated in a different way. > > The use case you're describing is more about working around > shortcomings in userspace by implemneting a major kernel feature, when > the said shortcomings can easily be controlled and mitigated from > userspace - e.g. if running out of fds can't be handled reliably from > the target application for some reason and the application may lock up > from the condition, protect the base resources so that a monitoring > process can always reliably run and let that take a corrective action > when such condition is detected. > In theory that's what we plan to do but we are looking for an efficient method of detecting that this particular application is using more fds than it should (declared by developer). Personally, I don't want to use rlimit for this as it ends up returning error code from for example open() when we hit the limit. This may lead to some unpredictable crashes in services (esp. those poor proprietary binary blobs). Instead of injecting errors to service we would like to just get notification that this service has more opened fds than it should and ask it to restart in a polite way. For memory seems to be quite easy to achieve as we can just get eventfd notification when application passes given memory usage using memory cgroup controller. Maybe you know some efficient method to do the same for fds? Best regards, -- Krzysztof Opasiak Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-07 20:06 ` Krzysztof Opasiak 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Opasiak @ 2017-03-07 20:06 UTC (permalink / raw) To: Tejun Heo Cc: lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Łukasz Stelmach, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Karol Lewandowski, cgroups-u79uwXL29TY76Z2rM5mHXA On 03/07/2017 08:41 PM, Tejun Heo wrote: > Hello, Krzysztof. > > On Tue, Mar 07, 2017 at 12:19:52PM +0100, Krzysztof Opasiak wrote: >> So maybe let me clarify our use case so we can have some more discussion >> about this. We are dealing with task of monitoring system services on an IoT >> system. So this system needs to run as long as possible without reboot just >> like server. In server world almost whole system state is being monitored by >> services like nagios. They measure each parameter (like cpu, memory etc) >> with some interval. Unfortunately we cannot use this it in an embedded >> system due to power consumption. > > So, we don't add controllers for specific use case scenarios. The > target actually has to be a fundamental resource which can't be > isolated in a different way. > > The use case you're describing is more about working around > shortcomings in userspace by implemneting a major kernel feature, when > the said shortcomings can easily be controlled and mitigated from > userspace - e.g. if running out of fds can't be handled reliably from > the target application for some reason and the application may lock up > from the condition, protect the base resources so that a monitoring > process can always reliably run and let that take a corrective action > when such condition is detected. > In theory that's what we plan to do but we are looking for an efficient method of detecting that this particular application is using more fds than it should (declared by developer). Personally, I don't want to use rlimit for this as it ends up returning error code from for example open() when we hit the limit. This may lead to some unpredictable crashes in services (esp. those poor proprietary binary blobs). Instead of injecting errors to service we would like to just get notification that this service has more opened fds than it should and ask it to restart in a polite way. For memory seems to be quite easy to achieve as we can just get eventfd notification when application passes given memory usage using memory cgroup controller. Maybe you know some efficient method to do the same for fds? Best regards, -- Krzysztof Opasiak Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-07 20:48 ` Tejun Heo 0 siblings, 0 replies; 17+ messages in thread From: Tejun Heo @ 2017-03-07 20:48 UTC (permalink / raw) To: Krzysztof Opasiak Cc: lizefan, hannes, Łukasz Stelmach, linux-kernel, Karol Lewandowski, cgroups Hello, On Tue, Mar 07, 2017 at 09:06:49PM +0100, Krzysztof Opasiak wrote: > Personally, I don't want to use rlimit for this as it ends up returning > error code from for example open() when we hit the limit. This may lead to > some unpredictable crashes in services (esp. those poor proprietary binary > blobs). Instead of injecting errors to service we would like to just get > notification that this service has more opened fds than it should and ask it > to restart in a polite way. > > For memory seems to be quite easy to achieve as we can just get eventfd > notification when application passes given memory usage using memory cgroup > controller. Maybe you know some efficient method to do the same for fds? So, if all you wanna do is reliably detecting open(2) failures, can't you do that with bpf tracing? Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-07 20:48 ` Tejun Heo 0 siblings, 0 replies; 17+ messages in thread From: Tejun Heo @ 2017-03-07 20:48 UTC (permalink / raw) To: Krzysztof Opasiak Cc: lizefan-hv44wF8Li93QT0dZR+AlfA, hannes-druUgvl0LCNAfugRpC6u6w, Łukasz Stelmach, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Karol Lewandowski, cgroups-u79uwXL29TY76Z2rM5mHXA Hello, On Tue, Mar 07, 2017 at 09:06:49PM +0100, Krzysztof Opasiak wrote: > Personally, I don't want to use rlimit for this as it ends up returning > error code from for example open() when we hit the limit. This may lead to > some unpredictable crashes in services (esp. those poor proprietary binary > blobs). Instead of injecting errors to service we would like to just get > notification that this service has more opened fds than it should and ask it > to restart in a polite way. > > For memory seems to be quite easy to achieve as we can just get eventfd > notification when application passes given memory usage using memory cgroup > controller. Maybe you know some efficient method to do the same for fds? So, if all you wanna do is reliably detecting open(2) failures, can't you do that with bpf tracing? Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-08 2:59 ` Parav Pandit 0 siblings, 0 replies; 17+ messages in thread From: Parav Pandit @ 2017-03-08 2:59 UTC (permalink / raw) To: Tejun Heo Cc: Krzysztof Opasiak, Li Zefan, Johannes Weiner, Łukasz Stelmach, Linux Kernel Mailing List, Karol Lewandowski, cgroups Hi, On Tue, Mar 7, 2017 at 2:48 PM, Tejun Heo <tj@kernel.org> wrote: > > Hello, > > On Tue, Mar 07, 2017 at 09:06:49PM +0100, Krzysztof Opasiak wrote: > > Personally, I don't want to use rlimit for this as it ends up returning > > error code from for example open() when we hit the limit. This may lead to > > some unpredictable crashes in services (esp. those poor proprietary binary > > blobs). Instead of injecting errors to service we would like to just get > > notification that this service has more opened fds than it should and ask it > > to restart in a polite way. > > How does those poor proprietary binary blobs remain polite after restart? Do you mean you want to keep restarting them when it reaches the limit? > > For memory seems to be quite easy to achieve as we can just get eventfd > > notification when application passes given memory usage using memory cgroup > > controller. Maybe you know some efficient method to do the same for fds? > > So, if all you wanna do is reliably detecting open(2) failures, can't > you do that with bpf tracing? > > Thanks. > > -- > tejun > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-08 2:59 ` Parav Pandit 0 siblings, 0 replies; 17+ messages in thread From: Parav Pandit @ 2017-03-08 2:59 UTC (permalink / raw) To: Tejun Heo Cc: Krzysztof Opasiak, Li Zefan, Johannes Weiner, Łukasz Stelmach, Linux Kernel Mailing List, Karol Lewandowski, cgroups-u79uwXL29TY76Z2rM5mHXA Hi, On Tue, Mar 7, 2017 at 2:48 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: > > Hello, > > On Tue, Mar 07, 2017 at 09:06:49PM +0100, Krzysztof Opasiak wrote: > > Personally, I don't want to use rlimit for this as it ends up returning > > error code from for example open() when we hit the limit. This may lead to > > some unpredictable crashes in services (esp. those poor proprietary binary > > blobs). Instead of injecting errors to service we would like to just get > > notification that this service has more opened fds than it should and ask it > > to restart in a polite way. > > How does those poor proprietary binary blobs remain polite after restart? Do you mean you want to keep restarting them when it reaches the limit? > > For memory seems to be quite easy to achieve as we can just get eventfd > > notification when application passes given memory usage using memory cgroup > > controller. Maybe you know some efficient method to do the same for fds? > > So, if all you wanna do is reliably detecting open(2) failures, can't > you do that with bpf tracing? > > Thanks. > > -- > tejun > -- > To unsubscribe from this list: send the line "unsubscribe cgroups" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-08 10:19 ` Krzysztof Opasiak 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Opasiak @ 2017-03-08 10:19 UTC (permalink / raw) To: Parav Pandit, Tejun Heo Cc: Li Zefan, Johannes Weiner, Łukasz Stelmach, Linux Kernel Mailing List, Karol Lewandowski, cgroups On 03/08/2017 03:59 AM, Parav Pandit wrote: > Hi, > > On Tue, Mar 7, 2017 at 2:48 PM, Tejun Heo <tj@kernel.org> wrote: >> >> Hello, >> >> On Tue, Mar 07, 2017 at 09:06:49PM +0100, Krzysztof Opasiak wrote: >>> Personally, I don't want to use rlimit for this as it ends up returning >>> error code from for example open() when we hit the limit. This may lead to >>> some unpredictable crashes in services (esp. those poor proprietary binary >>> blobs). Instead of injecting errors to service we would like to just get >>> notification that this service has more opened fds than it should and ask it >>> to restart in a polite way. >>> > > How does those poor proprietary binary blobs remain polite after restart? They wont. > Do you mean you want to keep restarting them when it reaches the limit? We'd like to restart them each time when they reach limit declared by developer. Best regards, -- Krzysztof Opasiak Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller @ 2017-03-08 10:19 ` Krzysztof Opasiak 0 siblings, 0 replies; 17+ messages in thread From: Krzysztof Opasiak @ 2017-03-08 10:19 UTC (permalink / raw) To: Parav Pandit, Tejun Heo Cc: Li Zefan, Johannes Weiner, Łukasz Stelmach, Linux Kernel Mailing List, Karol Lewandowski, cgroups-u79uwXL29TY76Z2rM5mHXA On 03/08/2017 03:59 AM, Parav Pandit wrote: > Hi, > > On Tue, Mar 7, 2017 at 2:48 PM, Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org> wrote: >> >> Hello, >> >> On Tue, Mar 07, 2017 at 09:06:49PM +0100, Krzysztof Opasiak wrote: >>> Personally, I don't want to use rlimit for this as it ends up returning >>> error code from for example open() when we hit the limit. This may lead to >>> some unpredictable crashes in services (esp. those poor proprietary binary >>> blobs). Instead of injecting errors to service we would like to just get >>> notification that this service has more opened fds than it should and ask it >>> to restart in a polite way. >>> > > How does those poor proprietary binary blobs remain polite after restart? They wont. > Do you mean you want to keep restarting them when it reaches the limit? We'd like to restart them each time when they reach limit declared by developer. Best regards, -- Krzysztof Opasiak Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller 2017-03-07 20:48 ` Tejun Heo (?) (?) @ 2017-03-08 9:52 ` Krzysztof Opasiak 2017-03-08 18:59 ` Tejun Heo -1 siblings, 1 reply; 17+ messages in thread From: Krzysztof Opasiak @ 2017-03-08 9:52 UTC (permalink / raw) To: Tejun Heo Cc: lizefan, hannes, Łukasz Stelmach, linux-kernel, Karol Lewandowski, cgroups On 03/07/2017 09:48 PM, Tejun Heo wrote: > Hello, > > On Tue, Mar 07, 2017 at 09:06:49PM +0100, Krzysztof Opasiak wrote: >> Personally, I don't want to use rlimit for this as it ends up returning >> error code from for example open() when we hit the limit. This may lead to >> some unpredictable crashes in services (esp. those poor proprietary binary >> blobs). Instead of injecting errors to service we would like to just get >> notification that this service has more opened fds than it should and ask it >> to restart in a polite way. >> >> For memory seems to be quite easy to achieve as we can just get eventfd >> notification when application passes given memory usage using memory cgroup >> controller. Maybe you know some efficient method to do the same for fds? > > So, if all you wanna do is reliably detecting open(2) failures, can't > you do that with bpf tracing? > Well detecting failures of open is not enough and it has couple of problems: 1) open(2) is not the only syscall which creates fd. In addition to other syscalls like socket(2), dup(2), some ioctl() on drivers (for example video) also creates fds. I'm not sure if we have any other mechanism than grep through kernel source to find out which ioctl() creates fd or and which not. 2) As far as I know (I'm not a bpf specialist so please correct me if I'm wrong), with bpf we are able only to detect such events but we are unable to prevent them from getting to caller. It means that service will know that it run out of fds and will need to handle this properly. If there is a bug in this error path service may crash. What we would like to get is just a notification to external process that some limit has been reached without returning error to service itself. 3) Theoretically we could do this using bpf or syscall auditing and count fds for each userspace process or check /proc/<PID> after each notification but it's getting very heavy for production environment. Best regards, -- Krzysztof Opasiak Samsung R&D Institute Poland Samsung Electronics ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: counting file descriptors with a cgroup controller 2017-03-08 9:52 ` Krzysztof Opasiak @ 2017-03-08 18:59 ` Tejun Heo 0 siblings, 0 replies; 17+ messages in thread From: Tejun Heo @ 2017-03-08 18:59 UTC (permalink / raw) To: Krzysztof Opasiak Cc: lizefan, hannes, Łukasz Stelmach, linux-kernel, Karol Lewandowski, cgroups On Wed, Mar 08, 2017 at 10:52:18AM +0100, Krzysztof Opasiak wrote: > Well detecting failures of open is not enough and it has couple of problems: > > 1) open(2) is not the only syscall which creates fd. In addition to other > syscalls like socket(2), dup(2), some ioctl() on drivers (for example video) > also creates fds. I'm not sure if we have any other mechanism than grep > through kernel source to find out which ioctl() creates fd or and which not. > > 2) As far as I know (I'm not a bpf specialist so please correct me if I'm > wrong), with bpf we are able only to detect such events but we are unable to > prevent them from getting to caller. It means that service will know that it > run out of fds and will need to handle this properly. If there is a bug in > this error path service may crash. > What we would like to get is just a notification to external process that > some limit has been reached without returning error to service itself. > > 3) Theoretically we could do this using bpf or syscall auditing and count > fds for each userspace process or check /proc/<PID> after each notification > but it's getting very heavy for production environment. We simply can't design the kernel to accomodate bandaid workarounds for grossly misbehaving applications. If you can find something which can solve the problem using wider scope tools like bpf, seccomp, and what not, great. If not, too bad, but we can't burdern everyone else with workarounds for the extremely specific and contrived issues that you're seeing. Thanks. -- tejun ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-03-08 19:01 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CGME20170217093725eucas1p12478baf297d25303f3020f4973fbf3b0@eucas1p1.samsung.com> 2017-02-17 9:37 ` counting file descriptors with a cgroup controller Łukasz Stelmach 2017-02-17 11:37 ` Krzysztof Opasiak 2017-03-06 18:58 ` Tejun Heo 2017-03-06 18:58 ` Tejun Heo 2017-03-07 11:19 ` Krzysztof Opasiak 2017-03-07 11:19 ` Krzysztof Opasiak 2017-03-07 19:41 ` Tejun Heo 2017-03-07 20:06 ` Krzysztof Opasiak 2017-03-07 20:06 ` Krzysztof Opasiak 2017-03-07 20:48 ` Tejun Heo 2017-03-07 20:48 ` Tejun Heo 2017-03-08 2:59 ` Parav Pandit 2017-03-08 2:59 ` Parav Pandit 2017-03-08 10:19 ` Krzysztof Opasiak 2017-03-08 10:19 ` Krzysztof Opasiak 2017-03-08 9:52 ` Krzysztof Opasiak 2017-03-08 18:59 ` Tejun Heo
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.