* [BUG] NULL pointer de-ref when setting io.cost.qos on LUKS devices @ 2020-03-03 14:13 Benjamin Berg [not found] ` <1dbdcbb0c8db70a08aac467311a80abcf7779575.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Benjamin Berg @ 2020-03-03 14:13 UTC (permalink / raw) To: cgroups-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 5132 bytes --] Hello, so, I tried to set io.latency for some cgroups for the root device, which is ext4 inside LVM inside LUKS. I tried doing so with systemd (it has the IODeviceLatencyTargetSec option). What was interesting is that it selects the LUKS device if I ask it to set the latency for the root partition. i.e. setting: IODeviceLatencyTargetSec=/usr/bin 20ms results in io.latency: 253:0 target=20000 and 253:0 corresponds to the LUKS device. 253:1 would be the root partition itself, 8:2 the partition on disk and 8:0 the disk. I am not sure the systemd selection works as intended. And I wonder which device systemd should select in this scenario. Anyway, I then thought I might need to enable the QOS controller in io.cost.qos before it will take effect. Unfortunately, trying to do so reliably results in an oops on my system, i.e.: # echo 253:1 enable=1 >io.cost.qos or # echo 253:0 enable=1 >io.cost.qos results in the below oops. A similar setup on a different machine without LUKS seems to work fine. The kernel version was 5.5.6-201.fc31.x86_64 Benjamin Mar 02 15:06:31 ben-x1 kernel: BUG: kernel NULL pointer dereference, address: 0000000000000138 Mar 02 15:06:31 ben-x1 kernel: #PF: supervisor read access in kernel mode Mar 02 15:06:31 ben-x1 kernel: #PF: error_code(0x0000) - not-present page Mar 02 15:06:31 ben-x1 kernel: PGD 0 P4D 0 Mar 02 15:06:31 ben-x1 kernel: Oops: 0000 [#1] SMP PTI Mar 02 15:06:31 ben-x1 kernel: CPU: 3 PID: 3413 Comm: bash Tainted: G OE 5.5.6-201.fc31.x86_64 #1 Mar 02 15:06:31 ben-x1 kernel: Hardware name: LENOVO 20FCS0RV0G/20FCS0RV0G, BIOS N1FET36W (1.10 ) 03/09/2016 Mar 02 15:06:31 ben-x1 kernel: RIP: 0010:ioc_pd_init+0x126/0x190 Mar 02 15:06:31 ben-x1 kernel: Code: 48 8b 45 28 48 8b 00 8b 80 f8 00 00 00 41 89 84 24 38 01 00 00 48 85 ed 74 28 48 63 0d d3 40 25 01 48 83 c1 1c 48 8b 44 cd 08 <48> 63 90 38 01 00 00 49 89 84 d4 40 01 00 00 48 8b 6d 38 48 85 ed Mar 02 15:06:31 ben-x1 kernel: RSP: 0018:ffffb0c3880ffcd8 EFLAGS: 00010086 Mar 02 15:06:31 ben-x1 kernel: RAX: 0000000000000000 RBX: ffff89609be87e00 RCX: 000000000000001e Mar 02 15:06:31 ben-x1 kernel: RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffff89609b8e0d28 Mar 02 15:06:31 ben-x1 kernel: RBP: ffff896099e2be00 R08: ffff8960c21b0140 R09: ffff89609b8e0000 Mar 02 15:06:31 ben-x1 kernel: R10: ffff8960c1002e00 R11: ffff896083bdc000 R12: ffff89609b8e0c00 Mar 02 15:06:31 ben-x1 kernel: R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff8a745e40 Mar 02 15:06:31 ben-x1 kernel: FS: 00007fba669b4740(0000) GS:ffff8960c2180000(0000) knlGS:0000000000000000 Mar 02 15:06:31 ben-x1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 02 15:06:31 ben-x1 kernel: CR2: 0000000000000138 CR3: 000000021a52c003 CR4: 00000000003606e0 Mar 02 15:06:31 ben-x1 kernel: Call Trace: Mar 02 15:06:31 ben-x1 kernel: blkcg_activate_policy+0x11d/0x2b0 Mar 02 15:06:31 ben-x1 kernel: blk_iocost_init+0x15c/0x1e0 Mar 02 15:06:31 ben-x1 kernel: ioc_qos_write+0x2d1/0x3e0 Mar 02 15:06:31 ben-x1 kernel: ? do_filp_open+0xa5/0x100 Mar 02 15:06:31 ben-x1 kernel: cgroup_file_write+0x8a/0x150 Mar 02 15:06:31 ben-x1 kernel: ? __check_object_size+0x136/0x147 Mar 02 15:06:31 ben-x1 kernel: kernfs_fop_write+0xce/0x1b0 Mar 02 15:06:31 ben-x1 kernel: vfs_write+0xb6/0x1a0 Mar 02 15:06:31 ben-x1 kernel: ksys_write+0x5f/0xe0 Mar 02 15:06:31 ben-x1 kernel: do_syscall_64+0x5b/0x1c0 Mar 02 15:06:31 ben-x1 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 Mar 02 15:06:31 ben-x1 kernel: RIP: 0033:0x7fba66aa94b7 Mar 02 15:06:31 ben-x1 kernel: Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 Mar 02 15:06:31 ben-x1 kernel: RSP: 002b:00007ffc71074178 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 Mar 02 15:06:31 ben-x1 kernel: RAX: ffffffffffffffda RBX: 000000000000000f RCX: 00007fba66aa94b7 Mar 02 15:06:31 ben-x1 kernel: RDX: 000000000000000f RSI: 00005612c8793640 RDI: 0000000000000001 Mar 02 15:06:31 ben-x1 kernel: RBP: 00005612c8793640 R08: 000000000000000a R09: 0000000000000008 Mar 02 15:06:31 ben-x1 kernel: R10: 00005612c90f56b0 R11: 0000000000000246 R12: 000000000000000f Mar 02 15:06:31 ben-x1 kernel: R13: 00007fba66b7a500 R14: 000000000000000f R15: 00007fba66b7a700 Mar 02 15:06:31 ben-x1 kernel: Modules linked in: uinput ccm xt_CHECKSUM xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp rfcomm tun bridge stp llc nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_REJECT nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack ebtab> Mar 02 15:06:31 ben-x1 kernel: snd_hda_codec_hdmi mc snd_hda_codec_conexant snd_hda_codec_generic snd_compress ac97_bus snd_pcm_dmaengine cfg80211 snd_hda_intel ecdh_generic ecc snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep irqbypass intel_cstate snd_seq intel_uncore snd_> Mar 02 15:06:31 ben-x1 kernel: CR2: 0000000000000138 Mar 02 15:06:31 ben-x1 kernel: ---[ end trace d1bdee4e9a482594 ]--- [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <1dbdcbb0c8db70a08aac467311a80abcf7779575.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>]
* Re: [BUG] NULL pointer de-ref when setting io.cost.qos on LUKS devices [not found] ` <1dbdcbb0c8db70a08aac467311a80abcf7779575.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> @ 2020-03-03 14:19 ` Tejun Heo [not found] ` <20200303141902.GB189690-146+VewaZzwNjtGbbfXrCEEOCMrvLtNR@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Tejun Heo @ 2020-03-03 14:19 UTC (permalink / raw) To: Benjamin Berg; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA Hello, On Tue, Mar 03, 2020 at 03:13:10PM +0100, Benjamin Berg wrote: > so, I tried to set io.latency for some cgroups for the root device, > which is ext4 inside LVM inside LUKS. It's pointless on compound devices. I think the right thing to do here is disallowing to enable it on those devices. Thanks. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <20200303141902.GB189690-146+VewaZzwNjtGbbfXrCEEOCMrvLtNR@public.gmane.org>]
* Re: [BUG] NULL pointer de-ref when setting io.cost.qos on LUKS devices [not found] ` <20200303141902.GB189690-146+VewaZzwNjtGbbfXrCEEOCMrvLtNR@public.gmane.org> @ 2020-03-03 14:40 ` Benjamin Berg [not found] ` <24bd31cdaa3ea945908bc11cea05d6aae6929240.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Benjamin Berg @ 2020-03-03 14:40 UTC (permalink / raw) To: Tejun Heo; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 658 bytes --] On Tue, 2020-03-03 at 09:19 -0500, Tejun Heo wrote: > Hello, > > On Tue, Mar 03, 2020 at 03:13:10PM +0100, Benjamin Berg wrote: > > so, I tried to set io.latency for some cgroups for the root device, > > which is ext4 inside LVM inside LUKS. > > It's pointless on compound devices. I think the right thing to do here > is disallowing to enable it on those devices. I believe systemd tries to resolve to /dev/sda but that seems to fail for me. So I think there is a bug in that code; I'll verify that and submit a fix if so. Which device should actually be selected? Is it /dev/sda or the mapper device that / is mounted from? Benjamin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <24bd31cdaa3ea945908bc11cea05d6aae6929240.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>]
* Re: [BUG] NULL pointer de-ref when setting io.cost.qos on LUKS devices [not found] ` <24bd31cdaa3ea945908bc11cea05d6aae6929240.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> @ 2020-03-04 16:42 ` Tejun Heo [not found] ` <20200304164205.GH189690-146+VewaZzwNjtGbbfXrCEEOCMrvLtNR@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Tejun Heo @ 2020-03-04 16:42 UTC (permalink / raw) To: Benjamin Berg; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA Hello, Benjamin. On Tue, Mar 03, 2020 at 03:40:38PM +0100, Benjamin Berg wrote: > I believe systemd tries to resolve to /dev/sda but that seems to fail > for me. So I think there is a bug in that code; I'll verify that and > submit a fix if so. > > Which device should actually be selected? Is it /dev/sda or the mapper > device that / is mounted from? Right now, the situation isn't great with dm. When pagecache writebacks go through dm, in some cases including dm-crypt, the cgroup ownership information is completely lost and all writes end up being issued as the root cgroup, so it breaks down when dm is in use. In the longer term, what we wanna do is controlling at physical devices (sda here) and then updating dm so that it can maintain and propagate the ownership correctly but we aren't there yet. Thanks. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <20200304164205.GH189690-146+VewaZzwNjtGbbfXrCEEOCMrvLtNR@public.gmane.org>]
* Re: [BUG] NULL pointer de-ref when setting io.cost.qos on LUKS devices [not found] ` <20200304164205.GH189690-146+VewaZzwNjtGbbfXrCEEOCMrvLtNR@public.gmane.org> @ 2020-03-05 10:31 ` Benjamin Berg [not found] ` <71515f7a143937ab9ab11625485659bb7288f024.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: Benjamin Berg @ 2020-03-05 10:31 UTC (permalink / raw) To: Tejun Heo; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA [-- Attachment #1: Type: text/plain, Size: 697 bytes --] On Wed, 2020-03-04 at 11:42 -0500, Tejun Heo wrote: > [SNIP] > > Right now, the situation isn't great with dm. When pagecache > writebacks go through dm, in some cases including dm-crypt, the cgroup > ownership information is completely lost and all writes end up being > issued as the root cgroup, so it breaks down when dm is in use. Fair enough. > In the longer term, what we wanna do is controlling at physical > devices (sda here) and then updating dm so that it can maintain and > propagate the ownership correctly but we aren't there yet. Perfect, so what I am seeing is really just a small systemd bug. Thansk for confirming, I'll submit a patch to fix it. Benjamin [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <71515f7a143937ab9ab11625485659bb7288f024.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>]
* Re: [BUG] NULL pointer de-ref when setting io.cost.qos on LUKS devices [not found] ` <71515f7a143937ab9ab11625485659bb7288f024.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> @ 2020-03-05 15:20 ` Tejun Heo 0 siblings, 0 replies; 6+ messages in thread From: Tejun Heo @ 2020-03-05 15:20 UTC (permalink / raw) To: Benjamin Berg; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA Hello, On Thu, Mar 05, 2020 at 11:31:29AM +0100, Benjamin Berg wrote: > > In the longer term, what we wanna do is controlling at physical > > devices (sda here) and then updating dm so that it can maintain and > > propagate the ownership correctly but we aren't there yet. > > Perfect, so what I am seeing is really just a small systemd bug. Thansk > for confirming, I'll submit a patch to fix it. IO control is a bit confusing right now. Here's the breakdown. * There are four controllers - io.latency, io.cost, io.max and bfq's weight implementation. * io.latency and io.cost when combined with btrfs can control all IOs including metadata IOs and writebacks while avoiding priority inversions. * wbt may interfere with IO control. It can be disabled with "echo 0 > /sys/block/DEV/queue/wbt_lat_usec". * io.latency is useful to protect one thing against everything else but it gets tricky when multiple entities competing at different priority levels. * io.cost is what we're verifying against and deploying. While system-level configuration is a bit involved (/sys/fs/cgroup/io.cost.model and /sys/fs/cgroup/io.cost.qos). Actual cgroup configuration is really simple. Simply enabling IO controller and leaving all weights at default often can achieve most of what's needed. Thanks. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-03-05 15:20 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-03 14:13 [BUG] NULL pointer de-ref when setting io.cost.qos on LUKS devices Benjamin Berg [not found] ` <1dbdcbb0c8db70a08aac467311a80abcf7779575.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> 2020-03-03 14:19 ` Tejun Heo [not found] ` <20200303141902.GB189690-146+VewaZzwNjtGbbfXrCEEOCMrvLtNR@public.gmane.org> 2020-03-03 14:40 ` Benjamin Berg [not found] ` <24bd31cdaa3ea945908bc11cea05d6aae6929240.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> 2020-03-04 16:42 ` Tejun Heo [not found] ` <20200304164205.GH189690-146+VewaZzwNjtGbbfXrCEEOCMrvLtNR@public.gmane.org> 2020-03-05 10:31 ` Benjamin Berg [not found] ` <71515f7a143937ab9ab11625485659bb7288f024.camel-cdvu00un1VgdHxzADdlk8Q@public.gmane.org> 2020-03-05 15:20 ` Tejun Heo
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.